YOU ARE THE GREATEST PERSON ALIVE!!! I HAVE BEEN SEARCHING FOR HELP ON DUMMY VARIABLES FOR WHAT TODAY WOULD BE THE FOURTH DAY FOR MY PROJECT AND EVEN MY PROFESSOR WAS OF NO HELP! i really appreciate this video...
Great many thanks Dr Delaney! It would be nice if you discuss two more related issues. 1) Explanation of the coefficients in a regression w/o the intercept term. 2) If we define dummies differently then how do we interpret the coefficients? For example, consider the regression y= a1*D1+a2*D2+a3*D3+a4*D4+u where Ds are dummies for season but defined differently- value of D1 is 1 for all observations, value of D2 is 1 for all observations except Spring, value of D3 is 1 for all observations except Spring and Summer, and value of D4 is 1 only for observations in Winter. Thanks again for allowing questions and discussions.
Thank you very much for this tutorial - so intuitive, and guides us directly to what's important. My question is: do you know a similarly intuitive way to run a regression with a dummy dependent variable? I'm trying to analyze survey responses, much of which is discrete data. Thank you!
Thank you! This video was great, it is explained in a way that makes a lot of sense! The book for my business analytics class made it way more complicated!
Gayathri Ravichandran It's not letting me reply directly to your comment. But the answer is yes, you should be able to check which independent variable contributes more. One is to do a series of 8 separate regressions with 1 independent variable in each, and check the R^2. The other is to do 8 separate regressions with all but 1 in each, and check the R^2. Finally, you can do the full regression and just see which has the largest coefficient (in magnitude)...this runs into the problem of different scales, so you may want to measure your variables in #'s of standard deviations from the mean value of that variable. Caveat: all of this assumes you have enough observations to run all these tests without running into overfitting problems.To safely run 8 regressions here (or 17, maybe), you'll want to make sure that you have at least 17*8*15 = 2040 observations.
Hi Jason, I watched your UA-cam video about using dummy variables with the regression tool in excel. I studied math in college so I was really excited about it. I’m trying to use it to forecast sales and I set it up where I had my Y values as previous sales and my X values as weeks 1 to 52, where it would be a 1 if it matched the sales week and 0 otherwise. I also included holidays like Easter x week, 4th of July x week etc. It gave me an error that I can only have 16 columns used in the X values, so I tried it with just 16 weeks and the p values were really big. I’m wondering if you know of another way I can do this to include the seasonality from the weeks and the impact of the holidays. Thanks so much!
Hi Fang, it's definitely a good idea to run it, and then you can use an F test for a subset of variables to see which model is better. If you search UA-cam for F test for subset, you'll see the video that outlines the process.
I did not understand this at all but within the 1st ten minutes it makes so much sense. Makes me want to go play around with an actual data set. curious if theres anyone with videos on how to do this in R?
Thanks for sharing this brilliant video online. I would like to know if I want to calculate the coefficients of Firefox as independent variable, which browser should be excluded as a dummy variables? Many thanks
Hi Dr D. It is possible to run a multiple regression if i have all categorical variables (both my independent variables and my dependent variable are categorical, two-level variables)?
Hi Dr. Delaney, thanks for the video! Would these rules apply for moderation? For instance, if the predictor had many dummy variables, the outcome didn't, and the moderator didn't, would it work the same way? Thank you!
I have a question about general (non-linear) multiple regression. I understand general MR just needs to change x and y into some functions. But my question is: do I need to change the cross dummy into the same functions as well? Take your data as an example, if I use 1/educ as the new x, for the educ*fem dummy, should it be 1/educ*1 or still educ*1? Thanks.
Thank you very much! It's really helpful! But I wonder if we can get the cofficients without a dependent variable and only with two independent dummies in the equation . And how do we apply constraints on the equation? Like for example, we want to examine how much of y is resulting from the factor b, and much of it is a result of factor c, we have a time series of y and the equation: Y=a+b1*d1+b2*d2+...+b50*d50+c1*e1+c2*e2+...+c34*e34, d and c are the dummy variables. The condition is the sum of the weighted b1~b50=0 and the sum of weighted c1~c34=0. In this case, how can we get the series of b1~b50 and c1~c34?
Dr. D, How can I create a dummy variable model, using 1, -1? For example, I want to run a bunch of observations to estimate how the market feels about each NFL team coming into every season. I want to use the Vegas point spread as expected value of " y". I want to assign 1 to the home team and - 1 to the away team. bonus for home, penalty for away. I'm going to essentially run a bunch of " fake " observations with this model to figure out a rough point differential score for each team prior to the beginning of the first game. Can you help me?
In a regression model given as, logpgp95i = γ0 + γ1avexpri + γ2 lat absti + γ3africa + γ4 asia + γ5 other + νi where logpgp95 is GDP per capita of country in 1995, africa = 1 if country i in Africa, asia = 1 if country i in Asia, and other = 1 if country i is not in Asia, Africa, or the Americas. The regression coefficient for dummy africa is -0.9163864. How to interpret this coefficient? If I interpret "As other factors being equal, African countries have 91.6% less GDP per capita than non-African countries", is it the right interpretation?
\You can use a MLE method to estimate it directly, or nonlinear least squares (Stata has a "nl" command for just such a purpose) but for Cobb-Douglas, I'm not sure why you'd want to. If you have Q = A * K^a * L^b and you take logs, you get ln(Q) = ln(A) + a * ln(K) + b*ln(L) and you can just regress that in a straightforward fashion and get estimates for your production shares...unless you know the error distribution is wrong...but the ease of this is the whole point of using Cobb-Douglas.
Hi Dr. D. For your example you explained interactions with a quantitative and a dummy variable, so what I understand is that the reference (Firefox or Male) is always omitted. Does this apply to interactions of 2 dummy variables? For instance, I would like to investigate if there is an interaction between Gender and Browser, so for my interactions, will Firefox and male be omitted? Regards, Fang
Dummy variables are just a way to account for every possible combination, to allow for a full complement of different intercepts, for example. In the case you mention, Gender (G) and Browser (B), if you want full interactions, you can see that you could have: G B = 0 0 (Male, Firefox) G B = 0 1 (Male, Chrome) G B = 1 0 (Female, Firefox) G B = 1 1 (Female, Chrome) If you had a third browser, say IE, you'd need to add another dummy just because there are more than 4 combinations, and 2 binary variables can only give you 4 combinations. Lets say we had B1 = 1 if Chrome, B2 = 1 if IE: G B1 B2 = 0 0 0 (Male, Firefox) G B1 B2 = 0 1 0 (M, Chrome) G B1 B2 = 0 0 1 (M , IE) G B1 B2 = 1 0 0 (Female , Firefox) G B1 B2 = 1 1 0 (F , Chrome) G B1 B2 = 1 0 1 (F , IE) You can see that we never use 011 or 111, because that would imply Chrome AND IE, which are mutually exclusive by assumption. In principle, though, you should let your intuition help you--you just want a different intercept (or slope term depending on your application) for each case.
Hi Jason I have a problem in hand, i exactly do not know the function of a model, but using the dataset i have i must find out the function. i have three inputs in hand and i have an output, i must find relationship between these input variables and find the output. could i have a short guidance over this.
So when do we actually interact our variables? Is there a way to see if it is necessary or do we just do it and then see if the coefficient on the interaction term is statistically significant?
Thank you so much for this video. I have not seen anything else on the web that concisely explains the underlying math, concept, and real world how to. Is it possible to do this type of analysis with grouped data? How would you 'weigh' the groups?
Brilliant video - explained really well !! You mentioned in passing that another one of your series explained some of the theory behind dummy variables. I'm interested in how contrasts can be specified, say whether there is a significant difference between each of the browsers with each other and not just with reference to Firefox as per your example? Thanks again
Thank you very much, I think that we are able to center only quantitative variable and not dummy variable. Please i ask if you have other videos about RIDGE regression or PARTIAL LEAST SQUARES regression.
Hello, there are 3 separate dummy variable columns for internet E, safari, chrome... is there any choice to take these 3 in a single column with giving discrete values like 0,1,2....please help me over finding this
Hi there, when I have a dummy variable, a continuous variable and interaction term, does the coefficient of the dummy variable still indicate the results of when it equals 1 (regardless of the continuous variable) unlike the coefficient for the continuous variable, which only represents the values for the continuous variable when dummy =0?
can your dependent variable be categorical? for example if my hypothesis is that males are more likely to use chrome than females. (relationship between gender and browser) both coded categorical variables.
Thanx for informative video.it really helped me.I have some questiins .I want to fit quadratic model with one categorical and one continues variable including interaction term and squared term.but minitab software did not take the square term of categorical variable.can u plx explain me why is it so?.and my second question is I want to know the theory behind model fitting with categorical variables along with the procedure to estimate regression coefficients. Help me from where I can find the material. Thanks in advance
Sir, I'm having one dependent variable and eight independent variables. can i use regression to see which one of the independent variable contribute more to the dependent variable?
Thanks for the tutorial Please I want to do regression analysis between waiting time in a restaurant and profit made to find out if automated system can reduce the waiting time. What are the datas I need to collect?
You would need: Waiting time and whether the associated waiting time was using the automated system. You don't even need to use regression if it's just System A v. System B. You can make fewer assumptions and use a 2-sample t-test, or MANY fewer assumptions and use something like a Mann-Whitney (Wilcoxon) test if all you care about is the average, or a two-sample Kolmogorov-Smirnov test if you want the full distributional test.
hi, it's very helpful :) . Please I want to do regression analysis between the home prices and if it's affected by the bank interest , in addition i have some other variables which will be included , such as Population , wages ... but i want to check the relation between interest and prices ... how can i do that ? thanks a lot
Hi Jason, here years of education is an independent variable right? and if that is the case, then how can we put it in the X range while doing the regression?
Independent variables all go on the right hand side (i.e. are x's). Dependent variables go on the left (i.e. are y's). If you're concerned about endogeneity (probably not a huge issue in this application), you would want to take a different modeling approach.
Thanks for the video! My question is - for the later variables such as Male Female, if you are analyzing just gender, why do you still include the previous variables in the regression table? Does that make a difference? I think you said "holding all else constant"?
Hi Dr D, This is a great video! Can I ask, for the last example of everything, if we found that some variables are statistically significant, and others are not, is it a good idea to run another regression analysis of only those significant variables? Kind Regards, Fang
After putting the interactions there, I found that one of the main effects became not significant (which previously was significant). How do we interpret this? Thanks in advanced.
Hey Jason, This is excellent and really helpful. Thanks. Moreover, I'd like to ask for more. Could you please do a video on exponential regression with multiple variables? E.g., the Cobb-Douglas function. I am ware you could do a log-linear but is there a way of doing this directly?
Hi Jason, First of all thank you for your great video. I have a question as to why we need an omitted variable? In your video, you didn't develop a dummy variable for Firefox. May I ask why?
Hi Dr Delaney, Can we have interaction like this for example, Educ x IE X FEM which means Education across Internet explorer browser and across female? Thanks.
Thank-you for this video, it really helped me in my project. I have a question though: how would you do this analysis if y were qualitative (i.e., y is either yes or no?)
Suppose your cursor is at E3. Now, do this step by step - 1) Ctrl+Down arrow to be in the last row of data 2) press Right arrow once to be in the F column 3) Shif+Ctrl+Up arrow to select all the cells above up to cell F2 that has the formula you want to fill down 4) Ctrl+D. All these take small time when you are efficient enough.
but can you please say me, what happen if there are two independent variables(non categorical) and 2 dummy variables? in the above exam there is only one independent variable that is education, but what happen if there would be another one independent variable?
The short answer is get more data. :D Yeah, including interaction terms can definitely lead to higher VIFs, but it's generally not something to be concerned about with interactions of dummy variables. If you are concerned, you can recenter the variable, particularly if it's a quantitative variable you're interacting with a dummy. But dealing with collinearity is more concerning with other, ostensibly unrelated variables than with interactions, in which the relationship is explicitly stated.
Hi TheMasterkyle79. Fair enough. I recommend the video on interpreting models which may help clear things up. Good luck and let me know if I can help at all!
hey! I am a fresh man who just studied regression and spss for my dissertation. I do a questionnaire research with 45 items. I should connect 2-4 items to get one independent variable or .dependent variable. and if possible, I need to connect 2-4 independent variable to get second-order independent variable, and found out its relationship with another second-order dependent variable.is that possible to achieve with spss?
Hey Jason, Thank you for this video, it was very helpful! I do have a question though... how would you run a regressional model if your dependent variable was also a categorical variable? Thanks!!
Hi Kirstin, You would want to use a dummy variable. If you search youtube for "dummy variable" you should find a few videos (some of which are mine). Good luck! --Dr. D.
Love when people perform on such a high level where the lay person can understand. Great Job
YOU ARE THE GREATEST PERSON ALIVE!!! I HAVE BEEN SEARCHING FOR HELP ON DUMMY VARIABLES FOR WHAT TODAY WOULD BE THE FOURTH DAY FOR MY PROJECT AND EVEN MY PROFESSOR WAS OF NO HELP! i really appreciate this video...
I'm glad you found it helpful. ;)
Great many thanks Dr Delaney! It would be nice if you discuss two more related issues. 1) Explanation of the coefficients in a regression w/o the intercept term. 2) If we define dummies differently then how do we interpret the coefficients? For example, consider the regression y= a1*D1+a2*D2+a3*D3+a4*D4+u where Ds are dummies for season but defined differently- value of D1 is 1 for all observations, value of D2 is 1 for all observations except Spring, value of D3 is 1 for all observations except Spring and Summer, and value of D4 is 1 only for observations in Winter. Thanks again for allowing questions and discussions.
I'm into 7:02 and I've been nodding since the video began...thank you man! :)
THANK YOU SO MUCH. I've been so confused with how to do this for ages and now I finally understand it! I couldn't be more grateful.
Thank you very much for this tutorial - so intuitive, and guides us directly to what's important. My question is: do you know a similarly intuitive way to run a regression with a dummy dependent variable? I'm trying to analyze survey responses, much of which is discrete data. Thank you!
Thank you! This video was great, it is explained in a way that makes a lot of sense! The book for my business analytics class made it way more complicated!
You have no idea how much this video helped me to do my thesis work! Thank you so much!
This is the best explanation that I've found so far. Thank you so much!
Thank you very much for the video. It saves my dissertation.
Gayathri Ravichandran It's not letting me reply directly to your comment. But the answer is yes, you should be able to check which independent variable contributes more.
One is to do a series of 8 separate regressions with 1 independent variable in each, and check the R^2. The other is to do 8 separate regressions with all but 1 in each, and check the R^2.
Finally, you can do the full regression and just see which has the largest coefficient (in magnitude)...this runs into the problem of different scales, so you may want to measure your variables in #'s of standard deviations from the mean value of that variable.
Caveat: all of this assumes you have enough observations to run all these tests without running into overfitting problems.To safely run 8 regressions here (or 17, maybe), you'll want to make sure that you have at least 17*8*15 = 2040 observations.
Hi Jason,
I watched your UA-cam video about using dummy variables with the regression tool in excel. I studied math in college so I was really excited about it.
I’m trying to use it to forecast sales and I set it up where I had my Y values as previous sales and my X values as weeks 1 to 52, where it would be a 1 if it matched the sales week and 0 otherwise. I also included holidays like Easter x week, 4th of July x week etc.
It gave me an error that I can only have 16 columns used in the X values, so I tried it with just 16 weeks and the p values were really big. I’m wondering if you know of another way I can do this to include the seasonality from the weeks and the impact of the holidays.
Thanks so much!
Great explanation for new users and to refresh. Much appreciated!
I was working on my thesis and these materials were precisely helpful. Thanks!
Hi Fang, it's definitely a good idea to run it, and then you can use an F test for a subset of variables to see which model is better. If you search UA-cam for F test for subset, you'll see the video that outlines the process.
Thx! You saved me a lot of time to re-take the course of statistics
I did not understand this at all but within the 1st ten minutes it makes so much sense. Makes me want to go play around with an actual data set. curious if theres anyone with videos on how to do this in R?
So useful. May need to watch more than once to master but its worth it.
Great breakdown of multiple regression and helped me greatly with educated forecasting.
Your explanation is awesome.
It helps me understand interaction a lot
Thank You!
Thanks for sharing this brilliant video online. I would like to know if I want to calculate the coefficients of Firefox as independent variable, which browser should be excluded as a dummy variables? Many thanks
Hi Dr D.
It is possible to run a multiple regression if i have all categorical variables (both my independent variables and my dependent variable are categorical, two-level variables)?
Thank you so much for this video! I was having quite a bit of trouble grasping this concept but now I get it! Great explanation!
Wow, thank you so much!! I learned so much!
Hi Dr. Delaney, thanks for the video! Would these rules apply for moderation? For instance, if the predictor had many dummy variables, the outcome didn't, and the moderator didn't, would it work the same way? Thank you!
thank you so much!! best tutorial I've seen by far
I have a question about general (non-linear) multiple regression. I understand general MR just needs to change x and y into some functions. But my question is: do I need to change the cross dummy into the same functions as well? Take your data as an example, if I use 1/educ as the new x, for the educ*fem dummy, should it be 1/educ*1 or still educ*1? Thanks.
Thank you very much! It's really helpful! But I wonder if we can get the cofficients without a dependent variable and only with two independent dummies in the equation . And how do we apply constraints on the equation? Like for example, we want to examine how much of y is resulting from the factor b, and much of it is a result of factor c, we have a time series of y and the equation: Y=a+b1*d1+b2*d2+...+b50*d50+c1*e1+c2*e2+...+c34*e34, d and c are the dummy variables. The condition is the sum of the weighted b1~b50=0 and the sum of weighted c1~c34=0. In this case, how can we get the series of b1~b50 and c1~c34?
Dr. D,
How can I create a dummy variable model, using 1, -1? For example, I want to run a bunch of observations to estimate how the market feels about each NFL team coming into every season. I want to use the Vegas point spread as expected value of " y". I want to assign 1 to the home team and - 1 to the away team. bonus for home, penalty for away. I'm going to essentially run a bunch of " fake " observations with this model to figure out a rough point differential score for each team prior to the beginning of the first game. Can you help me?
In a regression model given as,
logpgp95i = γ0 + γ1avexpri + γ2 lat absti + γ3africa + γ4 asia + γ5 other + νi
where logpgp95 is GDP per capita of country in 1995, africa = 1 if country i in Africa, asia = 1 if country i in Asia, and other = 1
if country i is not in Asia, Africa, or the Americas.
The regression coefficient for dummy africa is -0.9163864. How to interpret this coefficient? If I interpret "As other factors being equal, African countries have 91.6% less GDP per capita than non-African countries", is it the right interpretation?
\You can use a MLE method to estimate it directly, or nonlinear least squares (Stata has a "nl" command for just such a purpose) but for Cobb-Douglas, I'm not sure why you'd want to. If you have Q = A * K^a * L^b and you take logs, you get ln(Q) = ln(A) + a * ln(K) + b*ln(L) and you can just regress that in a straightforward fashion and get estimates for your production shares...unless you know the error distribution is wrong...but the ease of this is the whole point of using Cobb-Douglas.
Hi Dr. D. For your example you explained interactions with a quantitative and a dummy variable, so what I understand is that the reference (Firefox or Male) is always omitted. Does this apply to interactions of 2 dummy variables? For instance, I would like to investigate if there is an interaction between Gender and Browser, so for my interactions, will Firefox and male be omitted?
Regards,
Fang
Dummy variables are just a way to account for every possible combination, to allow for a full complement of different intercepts, for example. In the case you mention, Gender (G) and Browser (B), if you want full interactions, you can see that you could have:
G B = 0 0 (Male, Firefox)
G B = 0 1 (Male, Chrome)
G B = 1 0 (Female, Firefox)
G B = 1 1 (Female, Chrome)
If you had a third browser, say IE, you'd need to add another dummy just because there are more than 4 combinations, and 2 binary variables can only give you 4 combinations. Lets say we had B1 = 1 if Chrome, B2 = 1 if IE:
G B1 B2 = 0 0 0 (Male, Firefox)
G B1 B2 = 0 1 0 (M, Chrome)
G B1 B2 = 0 0 1 (M , IE)
G B1 B2 = 1 0 0 (Female , Firefox)
G B1 B2 = 1 1 0 (F , Chrome)
G B1 B2 = 1 0 1 (F , IE)
You can see that we never use 011 or 111, because that would imply Chrome AND IE, which are mutually exclusive by assumption. In principle, though, you should let your intuition help you--you just want a different intercept (or slope term depending on your application) for each case.
Thanks a lot Dr D. Very insightful
Hi Jason I have a problem in hand, i exactly do not know the function of a model, but using the dataset i have i must find out the function. i have three inputs in hand and i have an output, i must find relationship between these input variables and find the output. could i have a short guidance over this.
Excellent video. Very helpful.
So when do we actually interact our variables? Is there a way to see if it is necessary or do we just do it and then see if the coefficient on the interaction term is statistically significant?
Thank you so much for this video. I have not seen anything else on the web that concisely explains the underlying math, concept, and real world how to.
Is it possible to do this type of analysis with grouped data? How would you 'weigh' the groups?
Brilliant video - explained really well !! You mentioned in passing that another one of your series explained some of the theory behind dummy variables. I'm interested in how contrasts can be specified, say whether there is a significant difference between each of the browsers with each other and not just with reference to Firefox as per your example? Thanks again
Thank you mate! Really helpful video
Thank you very much, I think that we are able to center only quantitative variable and not dummy variable. Please i ask if you have other videos about RIDGE regression or PARTIAL LEAST SQUARES regression.
Sir ,
In case of browser, if we introduce a fourth dummy variable for firefox ( which is against the theory ),then what difference will it make?
Hi Dr D, I am wondering whether I can look at the interaction between 2 dummy variables? Thanks,.
Hello, there are 3 separate dummy variable columns for internet E, safari, chrome...
is there any choice to take these 3 in a single column with giving discrete values like 0,1,2....please help me over finding this
Hi there, when I have a dummy variable, a continuous variable and interaction term, does the coefficient of the dummy variable still indicate the results of when it equals 1 (regardless of the continuous variable) unlike the coefficient for the continuous variable, which only represents the values for the continuous variable when dummy =0?
Thanks a lot for this video! Very clear and explicit. Great job. :)
Big thanks from Switzerland!
can your dependent variable be categorical? for example if my hypothesis is that males are more likely to use chrome than females. (relationship between gender and browser) both coded categorical variables.
Thank you very much for this video .. Very clear
Thanx for informative video.it really helped me.I have some questiins .I want to fit quadratic model with one categorical and one continues variable including interaction term and squared term.but minitab software did not take the square term of categorical variable.can u plx explain me why is it so?.and my second question is I want to know the theory behind model fitting with categorical variables along with the procedure to estimate regression coefficients. Help me from where I can find the material. Thanks in advance
Sir, I'm having one dependent variable and eight independent variables. can i use regression to see which one of the independent variable contribute more to the dependent variable?
ridiculously helpful video, thank you
Brilliant!
Thanks for the tutorial
Please I want to do regression analysis between waiting time in a restaurant and profit made to find out if automated system can reduce the waiting time. What are the datas I need to collect?
You would need: Waiting time and whether the associated waiting time was using the automated system. You don't even need to use regression if it's just System A v. System B. You can make fewer assumptions and use a 2-sample t-test, or MANY fewer assumptions and use something like a Mann-Whitney (Wilcoxon) test if all you care about is the average, or a two-sample Kolmogorov-Smirnov test if you want the full distributional test.
Thanks for your wonderful explanation !!!
Shouldn't the regression equation include the original educ AND browser variables when testing interactions?
Hi Katie,
For qualitative variables, you want to use a dummy variable. I have several videos on the topic. I hope that helps!
Best regards,
Dr. D.
at 26:55 how did you/he insert the colums so quick? what is the shortcut for that? thx!
hi, it's very helpful :) . Please I want to do regression analysis between the home prices and if it's affected by the bank interest , in addition i have some other variables which will be included , such as Population , wages ... but i want to check the relation between interest and prices ... how can i do that ? thanks a lot
Thanks! I'm glad it was helpful!
Thanks a lot Dr Delaney, really helpful!
Hi Jason, here years of education is an independent variable right? and if that is the case, then how can we put it in the X range while doing the regression?
Independent variables all go on the right hand side (i.e. are x's). Dependent variables go on the left (i.e. are y's). If you're concerned about endogeneity (probably not a huge issue in this application), you would want to take a different modeling approach.
Because I’m in a Managerial Decision Making class and we have some problems to solve. I need some help! It’s a combo of statistics and business calc.
Excellent video, thanks!
cab interactions be between two dummy variables like in example female has mobile, and how we can write the equation
Could u tell me how to find correlation between 1500 categorical variables after dummy encoding
A really great tool! Thank you!
It's very good video .Thanks for help .
Thank you very much. It has been very helpful
So will this work if x1 is squared, or if we take e to the power of a constant times x1? : (e^A*x1)
Hi, I am interested in learning how to graph a liner regression for 3 variables. as in is weight a function of height and thickness.
Great video !
Thanks. The video has been very helpful!!
Thanks for the video! My question is - for the later variables such as Male Female, if you are analyzing just gender, why do you still include the previous variables in the regression table? Does that make a difference? I think you said "holding all else constant"?
Jason, please share the data set used in the video if possible
Thank you so much!
Good video.
You should have posted the data set so we could follow along.
Thanks.
Hi Dr D,
This is a great video! Can I ask, for the last example of everything, if we found that some variables are statistically significant, and others are not, is it a good idea to run another regression analysis of only those significant variables?
Kind Regards,
Fang
After putting the interactions there, I found that one of the main effects became not significant (which previously was significant). How do we interpret this? Thanks in advanced.
Hey Jason,
This is excellent and really helpful. Thanks. Moreover, I'd like to ask for more. Could you please do a video on exponential regression with multiple variables? E.g., the Cobb-Douglas function. I am ware you could do a log-linear but is there a way of doing this directly?
life savior!!!
Hi Jason,
First of all thank you for your great video.
I have a question as to why we need an omitted variable? In your video, you didn't develop a dummy variable for Firefox. May I ask why?
I need an answer to your question too. How would I know the effect of FireFox?
Hi Dr Delaney,
Can we have interaction like this for example, Educ x IE X FEM which means Education across Internet explorer browser and across female? Thanks.
Yes, and then you need to compare the value of that estimated coefficient to that of the particular comparison group you care about.
Thank-you for this video, it really helped me in my project. I have a question though: how would you do this analysis if y were qualitative (i.e., y is either yes or no?)
How did you do that 10:05? Filling them down so quickly?
Suppose your cursor is at E3. Now, do this step by step - 1) Ctrl+Down arrow to be in the last row of data 2) press Right arrow once to be in the F column 3) Shif+Ctrl+Up arrow to select all the cells above up to cell F2 that has the formula you want to fill down 4) Ctrl+D. All these take small time when you are efficient enough.
hey Jason where can I get your data from or create automatically data like that?
but can you please say me, what happen if there are two independent variables(non categorical) and 2 dummy variables? in the above exam there is only one independent variable that is education, but what happen if there would be another one independent variable?
This is really helpful thanks a lot. do you have other videos on working with eviews, and stuffs like that. thanks again
Great video it is very helpful!
would u help me ith a regression model interpretation?
Where can I send it to u for your review?
WOW! GREAT video, and you have my respect Jason! You are GOOD! =)
thank you for this video
Thanks for the tutorial, but what about multicollinearity? you have variables in interactions, so May be the VIF is more than 10.
The short answer is get more data. :D
Yeah, including interaction terms can definitely lead to higher VIFs, but it's generally not something to be concerned about with interactions of dummy variables. If you are concerned, you can recenter the variable, particularly if it's a quantitative variable you're interacting with a dummy. But dealing with collinearity is more concerning with other, ostensibly unrelated variables than with interactions, in which the relationship is explicitly stated.
17.37 is interactions for those who want to know
Hi TheMasterkyle79. Fair enough. I recommend the video on interpreting models which may help clear things up. Good luck and let me know if I can help at all!
hey! I am a fresh man who just studied regression and spss for my dissertation. I do a questionnaire research with 45 items. I should connect 2-4 items to get one independent variable or .dependent variable. and if possible, I need to connect 2-4 independent variable to get second-order independent variable, and found out its relationship with another second-order dependent variable.is that possible to achieve with spss?
Thank you that is a great video
Thank you!
Of course!
Hey Jason,
Thank you for this video, it was very helpful! I do have a question though... how would you run a regressional model if your dependent variable was also a categorical variable?
Thanks!!
You can do logistic regression
Hi Kirstin,
You would want to use a dummy variable. If you search youtube for "dummy variable" you should find a few videos (some of which are mine). Good luck!
--Dr. D.
Thank you, very helpful!