I came here for a good explanation of the assumptions of multiple regression, and left with statistics wisdom. Plus the long lost Andy Field table which I couldn't for the life of me find in the book. All in all great video
This was so So helpful - having each discussed individually but all in the same place with a clear explanation of what each accomplishes. And the way you have the slide set up (using colored text and boxes) is helpful as well. Thank you so much for posting! I will be viewing many more of your videos.
hey quick question, can I do a multiple regression analysis with one continuous DV and multiple categorical IVs (that have 2 or more categories each) ?
@Emma's plants I have a video using dummy variables and qualitative variables that may help. It happens to be in Excel, not SPSS, but if you're familiar with both, you shouldn't have much trouble transferring the ideas to SPSS: ua-cam.com/video/y5WN_lz95DE/v-deo.html
Thanks for the video especially for the remedial actions included to avoid assumptions violations and summing everything up nicely. Keep up the good work..
I am confused about how to tell the difference when analyzing the independence and linearity? Both times it is said that the points should be scattered without a clear pattern. Am I misunderstanding something here? What is the exact difference in graphically checking for those two assumptions?
I have one question. For doing regression do all the variable should have a correlation with the dependent variable? Like my dependent variable does not have a significant correlation with two independent variables, when I do hierarchal multiple regression and remove these I get a bit larger R square than when I do with them.
Although the focus isn't directly on assumptions, I talk about the assumptions in a Correlation video (ua-cam.com/video/qmgiMZOerVM/v-deo.html), and a simple linear regression video (ua-cam.com/video/PU5_VR8sSxs/v-deo.html). Please take a look at those. If they don't suffice, please let me know. I'm in the middle of a move, but I'll see if I can put something together that is more to your liking. Thank you for commenting! Have a great week!
Thank you very much for your amazing video. Sorry for asking but I cannot find a simple answer to the following problem. I want to check if 2 correlation coefficients in a multiple regression (1 analysis, 1 sample) are significantly different between them. Do you know if there is an etc. online formula or an other way to find out? Thank you in advance.
Thank you for the very helpful video. I am still not clear on whether these assumptions need to be met on the sample data or on the whole population data?
Thank you, Jonathan. The assumptions are for the sample data. If you have the population data, you don't need to make inferences (use inferential statistics), you can just calculate the values you need.
Sorry. I am new at this and my question wasn't clear. The population data set has 4000 data points, and I am selecting for sample data set 400 data points randomly to build the model and make inferences to the population. My question was are the assumptions tested on the sample data set or population set. Thank you.
this is a good video i also have special problem of a simple linear regression where my data has plenty of outliers i identified the outliers +high leverage points and removed them on reruning the regression ,a new set of outliers appears i tried this thrice and every time i remove outliers a new set emerge i am not sure how much i should remove
To be clear, I am not a statistician - I just teach stats. My expertise is in information systems and healthcare. Now that the disclaimer is out of the way :) - it is not correct statistical method to just remove outliers and other extreme values. You have to have justification for why you are removing them...and wanting your regression to work is not an acceptable reason. An example of a reasonable justification is that you find the value of 233 years in a data set of ages. Unless you are including biblical characters, people don't live to 233 years in today's world. So, the 233 could be a typo when someone meant to type 23, 33, 32, etc. Unless you have a way of determining the actual age of the participant, you can remove it. Without knowing more about your data set, I suspect you may not have enough observations (data points) in your data set. Each time you remove outliers/extreme values, the "new" data points in the data are sufficiently spread out that you end up with new outliers/extreme values. You may want to run the Sample Size calculator--specifically, "A-priori Sample Size Calculator for Multiple Regression"--at danielsoper.com (disclaimer: I receive nothing to refer you to the site) to confirm that you have an adequate sample size. The site is free and has a lot of helpful statistics tools and information.
Hi.. Thanks for an informative video. I have a question though. I am studying the relationship between 13 variables (one indpendent and 12 dependents). Now each variable is measured using 4 or 5 items on a questionnaire. So in total I have 61 indicators or items. How would I go about checking the linearity assumption because the DV consists of 5 items and IV's have 56 items. Appreciate your feedback. Thanks.
To be clear, I am not a statistician. With that in mind, I believe you would have to resort to hierarchical linear modeling or structural equation modeling, rather than multiple regression. Multiple regression is used for a single dependent variable, not multiple.
Hi Monica! No, I haven't. I found this macro that might help, put out by IBM: www.ibm.com/support/knowledgecenter/SSLVMB_20.0.0/com.ibm.spss.statistics.help/synmac_ridgereg.htm INCLUDE '[installdir]/Samples/English/Ridge regression.sps'. RIDGEREG DEP=varname /ENTER = varlist [/START={0**}] [/STOP={1**}] [/INC={0.05**}] {value} {value} {value } [ /K=value] . [installdir] is the installation directory.
I am very much interested on this video, the best video. Can i use categorical variables as independent variable in the case of simple linear regresstion
Yes, but unless your categorical variables are binary, you'll have to use dummy coding (re-code the categorical variable into separate binary variables). Here's a pretty good explanation of dummy coding: www.psychstat.missouristate.edu/multibook/mlt08m.html (full disclosure: I have no affiliation with the site).
I came here for a good explanation of the assumptions of multiple regression, and left with statistics wisdom. Plus the long lost Andy Field table which I couldn't for the life of me find in the book. All in all great video
Thank you very much for you kind comments, Ella. I'm glad that you found the video useful.
This was so So helpful - having each discussed individually but all in the same place with a clear explanation of what each accomplishes. And the way you have the slide set up (using colored text and boxes) is helpful as well. Thank you so much for posting! I will be viewing many more of your videos.
This was SO helpful wow. I've been trying to find a video that explains assumption testing clearly and yours is spot on. Thanks so much!
hey quick question, can I do a multiple regression analysis with one continuous DV and multiple categorical IVs (that have 2 or more categories each) ?
Thank you for your kind comments. I really appreciate them.
@@emmasplantz yes, you can, but you need to look at using dummy variables.
@Emma's plants I have a video using dummy variables and qualitative variables that may help. It happens to be in Excel, not SPSS, but if you're familiar with both, you shouldn't have much trouble transferring the ideas to SPSS: ua-cam.com/video/y5WN_lz95DE/v-deo.html
@@weislearners thanks a lot !
Thank you for helping me with my dissertation analysis!
Big help, now I've got to figure out how to run the analysis :D
Thanks for the video especially for the remedial actions included to avoid assumptions violations and summing everything up nicely. Keep up the good work..
One of the best videos on multi regression. Thanks so much. Great job!
COULDN'T AGREE MORE!
this is one of the best videos to check for assumptions. Thank you so much!
Thank you very much!
Very clear explanations, well done and many thanks for the effortless
Thank you very much for your kind words. There is much room for improvement and your comments encourage me to do so, Mo!
You're welcome.
Thank you for your kind words.
thank u sir this video helped me a lot. u explained very good and ur slides are so helpful once again thank u sir
You are most welcome! Thank you for your kind comments!
I am confused about how to tell the difference when analyzing the independence and linearity? Both times it is said that the points should be scattered without a clear pattern. Am I misunderstanding something here? What is the exact difference in graphically checking for those two assumptions?
This is saving my dissertation
I'm glad it helps. Keep grinding! You'll finish!
@@weislearners Thank you Im doing my best :D
Hello , does anyone know how to test for linearity and homoscedascticity when you have a binary independent variable on SPSS?
Thanks for this great video.
You're welcome! I appreciate your comments.
Very impressive and full of knowledge
Thank you very much for your comments! I hope you found it helpful.
I have one question. For doing regression do all the variable should have a correlation with the dependent variable? Like my dependent variable does not have a significant correlation with two independent variables, when I do hierarchal multiple regression and remove these I get a bit larger R square than when I do with them.
Very well explained, thank you very much.
You're welcome. Thank you for visiting my page.
This is an excellent video!
The best lecture thank you👍👍👍
I really appreciate your compliment! Have a great rest of the week!
Do you have the same explanation using EXCEL? Thanks
Although the focus isn't directly on assumptions, I talk about the assumptions in a Correlation video (ua-cam.com/video/qmgiMZOerVM/v-deo.html), and a simple linear regression video (ua-cam.com/video/PU5_VR8sSxs/v-deo.html). Please take a look at those. If they don't suffice, please let me know. I'm in the middle of a move, but I'll see if I can put something together that is more to your liking.
Thank you for commenting!
Have a great week!
Thank you very much for your amazing video. Sorry for asking but I cannot find a simple answer to the following problem. I want to check if 2 correlation coefficients in a multiple regression (1 analysis, 1 sample) are significantly different between them. Do you know if there is an etc. online formula or an other way to find out?
Thank you in advance.
Very informative, thanks
Thank you for the very helpful video. I am still not clear on whether these assumptions need to be met on the sample data or on the whole population data?
Thank you, Jonathan.
The assumptions are for the sample data. If you have the population data, you don't need to make inferences (use inferential statistics), you can just calculate the values you need.
Sorry. I am new at this and my question wasn't clear. The population data set has 4000 data points, and I am selecting for sample data set 400 data points randomly to build the model and make inferences to the population. My question was are the assumptions tested on the sample data set or population set. Thank you.
No need to apologize at all.
If I understand you correctly, you would do the assumptions test on the 400 data points, the sample data.
Excellent Video... Thank You very much...
A Great Piece
Thank you!
this is a good video
i also have special problem of a simple linear regression
where my data has plenty of outliers
i identified the outliers +high leverage points and removed them on reruning the regression ,a new set of outliers appears
i tried this thrice and every time i remove outliers a new set emerge i am not sure how much i should remove
To be clear, I am not a statistician - I just teach stats. My expertise is in information systems and healthcare. Now that the disclaimer is out of the way :) - it is not correct statistical method to just remove outliers and other extreme values. You have to have justification for why you are removing them...and wanting your regression to work is not an acceptable reason.
An example of a reasonable justification is that you find the value of 233 years in a data set of ages. Unless you are including biblical characters, people don't live to 233 years in today's world. So, the 233 could be a typo when someone meant to type 23, 33, 32, etc. Unless you have a way of determining the actual age of the participant, you can remove it.
Without knowing more about your data set, I suspect you may not have enough observations (data points) in your data set. Each time you remove outliers/extreme values, the "new" data points in the data are sufficiently spread out that you end up with new outliers/extreme values. You may want to run the Sample Size calculator--specifically, "A-priori Sample Size Calculator for Multiple Regression"--at danielsoper.com (disclaimer: I receive nothing to refer you to the site) to confirm that you have an adequate sample size. The site is free and has a lot of helpful statistics tools and information.
thank you. your video is very helpful.
Hi.. Thanks for an informative video. I have a question though. I am studying the relationship between 13 variables (one indpendent and 12 dependents). Now each variable is measured using 4 or 5 items on a questionnaire. So in total I have 61 indicators or items. How would I go about checking the linearity assumption because the DV consists of 5 items and IV's have 56 items. Appreciate your feedback. Thanks.
To be clear, I am not a statistician. With that in mind, I believe you would have to resort to hierarchical linear modeling or structural equation modeling, rather than multiple regression. Multiple regression is used for a single dependent variable, not multiple.
Have you done a video on how to do ridge regression in SPSS. I am struggling to find how to do it.
Hi Monica! No, I haven't. I found this macro that might help, put out by IBM: www.ibm.com/support/knowledgecenter/SSLVMB_20.0.0/com.ibm.spss.statistics.help/synmac_ridgereg.htm
INCLUDE '[installdir]/Samples/English/Ridge regression.sps'.
RIDGEREG DEP=varname /ENTER = varlist
[/START={0**}] [/STOP={1**}] [/INC={0.05**}]
{value} {value} {value }
[ /K=value] .
[installdir] is the installation directory.
Here are instructions about the macros: www.ibm.com/support/knowledgecenter/en/SSLVMB_20.0.0/com.ibm.spss.statistics.help/synmac_caution.htm
I am very much interested on this video, the best video. Can i use categorical variables as independent variable in the case of simple linear regresstion
Yes, but unless your categorical variables are binary, you'll have to use dummy coding (re-code the categorical variable into separate binary variables). Here's a pretty good explanation of dummy coding: www.psychstat.missouristate.edu/multibook/mlt08m.html (full disclosure: I have no affiliation with the site).
I feel like I'm learning stats from Owen Wilson
I wish I could take that as a compliment.
I am Pius
Wow you're right >.
Thanks. It is very interesting
Thank Youuuu !
You are very welcome, Apple!
Explain like I'm five.
Do you mean I need to explain it clearer or that I explained it clearly enough, please?
Thank you for watching my video!