DEAR VIEWERS: I wanted to share that I have created a new Powerpoint presentation (March 2020), called "Binary logistic regression: A deeper dive into understanding and interpreting your SPSS results", can can freely download here: drive.google.com/open?id=1JP4oSflFxf0ZhIWC6Brthv_V5HMOQhE1 . As always, I hope you find it useful & share it with others! Cheers everyone!
Thank you for your kind words, Berhanu! By the way I have a newer presentation on binary logistic regression (at ua-cam.com/video/vab9NezxpBc/v-deo.html) I put up a few weeks ago. I hope you consider visiting it and sharing it. Best wishes!
Hi Saroj, thank you for your comment and for visiting! Just an FYI, I have a more recent video on binary logistic regression at ua-cam.com/video/vab9NezxpBc/v-deo.html&lc=UgwUD_hkPgquBCdjHi14AaABAg Best wishes!
So at 6:05, when the reference category is picked to be "First", does it refer to the lowest number of the categorical variable? in this case, there are 0 and 1, which represent male and female respectively. so when "first" is selected, does it refer to 0? conversely, if "last" were to be selected, then the largest number (in this case, 1) is chosen to be the reference? Would appreciate clarity. thanks!
Hi, the reference category in any dummy coding system is the category that is coded 0 across all dummy variables. In the case of gender, where I coded male=0 and female=1, then gender is the only dummy variable. The reference category therefore is male (coded 0) and the comparison group is female). If you want to learn more about dummy coding, please visit: ua-cam.com/video/XGlbGaOsV9U/v-deo.html By the way, the group you assign as a reference category is arbitrary. The only thing that designates the group as a reference category is how you coded your dummy variable(s). Cheers!
Hi, @05:55 you say that the reference category is male, which is coded "1", but you coded female as "1" not males... Am i wrong? I am trying to figure out whether i should choose "last" or "first" for reference category..
I probably misspoke in the video. I usually code male=0 and female=1. It should be clarified in the PowerPoint (linked under video description). Cheers
Hello Mike, Very educative tutorial, i highly appreciate. Just have a comment which have noted that at the 5:59 minute in the video yo mention the male category was coded one however it was coded zero hence the reference category should be last? or am i wrong?
Hi there. I probably misspoke in the video (there was a lot of ground to cover!). I believe males should have been referred to as group 0 instead of 1. They would be the reference category. The powerpoint (see under video description) should not contain this error. Thanks for letting me know. Best wishes!
But what happens when we have for example a positive coefficient for a predictor and the odds ratio for that predictor is smaller than 1? In that case, for every unit of increment on the predictor, would the odds of terminating increase by a factor of Exp(B) or decrease by a factor of Exp(B)?
Hi there. A positive regression coefficient is always associated with an odds ratio > 1, and a negative coefficient is always associated with odds ratio < 1. A slope of 0 would be associated with odds ratio = 1. They're basically saying the same thing, except using different metrics. The slope represents the relationship between your independent variable and your dependent variable as the predicted change in log-odds (logits) per unit increment on the independent variable. The odds ratio represents the relationship as the predicted change in odds (of target group membership) per unit increase on the independent variable. Logits can be transformed into odds through exponentiation, whereas odds can be transformed into logits by taking their natural log. Because logits are not very intuitive, most folks just use the regression slope for testing for statistical significance, and use the odds ratio as a kind of effect size to describe the effect of the IV on predicted group membership. I would be careful on saying the odds 'increase' or 'decrease' by a factor of OR (there's a good discussion on the language in Osborne's book on logistic regression). Basically though, if the OR is > 1 (which would correspond also to b>0), then the odds are increasing with increasing values on the IV; and if the OR < 1 (which will correspond to b
@@mikecrowson2462 Sir!! Or professor! I am not sure how to call you! Thank you so much for such an elaboration and quick reply!! That was really helpful. And generally your videos are super nice! I like the way you teach and your voice! Thank you for this channel !!
what about the assumptions for doing binary logistic regression? Do you have any video showing how to test if there is any violation of the assumptions? I am stuck ( do not know how) to check that the observations are independent from each other.. maybe perform 8 times separately independent samples t-test for all my 8 predictors?
Right now Im working on Credit Risk modelling using BLR and historical data of P2P lending platforms, like LendingClub for my Bachelor theses. Just wanted to ask : In my model I have more than 30. independent variables, when I make a regression in SPSS, some variables just disappear. Why is it happening? Thank You!
Aren't we supposed to delete 'income' variable from the model since it's non-significant and then rerun the binary logistic regression to interpret the accuracy etc
Hi Oskar, this kind of boils down to your view of whether one should include non-significant variables in your final model or not. I have no problem with pruning out non-significant predictors (in the manner you indicate), so long as it is clear to the reader that you started with a full model (where 'income') was non-significant and then (after pruning) report on the reduced model. This kind of approach is used quite often in the context of structural equation modeling, where a full model is described along with a reduced model. What WOULD be problematic is simply reporting on the pruned model 'as if' income was never included in the first place. That, I would not be able to get behind. I hope this helps to answer your question. Cheers!
If my ordinal and binary independent variables are already encoded then I dont need to specify them as categorical variables? If I do, I end up getting some large Exp(B) values, like 40,000.
If you have a categorical variable with more than 2 levels you either (a) have to create dummy variables and input them as predictors in your model (in which case you don't need to use the Categorical option) or (b) use the Categorical option (so that SPSS effectively does the dummy coding for you). I generally prefer option (a) because I prefer having greater control over my dummy variables (but that's just me). On the issue of the large Exp(B) values, this is likely to due a problem with complete or quasi-separation in your data. There is a nice discussion of these issues here: stats.idre.ucla.edu/other/mult-pkg/faq/general/faqwhat-is-complete-or-quasi-complete-separation-in-logisticprobit-regression-and-how-do-we-deal-with-them/ . In cases of quasi-separation, an option is to use the Firth logistic regression procedure. Cheers!
@@mikecrowson2462 Thanks so much, I realized that having too little in one category was maybe the problem. Choosing the indicator for categorical as first if the last category had too little observed and vice versa showed an improvement. I also removed any independent that was a constant or close to seemed to help as well.
Hi Alex. Thanks for visiting. It's been awhile, so I'm assuming that what you mean by 'first' and 'last' category pertains to the selection of the reference group/category related to categorical independent variables. Basically, by way of this selection you are instructing SPSS to assign either the first group or the last group the status of reference category (in a dummy coding scheme). The choice of which group to use as a reference category is up to you, but ideally I would think you would pick a group that you would most like to compare against the remaining groups. For gender, you only have two categories for that variable. I typically code it as 'identified female=1, 'identified male=0'. But that's an arbitrary decision based on my preferred coding approach. I hope this helps!
@@mikecrowson2462 Dr. Crowson, thank you for your enthusiastic explanation. I appreciate it very much. In your ppt, page 4, the context is "('genderid', coded 0=identified as male, 1=identified as female)", but in your video 5:55, you mentioned "the male is coded as 1," and then you choose "first" because "male" is important than "female." For this part, I get a little bit confused. I am not sure my idea is correct or not. If "male" is important than "female," it is coded as 1. I need to set it "first" from "last." As such, the income level, "(ordinal variable, coded 1=low, 2=medium, 3=high)," I set low income as "first" because it is important as a reference. Am I right? Thank you for your kind explanation.
@@erasuas Hi there. I think that I misspoke in the video. I typically code female 1 and male 0. I would stick with the Powerpoint description on the video description on that point. Cheers!
Hi there. The constant is the intercept. In standard OLS regression, it is the mean on Y when all predictors are 0. In the context of logistic regression it will be the predicted logit when all predictors are 0.
Thank you for your easy explanation of binary logistic. If I have one categorical independent variable (Gender) and other continuous dependent variable then which regression test will be applied?
Hi Nazia. If you have an iv that's binary (gender) and dv that's continues, you have a couple of options if you don't have any other ivs in your model. The easiest analysis is an independent samples t-test to test mean difference between groups. You could do the same with a simple linear regression with gender predicting the dv.
@@mikecrowson2462 Thank you so much sir for you reply . I have to check the impact of some of the ivs on one dv. T-test give me only the mean difference not the effect. while in linear regression can i check the impact of gender (categorical) on dv (continuous). My dv is not normally distributed.
The Powerpoint link was there. The data link and powerpoint links were mislabeled (but now corrected). A copy of the Powerpoint used in the video can be downloaded here (drive.google.com/open?id=1atjwuodokqqNE98oCjbrOpO6SuC-cwWf). A copy of the data used in the video can be downloaded here (drive.google.com/open?id=1Etmudy8b6SZRykSPxCyFzG8ANQZwv966).
Thanks for the video! In this example when you choose reference category 'first' are we basically saying that we will use the sub-category coded with the lowest value (i.e. male) as a reference to increases or decreases in the value coded higher than it (i.e. female)?
Hi Lawson, yes, basically the male group (coded 0) is the reference category. The slope associated with genderid variable is the difference in logits between groups. If the slope is positive, it indicates that the marginal mean of logits for females is greater than that for males. If it is negative, it indicates that the marginal mean of logits for females is less than that for males. By the way, I have a more recent video/presentation that you might also find useful: ua-cam.com/video/vab9NezxpBc/v-deo.html best wishes!
Thank you for this. I was just wondering about something. If I were to test the presence-absence of a target species at 17 different locations against environmental variables such as temp, rainfall and humidity. Would such a test be appropriate?
Hello, I have a binary dependent variable and 2 categorical independent variables only. is it necessary to have a quantitative variable as an independent?
Hi Leo. No, you can include nominal or ordinal variables as predictors. In the video and powerpoint, I cover the inclusion of categorical variables using the 'categorical' button. Be sure to go back over that information.
@@mikecrowson2462 thanks a lot for your quick feedback. btw, can u do a video lesson on multiple correspondence analysis or do you have any lecture material where I can download that. (with R or python codes....)
Thanks for the informative videos. I'm new to logistical regression and I find myself trying to wrap my head around how you choose the reference category and based on this what the output means. The example you used for gender is male is the reference category and male is coded as 1 and female 0. In lay terms what does it mean and how would you interpret Exp(B) of .221 Other videos have expressed it as a probability of males vs females. The example I'm using is the likelihood of an inmate being successful upon release e.g 1 = Successful and 0 = unsuccessful (return to prison) and the impact of participation in education programs while incarcerated. 0 = Didn't participate and 1 = did participate. Which would be the reference category? Any assistance is appreciated.
Hey there! If you want to run a BLR whith one indepedent variable (a 4 level ordinal) and you want to adjust for several covariates what method you use and where you put each variable in SPSS?
Hello! You move all variables over to the covariates box. For the categorical variable, you click on the tab for Categorical and indicate the type of contrasts desired (I typically prefer to set the Contrast to 'Indicator', which will give you a set of simple contrasts of one group versus a baseline group for the IV). SPSS will do the dummy coding for you. best wishes!
Hi I'm encountering a weird issue on SPSS, my independent variable has 5 categories, think of it like multiple genders (but really it's serogroup data on a bacteria) and my dependent is just positive or negative for a certain gene. When I run the same settings you did, it only shows 4 categories in "Variables in the equation", how do I make it show all 5?
It shows how the odds for the gene (Y/N) change for category 2 compared to your reference category. The reference category is set as the first or the last category, you can set this in the categories tab, under contrast. Select last / first and press change. If its set to first, then the 4 categories you see in the outcome basically say how odds change from category 1->2, 1->3, 1->4 and 1->5. that is why you don't see 5 categories. The results are arranged in order, so your category(I) in results is actually your Second category in bacterial strains, giving you how odds change compared to strain 1
Hi. Thanks for your question. The decision boils down to the scale of measurement associated with your variables. If you have a variable that's nominal or ordered-categorical then you use categorical option. If your variable is continuous, then you wouldn't be using the aforementioned approach. Hope this helps.
@@mikecrowson2462 Thanks Mike! Can all predictors be categorical covariates? I am looking at age group, tobacco consumption and alcohol consumptions groups on cancer (absent/present).
Normality of the independent variables is not a requirement for the procedure. That said, it is probably worth checking for outliers on your continuous IV's. best wishes.
Thankyou Sir, this is very helpful, I appreciate it. And i have a question, i tried crosstab 2x2 and BLR to check the OR of my predictor to my outcome. But why my crosstab OR is different than my Exp(B) score in BLR? It should be the exact same right? OR crosstab=1.18 Exp(B)=0.784 Thankyou for your attention
Hello from China from a Russian student! I've watched your video and find it very helpful! I have a question, what if I have 10 continuous variables and 1 binary outcome. When I try to check for linearity of logit for 10 IV's, it shows me very low values of Exp(B) in Variables in the equation table, therefore the assumption is not met. I am confused about how to deal with this situation.
Are you using Box-Tidwell procedure: X*ln(X) ? [See e.g., pg 18 of www.lexjansen.com/mwsug/2018/AA/MWSUG-2018-AA-91.pdf ]. It looks like if the interaction term is significant for a given predictor, then that would be an indication of non-linearity in the logit. If that's the case, then you'd need to account for the non-linearity in the model. Stoltzfus (2011; p. 4 from onlinelibrary.wiley.com/doi/epdf/10.1111/j.1553-2712.2011.01185.x) suggests either transforming the predictor - essentially to linearize the relationship between X and the logit) or dummy coding the IV. I hope this helps!
Hi there. This is an older video. You might check out my newest one on the topic at ua-cam.com/video/vab9NezxpBc/v-deo.html . You can also download a data file and PowerPoint that should provide additional clarification. Best wishes
Hi feuriger, thanks for your posting. I actually have a much newer video on logistic regression that I believe you might find more useful. There are also supplemental files you can download and study over that you might find helpful. Please visit: ua-cam.com/video/vab9NezxpBc/v-deo.html Best wishes!
Thank you for this helpful video! I was wondering how to do this: I have five conditions which were randomly assigned to the respondents of my survey. Now I want to check which condition resulted in the highest advertising recognition (my binary dependent variable, where the participants had to click yes or no). How can I filter the whole data so that I can compare each condition with the logistic regression? Do I create 5 categorical dummy variables for covariates for the five conditions for the whole data?! (I have 400 complete surveys at the moment)
Hi Georgia, I'm so glad you found this helpful. By the way, I do have a newer video I have a newer video on binary logistic regression I put out in March. You might find some additional useful information with it: ua-cam.com/video/vab9NezxpBc/v-deo.html Best wishes!
Graduate student studying Epidemiology. Thank you so much! You make this topic a lot easier to understand!
I am new in spss.And you just saved me....... Thank you verrrrry much☺
You are very welcome! Thanks for visiting!
DEAR VIEWERS: I wanted to share that I have created a new Powerpoint presentation (March 2020), called "Binary logistic regression: A deeper dive into understanding and interpreting your SPSS results", can can freely download here: drive.google.com/open?id=1JP4oSflFxf0ZhIWC6Brthv_V5HMOQhE1 . As always, I hope you find it useful & share it with others! Cheers everyone!
Thank you!!! your presentation style as well as your mode of teaching is very wonderful for me....
Thank you for your kind words, Berhanu! By the way I have a newer presentation on binary logistic regression (at ua-cam.com/video/vab9NezxpBc/v-deo.html) I put up a few weeks ago. I hope you consider visiting it and sharing it. Best wishes!
Thank you very much, really great that you also spelled out how to report the results in the end of the video.
Hi Artyom, you are very welcome. Thanks for visiting!
I highly appreciate Mike. So useful
Thanks for visiting Anthony. So glad you found it helpful. Best wishes!
Literally saved my life with this. Thanks!
You are very welcome!
Thanks for bringing such a useful vidoe.
Hi Saroj, thank you for your comment and for visiting! Just an FYI, I have a more recent video on binary logistic regression at ua-cam.com/video/vab9NezxpBc/v-deo.html&lc=UgwUD_hkPgquBCdjHi14AaABAg
Best wishes!
Your videos are always helpful
I am really thankful to have dicovered this video. It helps me a lot. Thank you very much.
thank you very much Dr Mike. Godsend
You, Sir, have saved my research.
Thank you. This was very helpful.
You are very welcome Dian! Thanks for visiting!
Awesome as always. Thank you
You made my day!!!
Thank you very much for really good explanation
You are very welcome! Thanks for visiting! cheers
Thank you so much for this lecture. It's very helpful!
Thank You for doing this!
So at 6:05, when the reference category is picked to be "First", does it refer to the lowest number of the categorical variable? in this case, there are 0 and 1, which represent male and female respectively. so when "first" is selected, does it refer to 0? conversely, if "last" were to be selected, then the largest number (in this case, 1) is chosen to be the reference? Would appreciate clarity. thanks!
Hi, the reference category in any dummy coding system is the category that is coded 0 across all dummy variables. In the case of gender, where I coded male=0 and female=1, then gender is the only dummy variable. The reference category therefore is male (coded 0) and the comparison group is female). If you want to learn more about dummy coding, please visit: ua-cam.com/video/XGlbGaOsV9U/v-deo.html
By the way, the group you assign as a reference category is arbitrary. The only thing that designates the group as a reference category is how you coded your dummy variable(s). Cheers!
Hi, @05:55 you say that the reference category is male, which is coded "1", but you coded female as "1" not males... Am i wrong? I am trying to figure out whether i should choose "last" or "first" for reference category..
I probably misspoke in the video. I usually code male=0 and female=1. It should be clarified in the PowerPoint (linked under video description). Cheers
Hello Mike, Very educative tutorial, i highly appreciate. Just have a comment which have noted that at the 5:59 minute in the video yo mention the male category was coded one however it was coded zero hence the reference category should be last? or am i wrong?
Hi there. I probably misspoke in the video (there was a lot of ground to cover!). I believe males should have been referred to as group 0 instead of 1. They would be the reference category. The powerpoint (see under video description) should not contain this error. Thanks for letting me know. Best wishes!
thank you for pointing that out, Njagi!
Great video!
Thank you sir for this informative video.
Would love to ask a few simple questions about categorical variables in BLR models.
But what happens when we have for example a positive coefficient for a predictor and the odds ratio for that predictor is smaller than 1? In that case, for every unit of increment on the predictor, would the odds of terminating increase by a factor of Exp(B) or decrease by a factor of Exp(B)?
Hi there. A positive regression coefficient is always associated with an odds ratio > 1, and a negative coefficient is always associated with odds ratio < 1. A slope of 0 would be associated with odds ratio = 1. They're basically saying the same thing, except using different metrics. The slope represents the relationship between your independent variable and your dependent variable as the predicted change in log-odds (logits) per unit increment on the independent variable. The odds ratio represents the relationship as the predicted change in odds (of target group membership) per unit increase on the independent variable. Logits can be transformed into odds through exponentiation, whereas odds can be transformed into logits by taking their natural log. Because logits are not very intuitive, most folks just use the regression slope for testing for statistical significance, and use the odds ratio as a kind of effect size to describe the effect of the IV on predicted group membership. I would be careful on saying the odds 'increase' or 'decrease' by a factor of OR (there's a good discussion on the language in Osborne's book on logistic regression). Basically though, if the OR is > 1 (which would correspond also to b>0), then the odds are increasing with increasing values on the IV; and if the OR < 1 (which will correspond to b
@@mikecrowson2462 Sir!! Or professor! I am not sure how to call you! Thank you so much for such an elaboration and quick reply!! That was really helpful. And generally your videos are super nice! I like the way you teach and your voice! Thank you for this channel !!
what about the assumptions for doing binary logistic regression? Do you have any video showing how to test if there is any violation of the assumptions? I am stuck ( do not know how) to check that the observations are independent from each other.. maybe perform 8 times separately independent samples t-test for all my 8 predictors?
Right now Im working on Credit Risk modelling using BLR and historical data of P2P lending platforms, like LendingClub for my Bachelor theses. Just wanted to ask : In my model I have more than 30. independent variables, when I make a regression in SPSS, some variables just disappear. Why is it happening? Thank You!
Aren't we supposed to delete 'income' variable from the model since it's non-significant and then rerun the binary logistic regression to interpret the accuracy etc
Hi Oskar, this kind of boils down to your view of whether one should include non-significant variables in your final model or not. I have no problem with pruning out non-significant predictors (in the manner you indicate), so long as it is clear to the reader that you started with a full model (where 'income') was non-significant and then (after pruning) report on the reduced model. This kind of approach is used quite often in the context of structural equation modeling, where a full model is described along with a reduced model. What WOULD be problematic is simply reporting on the pruned model 'as if' income was never included in the first place. That, I would not be able to get behind. I hope this helps to answer your question.
Cheers!
Hi Mike, this is really helpful! Thank you. Any tips on how to do this with a 80/20 train test split on SPSS?
If my ordinal and binary independent variables are already encoded then I dont need to specify them as categorical variables?
If I do, I end up getting some large Exp(B) values, like 40,000.
If you have a categorical variable with more than 2 levels you either (a) have to create dummy variables and input them as predictors in your model (in which case you don't need to use the Categorical option) or (b) use the Categorical option (so that SPSS effectively does the dummy coding for you). I generally prefer option (a) because I prefer having greater control over my dummy variables (but that's just me). On the issue of the large Exp(B) values, this is likely to due a problem with complete or quasi-separation in your data. There is a nice discussion of these issues here: stats.idre.ucla.edu/other/mult-pkg/faq/general/faqwhat-is-complete-or-quasi-complete-separation-in-logisticprobit-regression-and-how-do-we-deal-with-them/ . In cases of quasi-separation, an option is to use the Firth logistic regression procedure. Cheers!
@@mikecrowson2462 Thanks so much, I realized that having too little in one category was maybe the problem. Choosing the indicator for categorical as first if the last category had too little observed and vice versa showed an improvement. I also removed any independent that was a constant or close to seemed to help as well.
Thank yoou sir for this amazing video!
You are very welcome. Best wishes!
very useful video. Thanks a lot. I am only not sure the idea when setting the "last" and "first" of income and gender.
Hi Alex. Thanks for visiting. It's been awhile, so I'm assuming that what you mean by 'first' and 'last' category pertains to the selection of the reference group/category related to categorical independent variables. Basically, by way of this selection you are instructing SPSS to assign either the first group or the last group the status of reference category (in a dummy coding scheme). The choice of which group to use as a reference category is up to you, but ideally I would think you would pick a group that you would most like to compare against the remaining groups. For gender, you only have two categories for that variable. I typically code it as 'identified female=1, 'identified male=0'. But that's an arbitrary decision based on my preferred coding approach. I hope this helps!
@@mikecrowson2462 Dr. Crowson, thank you for your enthusiastic explanation. I appreciate it very much. In your ppt, page 4, the context is "('genderid', coded 0=identified as male, 1=identified as female)", but in your video 5:55, you mentioned "the male is coded as 1," and then you choose "first" because "male" is important than "female." For this part, I get a little bit confused.
I am not sure my idea is correct or not. If "male" is important than "female," it is coded as 1. I need to set it "first" from "last." As such, the income level, "(ordinal variable, coded 1=low, 2=medium, 3=high)," I set low income as "first" because it is important as a reference. Am I right? Thank you for your kind explanation.
@@erasuas Hi there. I think that I misspoke in the video. I typically code female 1 and male 0. I would stick with the Powerpoint description on the video description on that point. Cheers!
@@mikecrowson2462 Dr. Crowson, thank you for your clarification. The idea is clear to me. Have a good day and God Bless You.
thank you
What does the 'constant' mean in the results table?
Hi there. The constant is the intercept. In standard OLS regression, it is the mean on Y when all predictors are 0. In the context of logistic regression it will be the predicted logit when all predictors are 0.
@@mikecrowson2462 Thank you for replying!
Thank you for your easy explanation of binary logistic. If I have one categorical independent variable (Gender) and other continuous dependent variable then which regression test will be applied?
Hi Nazia. If you have an iv that's binary (gender) and dv that's continues, you have a couple of options if you don't have any other ivs in your model. The easiest analysis is an independent samples t-test to test mean difference between groups. You could do the same with a simple linear regression with gender predicting the dv.
@@mikecrowson2462 Thank you so much sir for you reply . I have to check the impact of some of the ivs on one dv. T-test give me only the mean difference not the effect. while in linear regression can i check the impact of gender (categorical) on dv (continuous). My dv is not normally distributed.
The download link for this PowerPoint contains the data file...kindly check
The Powerpoint link was there. The data link and powerpoint links were mislabeled (but now corrected). A copy of the Powerpoint used in the video can be downloaded here (drive.google.com/open?id=1atjwuodokqqNE98oCjbrOpO6SuC-cwWf). A copy of the data used in the video can be downloaded here (drive.google.com/open?id=1Etmudy8b6SZRykSPxCyFzG8ANQZwv966).
Thanks for the video! In this example when you choose reference category 'first' are we basically saying that we will use the sub-category coded with the lowest value (i.e. male) as a reference to increases or decreases in the value coded higher than it (i.e. female)?
Hi Lawson, yes, basically the male group (coded 0) is the reference category. The slope associated with genderid variable is the difference in logits between groups. If the slope is positive, it indicates that the marginal mean of logits for females is greater than that for males. If it is negative, it indicates that the marginal mean of logits for females is less than that for males. By the way, I have a more recent video/presentation that you might also find useful: ua-cam.com/video/vab9NezxpBc/v-deo.html
best wishes!
@@mikecrowson2462 Thanks a lot for this detailed explanation and further learning materials!
Thank you for this. I was just wondering about something. If I were to test the presence-absence of a target species at 17 different locations against environmental variables such as temp, rainfall and humidity. Would such a test be appropriate?
Hello, I have a binary dependent variable and 2 categorical independent variables only. is it necessary to have a quantitative variable as an independent?
Hi Leo. No, you can include nominal or ordinal variables as predictors. In the video and powerpoint, I cover the inclusion of categorical variables using the 'categorical' button. Be sure to go back over that information.
can u send the link again? the given link seems unaccessible....
Hello, Sanoj, the link should be working now. Cheers!
@@mikecrowson2462 thanks a lot for your quick feedback. btw, can u do a video lesson on multiple correspondence analysis or do you have any lecture material where I can download that. (with R or python codes....)
Thanks for the informative videos. I'm new to logistical regression and I find myself trying to wrap my head around how you choose the reference category and based on this what the output means. The example you used for gender is male is the reference category and male is coded as 1 and female 0. In lay terms what does it mean and how would you interpret Exp(B) of .221 Other videos have expressed it as a probability of males vs females. The example I'm using is the likelihood of an inmate being successful upon release e.g 1 = Successful and 0 = unsuccessful (return to prison) and the impact of participation in education programs while incarcerated. 0 = Didn't participate and 1 = did participate. Which would be the reference category? Any assistance is appreciated.
Hey there! If you want to run a BLR whith one indepedent variable (a 4 level ordinal) and you want to adjust for several covariates what method you use and where you put each variable in SPSS?
Hello! You move all variables over to the covariates box. For the categorical variable, you click on the tab for Categorical and indicate the type of contrasts desired (I typically prefer to set the Contrast to 'Indicator', which will give you a set of simple contrasts of one group versus a baseline group for the IV). SPSS will do the dummy coding for you. best wishes!
Hi I'm encountering a weird issue on SPSS, my independent variable has 5 categories, think of it like multiple genders (but really it's serogroup data on a bacteria) and my dependent is just positive or negative for a certain gene. When I run the same settings you did, it only shows 4 categories in "Variables in the equation", how do I make it show all 5?
It shows how the odds for the gene (Y/N) change for category 2 compared to your reference category. The reference category is set as the first or the last category, you can set this in the categories tab, under contrast. Select last / first and press change. If its set to first, then the 4 categories you see in the outcome basically say how odds change from category 1->2, 1->3, 1->4 and 1->5. that is why you don't see 5 categories. The results are arranged in order, so your category(I) in results is actually your Second category in bacterial strains, giving you how odds change compared to strain 1
Hi Mike,
How do you know which predictors to treat as covariates and which to treat as categorical covariates?
Hi. Thanks for your question. The decision boils down to the scale of measurement associated with your variables. If you have a variable that's nominal or ordered-categorical then you use categorical option. If your variable is continuous, then you wouldn't be using the aforementioned approach. Hope this helps.
@@mikecrowson2462 Thanks Mike! Can all predictors be categorical covariates? I am looking at age group, tobacco consumption and alcohol consumptions groups on cancer (absent/present).
@@morgs55555 yes, they can all be categorical. Cheers!
Sir, should I check normality of the continuous variables or should I use it as it is?
Normality of the independent variables is not a requirement for the procedure. That said, it is probably worth checking for outliers on your continuous IV's. best wishes.
Thanks for your presentation video. It is a great time. Next, I would like to view ordinal logistic regression if you have.
Hi Kyaw, thanks for visiting. Yes, I have a video on ordinal logistic regression here: ua-cam.com/video/rSCdwZD1DuM/v-deo.html . Enjoy! best wishes
Thankyou Sir, this is very helpful, I appreciate it. And i have a question, i tried crosstab 2x2 and BLR to check the OR of my predictor to my outcome. But why my crosstab OR is different than my Exp(B) score in BLR? It should be the exact same right? OR crosstab=1.18 Exp(B)=0.784
Thankyou for your attention
And i already set the reference category as the "last"
Why would gender ID be in the model in the first place?
It's in there to demonstrate inclusion of a binary predictor. The intention was not actually to test a substantive hypothesis.
Hello from China from a Russian student! I've watched your video and find it very helpful!
I have a question, what if I have 10 continuous variables and 1 binary outcome. When I try to check for linearity of logit for 10 IV's, it shows me very low values of Exp(B) in Variables in the equation table, therefore the assumption is not met. I am confused about how to deal with this situation.
Are you using Box-Tidwell procedure: X*ln(X) ? [See e.g., pg 18 of www.lexjansen.com/mwsug/2018/AA/MWSUG-2018-AA-91.pdf ]. It looks like if the interaction term is significant for a given predictor, then that would be an indication of non-linearity in the logit. If that's the case, then you'd need to account for the non-linearity in the model. Stoltzfus (2011; p. 4 from onlinelibrary.wiley.com/doi/epdf/10.1111/j.1553-2712.2011.01185.x) suggests either transforming the predictor - essentially to linearize the relationship between X and the logit) or dummy coding the IV. I hope this helps!
Anybody got a video that isn't just somebody reading a wall of text? Like... simple explanation so idiots like me can understand?
Hi there. This is an older video. You might check out my newest one on the topic at ua-cam.com/video/vab9NezxpBc/v-deo.html . You can also download a data file and PowerPoint that should provide additional clarification. Best wishes
Sorry. you speak too fast.
Hi feuriger, thanks for your posting. I actually have a much newer video on logistic regression that I believe you might find more useful. There are also supplemental files you can download and study over that you might find helpful. Please visit:
ua-cam.com/video/vab9NezxpBc/v-deo.html
Best wishes!
Thank you for this helpful video!
I was wondering how to do this:
I have five conditions which were randomly assigned to the respondents of my survey.
Now I want to check which condition resulted in the highest advertising recognition (my binary dependent variable, where the participants had to click yes or no). How can I filter the whole data so that I can compare each condition with the logistic regression?
Do I create 5 categorical dummy variables for covariates for the five conditions for the whole data?! (I have 400 complete surveys at the moment)
Thank you so much! This is very helpful!
Hi Georgia, I'm so glad you found this helpful. By the way, I do have a newer video I have a newer video on binary logistic regression I put out in March. You might find some additional useful information with it: ua-cam.com/video/vab9NezxpBc/v-deo.html
Best wishes!