Dear Editor of the TileStats series, sincere thanks for this video! I have read and heard many different explanations of the logistic regression model, but never really understood the intuition behind it. This is greatly done, I finally understood the sense of the model. I look forward to see other videos of yours.
Your illustration is easy to understand and also cover all the important point! And the subtitle is extremely helpful for a non native English speakers like myself.
OMG you made it so enjoyable and easy to follow. I re-learned in few minutes what it took me hours over hours to understand reading relevant literature. Thanks a lot
Thanks a lot for this great video! I understand how we get from probability to odds to log-odds. However, I don't understand what the purpose of this is. In maximum likelihood estimation, we adapt b1 so that the log-likelihood is maximized. But this process does not seem to depend on log-odds, right? Is log-odds only necessary for better intepretation of b0 and b1?
You actually fit a linear model to the data, which explains why the response variable must be expressed as logged odds. See for example this page: arunaddagatla.medium.com/maximum-likelihood-estimation-in-logistic-regression-f86ff1627b67
@@tilestats Thanks a lot, I really appreciate your response! I have read the article and other articles from the author. However, I don't understand why it is necessary to fit a linear model to the data?
You can fit a nonlinear model to the data, the sigmoid function in this case, but then you have to use nonlinear regression which is not that easy to work with. It is, for example, hard to find the global minimum of the error function for large nonlinear models. I'm actually working on a video about nonlinear regression.
Nice video. I have a question about using logistic regression with low prevalence (23:25): does NPV decrease, due to so many false negatives? However, in the example of the video dedicated to PPV and NPV, false negatives decrease and false positives increase with low prevalence
Do you mean low prevalence in the sample or in the population? With a low prevalence in the sample, you can adjust for this by changing the cutoff value.
@@tilestats ok, i'm agree. Is PPV calcolated considering prevalence in population? or in the sample? In last case, should I take into account the prevalence of the population when i'm sampling?
@@tilestats Please, tell me if I'm right: even considering a low prevalence in the population, I take a sample with a prevalence of 50% and I set the cutoff value that maximizes accuracy. Finally, I calculate PPV considering the prevalence in the population.
@giovannibrufani3603 yes, sounds right to me. I would also try to calculate the accuracy based on a test data set that I explain in the video about validation.
Thanks a lot for the videos ... very helpful. Wondering if the data used in this video is available to download to replicate the analysis being done? Thanks
How can this apply to qualitative variables. For instance Im reading an article on how social determinants can affect the probability of an adolescent girl being pregnant, but I don't really get how this can be interpreted. There is for example a determinant called "Age of onset of sexual relations" and there is an "estimate value" that is negative 0. And other values are positive and so on. I don't get it. Help
Let's say that we have a variable gender (men and women). If women are set as baseline (coded as zero), men are coded as one, then the estimated parameter say how much larger, or less, the value of the parameter is for the men compared to the women. If that value is positive, the OR is greater than one. If that value is negative, the OR is less than one (see 18:18 for how to calculate and interpret the OR).
Dear Editor of the TileStats series, sincere thanks for this video! I have read and heard many different explanations of the logistic regression model, but never really understood the intuition behind it. This is greatly done, I finally understood the sense of the model. I look forward to see other videos of yours.
Thank you!
Your illustration is easy to understand and also cover all the important point! And the subtitle is extremely helpful for a non native English speakers like myself.
The best explanation ever encountered.
OMG you made it so enjoyable and easy to follow. I re-learned in few minutes what it took me hours over hours to understand reading relevant literature. Thanks a lot
Thank you!
This channel is pure gold. Very clear explanation, thank you.
Thank you!
Great job and really awesome videos.
We owe you and god bless to u and ur's family.
Thank you!
does all scenarios probability form the sigmoid curve when plotted ?
Amazing video and explanation
Very clearly explained... Thank you 🥰
Thank you!
Many many thanks for this wonderful video with clear explanation!
Thank you!
You did a great job with your explanation. Thanks a lot.
Thank you!
Thank for the amazing video!!
Sir, how much of statistics is required for the business analytics program?
Amazing and very simple to understand, thanks for this great video :)
Thank you!
Great videos can I have the slides to refer with the transcript
I'm planning to put the lectures as pdfs on my homepage after the summer.
Thanks a lot for this great video!
I understand how we get from probability to odds to log-odds. However, I don't understand what the purpose of this is. In maximum likelihood estimation, we adapt b1 so that the log-likelihood is maximized. But this process does not seem to depend on log-odds, right? Is log-odds only necessary for better intepretation of b0 and b1?
You actually fit a linear model to the data, which explains why the response variable must be expressed as logged odds. See for example this page:
arunaddagatla.medium.com/maximum-likelihood-estimation-in-logistic-regression-f86ff1627b67
@@tilestats Thanks a lot, I really appreciate your response! I have read the article and other articles from the author. However, I don't understand why it is necessary to fit a linear model to the data?
You can fit a nonlinear model to the data, the sigmoid function in this case, but then you have to use nonlinear regression which is not that easy to work with. It is, for example, hard to find the global minimum of the error function for large nonlinear models. I'm actually working on a video about nonlinear regression.
@@tilestats Thank you, looking forward!
super fucking clear explanation, I am so glad i learned knowledge from you sir, thank you
Nice video. I have a question about using logistic regression with low prevalence (23:25): does NPV decrease, due to so many false negatives? However, in the example of the video dedicated to PPV and NPV, false negatives decrease and false positives increase with low prevalence
Do you mean low prevalence in the sample or in the population? With a low prevalence in the sample, you can adjust for this by changing the cutoff value.
@@tilestats ok, i'm agree. Is PPV calcolated considering prevalence in population? or in the sample? In last case, should I take into account the prevalence of the population when i'm sampling?
@@tilestats Please, tell me if I'm right: even considering a low prevalence in the population, I take a sample with a prevalence of 50% and I set the cutoff value that maximizes accuracy. Finally, I calculate PPV considering the prevalence in the population.
@giovannibrufani3603 yes, sounds right to me. I would also try to calculate the accuracy based on a test data set that I explain in the video about validation.
@@tilestats Sure. I'm not missing any videos in the playlist. Thank you very much for your work and for clarifyng my doubt!!!
Thanks a lot for the videos ... very helpful. Wondering if the data used in this video is available to download to replicate the analysis being done? Thanks
The data is the one you see in the video.
Thanks, I guess I can use the data presented in the tables (middle of the video)@@tilestats
thank you so much
How can we estimate the parameters of this model?
Can we just use ols method by using the linear model (b+b1.x)? Which is used as power of "e" here?
No, have a look at this video:
ua-cam.com/video/J0yuLu3oLuU/v-deo.html
How can this apply to qualitative variables. For instance Im reading an article on how social determinants can affect the probability of an adolescent girl being pregnant, but I don't really get how this can be interpreted. There is for example a determinant called "Age of onset of sexual relations" and there is an "estimate value" that is negative 0. And other values are positive and so on. I don't get it. Help
Let's say that we have a variable gender (men and women). If women are set as baseline (coded as zero), men are coded as one, then the estimated parameter say how much larger, or less, the value of the parameter is for the men compared to the women. If that value is positive, the OR is greater than one. If that value is negative, the OR is less than one (see 18:18 for how to calculate and interpret the OR).
How do you determine the quality of the fitted curve ?
Not sure what you mean with quality but maybe this video might help
ua-cam.com/video/J0yuLu3oLuU/v-deo.html
How did you get -5.75 and 2.75 ?
I used the least square formula and I got -0.34 and 0.39 !
You should use the maximum likelihood method.
ua-cam.com/video/J0yuLu3oLuU/v-deo.html
What statistical software ate u referring to?
I use R and SPSS, but other tools also work fine.
Can u give the exact formula for ur coefficients (b0 and b1) because we badly need it for a manual computation 😭
ua-cam.com/video/J0yuLu3oLuU/v-deo.html
You estimate based on maximizing the likelihood. There is no simple formula to estimate the parameters like in linear regression.
Do you have the slides?
If you go to my home page www.tilestats.com, you can buy some of the vidoes as PDFs
How to calculate b1 and b 0
By the maximum likelihood method:
ua-cam.com/video/J0yuLu3oLuU/v-deo.html
Please check the voice of your video before uploading the video. Please increase it if it is too low.