Just to add how logit inverse can be derived. It is the swapping of graph between x and y. You have to isolate the x(p in this case) and express x in terms of y. So, if logit is y and y = ln(p/1-p), e^y = p/1-p e^y-e^y(p) = p re arranging the terms, it comes out p = e^y/1+e^y, So, now your p (or x) has taken the role of y, and y has taken the role of x and you can swap variables y = e^x/1+e^x
I've been meaning to comment on your videos for a while. They are exceptionally well done! You (IMO) have such a knack for explaining what I had previously considered to be difficult concepts. Thank you!
Thank you Brandon, no matter advanced statistic or machine learning, when I having some trouble, i can always get perfect explanation from your lecture!
I wish i could keep clicking the thumbs up a thousand times. your explanation of the logit function and WHY it's used in logistic regression makes it so simple an 8th grader could do it!
THANKS is not enough.. i should say more than that, I started my day with logistic regression, planned to complete in 2 to 3 hours.. i had gone through lot of stuff which lead to confusion between logit and sigmoid function..... felt bad and sad, that i couldn't make it, then your stuff came, it is SAVIOUR.. You explained very clearly, detailed every thing in simpler manner. I ended my day with SMILE with your stuff.. APPRECIATE.. APPRECIATE.. APPRECIATE... Expecting stuff on Machine Learning Algorithms too.
the way u explain is amazing! i urge u to pls write a book... it will b a blessing for those who find these things ingraspable in college/trainings! pls pls! more power to u Sir!👍👍
Phillip Healy Hello! Thanks for pointing that out. Yes I am sure I slipped and added "ratio" since the odds is itself a ratio. I've read textbooks that do the same thing unfortunately. But yes I will try to avoid doing that and hopefully your comment cleared up any confusion for other viewers.
***** Thank You Very much Brandon. I Think You have a unique talent teaching these materials. I would suggest that you try to find a way to upload these materials on a popular education platforms, like Khan academy or Udacity or any others! I have watched many lectures given by highly decorated academics and I can tell you that they were no where near your style. Basically, your methods and lectures are undoubtedly qualified to earn a place in the best education platforms. I Hope I see you succeed even more!
This is a very good video and I really like your style of teaching. I have doubt as to why or how did we equate the logit of p with the linear combination of the independent variables?
Very clear explainations. I have a few suggestions to make though: - Sometimes while explaining some examples or points verbally which are a bit abstract like probability plot like in the previous logistic regression video, a simultanous visualization could help make it much clearer and easier for audience to visualize such points and effortless to follow you. - Would be great if your videos were complemented with implementations in R.(a bit biased suggestion since i am learning to implement these machine learning algorithms in R haha). But i believe you could explain things really well so a complementary video/series of videos or video embedded in same video would benefit us all.
Perfect video to link bernoulli distribution of Y, its estimate mean \hat{p_i} = E(Y_i) and the linear combination of betas. I have been wondering how the link function came out in terms of GLM. But thanks to this video, I understood that logistic regression is nothing but to model \hat{p_i} as a logistic function and logit is used to link this probability model to a linear combination! Thank you very much! But it would also have been amazing if you could add something about MLE in the vidoe, not in detail, but how the assumption that Y follows Bernoulli distribution is linked with MLE to estiamte the coefficients in another video.
I am glad to find your channel, Brandon. I watched all videos in this series. I have one question. When doing the modeling, are we using the logit function or the logit inverse function. My understanding is that we are using the logit inverse function since the y axis represents the 0-1 range. But from later videos when we put the coefficients into the function, we are using logit function. So, I am confused. Thanks!
In case anyone, like me, is confused with the part on 9:20, you might want to skip it look at 13:45 instead, which is the same thing but clearer (for me at least).
I didn't understand how the origins of the sigmoid function link with the nature of the problem. I did understand that for logistic regression we want a parametized S shaped function and we want to find its parameters that minimize the sse against the datapoints. Sigmoid is one such function, but there are others as well
Just a quick question. I am confused. at 14.44 it is mentioned that logit(p) = ln(p/(1-p)) as log of odds ratio and not log of odds. but ln(p/(1-p)) is log of odds and not log of odds ratio.
+Brandon Foltz Nice explanation Brandon. One question (pardon my ignorance), what is the intuition behind taking the natural log of the odds as for the logit instead of using the odds itself for the logit?
the equation displayed at14:32 is not entirely clear to me: why can we write logit(p) as a linear combination of the independent var? I do not see it as a straightforward equation. I mean, I am used to seeing the Utility as b0+b1X, but why the logit(p)? Is that a definition? Thanks
Great videos sir. Just one doubt what if we have more than 2 dependent variables. In that case barnaouli distribution will not be applicable. Right? If yes, then what is the way out.
Hello, thank you for the nice explanations of Logistic Regression. Very useful and easy to follow. Just one question: after having fitted a Logistic Regression model (say, binary), if one wishes to calculate the probability for the positive outcome on the basis of the LR results, one has to fed into the LR equation just the predictors that turned out to be significant or all the original predictors even tough they proved not significant? I ask you because, while I would lean toward the first solution, I see that in software like XLSTAT the returned logistic regression equation also comprises the not significant predictors. I am rather confused on that.
p/(1-p) is defined as the odds then why are you defining logit as log of odds ratio ... shouldn't it be log (odds) only and not the ratio ? can you please clarify ?
Let us all unite unanimously and send a formal request to the Vatican, so this guy can be canonized in the future. His deeds have saved innumerable individuals from the unforgiving claws of despair
In case somebody's wondering, inverse of logit is found the following way: 1. logit => ln(p/(1-p)) = y 2. inverse => e^y = p/(1-p) 3. p/(1-p) = e^y 4. p = e^y(1-p) 5. p = e^y - e^y(p) 6. p + e^y(p) = e^y 7. p(1+e^y) = e^y 8. p = e^y/(1+e^y)
Hi Kaysar777. You did not find the inverse as far as I can tell. What you did was raise both sides with e. That is a legitimate algebraic operation and does not give an inverse. For example, if y= x , then sqrt(y) = sqrt(x) -taking the square root of both sides is a legitimate algebraic operation. But doing that does not give the inverse. Legitimate algebraic operations DON'T change the function (the solution set is the same). However, obtaining an inverse gives an entirely new function (completely different set of solutions). Obtaining the inverse can be found by switching x and y. That is not what you did, is it?
I was completely ok through video one and 2 and then this third video got me :( Have no idea of whats happening after 3:30 which really sucks. When you put up the big graph, you didn't explain what was happening at like 5:30 onwards I literally had no idea of what I was looking at on this graph and it has sadly all gone down hill from here. Would love a clearer simpler version of the stuff about logit onwards... so lost.
Hi Brandon, first of all a huge thank you for making this great video series on logistic regression. These videos are very well made and also really great from a didactic standpoint. I felt like I could feel the synapses forming while watching these videos :). (subscribed)
For my fellow friends wondering why the logit function can be linked to a linear combination of independent variables, this link gives a relative good short explanation: Why is the logistic classification model specified in this manner? Why is the logistic function used to transform the linear combination of inputs x? The simple answer is that we would like to do something similar to what we do in a linear regression model: use a linear combination of the inputs as our prediction of the output. However, our prediction needs to be a probability and there is no guarantee that the linear combination x is between 0 and 1. Thus, we use the logistic function because it provides a convenient way of transforming x and forcing it to lie in the interval between 0 and 1. We could have used other functions that enjoy properties similar to the logistic function. As a matter of fact, other popular classification models can be obtained by simply substituting the logistic function with another function and leaving everything else in the model unchanged. For example, by substituting the logit function with the cumulative distribution function of a standard normal distribution, we obtain the so-called probit model. www.statlect.com/fundamentals-of-statistics/logistic-classification-model
Well. Maybe this is premature, but I have to say this video is very well done and clear. Believe me, most are not. Even finding an explanation for logistic regression via google does not produce good results. My only negative here is that the video refers to "Video 4", but the video titles do not seem to contain any "numbers" other than "101".
How or why do we think of estimating the probability using the linear regression of the independent variables. Also, how can we be sure that this method will work of trying to predict probability using regression of the variables ?
Thank you for the video, still not get clearly what drives the assumption that the logit is represented as a linear equation. this is for me the core of logistic regression but in all readings we assume this relation without giving any motivation behind.
Just to add how logit inverse can be derived. It is the swapping of graph between x and y. You have to isolate the x(p in this case) and express x in terms of y.
So, if logit is y and
y = ln(p/1-p),
e^y = p/1-p
e^y-e^y(p) = p
re arranging the terms, it comes out
p = e^y/1+e^y,
So, now your p (or x) has taken the role of y, and y has taken the role of x and you can swap variables
y = e^x/1+e^x
What a nice contribution. Thank you.
what a legend
can anyone add exactly why we take the inverse?
You are my hero
I've been meaning to comment on your videos for a while. They are exceptionally well done! You (IMO) have such a knack for explaining what I had previously considered to be difficult concepts. Thank you!
I do truly appreciate all your amazing tutorials... I can not be more grateful.
Thank you Brandon, no matter advanced statistic or machine learning, when I having some trouble, i can always get perfect explanation from your lecture!
I wish i could keep clicking the thumbs up a thousand times. your explanation of the logit function and WHY it's used in logistic regression makes it so simple an 8th grader could do it!
THANKS is not enough.. i should say more than that, I started my day with logistic regression, planned to complete in 2 to 3 hours.. i had gone through lot of stuff which lead to confusion between logit and sigmoid function..... felt bad and sad, that i couldn't make it, then your stuff came, it is SAVIOUR.. You explained very clearly, detailed every thing in simpler manner. I ended my day with SMILE with your stuff.. APPRECIATE.. APPRECIATE.. APPRECIATE... Expecting stuff on Machine Learning Algorithms too.
Hey thank you so much for making these, they really, really helped me clarify my class.
Maravilloso, ojala mis profesores enseñaran así, seria tan sencillo :)
A really good professor.
the way u explain is amazing! i urge u to pls write a book... it will b a blessing for those who find these things ingraspable in college/trainings! pls pls! more power to u Sir!👍👍
Very helpful presentation, especially when publishing papers, allowing one to interrogate the statistician from a platform of knowledge.
Logit is the natural log of the odds, not of the odds ratio
Phillip Healy Hello! Thanks for pointing that out. Yes I am sure I slipped and added "ratio" since the odds is itself a ratio. I've read textbooks that do the same thing unfortunately. But yes I will try to avoid doing that and hopefully your comment cleared up any confusion for other viewers.
***** Thank You Very much Brandon. I Think You have a unique talent teaching these materials. I would suggest that you try to find a way to upload these materials on a popular education platforms, like Khan academy or Udacity or any others! I have watched many lectures given by highly decorated academics and I can tell you that they were no where near your style. Basically, your methods and lectures are undoubtedly qualified to earn a place in the best education platforms. I Hope I see you succeed even more!
Phillip Healy +Brandon Foltz Thanks for pointing that out!
Philip Healy, In which slide you are talking about.
This is a very good video and I really like your style of teaching. I have doubt as to why or how did we equate the logit of p with the linear combination of the independent variables?
this is awesome! very intuitive flow to explain equation. I can understand without needing to know much algebra and stats for now.
Very clear explainations. I have a few suggestions to make though:
- Sometimes while explaining some examples or points verbally which are a bit abstract like probability plot like in the previous logistic regression video, a simultanous visualization could help make it much clearer and easier for audience to visualize such points and effortless to follow you.
- Would be great if your videos were complemented with implementations in R.(a bit biased suggestion since i am learning to implement these machine learning algorithms in R haha). But i believe you could explain things really well so a complementary video/series of videos or video embedded in same video would benefit us all.
The explanation of logit is very clear!
Perfect video to link bernoulli distribution of Y, its estimate mean \hat{p_i} = E(Y_i) and the linear combination of betas.
I have been wondering how the link function came out in terms of GLM.
But thanks to this video, I understood that logistic regression is nothing but to model \hat{p_i} as a logistic function and logit is used to link this probability model to a linear combination!
Thank you very much!
But it would also have been amazing if you could add something about MLE in the vidoe, not in detail, but how the assumption that Y follows Bernoulli distribution is linked with MLE to estiamte the coefficients in another video.
Thank you so much Brandon, these videos are amazing.
Hi Brandon, thank you for the fantastic videos on Logistic Regression. Best explanation ive come across.
I am glad to find your channel, Brandon. I watched all videos in this series. I have one question. When doing the modeling, are we using the logit function or the logit inverse function. My understanding is that we are using the logit inverse function since the y axis represents the 0-1 range. But from later videos when we put the coefficients into the function, we are using logit function. So, I am confused. Thanks!
U r the only n first person who i subscribed to and also turned on notifications.
One of the best explanations of Logit!!!
Thank you so much for this video. Really aided to my understanding.
In case anyone, like me, is confused with the part on 9:20, you might want to skip it look at 13:45 instead, which is the same thing but clearer (for me at least).
Awesome videos...! this time on logistic regression.... so lucidly explained... THANK YOU VERY MUCH ..Sir...
King of statistics....
Great explanations!
could you plz make a video about maximum likehood of logit models ......???/
have you found it out ?
Thank you, you are amazing! I cannot thank you enough!
Congrats and thanks for making these videos! SO informative
made the concept crystal clear ... Thanks
How do you derive the inverse logit function?
Everything I needed. Thanks.
Brilliant and easy to understand 🙂
great mind what an explanatio!!
I didn't understand how the origins of the sigmoid function link with the nature of the problem. I did understand that for logistic regression we want a parametized S shaped function and we want to find its parameters that minimize the sse against the datapoints. Sigmoid is one such function, but there are others as well
i feel like the other two were well paced, but I got a bit lost here. Maybe just me.
Agree, a big jump for me!
Love it!! thank you so much, you saved my day again!
Just a quick question. I am confused. at 14.44 it is mentioned that logit(p) = ln(p/(1-p)) as log of odds ratio and not log of odds. but ln(p/(1-p)) is log of odds and not log of odds ratio.
You made the mathy part less messy..thanks
Please cover MLE in your future videos.
Quick question Is the antilog the dames and inv(log) inverse of the log?
Subscribed. Keep up the good work! :)
do something on multiple logistic regression especially with age classes
I can't figure out why the logit is equal to the linear combination of the independent variables. Can you explain that to me?
Стилиян Валериев same here. What's the reason behind equating the two?
Стилиян Валериев did you figure it out? I have the same doubt.
We are trying to get an equation with similar range on both sides, both the logit func and independent vars have infinite range
Hey Akhil, thanks for reply but still i didn't get why logit is equal to the linear combination of the independent variables?
Appreciate the content, thank you!
Awesome! It is very clear!
+Brandon Foltz Nice explanation Brandon. One question (pardon my ignorance), what is the intuition behind taking the natural log of the odds as for the logit instead of using the odds itself for the logit?
very nice video, good work!
Where can we learn more about the MLE method used here?
you are simply awesome
God bless you
the equation displayed at14:32 is not entirely clear to me: why can we write logit(p) as a linear combination of the independent var? I do not see it as a straightforward equation. I mean, I am used to seeing the Utility as b0+b1X, but why the logit(p)? Is that a definition? Thanks
Great videos sir. Just one doubt what if we have more than 2 dependent variables. In that case barnaouli distribution will not be applicable. Right? If yes, then what is the way out.
Amazing Video Thanks you somuch
another amazing video
Hello, thank you for the nice explanations of Logistic Regression. Very useful and easy to follow. Just one question: after having fitted a Logistic Regression model (say, binary), if one wishes to calculate the probability for the positive outcome on the basis of the LR results, one has to fed into the LR equation just the predictors that turned out to be significant or all the original predictors even tough they proved not significant?
I ask you because, while I would lean toward the first solution, I see that in software like XLSTAT the returned logistic regression equation also comprises the not significant predictors. I am rather confused on that.
Hi Brandon, thank you for nice tutorials.
Which tool do you use to plot the odds ratios ?
great help thanks!
p/(1-p) is defined as the odds then why are you defining logit as log of odds ratio ... shouldn't it be log (odds) only and not the ratio ? can you please clarify ?
I got my answer in the comments below .. thank you :)
independent variables could be binary or continuous right?
Awesome - thanks!
The "antilog" is also called the function e^x :D
Anyone could explain why logit(p) can be expressed as the linear combination of independent variables?
Can u tell me the difference between logit and probit?
Thank you !
Excellent video amigo
Gracias!
Dear, could you share your slides as well so that I could review it in the future~
thank you
Thanks a lot
What is the value of e?
Hey, can someone show me how the inverse of logit is calculated. Thank you in advance.
wow! thanks
Let us all unite unanimously and send a formal request to the Vatican, so this guy can be canonized in the future. His deeds have saved innumerable individuals from the unforgiving claws of despair
Excellent video - do you want me to translate this in to russian?
niiiice
🤗
I love you sir (no homo)!
I almost shared this with a large audience, but it says the owner has disabled embedding.
+Miller Intel appreciate the thought. The link works just fine as well. Hope you would consider sharing the video using a direct link. Thanks!
In case somebody's wondering, inverse of logit is found the following way:
1. logit => ln(p/(1-p)) = y
2. inverse => e^y = p/(1-p)
3. p/(1-p) = e^y
4. p = e^y(1-p)
5. p = e^y - e^y(p)
6. p + e^y(p) = e^y
7. p(1+e^y) = e^y
8. p = e^y/(1+e^y)
Hi Kaysar777. You did not find the inverse as far as I can tell. What you did was raise both sides with e. That is a legitimate algebraic operation and does not give an inverse. For example, if y= x , then sqrt(y) = sqrt(x) -taking the square root of both sides is a legitimate algebraic operation. But doing that does not give the inverse. Legitimate algebraic operations DON'T change the function (the solution set is the same). However, obtaining an inverse gives an entirely new function (completely different set of solutions). Obtaining the inverse can be found by switching x and y. That is not what you did, is it?
This is amazing .
So in other words , logistic regression is just another curve fitting but the curve is sigmoid-like .
I was completely ok through video one and 2 and then this third video got me :( Have no idea of whats happening after 3:30 which really sucks. When you put up the big graph, you didn't explain what was happening at like 5:30 onwards I literally had no idea of what I was looking at on this graph and it has sadly all gone down hill from here. Would love a clearer simpler version of the stuff about logit onwards... so lost.
Cool explanation! It is really helpful!
Hi Brandon, first of all a huge thank you for making this great video series on logistic regression. These videos are very well made and also really great from a didactic standpoint. I felt like I could feel the synapses forming while watching these videos :).
(subscribed)
too logit to quit
For my fellow friends wondering why the logit function can be linked to a linear combination of independent variables, this link gives a relative good short explanation:
Why is the logistic classification model specified in this manner? Why is the logistic function used to transform the linear combination of inputs x?
The simple answer is that we would like to do something similar to what we do in a linear regression model: use a linear combination of the inputs as our prediction of the output. However, our prediction needs to be a probability and there is no guarantee that the linear combination x is between 0 and 1. Thus, we use the logistic function because it provides a convenient way of transforming x and forcing it to lie in the interval between 0 and 1.
We could have used other functions that enjoy properties similar to the logistic function. As a matter of fact, other popular classification models can be obtained by simply substituting the logistic function with another function and leaving everything else in the model unchanged. For example, by substituting the logit function with the cumulative distribution function of a standard normal distribution, we obtain the so-called probit model.
www.statlect.com/fundamentals-of-statistics/logistic-classification-model
Well. Maybe this is premature, but I have to say this video is very well done and clear. Believe me, most are not. Even finding an explanation for logistic regression via google does not produce good results. My only negative here is that the video refers to "Video 4", but the video titles do not seem to contain any "numbers" other than "101".
excellent explanation, thank very much. Many books omit this relation.......
How or why do we think of estimating the probability using the linear regression of the independent variables. Also, how can we be sure that this method will work of trying to predict probability using regression of the variables ?
A quick question, why logit function choose 1/(1+e^-x) instead of some other similar curve like -cotangent?
Can you please make a video of Ada boost, random forest and SVM regression?
It is the natural logarithm of odds and not the odds ratio as we are taking p/1-p. Am I correct ?
Hi Brandon, thanks for all your videos great help. Can you please add videos on decision trees and random forests as well.
Why do you sometimes use odds and odds ratio interchangeably
This is a great video - really explains the concepts in simple, easy to understand terms. Thanks Brandon!
This was really insightful. Thansk!
you are just awesome....
hi, i wanna ask you, i used linear regression model then tested in ROC - better predictability, but when i used logit function, ROC no better, why?
I really appreciate that at some point you started including the play list and video # directly at the beginning of the video.
Thank you for the video, still not get clearly what drives the assumption that the logit is represented as a linear equation. this is for me the core of logistic regression but in all readings we assume this relation without giving any motivation behind.