Deriving the least squares estimators of the slope and intercept (simple linear regression)
Вставка
- Опубліковано 21 бер 2019
- I derive the least squares estimators of the slope and intercept in simple linear regression (Using summation notation, and no matrices.) I assume that the viewer has already been introduced to the linear regression model, but I do provide a brief review in the first few minutes. I assume that you have a basic knowledge of differential calculus, including the power rule and the chain rule.
If you are already familiar with the problem, and you are just looking for help with the mathematics of the derivation, the derivation starts at 3:26.
At the end of the video, I illustrate that sum(X_i-X bar)(Y_i - Y bar) = sum X_i(Y_i - Y bar) =sum Y_i(X_i - X bar) , and that sum(X_i-X bar)^2 = sum X_i(X_i - X bar).
There are, of course, a number of ways of expressing the formula for the slope estimator, and I make no attempt to list them all in this video.
Finally, someone who made it simple to understand! Thank you!
Right! i went through like a million videos trying to understand this one segment and this was the first to do it.
True.
please, how is it possible to consider Beta variable (when taking derivatives) and then consider Beta constant (to take it out of the sum) ???
Best video . i was looking for something like this
This is an underrated video.
@Isaiah Matias is it?
My god, you explained this so easily. It took me hours trying to understand this before watching this video but still couldn’t understand it properly. After watching this video, it's crystal-clear now. ❤️
me, too. I spent a whole morning figuring this. He is a savior
I don't usually comment on teaching videos. But this really deserves thanks for how clearly and simply you explained everything. The lecture I had at the university left much to be desired
I can't thank you enough for this brilliant explanation!
I never thought that I could understand simple linear regression using this approach. Thank you
My Physical Chemistry teacher spent ~1.5 hrs showing this derivation and I got completely lost. Watching your video, it's so clear now. Thank you for your phenomenal work.
unbelievably perfect video, one of the best videos I have watched in the statistics field, so rare to find high-quality in this field idk why
Absolutely beautiful derivation!
Crystal clear!
Thanks very much.
Can I apply this in Mumbai University exam?
I have gone through tons of materials on this topic and they either skip the derivation process or go direct into some esoteric matrix arithmetics. This video explains everything I need to know. Thanks.
phenomenal video. Thank you for taking the time to explain each step of the derivations such as the sum rule for derivation. Thank you for helping me learn.
You are very welcome!
Amazing video! Slight bumps where my own knowledge was patchy but you provided enough steps for me to work those gaps out.
Thanks so much, this was so easy to follow and comprehend!
Glad you're back!
Thanks! Glad to be back! Just recording and editing as I type this!
you have no idea how you saved my life, I was struggling so hard to find out why xi(xi-xbar)=(xi-xbar)^2 and etc. you are the first one I found explained that.
Really, really good explanation!! Thank you!!
Thank you so much! This explanation is literally perfect, helped me so much!
Thanks for the kind words! I'm glad to be of help!
This is FANTASTIC. THANK YOU!
You made it simpler than my lecturer do. Thank you!
Absolutely brilliant video!!! Thanks so much
one video on youtube that actually explains something properly
thank you for actually explaining it, most of videos are just like "hi, if you want to solve this, plug in this awesome formula and thats it, thank you for watching :)"
The best part of this video is finally figuring out where that "n" came from in the equation for beta-naught-hat. Thank you so very much for making this available.
I'm glad to be of help!
Thank you so much for such a clear explanation! It helps me a lot in preparing for my upcoming final exam.
Thank you very much! Very clear and interesting explanation!
Thank you for explaining in such details ❤️
Great explanation! Step by step...
Thanks a lot sir I really got this what I need indeed. 🙏🙏🙏🙏🙏🙏🙏🙏There is no words for appreciation of your efforts
Beautiful video, good explanation
Very helpful video to understand. Many thanks!
thank you so much, this video has cleared all my confusions cuz the book im reading just says 'by doing some simple calculus'
thanks a lot for simplifying the derivation
This is talent. Thank you so much 😊
Awesome video sir! Thank you!
Exactly what I was looking for. Thank you so much!
You are very welcome!
Greatful, you are wonderful Sir,you have made me understand economics
Great sir, very helpful!
You are awesome! I am not a native speaker and still struggling with the master program courses in the US, but your instruction is so helpful. I appreciate your great help
Thanks! I'm happy to be of help!
Thank u soooo much! For explaining this. You made my day
Best explanation, thank you so much
finally, I've understood this bloody thing. Thank u sooooo much m8.
I'm glad you found it helpful!
This was really helpful thanks!
This is incredible, thank you so much! :)
You are very welcome!
Excellent video!
Excellent video
Thank you. This is very clear
Thank you very much. This video helped me a lot.
Great job! Thank you sir!
Yes! New stuff 👍🏼👍🏼
Greatly explained
Thank you so much am really enjoying and understanding what your teaching
holy hell I wish you were my econometrics professor. mine is useless
Thanks, so helpful!
Very well explained
Question: 6:24 Why and how beta zero hat is multiplied with n? Does n mean sample size? What's the reasoning behind n adjoining with beta zero hat?
Thanks Dr. Balka! Is it computationally expensive to estimate the parameters in this manner for models with many independent variables or very large datasets? Is that the reason why iterative methods such as gradient descent are sometimes employed instead?
The matrix inversion operation in ols is computationally expensive, hence numerical methods like gradient descent are useful.
good explanation!
you make it sooo easy
Wow. This is great.thanku so much.
Really nice derivation!
Thanks!
Hi, do you have a video on deriving coefficients in multiple regression?
That is a fun derivation using linear algebra and calculus. First step is the same here which is taking the first derivative and setting it equal to zero. The book "The Elements of Statistical Learning" has a good proof. I'd say one needs a calc 1 and linear algebra background first though.
Amazing and super helpful video! Extremely simple and easy to follow! But please, quick question: Why did you switch the Xi and Xbar at 7:51? This drastically changes the ending solution.
When he removes the inner paranthesis, the term Xi becomes negative and Xbar becomes positive. So when you multiply it by (-ve)Beta, the signage of both terms reverses
Easiest subscribe of my life.
Great video Brother
Thanks so much!
THANKS GOD FINALLY SOMEONE TRIED TO DERIVE THE FORMULA,
INSANE THAT NEARLY ALL OTHER RESOURCES OMIT THIS SHIT
LEGEND, HAVE TO SAY YOU ARE BETTER THAN A PROFFESOR
I *am* a professor!!!
definitely the best video out there on this topic
makes me wonder why its not the top recommendation / search result
maybe because of the title
Thanks! How about "Finding the formulas for the slope and intercept the EASY WAY! (When I got to step 8, my jaw DROPPED!)" :)
@@jbstatistics no idea, I'm not good at making up titles. Maybe something like this?
Simple Linear Regression | Derivation
@@robin-bird I was just joking, but thanks for the input :)
@@jbstatistics I was wondering - thanks for clearing that up ^^
Nice trick! Adding an intelligent zero huh?
Thanks for this video!
Thank you so much.
Thanks alot it really helped
u are a life saver
finally got my doubt resolved.😊
Thank you very to clear explanation ❤
You sir are AMazing
Thanks for the video. Just wondering why x and y can be considered constant when differentiate against B0 or B1? Is it because of partial differentiation or X and Y are known numbers?
I think you are arguing why Bo hat and B1 hat should be considered constants for the sample.They are clearly not going to change for that sample.
Good video Thanks!
Awesome!
Thank you very much sir !
You are very welcome!
Thank you so much
please, how is it possible to consider Beta variable (when taking derivatives) and then consider Beta constant (to take it out of the sum) ???
Wow many university lecturers can’t explain it this well!
At 10:52 timeline, how can we switch the role of X sub i and Y sub i? Could you help explain how this happens?
In the first step, we choose to expand (Xi - Xbar) but we could have chosen to expand (Yi - Ybar) and it would follow a similar route.
2:24, where did you discuss why it makes sense to minimize the sum of squared residuals ?
makes it more sensitive to bigger errors. And it's differentiable at all points. In the Mod function , it is not differentiable at the point it pivots up
@@aakarshan01 but why not to power 1.5? why not to power 4? why is it exactly power 2?
@@SuperYtc1 you can.but there is no need to. The differentiability is achieved in square. Why calculate a bigger number that could lead to problems since power 4 of a decimal number of more likely to break the minimum number limit of a float than a square. But in theory, you can
Thanks a lot!!!
The result represent the minimas since the original function that we were minimizing is convex and open upwards, so the only way for a critical value to exist is for it to be a minimum.
did we consider beta not hat and beta hat as variables for partial derivation in this problem usually they are constant in straight line right ? why did we take them as variables , if any one knows the answer plse do reply me
this is great
THANK YOU
Nailed it
great video, my summary just gave the formula with the text: 'just remember this' hate that
Thank you
at 10:43, can you please tell me why we can easily swap the roles of x and y? Is it based on any properties or formulas?
The initial term is sum (X_i - X bar)(Y_i - Y bar). While in the video I split up the (Y_i - Y bar) term, leaving (X_i - X bar) intact, I could have just as easily split up the (X_i - X bar) term instead, and using the same steps as I did in the video, end up with sum (Y_i - Y bar)X_i.
Thanks so much
Why do we take sum of squared residuals and not only residuals and do their partial derivative wrt alpha and beta
In which video does he discuss why the we use squared residuals?
thank you 100^100 times
Wait, at 6:45, how do you divide the summations by n and get (y) itself? y-sub-i isn't a constant, so how does the division even work?
OOHHH NOOOO, ITS THE MEAN. NOW I GOT IT. JUST GONNA LEAVE THIS HERE JUST TO SHOW HOW STUPID I CAN BE
I'm new to this formula and the big data field, what mathematical knowledge should I learn prior to watch this video? Thank you
@@noopyx3414 Oh man, you're lucky. I just logged in to UA-cam. Prior to this formula, I'd really suggest you check out Brandon Fultz's Statistics 101: Linear Regression series. There he explains what this formula and other stuff regarding the topic, are all about.
@@kaanaltug455 Thank you very much!
thank u so much.
How can we find the intercept and slope value of B0 and B1
He says that he describes why we square it elsewhere. Does anyone know which video that is?