Linear Regression and Multiple Regression
Вставка
- Опубліковано 16 лис 2024
- In this video, I will be talking about a parametric regression method called “Linear Regression” and it's extension for multiple features/ covariates, "Multiple Regression". You will gain an understanding of how to estimate coefficients using the least squares approach (scalar and matrix form) - fundamental for many other statistical learning methods.
⭐ Coursera Plus: $100 off until September 29th, 2022 for access to 7000+ courses: imp.i384100.ne...
MATH COURSES (7 day free trial)
📕 Mathematics for Machine Learning: imp.i384100.ne...
📕 Calculus: imp.i384100.ne...
📕 Statistics for Data Science: imp.i384100.ne...
📕 Bayesian Statistics: imp.i384100.ne...
📕 Linear Algebra: imp.i384100.ne...
📕 Probability: imp.i384100.ne...
OTHER RELATED COURSES (7 day free trial)
📕 ⭐ Deep Learning Specialization: imp.i384100.ne...
📕 Python for Everybody: imp.i384100.ne...
📕 MLOps Course: imp.i384100.ne...
📕 Natural Language Processing (NLP): imp.i384100.ne...
📕 Machine Learning in Production: imp.i384100.ne...
📕 Data Science Specialization: imp.i384100.ne...
📕 Tensorflow: imp.i384100.ne...
INVESTING
[1] Webull (You can get 3 free stocks setting up a webull account today): a.webull.com/8...
More on Matrix Calculus: atmos.washingt...
Fantastic work. Usually all tutorial videos about linear regression or multiple regression are simply giving the formulas out of nowhere, without explaining the rational in the background. Thanks for taking the time for diving through the underlying maths :)
Pretty amazing, especially since nobody really covers the mathematics behind ML, really appreciate the math based content.
Yesss! Math is underappreciated
this is a very well made video but this is always covered in statistics
This exposition is timely. I have battled over the disappearance of Y transpose Y in the matrix approach of a least squares for months until I came across this video. This is awesome. I am speechless.
after multiplying and opening the brackets at 9:00 third term of the resultant should have transpose of B hat and not just B hat
Correct
yes just a typo....
very well explained. I have been searching such video for many days. Now, the concept is crystal clear.
This is a great video - I was looking for the math behind calculating the co-efficients in multiple linear regression and this explains it perfectly. Thank you!
Thanks Sidharth! Glad it helped! Mind sharing the video to help others like you? :P
Thank you for the video! And I'd love to share it with others :)
Also, you just got a subscriber! Let's see you get to 1K soon !
Thank you! Much Appreciated! I'm trying to upload more regularly than I have done in the past. There should be a lot more where that came from very soon.
Yup! Any platform I can network with you on by the way? Quora for example?
Sidharth Ramanan Quora is good. I'm under the name "Ajay Halthor".
Excellent video, highly illuminating to finally see a comprehensive explanation of things that are too often left unexplained. I wish far more people, books, and videos explained statistics in similar detail.
Wow! This is the best video to quickly understand the derivation of linear regression formulas!
Needed some refresher on a math class from grad school, and this really hit the spot. Thank you!
this is the best video on multiple linear regression
Thanks so much!
I have tried many ways to find a decent derivation for multiple regression, I found the key term is understanding matrix derivation rules which I was missing all those times. this is first time I got the clear understanding of the formula. Thanks a lot.
after week of searching . finally i found you . Thank you so much
great explanation . keep going on
The search is over. Join me in turning this world into -- nah just kidding. Glad you Finally found me. Hope you stick around
One of the best explanations on this topic. And the presentation is superb
Splendid and now words are sufficiently enough for such lucid explanation
Thanks for the compliments! :)
I am binging the concepts and might forget to like - great channel.
Excellent explanation with precise terminology!
You explained this 1000000000000000000000000000x better than my professor. Thank you!
Ryan Smith Thanks! So glad it was useful!
Very nice explanation! Very clear! I was looking for exactly the same.
Thanks a lot. This is the most comprehensive regression video on UA-cam.
Kunt's Bro Thanks! Regression is an important topic, thought I'd take time explaining it
Incredible video for the derivation!
Finally a video which makes perfect sense. Thanks a lot bro.
Making sense is what I do :)
Excellently explained. Very lucid
Glad this is useful. Thank you :)
This is a great video and explained things so clearly! Thanks!
Thanks for the compliments! :)
Excellent!
Very nice see the scalar and matrix approach :)
This is one of the best videos
Wonderful video! very useful and clear!
Amazing, thanks to the map you just drew I feel confident to learn the deeper concepts!
Extremely clear. Bang on!
Hats of for your efforts ! Really Fun way to learn algorithms, Please post more videos of other machine learning algo.
At 11:10, the quadratic form of matrix differentiation should be x^T(A^T + A). Under the condition of A being symmetric could the derivative be 2 x^T A (as being used in the last term of d(RSS)/dx).
Super helpful and very clear! Thank you so so much!
Great video, exactly what i was searching for,
how did they get that matrix equation was exactly what i needed!
thanks a lot man!
Hands down my dawg❤️❤️ Very well explained
This video just made my day.
Absolutely loved it...
This was really helpful. I'm taking a unit on Data mining with no statistics background. Thank for sharing your knowledge 👊
wow good luck!!!
Thanks....put more videos on regression analysis
Glad you enjoyed it! Will think of more Regression based videos in the future
In minute 10:27: X is mx1 and A in mxn. The 3rd differentiation rule is about y = X*A. But, given the sizes of the matrices, how can you multiply X*A?
I have the same question. Were you able to clear it up?
here the differentiation rule should be: let scalar y=x^T A, then dy/dx = A^T
It's nice that the video shows some matrix differentiation rules, but I recommend the more serious propositions in: atmos.washington.edu/~dennis/MatrixCalculus.pdf
Very clear explanation … better than doing it by considering the projection on the model space and using the projection formula (t(X)X)^-1t(X)Y
thank you so much from korea
Thanks! You just saved my life!
Raquel Morales Anytime. Saving lives is what I do.
Thanks, this is an amazing video. It was very helpful.
Many thanks!
Very well explained. Thank you.
Thank you very much it was very helpful
really nice explanation you have deep knowledge. hoa can we minimize the error term?
Amazing work
Great explanation
My friend, you Saved my Bachelorpresentation.
great job man !
This video is so good, it explained several weeks of a course to me in 12 minutes smh
Glad this is useful :)
Is there something wrong at @8:58? Shouldn't B(hat) be B(hat)(Transpose)?
Super good
Very well explained!! Tq❤
beginning from 8:54 the RSS should have the third term as -(β_hat)^T X^T y instead of -(β_hat) X^T y, the transpose sign is missing here
You're Right. And I think it should be:" y=x^T.A => dy/dx = A" .
Very nicely explained
Great video, thanks for your effort 😁
I just have two questions:
1. in the last RSS equation, why is T removed from beta_hat in the third term
2. how is y = xA feasible given x has dimension (m x 1) and A has dim (n x m)
Appreciate your help please. Thanks!
Way too cool!!! I am enjoying this video!
Thank you! It helped me a lot.
1 question.
The method that you described above is of normal equation as of andrew ng machine learning course. The other way to find coeff. are gradient descent, BFGS, L-BFGS etc.
Correct me if I am wrong.
There's a mistake at minute 9:00, the third term of the expanded version of RSS is -(beta' x' y)
In your logistic regression, I am not sure how you came up with the two exponents when you formed the two product of the product of p(x) and 1-p(x)
Nice video. What software do you use for writing that math expressions? I mean, is it editor equations from ms word? Thank you.
Great video! Probably the best in explanation of math behind linear regression. Is there a way to do multiple non-linear regression?
9:40 How does the third formula work?
Here, the dimensions of xA do not satisfy the condition for matrix multiplication
thank you, you also saved my life :)
Can you upload a pdf of these formulae you shows in this video?
Good job, thanks
thank you very much for this amazing video, it was really helpful
do you have any other videos about : polynomial regression and non linear regression ?
Great math , thank you
Great video, thank you!
Something i don't understand is this: in simple linear regresión, you take the mean of square of error, but in múltiple regresión, what happend with taking the mean?
X and y in the result fórmula have components with the mean?
27.6K Subscriber on 13 July 2020... is that close enough from your prediction?
when u removed the bracket...what happened to B transposition and X transposition while multiplying with Y? B transposition is not there just B is there ...the last line of simplification ?
why is that, I am very confused
Which book you consulted??
You're the GOAT
Thenks!
Hey, in 2nd example, you got y = xA, how can you even multiply those two when dimensions don't match? (m x 1) * ( n x m) , thus 1 != n
Similar for 4th example where you got y = transpose(x) Ax ... I think A should be square matrix in this case (mxm).
2nd example: y = Ax, not xA.
4th example: You're right here. x^T A x has shape (1 x m) *(n * m)*(m*1). This is true if n = m i.e. A is a square matrix. Good catch! Should have mentioned that. In the derivation, we use it with X^T X -- which is square.
Hey,
sorry my typo,I was referring to the 3rd example, y = xA, not the 2nd one.
And also are you sure that the last term B^T * X^T * X * B is the case of your 4th example. Because you can rewrite that expression as (X*B)^T * (X * B) and then it's a norm squared of matrix, and you say g(X) = X * B, and then you can apply derivative with respect to beta given by this formula: 2 * g(X)^T * d(X*B) / dX, which in this case would yield the same result, so after all you might be correct as well.
All the best.
I come from Psychology and am following data science courses rn. The completely different way of approaching regression was a mystery to me, but this video helped me a lot. I do feel like I should practise stuff like this myself too, do you have any suggestions for places where to find exercises.
Thanks for commenting and watching! Maybe a textbook might be good for establishing a foundation. You can check out the “Introduction to Statistical Learning”. Aside from that I have I lol playlist on linear regression, though I admit it hops around some concepts. It still might be worth your watch.
Good, but I think I need to review some general math and sit down to work it out -- solving its not hard, but its good to know why it works.
For sure. One can always use builtin libraries to code it in a single line. But understanding why it works the way it does will help understand when to use it.
Thanks
Welcome!
Even the most simple things are hard to understand in depth
Can you give us the reference for the matrix differentiation used here?
Thank you!
9:44 Actually it should be 2AX if A is a symmetric matrix, am i correct ??,help me anyone please
amaizing. thanks
Welcome! :)
Thank you so much, this video helps a lot :)
Yunqiang Gan Thanks! Glad you liked it!
Why y predicted=beta hat X??
Instead Beta not should also be included
How can we obtain intercept and slope of B0 and B1 after shifting line l to l'
Is it possible that there is a little error at 12.54 min. 3'rd term of RSS: beta should be transposed?
this video made me more confused.
It's mostly for ML shit it's actually really helpful
GREAT!
why for calculate b_1, the 1/n, becomes n?
8.54 min: last line 3rd term, I cannot match, could anybody clear me, please?
THANKS
Is there a video that explains how this "min arg" work graphically? like how it actually minimizes the residuals
....don't tell me you also have AMS 210 finals?
@@norbertramaj3024 nope, im actually from Tunisia, not the states, but there are similar materials between AMS and what i study.
Minute 8:54. How can the third term be minus beta-hat X-transposed y?? I thought it should've been minus beta-hat-TRANSPOSED X-transposed y... can you help me?
You're mostly right. Might have missed that transpose out. There is a lot to keep track of here. If matrix multiplication works out, then that's good :)
@@CodeEmporium thank you so much! your video is pure gold to me :) lots of doubts finally solved :)
Glad to help :)
How did you get y=xA as A transpose ? As both x A doesnt have the dimensions to get multiplied?
Does the inverse( X*transpose(X)) always exists in the formula? Why?
Thanks a lottt
It's a pity we have not defined what would the inverse of a non-square matrix would be.
If we had, (XT . X)^-1 . XT . y would be X^-1 . XT^-1 . XT . y = X^-1 . y , and I'd have more time to play video games.
Moore-Penrose. Pseudo Inverse.
@@justahardworkingjoe Thanks. Will read!
why is sum(sqr(e)) = e^T * e
well' it's almost 2020 and you have almost 40k subs. This what you predicted?