One correction, not relevant to the actuall regression, but should be said nonetheless. The number of medals one athlete can win is not limitted to one, rather it is limited to the number of events the athlete competes in (maximum of one per event). In fact, numerous athletes have one multiple medals in one Olympics. Just wanted to clarify that. Of course, from a certain number of athelets, it will be impossible for a smaller team to compete in as many events as the large team, making it more likely that the larger team wins more medals.
Great and very clear explanation. The only point missed in the end is the regression visualisation 😉. Nice to have both initial data and the regression plotted
thanks for the lesson, but just a question, during the model the separation of x,y_train and x,y_test was not made, why would it not be necessary, and if it is necessary to do it, how would it be done? thanks
Would the solution for B be considered a least squares solution? Also, If we wanted to construct say a 95% confidence interval for each coefficient, would we take B for intercept, athletes, and prev_medals (-1.96, 0.07, 0.73) and multiply them by their respective standard errors and t-scores? Would the formula would be as follows: B(k) * t(n-k-1, alpha = 0.05/2) * SE(B(k)) , or does this require more linear algebra? Great tutorial btw, thanks for the help.
Hey, That is a great beatiful demonstration of linear regression. Thank you. But I didn't understand where prev_medals coming in building X matrix at the beginning? some one can give to me explanation on apparution of these value inside the X matrix?
Can you please make a video demonstrating the multivariate regression analysis with the following information taken into consideration? Performs multiple linear regression trend analysis of an arbitrary time series. OPTIONAL: error analysis for regression coefficients (uses standard multivariate noise model). Form of general regression trend model used in this procedure (t = time index = 0,1,2,3,...,N-1): T(t)=ALPHA(t) + BETA(t)*t + GAMMA(t)*QBO(t) + DELTA(t)*SOLAR(t) + EPS1(t)*EXTRA1(t) + EPS2(t)*EXTRA2(t) + RESIDUAL_FIT(t), where ALPHA represents the 12-month seasonal fit, BETA is the 12-month seasonal trend coefficient, RESIDUAL_FIT(t) represents the error time series, and GAMMA, DELTA, EPS1, and EPS2 are 12-month coefficients corresponding to the ozone driving quantities QBO (quasi-biennial oscillation), SOLAR (solar-UV proxy), and proxies EXTRA1 and EXTRA2 (for example, these latter two might be ENSO, vorticity, geopotential heights, or temperature), respectively. The general model above assumes simple linear relationships between T(t) and surrogates which is hopefully valid as a first approximation. Note that for total ozone trends based on chemical species such as involving Chlorine, the trend term BETA(t)*t could be replaced (ignored by setting m2=0 in the procedure call), with EPS1(t)*EXTRA1(t) where EXTRA1(t) is the chemical proxy time series. This procedure assumes the following form for the coefficients ALPHA, BETA, GAMMA,...) in effort to approximate realistic seasonal dependence of sensitivity between T(t) and surrogate. The expansion shown below is for ALPHA(t) - similar expansions for BETA(t), GAMMA(t), DELTA(t), EPS1(t), and EPS2(t): ALPHA(t) = A0
Hi Hameed - yes, some matrices are singular, and cannot be inverted. This happens when columns or rows are linear combinations of each other. In those cases, ridge regression is a good alternative. Here is a ridge regression explanation - ua-cam.com/video/mpuKSovz9xM/v-deo.html .
Hi, this is a wonderful explanation. Great job putting this together. The only thing that really confuses me is how you factor in previous medals in the predictive model. What would that look like in the linear equation at 1:54?
You would add a second term b2x2, so the full equation would be b0 + b1x1 + b2x2. x1 would be athletes, x2 is previous medals. Then you'd have separate coefficients (b1 and b2) for each.
Hi guys! Very interesting indeed! There is one thing I don't understand though. The identity matrix, as you mentioned , behaves like one in matrix multiplication when you multiply it with a matrix of the same size. But in this precise case (around the 13:08) the matrix B doesn't have the same size. So how come you can eliminate the identity matrix here from the equation? Thanks!
Hi Josué - I shouldn't have said "of the same size". Multiplying the identity matrix by another matrix behaves like normal matrix multiplication. So if the identity matrix (I) is 2x2, and you multiply by a 2x1 matrix B, you end up with a 2x1 matrix (equal to B). The number of columns in the first matrix you multiply has to match the number of rows in the second matrix. And the final matrix has the same row count as the first matrix, and the same column count as the second matrix.
In predictions we got values as 0.24,-1.6,-1.39 so can you explain does -1.6 medals is valid? Or I need to use some other dataset to perform regression like house prediction? Can you suggest me some dataset in which i can apply ridge regression?
Hi Sunil - with the way linear regression works, you can get numbers that don't make sense with the dataset. The best thing to do is to truncate the range (anything below 0 gets set to 0). Other algorithms that don't make assumptions about linearity can avoid this problem (like decision trees, k-nn, etc).
@@Dataquestio Thank you for your message, As the prediction of this data(Medals is in decimal ) so do you have any suggestion regarding other dataset in which i can make prediction which make sense using ridge regression?
This is one of the weaknesses of linear regression. Due to the y-intercept term, you can get predictions that don't make sense in the real world. An easy solution is to replace negative predictions with 0.
Hi Oluwamuyiwa - there are a few ways to invert a matrix. The easiest to do by hand is Gaussian elimination - en.wikipedia.org/wiki/Gaussian_elimination . That said, there isn't a lot of benefit to knowing how to invert a matrix by hand, so I wouldn't worry too much about it.
I recommend this video for those who understand the general concept of linear regression, but want to know what happens 'under the hood'
Amazing tutorial. Difficult concepts were explained with such ease. Kudos team Dataquest!
Today you will my teacher. I'm from VietNam. Thank you so much
Very explicit. You are a wonderful teacher. Thanks so much
This is absolutely amazing and great video. I can’t wait to see more great work
this is a great tutorial. Beautifully explained.
Thanks so much. Better than any E-books 🙂
Hi Vikas, which is better for GLM models in python: sklearn or statmodels package?
One correction, not relevant to the actuall regression, but should be said nonetheless. The number of medals one athlete can win is not limitted to one, rather it is limited to the number of events the athlete competes in (maximum of one per event). In fact, numerous athletes have one multiple medals in one Olympics. Just wanted to clarify that. Of course, from a certain number of athelets, it will be impossible for a smaller team to compete in as many events as the large team, making it more likely that the larger team wins more medals.
Great and very clear explanation. The only point missed in the end is the regression visualisation 😉. Nice to have both initial data and the regression plotted
Is there a reason you chose to implement the normal equation over gradient descent? I'm quite curious as I am more familiar with gradient descent.
Perfect,,, thnks a lot
thanks for the lesson, but just a question, during the model the separation of x,y_train and x,y_test was not made, why would it not be necessary, and if it is necessary to do it, how would it be done?
thanks
Do you have an example like this with multiple x-values or features?
Why do we need to add those "1" when solving the matrix
Very well explained!!!
Thank you . It was well explained.
Would the solution for B be considered a least squares solution? Also, If we wanted to construct say a 95% confidence interval for each coefficient, would we take B for intercept, athletes, and prev_medals (-1.96, 0.07, 0.73) and multiply them by their respective standard errors and t-scores? Would the formula would be as follows: B(k) * t(n-k-1, alpha = 0.05/2) * SE(B(k)) , or does this require more linear algebra? Great tutorial btw, thanks for the help.
"if you only enter one athlete, the most medals you can win is one" - Michael Phelps has entered the chat.
Hey, That is a great beatiful demonstration of linear regression. Thank you. But I didn't understand where prev_medals coming in building X matrix at the beginning?
some one can give to me explanation on apparution of these value inside the X matrix?
Can you please make a video demonstrating the multivariate regression analysis with the following information taken into consideration?
Performs multiple linear regression trend analysis of an arbitrary time series. OPTIONAL: error analysis for regression coefficients (uses standard multivariate noise model).
Form of general regression trend model used in this procedure (t = time index = 0,1,2,3,...,N-1):
T(t)=ALPHA(t) + BETA(t)*t + GAMMA(t)*QBO(t) + DELTA(t)*SOLAR(t) + EPS1(t)*EXTRA1(t) + EPS2(t)*EXTRA2(t) + RESIDUAL_FIT(t),
where ALPHA represents the 12-month seasonal fit, BETA is the 12-month seasonal trend coefficient, RESIDUAL_FIT(t) represents the error time series, and GAMMA, DELTA, EPS1, and EPS2 are 12-month coefficients corresponding to the ozone driving quantities QBO (quasi-biennial oscillation), SOLAR (solar-UV proxy), and proxies EXTRA1 and EXTRA2 (for example, these latter two might be ENSO, vorticity, geopotential heights, or temperature), respectively.
The general model above assumes simple linear relationships between T(t) and surrogates which is hopefully valid as a first approximation. Note that for total ozone trends based on chemical species such as involving Chlorine, the trend term BETA(t)*t could be replaced (ignored by setting m2=0 in the procedure call), with EPS1(t)*EXTRA1(t) where EXTRA1(t) is the chemical proxy time series.
This procedure assumes the following form for the coefficients ALPHA, BETA, GAMMA,...) in effort to approximate realistic seasonal dependence of sensitivity between T(t) and surrogate.
The expansion shown below is for ALPHA(t) - similar expansions for BETA(t), GAMMA(t), DELTA(t), EPS1(t), and EPS2(t):
ALPHA(t) = A0
It is such a fantastic explanation of Linear Regression. My question is, is there any possibility that we can't obtain the inverse of matrix X?
Hi Hameed - yes, some matrices are singular, and cannot be inverted. This happens when columns or rows are linear combinations of each other. In those cases, ridge regression is a good alternative. Here is a ridge regression explanation - ua-cam.com/video/mpuKSovz9xM/v-deo.html .
Hi, this is a wonderful explanation. Great job putting this together. The only thing that really confuses me is how you factor in previous medals in the predictive model. What would that look like in the linear equation at 1:54?
You would add a second term b2x2, so the full equation would be b0 + b1x1 + b2x2. x1 would be athletes, x2 is previous medals. Then you'd have separate coefficients (b1 and b2) for each.
Hi guys! Very interesting indeed! There is one thing I don't understand though. The identity matrix, as you mentioned , behaves like one in matrix multiplication when you multiply it with a matrix of the same size. But in this precise case (around the 13:08) the matrix B doesn't have the same size. So how come you can eliminate the identity matrix here from the equation? Thanks!
Hi Josué - I shouldn't have said "of the same size". Multiplying the identity matrix by another matrix behaves like normal matrix multiplication. So if the identity matrix (I) is 2x2, and you multiply by a 2x1 matrix B, you end up with a 2x1 matrix (equal to B).
The number of columns in the first matrix you multiply has to match the number of rows in the second matrix. And the final matrix has the same row count as the first matrix, and the same column count as the second matrix.
In predictions we got values as 0.24,-1.6,-1.39 so can you explain does -1.6 medals is valid? Or I need to use some other dataset to perform regression like house prediction? Can you suggest me some dataset in which i can apply ridge regression?
Hi Sunil - with the way linear regression works, you can get numbers that don't make sense with the dataset. The best thing to do is to truncate the range (anything below 0 gets set to 0). Other algorithms that don't make assumptions about linearity can avoid this problem (like decision trees, k-nn, etc).
@@Dataquestio Thank you for your message, As the prediction of this data(Medals is in decimal ) so do you have any suggestion regarding other dataset in which i can make prediction which make sense using ridge regression?
Thanks so much for the Video
great job
Hi Team... Very well explained Linear Regression from scratch... Do you have any video for Ridge Regression from Scratch using Python?
Hi Sunil - we don't. I'll look into doing ridge regression in a future video! -Vik
@@Dataquestio Thank you
@@Dataquestio thanks you are awesome
@@Dataquestio thanks
THANKS ALOT🤯
Thank you for this video. Could you please share the ppt slides of this lesson?
Hi Yousif - this was done using video animations, so there aren't any powerpoint slides, unfortunately. -Vik
I am wondering is it okay to have a model which predicts country to receive negative amount of medals? Isn't that just impossible?
This is one of the weaknesses of linear regression. Due to the y-intercept term, you can get predictions that don't make sense in the real world. An easy solution is to replace negative predictions with 0.
I rather use statsmodel than using this mthod which makes things complex
tnx sir
sorry teacher. i guess you were confused SSR was SSE and R2 = 1 - (SSE/SST) = SSR/SST
I guess another question I have is how to invert a matrix
Hi Oluwamuyiwa - there are a few ways to invert a matrix. The easiest to do by hand is Gaussian elimination - en.wikipedia.org/wiki/Gaussian_elimination . That said, there isn't a lot of benefit to knowing how to invert a matrix by hand, so I wouldn't worry too much about it.
This guy is old, young, sleepy and awake all at the same time.