I have not finished this video but this is the best I have seen so far. Though you didn't talk about multicollinearity, everything here is so clear in simple English Thank You!
Correct me if I am wrong, but how are features un-correlated with the error term useful for a Model? Which would mean no matter what we do to a particular feature weight, the error term cannot be controlled with it as it is iid wrt to the feature. So, we might as well remove it from our Model 😅 Isn't the second assumption more applicable for Naive Bayes? I am not sure if Linear Regression is especially sensitive if this assumption does not hold true, as it would just mean switching the signs & values of weights and make the correlated features converge towards a lower error. I mean, it would mean less overall information, but it probably does not affect the performance negatively.
@@venkateshgunda25 If features are correlated with the error(a.k.a residuals), it means using the features we can able to predict the error, if a model can predict the error, then it means it does overfitted, always our model should only learn the signal not the noise. Refer to GAUSS MARKOV THEOREM
@@devanshverma5395 because, we can use other least square method like total least square, partial least square in linear regression. So we cannot say it as assumption of linear regression, we should say assumption of ols, other least square methods has their own assumptions!
Very useful for MLE Interview! Thanks Emma :)
I have not finished this video but this is the best I have seen so far. Though you didn't talk about multicollinearity, everything here is so clear in simple English Thank You!
Awesome video!
Nice tips. Thanks a lot.🎉
What about multicolinearity ?
what about features are uncorrel with the error term (iid) and features are uncorrel with each other (no multicollinearity)?
Correct me if I am wrong, but how are features un-correlated with the error term useful for a Model? Which would mean no matter what we do to a particular feature weight, the error term cannot be controlled with it as it is iid wrt to the feature. So, we might as well remove it from our Model 😅
Isn't the second assumption more applicable for Naive Bayes? I am not sure if Linear Regression is especially sensitive if this assumption does not hold true, as it would just mean switching the signs & values of weights and make the correlated features converge towards a lower error. I mean, it would mean less overall information, but it probably does not affect the performance negatively.
@@venkateshgunda25 If features are correlated with the error(a.k.a residuals), it means using the features we can able to predict the error, if a model can predict the error, then it means it does overfitted, always our model should only learn the signal not the noise.
Refer to GAUSS MARKOV THEOREM
Her second point says "residuals are independent". So we can deduce that features are not correlated with the errors.
@@xiaofeichen5530 Yes. Error must be independently otherwise it violates the first assumption with linearity.
It is assumption of Ordinary Least Square(OLS),not assumption of linear regression!!!
How are they different?
@@devanshverma5395 because, we can use other least square method like total least square, partial least square in linear regression. So we cannot say it as assumption of linear regression, we should say assumption of ols, other least square methods has their own assumptions!