I really enjoyed listening and learning from your explanation of this topic: it's very informative and easy to understand. I can now do my research paper with one doubt cleared. Thank you for your help!
Hui superb lecture.. but I have one doubt I hope u will clear it. All of the videos related to lasso used regression model to explain it.. since it’s also a feature selection so can we use this as a feature selection for classification problem?? As PCA is used for both regression and classification. And moreover classifications means related to categorical data which by default convert into numerical values. So can we use lasso for classification problem also as a feature selection,if yes then why there is not any example of it? Thanx in advance
Could you clarify what you mean when you say "real example"? Around 8:44 in the video, I go through an example of how results from LASSO modeling could be interpreted in the context of an applied problem: predicting credit card balance.
@@omegajoctan2938 LASSO models aren't meant to be fit by hand (the level of computation involved requires a computer), so there really aren't hand calculations that would be useful to show. For this reason, this video focuses on the concepts underlying this tool.
@@lesliemyint1865 Thanks for the reply, I'm looking for a way to code this logic if you could show a coding example, that will be great not by using the skicit-learn library, coding from scratch, I understand that the computations are too hard to manually
@@omegajoctan2938 An example of coding up LASSO from scratch is available here at the top of the page: lmyint.github.io/253_spring_2019/shrinkageregularization.html. This isn't what appears in actual LASSO implementations but I hope it helps get the ideas across.
Did you mean when a beta coefficient is multiplied by lambda (the penalty parameter)? If there is a large lambda penalty, then there is a very large contribution to the penalized sum of squared residuals - that indeed can be very large. But this is exactly what mathematically incentivizes the beta coefficient to move towards zero. The smaller the coefficient magnitude, the smaller the penalty incurred. (And when the coefficient is zero, there is no penalty incurred.)
Yes, we can just have RSS but just RSS can lead to an overfit model when we have lots of predictors (some of which are likely uninformative in predicting the outcome). The penalty term encourages predictors that help little to not at all be eliminated from the model.
I really enjoyed listening and learning from your explanation of this topic: it's very informative and easy to understand. I can now do my research paper with one doubt cleared. Thank you for your help!
Very easy to understand. Thank you!
Very cool, thank you!
This is a really good video
Brilliant explanation of a complex topic 👍
Excellent video
greate video.
Do you know a good scientific paper that describes how LASSO regression works which I could cite for my papers?
Not a paper but I highly recommend the ISLR textbook: hastie.su.domains/ISLR2/ISLRv2_website.pdf
@@lesliemyint1865 dang this is awesome thank you!
Hui superb lecture.. but I have one doubt I hope u will clear it. All of the videos related to lasso used regression model to explain it.. since it’s also a feature selection so can we use this as a feature selection for classification problem?? As PCA is used for both regression and classification. And moreover classifications means related to categorical data which by default convert into numerical values. So can we use lasso for classification problem also as a feature selection,if yes then why there is not any example of it? Thanx in advance
great
Thank you❤
I really enjoyed listening. But is it possible to give a real example?
that is the real question, I can't figure out where to start calculating this and what to expect from the formula
Could you clarify what you mean when you say "real example"? Around 8:44 in the video, I go through an example of how results from LASSO modeling could be interpreted in the context of an applied problem: predicting credit card balance.
@@omegajoctan2938 LASSO models aren't meant to be fit by hand (the level of computation involved requires a computer), so there really aren't hand calculations that would be useful to show. For this reason, this video focuses on the concepts underlying this tool.
@@lesliemyint1865 Thanks for the reply, I'm looking for a way to code this logic if you could show a coding example, that will be great not by using the skicit-learn library, coding from scratch, I understand that the computations are too hard to manually
@@omegajoctan2938 An example of coding up LASSO from scratch is available here at the top of the page: lmyint.github.io/253_spring_2019/shrinkageregularization.html. This isn't what appears in actual LASSO implementations but I hope it helps get the ideas across.
How a positive beta multiplied by large positive Alfa becomes zero? Cannot get it.
Did you mean when a beta coefficient is multiplied by lambda (the penalty parameter)? If there is a large lambda penalty, then there is a very large contribution to the penalized sum of squared residuals - that indeed can be very large. But this is exactly what mathematically incentivizes the beta coefficient to move towards zero. The smaller the coefficient magnitude, the smaller the penalty incurred. (And when the coefficient is zero, there is no penalty incurred.)
Why do we need the penalty term? can we not just have the RSS without it?
Yes, we can just have RSS but just RSS can lead to an overfit model when we have lots of predictors (some of which are likely uninformative in predicting the outcome). The penalty term encourages predictors that help little to not at all be eliminated from the model.