Regularization - Explained!

CodeEmporium

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 17 гру 2024
We will explain Ridge, Lasso and a Bayesian interpretation of both.
ABOUT ME
⭕ Subscribe: www.youtube.co...
📚 Medium Blog: / dataemporium
💻 Github: github.com/ajh...
👔 LinkedIn: / ajay-halthor-477974bb
RESOURCES
[1] Graphing calculator to plot nice charts: www.desmos.com
[2] Refer section 6.2 on "Shrinkage Methods" for mathematical details: hastie.su.doma...
[3] Karush-Kuhn-Tucker conditions for constrained optimization with inequality constraints: en.wikipedia.o...
[4] stat exchange discussions on [3]: stats.stackexc...
[5] Proof of ridge regression: stats.stackexc...
[6] Laplace distribution (or double exponential distribution) used for lasso prior: en.wikipedia.o...
[7] ‪@ritvikmath‬ 's amazing video for the bayesian interpretation of lasso and ridge regression: • Bayesian Linear Regres...
[8] Distinction between Maximum "Likelihood" Estimations and Maximum "A Posteriori" Estimations: agustinus.kris...
MATH COURSES (7 day free trial)
📕 Mathematics for Machine Learning: imp.i384100.ne...
📕 Calculus: imp.i384100.ne...
📕 Statistics for Data Science: imp.i384100.ne...
📕 Bayesian Statistics: imp.i384100.ne...
📕 Linear Algebra: imp.i384100.ne...
📕 Probability: imp.i384100.ne...
OTHER RELATED COURSES (7 day free trial)
📕 ⭐ Deep Learning Specialization: imp.i384100.ne...
📕 Python for Everybody: imp.i384100.ne...
📕 MLOps Course: imp.i384100.ne...
📕 Natural Language Processing (NLP): imp.i384100.ne...
📕 Machine Learning in Production: imp.i384100.ne...
📕 Data Science Specialization: imp.i384100.ne...
📕 Tensorflow: imp.i384100.ne...

КОМЕНТАРІ • 29

@ashishanand9642 6 місяців тому ⁺³
Why this is so Underrated, this should be on every one playlist for linear regression.
Hatsoff man :)
@data_quest_studio4944 2 роки тому ⁺⁸
My man looks sharp and dapper
@CodeEmporium 2 роки тому ⁺¹
Haha. Thanks! I think this shirt looked better on camera than in person. :)
@blairnicolle2218 7 місяців тому
Excellent videos! Great graphing for intuition of L1 regularization where parameters become exactly zero (9:45) as compared with behavior of L2 regularization.
@paull923 Рік тому ⁺²
I had to watch it twice to truly digest your approach, but I like your approach to the contour plot in particular. I hope to boost your channel with my comments a tiny bit ;). tyvm!
what I was taught and what is helpful to know imo:
1) Speaking on an abstract level what regularization achieves: it punishes high-dimensional terms.
2) The notion of L1- and L2- regularization and when you talk about "Gaussian" for Ridge, you could also talk about "Laplace" distribution instead of double exponential distribution for Lasso regression
@CodeEmporium Рік тому ⁺¹
Thanks so much for your comments Paul! And yea, I feel like I have seen similar contour plots in books but never truly understood “why” they were like that until I started diving into details myself. Hopefully in the future I can explain it in a way that you’d be able to get it in a single pass through the video too :)
@ajaytaneja111 2 роки тому ⁺⁶
Hi Ajay, great video, as always. One suggestion with your permission;) I think it might be worthwhile introducing the concept of regularization by comparing:
Feature elimination ( which is equivalent to making the weight zero) vs reducing the weight ( which is regularization) and elaborate on this and then drfting towards Lasso and, Ridge. ;)
@NicholasRenotte 2 роки тому
Well hello everyone right back at you Ajay! These are fire, the live viz is on point!
@CodeEmporium 2 роки тому ⁺¹
Thank you for noticing ma guy. I will catch up to the 100K gang soon. Pls wait for me 😂
@NicholasRenotte 2 роки тому
@@CodeEmporium 😂 you're one hunnit in my eyes 🙏
@ivanalejandrogarciaramirez8976 4 місяці тому
Thank you very much for this answer, I have been looking for it for a while: 7:42
@jpatel2002 4 місяці тому
8:27 yi < -(lambda/2)
@cormackjackson9442 8 місяців тому
Such an awesome video! Can't believe i hadn't made the connection between ridge and Lagrangians, literally has a lambda in it lol!
@cormackjackson9442 8 місяців тому
With the lasso intuition, the stepwise function you get for theta, how do you get the conditions on the right i.e. yi < lambda/2.I thought perhaps instead of writing theta < 0, you are just using the implied relationship between yi and lambda. E.g. that if theta < 0, and therefore |theta|.= - theta, which then after optimising gives theta = y - lambda/2 i.e. y = lambda/2 + theta, but then i get the opposite conditions as you...i.e. as theta is negative in this case wouldn't that give y = lambda/2 + theta < lambda/2?
@lucianofloripa123 5 місяців тому
Good explanation!
@sivakrishna5530 Рік тому
always find interesting things here ,Keep going .Good luck .
@CodeEmporium Рік тому
Hah! Glad that is the case. I am here to pique that interest :)
@chadx8269 Рік тому ⁺¹
Nice explaination of Bayesian.
Isn't Regularization just the Lagrange multiplier. The optimum point is where the the gradient of the constraint is proportional to the gradient of the cost function.
@abhirajarora7631 8 місяців тому
It is mathematically written in the same way but they are not the same. Langrange multipliers are used when you need to min/max a given function provided a constraint, and then you find the value of lambda, but in regularisation, we set the lambda value ourselves. Regularisation gives us a penalty if we take steps towards the non minimum direction and thus allows us to go back to the correct direction in the following iteration.
@fujinzhou7150 Рік тому
Love your awesome videos! Salute! Thank you so much!
@CodeEmporium Рік тому
You are so welcome! I am happy this helps
@alexandergeorgiev2631 Рік тому
How does Gauss-Newton for nonlinear regression change with (L2) regularization?
Рік тому
Nice video, thanks! The only thing I think is slightly incorrect is that you could see polynomials with increasing degrees as complex. Since you are talking about maths, I was expecting to see imaginary unit when I first heard complex.
@TheRainHarvester 2 роки тому
Great content on your channel. I just found it! Heh i used desmos to debug/visualize too!
I just added a video explaining easy multilayer back propogation. The book math with all the subscripts is confusing, so i did it without any. Much simpler to understand.
@CodeEmporium 2 роки тому ⁺¹
Thank you! And Solid work on that explanation :)
@mathwithsidiqat 4 місяці тому
Thank you
@kakunmaor Рік тому
AWESOME!!!!! thanks!
@lijinhui6902 Рік тому
thx !
@glenngray2658 2 роки тому
🌸 pքɾօʍօʂʍ

Наступне

Автоматичне відтворення