Some details who'd like to to get into that really deep: regarding L1 penalty - we can't really choose w2 = 0 and w1 = 1, since loss consists of 2 parts: loss from the diamond shape (l1 penalty) + loss from the ellips shape (initial loss function). since for (w1, w2) pairs (0, 1) and (1, 0) l1-penalty term values are the same and equals 1 + 0 = 0 + 1 = 1, we now look at the initial loss function as it just as well depends on the (w1, w2) values. for w1 = 0 and w2 = 1 (closer to the center point) loss is going to be less than for w1 = 1 and w2 = 0 (farther from the center point) - we can see that from the contour lines plot. therefore, optimizer won't go there and will converge on w1 = 0 and w2 = 1 and that's it.
How can increasing alpha would decrease the weight can you pls explain. Now 0.1 is bigger than 0.001 if I have weight as 0.4 now 0.1 *0.4= 0.04 where as 0.001 *0.4 would be 0.0004 now lesser the alpha lesser the weight which will be near to zero correct. I feel what you mean by bigger alpha is alpha with bigger negative power isn't that correct. Can you please clarify
"increasing the alpha would decrease the weight" does not refer to the calculation of alpha * weights, it refers to what happens when you minimize your regularized loss function.
Many of you have asked me to share my presentation notes, and now… I have them for you! Download all the PDFs of my Notion pages at www.emmading.com/get-all-my-free-resources. Enjoy!
Hello Emma, I started my switching to AI/ML and noticed your website and courses. does your training or courses suitable for beginner? . also am not sure if you you have special coursers for statistics and mathematics for AI/ML .. Thank you
Man, I tried to find videos and blog post about this topic and most of them just scratch the surface. Thanks for the deep analysis and comparison!
So glad you found it helpful, Minh! Thanks for watching. 😊
thank you so much for this clear explanation. It has helped me more than the Coursera course.
Some details who'd like to to get into that really deep: regarding L1 penalty - we can't really choose w2 = 0 and w1 = 1, since loss consists of 2 parts: loss from the diamond shape (l1 penalty) + loss from the ellips shape (initial loss function). since for (w1, w2) pairs (0, 1) and (1, 0) l1-penalty term values are the same and equals 1 + 0 = 0 + 1 = 1, we now look at the initial loss function as it just as well depends on the (w1, w2) values. for w1 = 0 and w2 = 1 (closer to the center point) loss is going to be less than for w1 = 1 and w2 = 0 (farther from the center point) - we can see that from the contour lines plot. therefore, optimizer won't go there and will converge on w1 = 0 and w2 = 1 and that's it.
How can increasing alpha would decrease the weight can you pls explain. Now 0.1 is bigger than 0.001 if I have weight as 0.4 now 0.1 *0.4= 0.04 where as 0.001 *0.4 would be 0.0004 now lesser the alpha lesser the weight which will be near to zero correct. I feel what you mean by bigger alpha is alpha with bigger negative power isn't that correct. Can you please clarify
"increasing the alpha would decrease the weight" does not refer to the calculation of alpha * weights, it refers to what happens when you minimize your regularized loss function.
Much clearer than what I learned from elsewhere. I also noticed that you slowed the pace of speaking which is nice for people to follow
Many of you have asked me to share my presentation notes, and now… I have them for you! Download all the PDFs of my Notion pages at www.emmading.com/get-all-my-free-resources. Enjoy!
Hello Emma, I started my switching to AI/ML and noticed your website and courses. does your training or courses suitable for beginner? . also am not sure if you you have special coursers for statistics and mathematics for AI/ML .. Thank you
Great vid! It's like a fast recap of a college stat class.
Great explanation, Emma! Have a nice day!