Very few of them had even explained what momentum is made up of, whats its equation, you just took 2 mins to add that explanation but it helped so much to understand rest 10 mins of the video without pausing. Great work. Please keep it up.
Very few resources in the internet explain these concepts in this kind of depth and clearly. Either they are in depth but not understandable or clear but not in depth. Loved your explanation.
Your lectures are very short and easy to understand. Hope you will make more videos like this about optimization algorithms in deep learning.Thank you very useful video
Thank you for a detailed video! I'm not an expert in this area. Could you explain what are W and B? From my understanding, W is the vector of parameters in the cost function, e.g., we want to minimize f(W). Is that correct? If so, what is B? How is it different from W? Thanks!
Hello... I think your understanding is not properly correct. I have explained W and B in this video: ua-cam.com/video/mlk0rddP3L4/v-deo.html Please have a look at the entire video and you might understand. Also, I have explained about W and B in my initial videos as well in Linear Regression Playlist. If you watch my playlists from the beginning, then you will get a more clear idea. Hope its helpful.
At 7:00 I think the difference formula is supposed to V(t) = Beta*Theta(t) + (1-Beta)*V(t-1) rather than V(t) = Beta*V(t-1) + (1-Beta)*Theta(t). Am I seeing that correctly?
3:09 you are saying we give higher weightage to new points and low weightage to old points but at 7:47 , you are saying something opposite of it so a confusion in this I will appreciate if you can resolve this
@@Ankit-hs9nb at 7:47, lower weightage is still given to the older points compared to newer points, but in general. the weightage to the older points taking beta as 0.95 is greater than weightage to the older points taking beta as 0.6 In general, both of them are giving more weightage to the newer points. I was just comparing the weightage for beta as 0.95 and 0.6 at 7:47
This is EXACTLY HOW I needed to learn: Maths + Visualization with equations! Thank you so much!
So happy to help 🤗
Very few of them had even explained what momentum is made up of, whats its equation, you just took 2 mins to add that explanation but it helped so much to understand rest 10 mins of the video without pausing. Great work. Please keep it up.
Thank you Pranay! Glad to hear that you found the videos helpful
Very few resources in the internet explain these concepts in this kind of depth and clearly. Either they are in depth but not understandable or clear but not in depth. Loved your explanation.
Thank you so much! It means a lot to me
Your lectures are very short and easy to understand. Hope you will make more videos like this about optimization algorithms in deep learning.Thank you very useful video
Good to hear! I will keep uploading more videos
Greatly explained ! Thank you !! ( I find it even better than Andrew's one on the momentum), Keep it up !!
Thank you! Glad you found it valuable. Good to hear this!
this was exactly what I was seeking for. Thanks a lot!!!!!!
Glad I could help!
Very nice explanation, thank you....From scratch, mathematics is what I was looking for....This really helped!!
Glad to help!
Wonderful video. Made the concept look very easy...
Glad it helped! 🙂
You're a great man dude! Thanks alot.
oh my god, this was clearly explained, thanks for this perfect insight.
Thank You. Glad it helped you!
This is the best explanation. Thank you
very informative video brother,Thank you very much for the explanation,It was great
Thank You!
Thank you for a detailed video! I'm not an expert in this area. Could you explain what are W and B? From my understanding, W is the vector of parameters in the cost function, e.g., we want to minimize f(W). Is that correct? If so, what is B? How is it different from W? Thanks!
Hello... I think your understanding is not properly correct. I have explained W and B in this video: ua-cam.com/video/mlk0rddP3L4/v-deo.html
Please have a look at the entire video and you might understand. Also, I have explained about W and B in my initial videos as well in Linear Regression Playlist. If you watch my playlists from the beginning, then you will get a more clear idea.
Hope its helpful.
At 7:00 I think the difference formula is supposed to V(t) = Beta*Theta(t) + (1-Beta)*V(t-1) rather than V(t) = Beta*V(t-1) + (1-Beta)*Theta(t). Am I seeing that correctly?
On 5:27 when computing V3, aren't you missing the factor (1-beta) from V2?
Yea i missed writing it… thanks for letting me kno
u are so good but some of ur vudeos has no subtitle caption unavailabe please active that for all ur videos tnx a lot
Great video, thanks man
Very well explained sir. Can you please start a playlist of DSA for python?
What would be the difference between this and adadelta?
3:09 you are saying we give higher weightage to new points and low weightage to old points
but at 7:47 , you are saying something opposite of it
so a confusion in this
I will appreciate if you can resolve this
The words seems opposite, but if you try to observe it carefully, I am talking about 2 different things there.
so at 3:09 you are talking about points but at 7:47 you are talking about average?
did I understand correctly?
@@Ankit-hs9nb at 7:47, lower weightage is still given to the older points compared to newer points, but in general.
the weightage to the older points taking beta as 0.95 is greater than weightage to the older points taking beta as 0.6
In general, both of them are giving more weightage to the newer points. I was just comparing the weightage for beta as 0.95 and 0.6 at 7:47
Thank you so much!
You are a god! Thanks from Argentina!
Hi Santiago.. Its great to see that people from different lands are learning from these videos. Thanks for the compliment 😇
Thank you so much
Your Welcome!
Amazing
you are best.. :-)
Thank you so much
good
I want to contact you for business work
Hello Ali, here is my email address: codeboosterjp@gmail.com
@@CodingLane I sent you an email