How Gradient Descent Works. Simple Explanation

Data Science Garage

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 22 сер 2024
Video explain what is gradient descent and how gradient descent works with a simple example. Basic intuition and explanation are revealed in the video.The content is:
0:09 - What is Gradient Descent.
0:30 - Example
0:39 - Step no. 1. Start with a random point and find the gradient (derivative) of the given function.
1:28 - Step no. 2. Set learning rate to get know how big should be a step to move forward on gradient descent to the opposite direction.
1:47 - Step no. 3. Perform calculations on iterations.
1:58 - Initialize parameters.
2:20 - calculations on the 1st iteration.
3:20 - calculations on the 2nd iteration.
4:26 - till we reach global or local minimum.
I hope that this explanation on how gradient descent works is useful for beginners of deep learning or good reminder for machine learning / deep learning experts.
#deeplearning #gradientdescent #ai

КОМЕНТАРІ • 188

@lovekildetoft5658 3 роки тому ⁺⁶
All other videos on gradient descent are atleast 20 minutes long. This one is five, and made me understand more than any of those videos. Thank you!
@DataScienceGarage 3 роки тому
Thank you for watching! Hoping it was useful.
@blendaguedes 4 роки тому ⁺⁴⁵
Sometimes we just need two loops to understand a whole. Thank you!
@uncoded0 2 роки тому ⁺¹²
Thank you! Many hours of trying to understand gradient decent, now I finally get it, thanks to this video. Thank you!
@anamikabhowmick6322 3 роки тому ⁺¹¹
This is one of the best and easiest way to learn and understand gradient descent, thank you so much for this
@DataScienceGarage 3 роки тому ⁺²
Glad you liked it! :)
@impzhu3088 3 роки тому ⁺⁹
That’s the way to explain a concept! Example with detailed steps. Thank you so much!
@DataScienceGarage 3 роки тому
Thanks for watching! :)
@riki2404 3 роки тому ⁺²
Thank you for such a clear explanation. Short and precise. no unnecessary talk.
@DataScienceGarage 3 роки тому
Thanks for watching! Hoping it was useful :)
@DataScienceGarage 5 років тому ⁺⁴
If you found useful in this video I highly recommend to check other related ones:
-- Calculate Convolutional Layer Volume in ConvNet (ua-cam.com/video/3uEd0ErqGzU/v-deo.html)
-- Adam. Rmsprop. Momentum. Optimization Algorithm. - Principles in Deep Learning (ua-cam.com/video/YacPECoI5SY/v-deo.html)
-- Numpy Argsort - np.argsort() - function. Simple Example (ua-cam.com/video/6W8UHvn8ckg/v-deo.html)
-- Python Regular Expression (RegEx). Extract Dates from Strings in Pandas DataFrame (ua-cam.com/video/E4avBXbNOGc/v-deo.html)
@wtry0067 4 роки тому ⁺¹⁴
It's very short and very useful.. I get clarity what I was looking for.
Thanks once again.
@abdelrahmane657 Рік тому
Oh my god, you are excellent. You make the difference on UA-cam. Thank you so much. 🎉🙏👏🙌👌👍✌🏼
@DataScienceGarage Рік тому
Thanks you for such feedback, appreciate it! :)
@hemantsah8567 4 роки тому ⁺²
It is easy... spent my 2 days on learning gradient descent.... then I came to your video... Thanks bro
@aalishanhunzai 4 місяці тому
Bro thank you so much for your efforts, couldn't find a more simple explanation of gradient descent than this one.
@DataScienceGarage 4 місяці тому ⁺¹
Thanks for such feedback! :)
@sukanya4498 2 роки тому ⁺⁴
Love this video ❤️, Very simple and precise! Thank you !
@DataScienceGarage 2 роки тому
Thanks for watching! :)
@Elementiah 2 місяці тому
Thank you so much for this! This is the perfect explanation! 😄
@TingBie 5 місяців тому
Thanks for this example, simple and spot-on!
@Alex-pd5xc Рік тому
wow dude, very clearly explained and you made it simple for me to understand. cheers man
@DataScienceGarage Рік тому
Thanks for such feedback, appreciate!
@samvanoye 6 місяців тому
Perfectly explained, thanks!
@DataScienceGarage 6 місяців тому
Thanks for watching! :)
@alexbarq1900 2 роки тому ⁺²
I get the idea but any reason for not doing simple math to find the local min?
dy/dx = 2(x+5)
If we want to find the min, we just do dy/dx=0.. then:
0 = 2(x+5)
x = -5
@glenfernandes253 2 роки тому ⁺¹
how do you know, how many iterations to run before reaching the global/local minimum, what if it reaches the minimum and starts climbing on the other side ?
@strykeregziadahmed9562 Рік тому
2 hours in DL course i didnt get it
5 min made my day this is how actually learning should be
@DataScienceGarage Рік тому
Glad it was helpful for you! :)
@muhammadhashir7949 2 роки тому ⁺¹
Thank you so much your work was practical and I loved it alot and underestood gradient descent. Before that I spent lots of time but didn't understood it properly
@phaniraju0456 4 роки тому ⁺²
I bow to your for this great clarification ..loved it
@josephsmy1994 2 роки тому ⁺¹
Awesome explanation! straight to the point
@DataScienceGarage 2 роки тому
Thanks for such feedback! :)
@eramitvajpeyee85 3 роки тому
Thank you so much for explaining it in short and easy way!! Please keep uploading content like this.
@DataScienceGarage 3 роки тому
Thank you for watching! Glad you enjoyed! :)
@yaminikommi5406 2 роки тому ⁺²
We can take any number as intial parameters and learning rate
@mattk6182 3 роки тому ⁺²
using x as your means of showing multiplication is confusing, makes it looks like you took the derivative wrong with 2x(x+5)..maybe in future videos either leave the x out so the multiplication is implied.
@zafarnasim9267 2 роки тому ⁺¹
You made it so simple. Great Job!
@DataScienceGarage 2 роки тому
Thanks a lot! :)
@sanurcucuyeva7040 Рік тому ⁺¹
Hi, thanks for explanation. If our function is hard, at what point in the iteration should we stop to find the minimum point
@murat2073 2 роки тому ⁺¹
thank you Sir! you are a HERO!!!
@DataScienceGarage 2 роки тому
Thanks a lot! :)
@twicestay6683 9 місяців тому
Thx a lot!!! But I'd like to ask why the learning rate=0.01? is it a random number? Thx
@michaelscott8572 4 роки тому ⁺¹
What I don't get is: When we use this method in neural net, we don't know the Errorfunction. We just have some point. So how can I build the derivative?
@omkarkadam5715 3 роки тому
Thanks mate, Finally Enlightened.
@DataScienceGarage 3 роки тому ⁺¹
Thanks for watching! I hope it was useful :)
@smurfNA Рік тому
hey! so do we choose the learning rate? and the gradient is simply just the function right ?
@abdellatifmarghan7521 7 місяців тому
Thank you. grateful explanation
@DataScienceGarage 7 місяців тому
Glad it was useful! :)
@mohamedelkhanche707 2 роки тому
ohhhhh wonderful i was chocked this is insane thank you from all my heart
@DataScienceGarage 2 роки тому
Thanks a lot for such feedback, appreciate!
@pwan3971 Рік тому
Thanks a lot, really appeciate the video, this makes so much sense now
@DataScienceGarage Рік тому
Thank you for you feedback! Glad it was useful for you :)
@yasamannazemi6706 3 роки тому ⁺²
It was so simple and helped me a lot :)
Thanks👍🏻
@DataScienceGarage 3 роки тому
Thanks!
@praneethcj6544 4 роки тому ⁺¹
Simple and clear ... Yet need more detailing ...!!!!
@Slendich 2 роки тому
Really great and simple explanation. Thank you
@DataScienceGarage 2 роки тому
Thanks for sich feedback, I'm glad it was helpful! :)
@ajaykushwaha4233 3 роки тому
Best explanation ever.
@luisurena1770 3 роки тому ⁺²
Coñazo siempre hay un indu que me ayuda a entender todo🔥🔥🔥🔥
@abdanettaye8217 3 роки тому
good starting, thank you
@hindbelkharchiche1654 3 роки тому
Thank you .. the explanation is as simple as useful .
@blinky1892 Рік тому
How do we know what the y value is of the parabole at any given x?😊
@mbogitechconpts 2 роки тому
Beautiful video. I have to like it.
@DataScienceGarage 2 роки тому ⁺¹
Thanks for feedback, inspiring!
@nawaab9275 3 роки тому
thanks for saving the semester
@fmikael1 2 роки тому
Thanks for the great explination. everyone always complicates it
@DataScienceGarage 2 роки тому ⁺¹
Thanks for feedback, glad it was helpful! :)
@SuperYtc1 Рік тому ⁺¹
This is a good video.
@DataScienceGarage Рік тому
Thanks!
@radhar5349 2 роки тому
Great explanation. Easy to get the concept
@DataScienceGarage 2 роки тому
Thanks for feedback! :)
@eliashossain4327 Рік тому
Best explanation.
@DataScienceGarage Рік тому
Thanka for such feedback! :)
@dennisjoseph4528 4 роки тому
Great job of explaining this as simple as possible Sir
@9891676610 2 роки тому
Awesome explanation . Thanks a lot !!
@DataScienceGarage 2 роки тому
Thanks for watching! Hope it was useful!
@basheeralwaely9658 3 роки тому
Well done sir, very easy to understand
@tevinwright5109 Місяць тому
GREAT VIDEO
@DataScienceGarage Місяць тому
Thanks for watching this!
2 роки тому
Perfect !
@bharatcreations7154 2 роки тому
Can we compute same thing without getting into learning rate??
@george4746 3 роки тому
Thanks, It was very clear and concise.
@DataScienceGarage 3 роки тому
Thanks!
@RayhanAhmedsimanto 5 років тому
Amazing Practical Explanation. Great work.
@DataScienceGarage 5 років тому
Thanks Rayhan!
@bhavikdudhrejiya4478 4 роки тому
Very good video. I appreciate your hard work. Keep uploading more videos.
@DataScienceGarage 4 роки тому
Many thanks for such comment!
@supantha118 Рік тому
Thank you so much
@sandipmaity2687 4 роки тому ⁺¹
Amazing Explanation :) Really simple and to the point 😀
@colton3000 2 роки тому ⁺¹
How do we find learning rate?
@denisplotnikov6875 2 роки тому
How to use this example for Stochastic Gradient Descent?
@machinelearningid3931 4 роки тому ⁺²
Thanks, this give me the light in darkness.
@TheJayenz 2 роки тому
Thank you so much!
@DataScienceGarage 2 роки тому
Thanks for watching! :)
@Snetter 2 роки тому
Nice work! thanks
@DataScienceGarage 2 роки тому
Thanks for feedback, glad for this! :)
@muhammadhilmirozan1266 3 роки тому
thx for explanation!
@DataScienceGarage 3 роки тому
Thanks for watching! :)
@thankyouthankyou1172 2 роки тому
Useful thank you
@DataScienceGarage 2 роки тому
Thank you for watching!
@mastan775 4 роки тому
Very good explanation...thanks a lot.
@ericklestrange6255 4 роки тому ⁺⁵
didnt explain how do calculate the direction we are moving to (the minus), why the derivatives etc
@ak-ot2wn 4 роки тому ⁺³
That's what I am looking for already for several days and nobody mentions this. Anyways I still think, that it is trivial. If your derivative is negative, you have to "move" to the right side (in case of 2 variables). If it is positive, you have to "move" to the left.
@debayondharchowdhury2680 4 роки тому
he also didn't talk about loss calculation. why do we need to calculate loss at all if we can simply use the gradient descent on the function.
@blendaguedes 4 роки тому ⁺¹
@@debayondharchowdhury2680 Your loss function is the one pointing the difference between your output and your "y". You calculate the gradient to your loss function. At his example, he shows something that looks like a ' mean squared error' as loss function to me, and he is doing a linear regression with only one input "x".
I recommend you the Andrew Ng classes on Coursera. have a good time
@blendaguedes 4 роки тому
@@ak-ot2wn I totally agree with what you are saying, the only matter is when you are programming you don't see witch direction your vector is going. So basically if the error is going down: keep going, if it starts to increase go back. You can just stop, or you can make your learning rate smaller to increase your accuracy
@rssaiganesh 4 роки тому
I think the comments thread is looking for the math behind the formula for the gradient descent. Apologies if I misunderstood. But here is a link that helped me: towardsdatascience.com/understanding-the-mathematics-behind-gradient-descent-dde5dc9be06e
@user-qj1lm1xh2z 2 роки тому
Well done 👏
@DataScienceGarage 2 роки тому ⁺¹
Thanks! :)
@arvinds7182 Рік тому
On point👏
@DataScienceGarage Рік тому
Thanks!
@shankaks7217 Рік тому
Why did we choose 0.01 as the learning rate?
@davidbarnwell_virtual_clas6729 2 роки тому
How do we choose the learning rate? Good video but it's things like that I'd love to know
@DataScienceGarage 2 роки тому ⁺¹
Hi! Choosing learning rate often is not easy task. I usually makes experiments on model performance with multiple learning rate (manual, Grid search hyperparameter tuning, Bayesian search, etc.).
@davidbarnwell_virtual_clas6729 2 роки тому
@@DataScienceGarage Ahh...ok...I get you...it's very interesting.
@gerleenjosuegoya8777 Рік тому
Thank you!
@DataScienceGarage Рік тому
Thank you for watching!
@boniface385 Рік тому
Hi, why the learning rate are 0.01? Can it be any random learning rate? For example 0.2, 0.02 or any. I appreciate it for thee fast reply, thank you😊🙏🏻🙏🏻🙏🏻
@DataScienceGarage Рік тому ⁺¹
Hello! Thanks for watching this video, I'm glad it was useful for you. While modelling ML system, you can specify random Learning rate. However the good practice is to use 0.1, 0.01, 0.001, or 0.0001. Each ML model has its own architecture, and different training data, hyperparameters, etc., so learning rate can be adopted separately for each case.
Here, I used 0.01 just for demonstration purposes.
@boniface385 Рік тому
@@DataScienceGarage thank you so much for the explanation. 🫶🏻
@dveerraju1852 Місяць тому
How can you know the learning rate
@ydkmusic 4 роки тому
Great video! There is a typo around 3:50. The bottom equation should be x_2 = .... instead of x_1.
@davidkayode6679 3 роки тому
Wonderful Video!!! Thank You!
@DataScienceGarage 3 роки тому
Thanks for feedback!:)
@kronlogic2408 3 роки тому
For the Iteration 2, shouldn't the second line be x2= and not x1= ?
@karthiklogan9384 3 роки тому
really helpful sir.thank you so much
@DataScienceGarage 3 роки тому
Happy that was useful.
@darkman8939 3 роки тому
thanks, very hhelpful.
@grinfacelaxu 2 місяці тому
Nice!
@DataScienceGarage 2 місяці тому
Thanks!
@bhavya2301 3 роки тому
Thankyou.
@seathru1232 3 роки тому
GREAT!
@DataScienceGarage 3 роки тому
Thanks!
@emrecik9882 Рік тому
Thanks
@bernardaslabutis5098 3 роки тому
Ačiū, padėjo!
@DataScienceGarage 3 роки тому ⁺¹
Džiaugiuosi! :)
@AJ-et3vf 3 роки тому
Very useful! Awesome ❤️
@DataScienceGarage 3 роки тому ⁺¹
Thanks for watching! :)
@pearlsofwisdom2416 4 роки тому
Good explanation but would have been better if you elaborated its formula of why it is used to reach next step. Why is derivative multiplied by learning rate and why it is then substracted from first point value
@blendaguedes 4 роки тому
The learning rate makes the "decay slow". At his first interaction, the result would be: -3 -4 = -7.
Can you see where this is going? As he goes slow he will keep dropping his "y", until he get to as close as possible to -5. Sometimes to get at the minimum you have to make you learning rate smaller while computing your weights .
@Hasasinful 3 роки тому
Thanks just what i needed
@DataScienceGarage 3 роки тому
Hope it was useful. Thank you!
@diegososa5280 3 роки тому
Thank you very much!
@DataScienceGarage 3 роки тому
Thanks! Hoping it was useful. :)
@MuditDahiya 4 роки тому
Very nice explanation!!
@DataScienceGarage 4 роки тому
Thanks!
@mohsinjunaid8454 Рік тому
thanks
@govardhan3099 3 роки тому
Great explained...
@DataScienceGarage 3 роки тому ⁺¹
Thanks!
@harshithbangera7905 3 роки тому ⁺¹
How we know -5 is global minimum...is there when gradient or derivative become 0
@explovictinischool2234 Рік тому
Hello, better now than never.
Let's assume we have reached -5 at step Xn. However, we don't know that we have reached the local minimum.
We perform another step Xn+1 with the formula, which gives:
Xn+1 = Xn - (learning_rate) * (dy/dx)
Xn+1 = -5 - (0.01) * (2 * (-5+5))
Xn+1 = -5 - (0.01) * 0
Xn+1 = -5
And so, we have Xn+1 = Xn which means we can not progress anymore and which means we reached the local minimum.
@dtakamalakirthidissanayake9770 4 роки тому
Thank You So Much. Great Simple Explanation!!!
@jimyang8824 4 роки тому
Good explanation!
@moazelsawaf2000 4 роки тому ⁺¹
Thanks a lot sir
@AlfredEssa 3 роки тому
Good job!
@codingtamilan 4 роки тому
How you draw that curve can be fixed as -5 ?
Always it is centre as -5 ?
@blendaguedes 4 роки тому ⁺¹
First you decides witch will be his loss function. On his case it was (5+x)^2, or x^2 + 10x + 25. Nos you program the gradient descent to find the minimum of the function. It depends of your function.
@codingtamilan 4 роки тому ⁺¹
@@blendaguedes thq... pleasure to meet you
@MrAnindyabanerjee 4 роки тому
Thank you

Наступне

Автоматичне відтворення

Solve any equation using gradient descent