Tutorial 12- Stochastic Gradient Descent vs Gradient Descent
Вставка
- Опубліковано 5 жов 2024
- Below are the various playlist created on ML,Data Science and Deep Learning. Please subscribe and support the channel. Happy Learning!
Deep Learning Playlist: • Tutorial 1- Introducti...
Data Science Projects playlist: • Generative Adversarial...
NLP playlist: • Natural Language Proce...
Statistics Playlist: • Population vs Sample i...
Feature Engineering playlist: • Feature Engineering in...
Computer Vision playlist: • OpenCV Installation | ...
Data Science Interview Question playlist: • Complete Life Cycle of...
You can buy my book on Finance with Machine Learning and Deep Learning from the below url
amazon url: www.amazon.in/...
🙏🙏🙏🙏🙏🙏🙏🙏
YOU JUST NEED TO DO
3 THINGS to support my channel
LIKE
SHARE
&
SUBSCRIBE
TO MY UA-cam CHANNEL
Krish sir your youtube channel is just like GITA for me as one gets all the answers to life in GITA I get all my doubts cleared on your channel.
Thank you, SIr.
after becoming member how can i get the data science material, can you please tell me?
Amazing explanation Sir! You'll always be the hero for the AI Enthusiasts. Thanks a lot!
whenever i am confused with some topics , i come back to this channel and watch your videos and it helps me a lot sir .Thank you sir for an amazing explanation
Best Deep Learning playlist on youtube
what an amazing teacher you are. Crystal clear.
This is excellent explanation so that anyone can understand with so much granular level of details.
My God! Finally am clear about GD SGD and mini batch SGD!
I saw many videos but this one is quite comprehensible and informative
This video is such a light bulb moment for me :D Thank you so very much!!
Excellent explanation, it was really helpful thank you.
Good Explanation! But you did not speak much about when to use SGD although you clarified better on GD and Mini Batch SGD
There is nothing much to explain about SGD when you are talking about 1 datapoint at a time while considering dataset of 1000 datapoints.
You are a HERO sir
such a clean way of explanation
How are you so good at explaining 😭😭😭😭😭 Thanks a lot ♥♥♥
Really this is the best video i'v seen ever explaining the concept better than famous. school
Krish you concise subject most meaningfully
Superb...simply superb. understood the concept now from the Loss function. Well don Krish.
Thanks Krish. Good video.I want to use all this knowledge in my next batch of deep learning by ineuron
Good Good clearly explained nobody can explained like this
Thank you, It is clear explanation. I got it!
Hi Krish, one request to you ...like this playlist, please make long videos for the ML Playlist with the Loss Functions , Optimizers used in various ML Algorithms --> mainly in case of Classification Algorithms
So simply explained
Your vidoes are excellent reference to brush up these concepts
Great explanation . Very clear . Thank!
Sir your videos are amazing. Can you please explain about latest methodologies such as BERT , ELMO
negative weights and positive weights best explained as--
since the angle of tangent is more than 90 degree in left side of the curve so this results in -ve values and for other its less than 90 degree so it would be +ve
thanks for this! great explanation
yes i really liked this explanation thanks
Excellent explanation Sir!
What an explaination 🧡 . Great !! Awesome !! .
Awesome @KrishNaik Sir.
Just wanted to ask to ask if you could also suggest some good resources online that we can read which could bring more clarity.......
Explained very well. Thankyou.
Sir, please solve my problem, in my view we are doing gradient descent to find the best value of m (slop in case of linear regression, considering b = 0) so if we use all the point then we must came to know at which point the value of m is less, so why we have to use learning rate to update weight because we already know the best value.
very good explanation
Thanks Krish !!! very nice explanation
Top notch explanation!
Hello Sir, could you share the link for the code where you explained, these videos series are very nice with short of the period we can cover so many concepts. :)
Great video man 👍👍..Please keep it up. I am waiting for next videos
the only video that explains
If we use a sample of output to find the loss, will we use its derivative for changing whole weight or change the weights of the respective output
Hi, is it completely theoretical or will you code in further sessions?
Good attempt 👍. Please record with camera on manual focus.
when you mentioned SGD takes place in linear regression . I didnt understand that comment . Even in your linear regression videos for the mean square error we are having sum of squares for all data points . So how SGD got linked in linear regression ?
4:17 SGD have minimum 256 records to find error / minima you said it's 1 record at a time
I read few articles which says In "SGD a randomly one data point is picked from the whole data set at each iteration". 256 records which you're talking about may be Mini Batch SGD "It is also common to sample a small number of data points instead of just one point at each step and that is called “mini-batch” gradient descent."
@@pramodyadav4422 yeah ,even I have read that in SCD only one data point is selected and updated in each iteration instead of all.
thank you very much for your efforts. please how can we solve a portfolio allocation problem using this algorithm? please answer me
Thank you, sir!
good explanation! thankss
Thank you sir.
I have 1 question regarding this topic. Is this concept applicable to linear regression, right?
Great Sir
Excellent
greate video excelent effort. appreciated!!
@12:02 Sir it should bemini batch stocastic g.d.
which have more convergence speed SGD or GD ?
This is what i was looking for
thanks Krish
8:58 , using GD it converge quickly and while using mini-batch SGD it follows zigzag path, How??
In case of mini batch sgd, we are considering only some points so some deviations will be there in the calculation compared to usual gradient descent where we are considering all values. Simple example GD is like total population and mini SGD is like sample population, it will never be equal and in sample population some deviation always will be there in distribution compared to total population distribution.
We cant use GD everywhere, due to time computation factor, using mini SGD will give approximate correct result.
@@kannanparthipan7907 Deviation will be there in the final output or in the final converge result. Question is why do we have during the process of convergence. Also for every epoch if we consider different samples then understood that there can be zig zag results in the process of convergence. But if only one sample of k records are considered then why is that zig zag during convergence?
Ok now I got it. For every iteration, samples are picked at random, so is zig zag. Just gone through other artciles
please workout your camera issue it seems like it is set to auto focus resulting in a little disturbance.
9.28 time, you said sgd will take time to converge than gd, then which is fast , sgd or gd????
have you make the videos of practical implementation of all the work if so please share the links
please do tell about stochastic gradient ascent also
Great one!
Awesome KRISHHHHHH
can you share the paper for reference and also can you share the resources for deep learning for image processing.
Thank u sir 🙏🙏🙌🧠🐈
1000 likes for you man👏👍
Thanks
Thanks buddy
Great guy.
Excellent!
How can we take k inputs at the same time
sir can you explain me SPGD algorithm please
hats off man
please what the difference between GD and Batch GD !
what is resource of data point?
py:28: RuntimeWarning: overflow encountered in scalar power
cost = (1/n)*sum([value**2 for value in(y-y_predicted)]) hey bro . ia m stuck here with this error , i could not understand the error itself, if you suggests me some solution. .... just now i started to practice a ml algorthm.
Perfect ..!!!
You are AWESOME! :)
great video !!
Nice bro
good video
Switch the auto focus feature in your camera. It is distracting.
Confusing one !
do change your method of teaching seems like someone has read a book and just trying to copy thatt content from ones side .....use your own ideologies for it
:)
our videos are good but camara was bad
Why dont you buy him a new one ?
Pora eri poka