What is Gradient Descent?

Поділитися
Вставка
  • Опубліковано 5 вер 2024
  • Gradient descent (en.wikipedia.o...) is a simple, yet powerful, optimization algorithm. It is widely used in machine learning and artificial intelligence because of its flexibility and effectiveness. The iterative nature of gradient descent makes it well suited for many problems a where closed form solution is either not available or computationally hard to find.
    Credits:
    All animations were made using the manim library: docs.manim.com...
    Source code for the animations in the video: github.com/Alg...

КОМЕНТАРІ • 42

  • @algoneural
    @algoneural  Рік тому +4

    The source code for all animations in this video is available here github.com/AlgoNeural/what-is-gradient-descent.

  • @rryk
    @rryk Рік тому +3

    This is a great visual way to explain such an important algorithm. It can be used for so many different applications. Looking forward to more advanced versions with more animations. I would love to also learn more about various approaches of picking the starting point.

    • @algoneural
      @algoneural  Рік тому

      Thank you so much! Yes, these videos are already in the pipeline. Stay tuned for more.

  • @jaceharrison6487
    @jaceharrison6487 Рік тому +1

    Great Video. You will definitely grow your channel fast with this quality of video

    • @algoneural
      @algoneural  Рік тому

      Thank you! I'll try my best to keep videos coming.

  • @eranfeit
    @eranfeit Рік тому

    Very nice explanation.

  • @janerikjakstein
    @janerikjakstein Рік тому

    Great visuals! 👍

  • @araqweyr
    @araqweyr Рік тому +1

    Great video. Even though I don't know much about this method (or neural networks in general) I happened to know that it might not always find optimal solution. I believe there is even a term for this. It would be great if you could tell us more about this and how people trying to solve this problem.

    • @algoneural
      @algoneural  Рік тому +1

      Spot on! Gradient descent might get stuck in a local minimum, which means that it finds a solution that is better than others in its immediate neighborhood, but there is still some other solution that is better. There are various ways of dealing with this, one of them being gradient descent with momentum. I am planning to cover these topics in the future (probably in a separate video).

  • @dydx8407
    @dydx8407 Рік тому +1

    wow i thought i was watching a 1m sub channel, amazing quality

    • @algoneural
      @algoneural  Рік тому +1

      Thank you so much! I'll try to keep it up.

  • @havocthehobbit
    @havocthehobbit Рік тому

    This was a great visual explination , thank you . cant wait to see your next video

    • @algoneural
      @algoneural  Рік тому +1

      Thank you! The next one is coming in a couple of days

  • @giostechnologygiovannyv.ri489

    Yup there are limitations of GD, like finding saddle points or points with vanishing gradients, so can get stuck for a long time in those points, even converge but to local minima not the global minimum. Therefore, they created other variations like GD with momentum, NAG, RMSProp, Adam, variations of Adam, etc. Thanks for the video, I know all these concepts I'm just looking for a graph in 3D as most of the YT videos use the simple representation in 2D XD hahaha

    • @algoneural
      @algoneural  Рік тому +1

      3D is more fun though

    • @giostechnologygiovannyv.ri489
      @giostechnologygiovannyv.ri489 Рік тому

      @@algoneural Yeah exactly! And more realistic for more than 1 variable problems. Also includes the idea of saddle points, I was thinking about what happens if you go straight the gradient in a saddle point just in the middle of it, it will never escape the point, even with momentum, I looked for many videos talking about it, no one does, but I found 2 papers that acknowledge this issue but is less likely to happen :3 .... Btw have you already done the video explaining the reason of momentum being a physic intuition? ^^!

    • @algoneural
      @algoneural  Рік тому

      @@giostechnologygiovannyv.ri489 I have a short about the momentum, but it is really short :)
      ua-cam.com/users/shortsfQEQYVWYIQY

  • @AK56fire
    @AK56fire Рік тому

    Brilliant animation and very good explanation.. Keep up the great work..

  • @mrnarason
    @mrnarason Рік тому

    Great video, been subbing to a lot of smaller math channels

  • @badphysics4604
    @badphysics4604 Рік тому

    Beautifully explained!

    • @algoneural
      @algoneural  Рік тому

      Thanks a lot! Cool avatar btw :)

  • @NavanshuPandey
    @NavanshuPandey Рік тому

    Brilliantly explained. Subscribed 👍

    • @algoneural
      @algoneural  Рік тому

      Thank you so much! I appreciate it 😊

  • @jackieliu1700
    @jackieliu1700 Рік тому

    One drawback for gradient descent is local minimum. There are times when the steepest descent does not reach the actual minimum of the function

    • @algoneural
      @algoneural  Рік тому +2

      Yes, precisely. Using the the analogy of descending from a mountain, this looks like getting stuck in a little pit, while not reaching a deep valley. One interesting way to deal with this is to use the momentum. Like in the real world, having enough inertia might help us bounce out of a small local minimum, and continue on the way to the deep valley. But there is no universal method that would work for all functions.

  • @subramanyam2699
    @subramanyam2699 Рік тому

    Please do more videos..

  • @blank_bow
    @blank_bow Рік тому

    awesome

  • @ja6920
    @ja6920 Рік тому

    How can you perform gradient descent on an unknown "black box" function?

    • @algoneural
      @algoneural  Рік тому

      I would first consider other optimization methods, such as surrogate models or genetic algorithms. As for gradient descent, you might try to approximate the partial derivatives using finite differences. But this can be computationally expensive, and you also need to consider whether the underlying black box function is differentiable.

    • @ja6920
      @ja6920 Рік тому +1

      @@algoneural Thank you for the response. My challenge is that the function is not just a mathematical expression, it has if/then branches that dictate how the results are computed so likely not differentiable. I tried variations on binary search to explore the results from the cost function but since it is calculated against real data (estimated vs actual using R-squared) it is not very easy to "descend" as it has ups and downs. Most explanations I find are always based on some nicely defined function where partial derivatives can be used for the gradient. I will look into those other suggestions mentioned (surrogate models, genetic algorithm).

    • @algoneural
      @algoneural  Рік тому

      @@ja6920 You could try some other derivative-free optimization methods, e.g. simulated annealing

    • @ja6920
      @ja6920 Рік тому +1

      @@algoneural Thank you, these are all new to me so will look into them.

  • @AK56fire
    @AK56fire Рік тому

    Also, it would be great if you shared your code too..

    • @algoneural
      @algoneural  Рік тому +2

      Absolutely. I will do so as soon as I'm done cleaning up the code.

  • @myworldAI
    @myworldAI Рік тому

    👍👍👍👍👍