What is Vanishing/Exploding Gradients Problem in NNs

Поділитися
Вставка
  • Опубліковано 6 лис 2022
  • Vanishing/Exploding Gradients are two of the main problems we face when building neural networks. Before jumping into trying out fixes, it is important to understand what they mean, why they happen and what problems they cause for our neural networks. In this video, we will learn what it means for gradients to vanish or explode and we will take a quick look at what techniques there are in order to deal with vanishing or exploding gradients.
    Previous lesson: • How to Implement Regul...
    Next lesson: TBA
    📙 Here is a lesson notes booklet that summarizes everything you learn in this course in diagrams and visualizations. You can get it here 👉 misraturp.gumroad.com/l/fdl
    👩‍💻 You can get access to all the code I develop in this course here: github.com/misraturp/Deep-lea...
    ❓To get the most out of the course, don't forget to answer the end of module questions:
    fishy-dessert-4fc.notion.site...
    👉 You can find the answers here:
    fishy-dessert-4fc.notion.site...
    RESOURCES:
    🏃‍♀️ Data Science Kick-starter mini-course: www.misraturp.com/courses/dat...
    🐼 Pandas cheat sheet: misraturp.gumroad.com/l/pandascs
    📥 Streamlit template (updated in 2023, now for $5): misraturp.gumroad.com/l/stemp
    📝 NNs hyperparameters cheat sheet: www.misraturp.com/nn-hyperpar...
    📙 Fundamentals of Deep Learning in 25 pages: misraturp.gumroad.com/l/fdl
    COURSES:
    👩‍💻 Hands-on Data Science: Complete your first portfolio project: www.misraturp.com/hods
    🌎 Website - misraturp.com/
    🐥 Twitter - / misraturp
  • Наука та технологія

КОМЕНТАРІ • 9

  • @deniz.7200
    @deniz.7200 9 місяців тому +2

    The best explanation of this issue!
    Thank you very much Mısra!

  • @ProgrammingCradle
    @ProgrammingCradle Рік тому +2

    Wonderful explanation Misra.
    It's a simple concept but I have seen many people getting confused because of it. I am sure it will help many learners.

  • @angelmcorrea1704
    @angelmcorrea1704 6 місяців тому

    Thank you so much, excellent explanation.

  • @mmacaulay
    @mmacaulay Рік тому +1

    Hi Misra, Your videos are amazing and your explanations are usually very accessible. However, while the vanishing/ exploding gradient problem in NNs is a complex concept, I did unfortunately find your train of thought or explanation in this video confusing. Would it be possible to provide a another video on the vanishing/ exploding gradient problem? Many thanks.

    • @Sickkkkiddddd
      @Sickkkkiddddd 10 місяців тому

      Essentially, deeper networks increase the risk of wonky gradients because of the multiplicative effects of the chain rule during back-propagation. Gradients in earlier layers of the network will have diminishing/vanishing gradients which means their neurons will learn essentially nothing during backprop causing the network to take forever to train. In the reverse case, gradients in earlier layers will have exploding gradients which will ultimately destabilise the training process and produce inefficient/unreliable parameters.

  • @bay-bicerdover
    @bay-bicerdover Рік тому

    Dynamic Length Factorization Machines hakkinda youtube'da tek bir video var. Benim gibi cömezlere o makinenin isleyisini uygulamali bir videoda anlatirsaniz, makbule gecer.

  • @khyatipatni6117
    @khyatipatni6117 4 місяці тому

    Hi Misra, I brought the notes for deep learning, I did not know it's just one time download, I downloaded that time and then lost it. How can I re-download it without paying again, Please help.

    • @misraturp
      @misraturp  4 місяці тому

      Hello, it is not a one-time download. Did you try the link sent to your email once again?

  • @bay-bicerdover
    @bay-bicerdover Рік тому

    Sigmoid islevinin azami türev degeri 0.25