Back Propagation in training neural networks step by step

Поділитися
Вставка
  • Опубліковано 12 чер 2024
  • This video follows on from the previous video Neural Networks: Part 1 - Forward Propagation.
    I present a simple example using numbers of how back prop works.
    0:00 Introduction
    0:35 Our silly dataset
    0:55 Recap of forward propagation
    2:00 Backpropagation beginning
    3:00 Intuition behind backpropagation
    4:45 The best way to carry out backprop is by using gradient descent
    4:50 What is gradient descent?
    7:00 What is a partial derivative?
    7:30 What is a cost function?
    8:05 Partial derivative formula using the chain rule
    13:35 Update the weights and biases using gradient descent
    14:00 What is a learning rate?
    14:10 Gradient descent formula and full examples
    24:26 Updated weights
    25:00 Stochastic gradient descent
    26:30 What is an epoch?
    27:10 Unresolved questions. Learning rate; stochastic gradient descent; activation function;

КОМЕНТАРІ • 124

  • @oussamaoussama6364
    @oussamaoussama6364 Рік тому +46

    This is the best video on YT that I know of, that explains back propagation and gradient descent clearly, I've tried so many but this one is by far the best. Thanks for putting this together.

    • @rogerpitcher2636
      @rogerpitcher2636 10 місяців тому +1

      I agree, by far the best description, of these processes.

  • @user-ou7dq1bu9v
    @user-ou7dq1bu9v Місяць тому +2

    After 2 years since publishing, your video is still a gem 💥

  • @user-ph3qi5to4r
    @user-ph3qi5to4r 7 місяців тому +3

    I have seen so many yt videos these days about backprop & gradient descent, you r the clearest among them, including a video with millions of watch. This video deserves more exposure. Thank you!

  • @xintang7741
    @xintang7741 7 місяців тому +3

    Genius! Better than any professor in my school. Most helpful lecture I have ever found. Thanks a lot!!!!!!!

  • @marcusnewman8639
    @marcusnewman8639 Рік тому +3

    I am doing my bachelor in insurance mathematics, and one of my tasks were to model a forward neural network. Had no clue what it was. Watched 50 minutes of this guy and now I understand everything. Really great videos!

  • @luisreynoso1734
    @luisreynoso1734 Місяць тому

    This is the very best video on explaining Back Propagation! It is very clear and well-designed for anyone needing to learn more about AI. I look forward to seeing other videos from Bevan.

  • @rahuldevgun8703
    @rahuldevgun8703 25 днів тому

    The best i have seen till date .. superb

  • @carlhopkinson
    @carlhopkinson Рік тому +3

    You really have a talent for explaining this difficult subject cleary so that it makes sense and links up to the intuitive notions.

  • @ScampSkin
    @ScampSkin 9 місяців тому +2

    There are a lot of nice looking videos on bp, but this one finally makes it clear that there is not so much dependency on all previous neurons rather than on last layer only. It was intimidating and overwhelming to think that I have to keep track of all neurons and their derivatives, but now it is clear that i can do everything one step at a time. I might sound chaotic and incoherent, but I'm just so excited to finally find a video not too simple and not too heavy on math notation, yet still it makes things clear

  • @wagsman9999
    @wagsman9999 11 місяців тому +1

    This is one the clearest explanation on back propagation I’ve come across. Thanks!

  • @MrFindmethere
    @MrFindmethere 9 місяців тому

    One of the best videos covering the whole scenario with an example not just part of the process

  • @sgrimm7346
    @sgrimm7346 8 місяців тому +2

    Just subbed to your channel because of the extremely clear explanations. Backprop has always been a sticking point for me due mostly due to the fact no one else is willing to get down from their 'math jargon' throne and actually explain the variables and functions in human language, You, Sir, are a gem and deserve all kudos you can get. Years ago I wrote a couple ANN programs with BP but I didn't understand it, I just wrote the calculations. Now, I can't wait to try it out again with a new understanding of the subject. Thank you again.

  • @karthikrajeshwaran1997
    @karthikrajeshwaran1997 3 місяці тому

    just outstanding - re watched it and it made it so clear!

  • @user-vc6uk1eu8l
    @user-vc6uk1eu8l 9 місяців тому +1

    I am completely amazed!! The clarity of explanation is at the highest level possible! Thank you very much, Sir, for this video and for all the efforts you put to make it so clear! Such a great talent to explain complex ideas in a clear and concise manner is indeed very rarely seen!

  • @Abdul-Mutalib
    @Abdul-Mutalib Рік тому +1

    _What a great teaching! The way you have explained all the nuts and volts, It's just amazing._

  • @msalmanai62
    @msalmanai62 2 роки тому +4

    Very clear and easiest explanation. Not only did you explain backward propagation in easy way but you also clarified a lot of other concepts as well. Thanks and a lot of love ❤❤❤

    • @bevansmithdatascience9580
      @bevansmithdatascience9580  2 роки тому +1

      Glad it was helpful Muhammed! Please do comment on what other concepts you were helped on. Thanks

  • @Be1pheg0r_
    @Be1pheg0r_ 8 місяців тому +1

    amazing video. well explained and easy to understand. by far the best thing I have found for now that explains everything with good, supportive visuals

  • @asaad3138
    @asaad3138 5 місяців тому

    By far this is the best explanation. Clear, precise, detailed instructions. Well good and thank you so much 🙏

  • @nawaldaoudi2625
    @nawaldaoudi2625 7 місяців тому

    The best video I've seen so far. Such a clear and concise explanation. I finally understood in a smoothy way the different concepts related to the Back and forward propagation. I'm grateful

  • @MultiNeurons
    @MultiNeurons Рік тому

    Finally a very well done job about back propagation

  • @Pouncewound
    @Pouncewound Рік тому +2

    Amazing video. I was really confused by other videos but yours really explained every bit of it simply. thanks!

  • @sisumon91
    @sisumon91 3 місяці тому

    Best video I have found for BP! Thanks for all your efforts.

  • @karthikrajeshwaran1997
    @karthikrajeshwaran1997 3 місяці тому

    thanks so much for the clarity. helps tremendously! lvoe this.

  • @wanna_die_with_me
    @wanna_die_with_me Місяць тому

    THE BEST VIDEO FOR UNDERSTANDING BACK PROPAGATION!!!! Thank you sir

  • @magnus.t02
    @magnus.t02 9 місяців тому

    Best explanation of gradient descent available. Thank you!

  • @techgamer4291
    @techgamer4291 3 місяці тому

    Thank you so much , Sir.
    Best explanation I have seen on this platform .

  • @karthikeyanak9460
    @karthikeyanak9460 2 роки тому

    This is the best explanation of back propagation I ever came across.

  • @kumardeepankar
    @kumardeepankar 7 місяців тому

    The best explanation on back propagation. Many thanks!

  • @user-ts5vd9fp1g
    @user-ts5vd9fp1g 2 місяці тому

    This channel is so underrated...

  • @mlealevangelista
    @mlealevangelista Рік тому

    Amazing. I finally learned it. Thank you so much.

  • @farnooshteymourzadeh8874
    @farnooshteymourzadeh8874 Рік тому

    well clarifying, not too long not too short, just enough! really thanks!

  • @gowthamreddy2236
    @gowthamreddy2236 2 роки тому

    My gosh... The clarity is amazing... Thanks Bevan

  • @abdelmfougouonnjupoun4614
    @abdelmfougouonnjupoun4614 Рік тому

    Thank you so much for such an amazing explanation, the best I ever saw before.

  • @minerodo
    @minerodo 7 місяців тому

    I really appreciate this video!! Believe me, I have been looking in books and in other videos, but this is the only one that tell the entire story on a very clear way (besides Stack Quest channel)! thanks a lot!! god bless you!

  • @dallochituyi6577
    @dallochituyi6577 2 роки тому

    Absolutely enjoyed your explanation. Good job sir.

  • @ywbc1217
    @ywbc1217 8 місяців тому

    YOU ARE REALLY THE BEST ONE 🤗

  • @rajFPV
    @rajFPV 11 місяців тому +1

    Just Beautiful !
    Your math notation combined with your skill of teaching just made it so simple! Forever indebted. Thank you so much

    • @sksayid6406
      @sksayid6406 5 місяців тому

      Please teach us more. It was a great explanation, and you made this too easy for us. Thanks lot.

  • @danjohn-dj3tr
    @danjohn-dj3tr 8 місяців тому

    Awesome, clearly explained in simple way👍

  • @zarmeza1
    @zarmeza1 7 місяців тому

    this is the best explanation i found, thanks a lot

  • @PLAYWW
    @PLAYWW 5 місяців тому

    You are the UA-camr I have met who can explain all the specific calculation processes clearly and patiently. I appreciate you creating this video. It helps a lot. I wonder if you can make a video about Collaborative filtering?

  • @sma92878
    @sma92878 2 місяці тому

    This is amazing, so clear and easy to understand!

  • @linolium3109
    @linolium3109 Рік тому

    That's such a good video. I had these in a lecture and don't understand anything. That was really a relief for me. Thank you for that!

  • @isurusubasinghe2038
    @isurusubasinghe2038 2 роки тому

    The best and simplest explanation ever

  • @samuelwondemu6972
    @samuelwondemu6972 Рік тому

    best way of teaching.Thanks a lot

  • @adiai6083
    @adiai6083 2 роки тому

    Very simple and clear explanation ever

  • @DanielRamBeats
    @DanielRamBeats 10 місяців тому

    you are an amazing teacher, thank you for taking the time to create and to share your knowledge with us. I am grateful,

  • @LakshmiDevi_jul31
    @LakshmiDevi_jul31 4 місяці тому

    Thank you so much. it was so simple

  • @markuscwatson
    @markuscwatson 6 місяців тому

    Great presentation. Good job 👍

  • @robertpollock8617
    @robertpollock8617 7 місяців тому

    EXCELLENT!!!!!!!!!!!!!!!!JOB WELL DONE!!!!!!!!!!!!!!!!!!!!!!!!!!!!Wish you would make a video on batch gradient descent.

  • @VladimirDjokic
    @VladimirDjokic 7 місяців тому

    Excellent explanation Thank you ❤

  • @mychangeforchange7946
    @mychangeforchange7946 Рік тому

    The best explanation ever

  • @1622roma
    @1622roma Рік тому

    Best of the best. Thank you 🙏

  • @jesuseliasurieles8053
    @jesuseliasurieles8053 9 місяців тому

    Sr awesome job explaining this amazing topic

  • @Samurai-in5nr
    @Samurai-in5nr 2 роки тому +1

    The best and simplest explanation ever. Thanks man :)

  • @lawrence8597
    @lawrence8597 2 роки тому +1

    Thanks very much, God bless you.

  • @Zinab8850
    @Zinab8850 Рік тому

    Your explanation is fantastic!! Thanks

  • @shobhabhatt3602
    @shobhabhatt3602 2 роки тому

    Thanks a lot for such a video. Simplest, easy, and thorough explanation for both beginners as well as advanced learners.

  • @MrSt4gg
    @MrSt4gg Рік тому

    Thank you very much for the video!

  • @tkopec125
    @tkopec125 Рік тому

    Finally! Got It! :) Thank You Sir very much

  • @kennethcarvalho3684
    @kennethcarvalho3684 4 місяці тому

    Finally i understood something on this topic

  • @user-xg1cj7wh1m
    @user-xg1cj7wh1m 6 місяців тому

    Mister, you have saved my life lol, thank you!!!

  • @levieuxperesiscolagazelle2684

    God bless you thank you ,you saved my life !!!!!

  • @DanielRamBeats
    @DanielRamBeats 8 місяців тому

    I had to pause the video at the point you mentioned the chain rule and go back to learn calculus, I took Krista Kings's calc class on Udemy, I am finally back to understand these concepts! 1 month later :)

  • @Koyaanisqatsi2000
    @Koyaanisqatsi2000 10 місяців тому

    Great content! Thank you!

  • @Terra2206
    @Terra2206 Рік тому

    I was reading a book about neural networks, so in one of the last steps i have a great question about how a number was calculated, so i got frustrated, the book had a error, and i could find it thanks to this video, thanks a lot, very good explanation

  • @KolomeetsAV
    @KolomeetsAV 10 місяців тому

    Thanks a lot! Really helpful!

  • @faisalrahman3608
    @faisalrahman3608 2 роки тому +9

    You have got the skill to explain ML to even an 8-year-old.

  • @alinajokeb6930
    @alinajokeb6930 2 роки тому

    Every learning channels should follow you. Your learning method is amazing sir, also the last questions are actually in my mind but you cleared it.. Thank alot 🌼

  • @abdulhadi8594
    @abdulhadi8594 Рік тому

    Excellent SIr

  • @depressivepumpkin7312
    @depressivepumpkin7312 5 місяців тому

    Man, this is at least the 15th video on the topic I watch, including several books, related to the back propagation, and this is the best one. All previous videos just skip a lot of explanation, focusing on how this backpropagation is important and crucial and what it allows to do, instead of doing the step-by-step overview. This video contains zero bs, and only has clear explanations, thank you

  • @PrashantThakre
    @PrashantThakre 2 роки тому +1

    you are great .. thanks for such amazing videos

  • @Amit-mq4ne
    @Amit-mq4ne 11 місяців тому

    great!! thank you

  • @yazou4564
    @yazou4564 7 місяців тому

    well done!

  • @AlperenK.
    @AlperenK. 8 місяців тому

    Awesome

  • @debjeetbanerjee871
    @debjeetbanerjee871 2 роки тому +1

    best explanation over the internet..........could u please make videos on the different activation functions that u mentioned(tanh and the ReLU)..........it would be really nice of you !!

  • @surojitkarmakar3452
    @surojitkarmakar3452 Рік тому

    Atlast i understood 😌

  • @alonmalka8008
    @alonmalka8008 6 місяців тому

    illegally underrated

  • @DanielRamBeats
    @DanielRamBeats 8 місяців тому

    I had gotten confused on the notion of "with respect to x" which after some studies means, when you take the derivative of a multi-variable function, you only differentiate the x and keep y exactly the same. God this learning it taking forever! :/

  • @anupmandal5396
    @anupmandal5396 9 місяців тому

    Best

  • @Richard-bt6uk
    @Richard-bt6uk 2 місяці тому

    Hello Bevan
    Thank you for your excellent videos on neural networks.
    I have a question pertaining to this video covering Back Propagation. At about 14:30 you present the equation for determining the updated weight, W7. You are subtracting the product of η and the partial derivative of the Cost (Error) Function with respect to W7. However, this product does not yield a delta W7, i.e., a change in W7. It would seem that the result of this product is more like a delta of the Cost Function, not W7, and it is not mathematically consistent to adjust W7 by a change in the Cost Function. Rather we should adjust W7 by a small change in W7. Put another way, if these quantities had physical units, the equation would not be consistent in units. From this perspective, It would be more consistent to use the reciprocal of the partial derivative shown. I’m unsure if this would yield the same results. Can you explain how using the derivative as shown to get the change in W7 (or indeed in any of the weights) is mathematically consistent?

  • @ammarjagadhita3189
    @ammarjagadhita3189 3 місяці тому

    i just wondering in the last part when i try to calculate partial derivative of w4 the result i got is -3711 but in the video it is -4947. then i make sure so i changed the last equation part to x1 (60) and it gives me the same result like in the video which -2783, so im not sure if i miss something since he didnt write the calculation from w4

  • @batuhantongarlak3490
    @batuhantongarlak3490 2 роки тому

    besttt

  • @chillax1629
    @chillax1629 Рік тому +3

    Believe you used an learning rate of 0.001 not 0.01. Or you did use an learing rate of 0.01 and are having an error in the updated w7 as this is 12.42 and not 12.04.

  • @StudioFilmoweAlpha
    @StudioFilmoweAlpha 4 місяці тому +1

    22:53 Why Z1 is equal to -0,5?

  • @neinehmnein5701
    @neinehmnein5701 2 роки тому +2

    Hi, thanks for the video! This question might be a little stupid, but aren´t the weight updates in 24:27 wrongly computated? shouldnt the first value be 0,29303 rather than 193?
    thanks for an answer!

    • @bevansmithdatascience9580
      @bevansmithdatascience9580  2 роки тому

      Thanks for the message. It could be that I have made a mistake. But I hope that you can understand the principle of how to do it. Thanks

    • @waleedbarazanchi4503
      @waleedbarazanchi4503 2 роки тому

      It is correctly mentioned. You may confused 19,303 with 19.303.Best

  • @Princess-wq7wk
    @Princess-wq7wk Рік тому

    How to update b1 , I don’t know how to update it

  • @joyajoya4674
    @joyajoya4674 Рік тому

    😍😍

  • @YAPJOHNSTON
    @YAPJOHNSTON Рік тому

    how does b1 and b2 calculate ?

  • @giorgosmaragkopoulos9110
    @giorgosmaragkopoulos9110 2 місяці тому +1

    So what is the clever part of back prop? Why does it have a special name and it isn't just called "gradient estimation"? How does it save time? It looks like it just calculates all derivatives one by one

    • @bevansmithdatascience9580
      @bevansmithdatascience9580  2 місяці тому

      it is the main reason why we can train neural nets. The idea in training neural nets is to obtain the weights and biases throughout the network that will give us good predictions. The gradients you speak of get propagated back through the network in order to update the weights to be more accurate each time we add in more training data

  • @codematrix
    @codematrix Рік тому

    Hi Bevan, I plugged your original weights and biases and got 24.95, which is correct, using inputs 60, 80, 5. I then entered all your modified values at frame 24:39 and got 42.19. I was hoping to get very close to ~82. Are you sure that you applied the local minimum value during gradient decent?

    • @bevansmithdatascience9580
      @bevansmithdatascience9580  Рік тому +1

      It could be a mistake on my part. I unfortunately don't have time now to go back and check. However, the important thing is that you understand how it all works. Cheers

    • @codematrix
      @codematrix Рік тому

      @@bevansmithdatascience9580 I think I need to re-forward pass the features back into the NN with the adjusted values and recalculate the cost function until it reaches an acceptable local minimum. I’ll give that a try.

  • @FPChris
    @FPChris 2 роки тому

    So you do a new forward pass after each X back propagation? X1 forward , back...X2 forward, back...X3 forward, back.

    • @bevansmithdatascience9580
      @bevansmithdatascience9580  2 роки тому

      Please check out my video on What is an epoch? That should clear things up. If not, ask again.

  • @orgdejavu8247
    @orgdejavu8247 2 роки тому

    g1/z1 okey dz1 is the derivative of the sigmoid function but what happens with de dg1 in the numerator

    • @bevansmithdatascience9580
      @bevansmithdatascience9580  2 роки тому

      Please check 20:50 Let me know if that helps?

    • @orgdejavu8247
      @orgdejavu8247 2 роки тому

      @@bevansmithdatascience9580 I made a bad question what I meant is what happened to dellZ1 , why does it disapper? why dellg1/dellz1 equals to only the derivative of the sigmoid function

  • @effortlessjapanese123
    @effortlessjapanese123 2 місяці тому

    haha South African accent. baie dankie Bevan!

  • @apppurchaser2268
    @apppurchaser2268 Рік тому

    Great explanation thanks. But i think 12 - 0.01 * 42 .2 is 11.578 and not 12.04. @15:37 by the way amazing job well concept explanations 🙏

    • @Kevoshea
      @Kevoshea Рік тому +1

      watch out when multiplying two minus numbers.

  • @MrRaminfar
    @MrRaminfar 2 роки тому

    Do you have an entire course?

  • @wis-labbahasainggris8956
    @wis-labbahasainggris8956 23 дні тому

    Why does weight updating use a minus sign, instead of a plus sign? 24:34

    • @bevansmithdatascience9580
      @bevansmithdatascience9580  21 день тому

      In gradient descent we want to tweak the weights/biases until we obtain a minimum error in our cost function. So for that we need to compute the negative of the gradient of the cost function, multiply it by a learning rate and add it to the previous value. This negative means we are moving downhill in the cost function (so to speak)

  • @DanielRamBeats
    @DanielRamBeats 8 місяців тому

    Ugh, I am slowly getting it. First you take the derivative of the cost function with respect to Ypred, which is 2 times the difference between Ypred and Yact and then you multiply that by the partial derivative of Yppred with respect to w7. which is just g1. Then you multiply them together to obtain the partial derivative of the cost function with respect to w7. Then you take that partial derivative multiply that by a learnign rate and then subtract that from the original value of w7, this then becomes the new value of w7.
    I am still confused on the next part, calculating the partial derivatives of the hidden layers, but hey that is some progress so far, right? :/

    • @Luca_040
      @Luca_040 5 місяців тому

      And how to calculate the first partial derivative?

  • @ayoubtech6930
    @ayoubtech6930 5 місяців тому

    could you please send me the ppt file

  • @jameshopkins3541
    @jameshopkins3541 9 місяців тому

    BUT SMITH WHY G? WHY DCOST? PLEASE DON'T DO MORE VIDS!!!!