The spelled-out intro to neural networks and backpropagation: building micrograd

Поділитися
Вставка
  • Опубліковано 22 гру 2024

КОМЕНТАРІ • 1,9 тис.

  • @georgioszampoukis1966
    @georgioszampoukis1966 8 місяців тому +661

    The fact that this video is free to watch feels illegal. It really speaks volumes about Andrej. What a stunning explanation. It takes incredible skill and expertise to be able to explain such a complex topic this intuitively and simply. All I can say is thank you from the bottom of my heart that you offer videos like this for free. What an amazing man!

    • @Rumblerist
      @Rumblerist 6 місяців тому +2

      Wholeheartedly agree, there are lots of videos that take a stab at explaining the core of how a neural network works, this is by far the most simple yet conveys the fundamentals to the core of how neural networks work. Thanks @Andrej

    • @ian-haggerty
      @ian-haggerty 6 місяців тому +1

      @@Rumblerist All the best math boils down to timesing stuff and adding stuff.

    • @ian-haggerty
      @ian-haggerty 6 місяців тому

      And timesing stuff is basically an abstraction of adding stuff.

    • @yusuf.isyaku
      @yusuf.isyaku 6 місяців тому

      Thank you for writing this.

    • @magnetsec
      @magnetsec 5 місяців тому

      time to go to jail ig

  • @MoAlamili
    @MoAlamili 10 днів тому +19

    Andrej, your ability to explain complex topics so clearly and simply is remarkable. Thank you for your generosity in sharing this knowledge-it’s deeply appreciated!

  • @ShahafAbileah
    @ShahafAbileah 6 місяців тому +127

    From github: "Potentially useful for educational purposes." What an understatement. Thank you so much for this video.

  • @nyariimani7281
    @nyariimani7281 Рік тому +300

    This reminds me of my college courses, except it's way better in three ways: 1) Here the speaker really does know what he's talking about. 2) I can stop and rewind, get definitions, and practice myself before moving on to the next step over and over, so I can get the most out of the next step because I actually had the time to understand the last step. 3) I can do this over several days so I can keep coming back when I'm fresh and present. You are a gem and I really, really appreciate you creating this.

  • @kemalatayev
    @kemalatayev 2 роки тому +442

    Just and FYI for those following at home. If you are getting an error at 1:54:47 you should add __radd__ into your Value class similar to __rmul__. It will allow the order of addition to not matter. I don't think it was shown in the earlier sections.

    • @adamderose9468
      @adamderose9468 2 роки тому +16

      ty, i needed this at t=6422 in order to sum(wi*xi for wi, xi in zip(self.w, x)) + self.b

    • @CarlosGranier
      @CarlosGranier Рік тому +7

      @@adamderose9468 Thanks Adam. This had me stumped.

    • @karanshah1698
      @karanshah1698 Рік тому +7

      Underrated comment...

    • @lidiahyunjinkwon7138
      @lidiahyunjinkwon7138 Рік тому +2

      OMG, thank you so much. It was driving me nuts.

    • @jamesb43
      @jamesb43 Рік тому +1

      That’s comforting. I thought I just missed it. Good on you for sharing this

  • @peterdann643
    @peterdann643 Рік тому +628

    Simply stunning. I'm a 72 year old fiction writer with rudimentary computer programming skills whose son works professionally in this area. I wanted to gain a better understanding of the technology he's working with, and writes scientific papers about, and now I feel I've made a great start in that direction. Wonderful!

    • @BarDots315
      @BarDots315 Рік тому +74

      You're a great father never change!!

    • @hamdanalameri2885
      @hamdanalameri2885 Рік тому +36

      what an amazing father you are, My dad also tries to keep up with all the technologies just so he can understand and get to bond with his children. I want you to know that we appreciate you guys and love you.

    • @ClayMole
      @ClayMole Рік тому +5

      You're awesome!
      May I ask, how has it been to work as a fiction writer? Would you recommend it?

    • @manishj5154
      @manishj5154 Рік тому +7

      All the people fawning over him, you do realize he started with saying he is a "fiction writer".
      Granted that it's pretty cool if this isn't fiction.

    • @semtex6412
      @semtex6412 11 місяців тому

      @@manishj5154 he's likely backpropagating

  • @DrKnowitallKnows
    @DrKnowitallKnows 2 роки тому +1308

    Andrej, the fact that you're making videos like this is AMAZING! Thank you so much for doing this. I will be spending some quality time with this one tonight (and probably tomorrow lol) and can't wait for the next one. Thank you, thank you, thank you!

    • @2ndfloorsongs
      @2ndfloorsongs 2 роки тому +13

      And thank you for your videos, Dr Know It All. Always appreciate them.

    • @mattphorwich
      @mattphorwich 2 роки тому +6

      I was stoked to discover Andrej sharing the knowledge on these videos as well!

    • @lonnybulldozer8426
      @lonnybulldozer8426 2 роки тому +1

      You made love to the video?

    • @0GRANATE0
      @0GRANATE0 2 роки тому +1

      And what happend? Do you now understand DNN?

  • @carlosgruss7289
    @carlosgruss7289 Місяць тому +16

    Crazy that some of the most talented/knowledgeable people on the planet just come out here on UA-cam and offer to teach the world for free. Makes you feel hopeful about humanity in a way :)

  • @fhools
    @fhools Рік тому +193

    When I'm confused about deep learning, I go back to this video and it calms me. It shows that there is a simple explanation waiting for someone like Andrej to show the light.

    • @FireFly969
      @FireFly969 7 місяців тому +7

      Yep I watched a 52 hours course on pytorch, I was learning how to build a neural network but not how neural network works, which is a stupid thing from me, and a good lesson, to learn how things works if you want to learn it.

    • @mehulsuthar7554
      @mehulsuthar7554 6 місяців тому +1

      @@FireFly969 mind sharing the course. please

    • @derekcarday
      @derekcarday 2 місяці тому +1

      im more confused now

  • @harshmalik3470
    @harshmalik3470 6 місяців тому +17

    I can't even comprehend the level of mastery it must take, to be able to distill such a complex topic in such a simple format, and the humility to give it our for free so that others may learn.
    Thankyou so much Andrej for doing this, you're truly amazing.

  • @ThetaPhiPsi
    @ThetaPhiPsi 2 роки тому +205

    This is the single best explanation of backprop in code that I've seen so far. I've once implemented a neural network from scratch, except autograd, so Micrograd is a good fit and so clear and accessible. Thanks Andrej!

    • @leslietetteh7292
      @leslietetteh7292 Рік тому +1

      Actually true. And exactly the same, I've once implemented a neural network from scratch, and I broadly understood, but this is the best explanation of backpropagation I've seen. Excellent work.

  • @shubh9207
    @shubh9207 6 місяців тому +32

    I don't understand why I understood each and every thing that Andrej explained. Such a gem of an instructor. Loved how he showed the actual implementation of tanH in the PyTorch library. This video is around 2 hours and 30 minutes long but I took 2 weeks to understand it completely.

  • @bycloudAI
    @bycloudAI 2 роки тому +52

    This is literally gold, you have explained everything so intuitively and made it so much easier to understand!
    Thank you so much Andrej for sharing this in-depth knowledge for free!

    • @ophello
      @ophello 2 роки тому +8

      You literally don’t know what “literally” means.

    • @flflflflflfl
      @flflflflflfl Рік тому

      @@ophello Not necessarily. One can use a word incorrectly while still knowing its true meaning.

    • @SOMEONE-jg6jg
      @SOMEONE-jg6jg Рік тому

      love your videos bro

    • @writethatdown100
      @writethatdown100 9 місяців тому

      @@ophello I know this is a year old comment, and my reply is pointless, but _technically_ 🤓Merriam Webster lists "used in an exaggerated way to emphasize a statement or description that is not literally true or possible" as one of the definitions. People define the dictionary. Not the other way around. And yes, it *literally* doesn't matter at all, but it annoyed me that you were wrong when trying to _correct_ somebody else's well meaning compliment.

  • @robl39
    @robl39 Рік тому +248

    Finally… someone who understands it well enough to explain it to a beginner. This is hands down the best NN video on the Internet. Thanks a ton!

  • @bergonius
    @bergonius 2 роки тому +82

    Great teacher with great background and expertise. We're lucky to have him spending his time to share his knowledge with anyone who wants to learn, for free. Looking forward to more videos.

    • @notkamara
      @notkamara 2 роки тому +4

      He's great! He even has an old UA-cam cubing channel (Badmephisto) and his tutorials there are awesome too!

    • @alisaad679
      @alisaad679 Рік тому +1

      @@notkamara omg i learned how to solve a rbx cube from him decades ago thats crazy, now im learnning neural networks, crazy how the world works

  • @kuoldeng4568
    @kuoldeng4568 10 місяців тому +6

    Thank you for taking the time to do this. I'm a MSc Economics grad hoping to understand how neural networks work to start an AI startup, and your lecture is a perfect balance between depth and simplicity. Not everyone posses a natural talent for teaching and you have it!

  • @gabrieldornelles9310
    @gabrieldornelles9310 2 роки тому +234

    I'm really inspired by you as an educator, and I'm very happy to see you sharing your knowledge in a lecture after a long time!

  • @peters972
    @peters972 6 місяців тому +17

    You walked the razor thin edge of going too fast or leaving steps out, or going to slow and making them want to skip, like a magician. (you must have backpropagated the consequence of almost every word, to come up, with the perfect lecture with the lowest loss!) Thanks you so much Andrej, even I was able to keep up, and I am going to show off my knowledge at the pub and library.

  • @nkhuang1390
    @nkhuang1390 2 роки тому +141

    It takes real talent, dedication and complete mastery of the subject matter to breakdown difficult technical topics so clearly. Its also clear that Andrej is having fun while he elucidates. This is simply the most amazing series of educational videos on the internet on these topics. I hope you continue to put out more material like these.

  • @__amkhrjee__
    @__amkhrjee__ 4 місяці тому +4

    I am mind blow by the sheer simplicity & clarity of your explanation. You are an inspiration.

  • @GregX999
    @GregX999 2 роки тому +313

    OMG! This is the first time I've ever TRUELY understood what's actually going on when training a NN. I've tried to learn so many times, but everyone else seems to make it so unnecessarily complex. Thanks for this!

    • @ophello
      @ophello 2 роки тому +13

      It’s spelled “truly.”

    • @khaldrogo9451
      @khaldrogo9451 Рік тому +3

      @@ophello giggity

    • @pooroldnostradamus
      @pooroldnostradamus Рік тому +30

      @@ophello Excuse his poor training dataset

    • @veganath
      @veganath Рік тому

      @@ophello an example of backpropagation, you have no doubt adjust Greg's waits...lol

    • @sidg11
      @sidg11 Рік тому +2

      @@veganath it's spelled weights ...

  • @OMGinsane
    @OMGinsane 4 місяці тому +6

    I'm an extreme beginner and am learning so much from this video! Only 24 minutes in and I'm learning so much by listening to parts then writing the code and finally asking ChatGPT to dissect the code further so i can learn how specific things work. Thanks so much!

  • @krebul
    @krebul 2 роки тому +81

    I'm a traditional dev and I have tried a lot of different guides and tutorials on neural networks. This is the first time I have been able to understand it. I'm about 1/3 through the video and it's 2am. Thanks for your excellent breakdown!!!

    • @0GRANATE0
      @0GRANATE0 2 роки тому +2

      Did you finish the Video? Do you now understand it? Are you able to read papers in this area and implement them?

    • @brenok
      @brenok Рік тому +5

      @@0GRANATE0 Is this some kind of suggestive question?

  • @JonnyDaenen
    @JonnyDaenen 9 місяців тому +2

    Wow, I have been following several courses and trainings on neural nets, but this is exactly what I needed: a non-black-box approach that shows me in code how things work. All the abstractions are much more clear to me now! E.g. why a loss function needs to be differentiable, why you would need batches, etc. It’s all just one big expression…
    Awesome work, Andrej! 🚀 thank you for making this available! 🙏

  • @treksis
    @treksis 2 роки тому +12

    This video reminds me of my old numerical analysis professor who forced us to draw every interpolation problem by hand as an assignment. We drew all the tangent lines with a ruler and a protractor like a kid in primary school. We were complaining because that was just a few lines of code in Matlab, but in the end, that really helped us to develop true intuition behind it. Thanks for the intuitive video.

  • @ChenqianJing-g5w
    @ChenqianJing-g5w Рік тому +9

    Our professor highly recommended us to use your video to learn more about back propagation. You explained it so well. Thank you so much for making this video, this video really helped our study and understanding!

    • @SrikarDurgi
      @SrikarDurgi 7 місяців тому

      You've got a good prof.
      Many feel insecure and don't recommend anything good.

  • @anatolianlofi
    @anatolianlofi 2 роки тому +11

    This is probably the simplest, most well-paced explanation of back-propagation I've seen on UA-cam.
    I wish everyone would break down information in this way. Thank you.

    • @Kobe29261
      @Kobe29261 9 місяців тому

      Its really sad; we don't pay the smartest people enough to be teachers. You think about it and its atrocious; the problem is actually worse - people like Andrej are coopted by big corporations where their expertise and research can be hidden behind a wall of NDAs

  • @Snehilw
    @Snehilw 2 місяці тому

    Nothing but gratitude for you Andrej. I love to watch all your talks and explanations. Very refreshing, energizing, motivating, clear and concise. I can watch 100 hrs of your talks at a stretch without loosing attention. Very captivating! Kudos! And thank you for this great community service!

  • @raphaelkalandadze9691
    @raphaelkalandadze9691 2 роки тому +11

    What an astonishing lecture, the best explanation of backprops, and the whole cycle is so intuitive and easy to understand.
    I wish I had a teacher like that. I would know everything 100 times better than I do now.
    Somebody is still saying that a child will learn everything on his own,
    but I bet everyone will be a genius if Andrej teaches them and all of you are happy you attended his lectures at Stanford.
    I wish I had such an explanation skill one day
    I'm glad to see you on UA-cam, and I hope you continue this series in the future
    I have a lot more to say, but I hope to tell you in person one day
    thank you 100 times

  • @李国瀚-n1x
    @李国瀚-n1x 11 місяців тому +1

    Thank you for implementing backpropagation and automatic differentiation in such an elegant and easy to understand way. This is the most detailed and in-depth beginner's course I have ever seen.

  • @siddharthdhirde
    @siddharthdhirde 10 місяців тому +11

    I appreciate that you did not edit out your mistakes in the video.
    It helped me to understand the common pitfalls in building neural network.

  • @wangcwy
    @wangcwy 7 місяців тому +1

    The best ML tutorial video I have watched this year. I really like detailed example, and how these difficult concepts are explained in a simple manner. What a treat for me to watch and learn!

  • @OlabodeAdedoyin
    @OlabodeAdedoyin 2 роки тому +51

    The weird sense of accomplishment I felt when I visualised (draw_dot) the all the operations in its full glory is unreal 😂. Thank you for this 🙏. I'm a software engineer that has been trying to truly understand neural nets for a year now. I'm not sure you understand how much this means for me.
    I really appreciate you 🙏

  • @notderek7408
    @notderek7408 Рік тому +3

    Hey Andrej, idk if you'll read this but I wanted to echo others' appreciation for this fantastic introduction. I've been a SWE for many years but always ML-adjacent despite a maths background. This simple video has instilled a lot of intuition and confidence that I actually grasp what these NN's are doing and it's a lot of fuel in my engine to keep diving in. Thank you!

  • @DrRosik
    @DrRosik Рік тому +11

    This video is by far the best and most educational I've seen regarding "how NN work", and I have looked very long. For me, even though I thought I knew how NN work and operate, at least to the point that I could use the tools out there to build a simple one. I never fully grasped the details and was always frustrated why I didn't know WHY I had to use a specific setup for my NN, I just "knew" that this is the way to do it. This video explains the basics in such a simple and logical way. Thank you Andrej! And please keep up the good work.

  • @hagenfinley8112
    @hagenfinley8112 Рік тому +1

    I was a Philosophy major at CAL (I tell people in spite of that I have prospered ;-) with really no math or computer science training and weak python skills. I mention philosophy because the primary skill one acquires studying philosophy is to read things you don't understand. With that in mind, I embarked on reading Ian Goodfellow's Deep Learning and Andrew Ng's Deep Learning Coursera courses. On that flimsy foundation (there's so much my weak mind still doesn't understand), I began this video's journey. Andrej very generous explanation is so helpful. I especially appreciate the lack of mathematical notation which is prevalent in the other works. Andrej employs seemingly simple math equations in python which made his training much more accessible to a novice like me. I deeply grateful he took the time to create this simple step by step explanation. I should only have to watch it 100 more times.

  • @caseyleemiller1
    @caseyleemiller1 2 роки тому +4

    This is an excellent tutorial not only on neural networks but python and Jupyter notebooks as well. Lost sense of self for 2.5 hours and learned a ton! Thank you.

  • @waldof86
    @waldof86 7 місяців тому +1

    I've learned more following this along in some hours than I've learned in a year's worth of classes. Thank you for being so open with your knowledge

  • @ChrisOffner
    @ChrisOffner 2 роки тому +13

    Amazing tutorial, very cool! I love that you patiently walked through a lot of manual examples - too often educators get self-conscious about showing simple steps more than once and then yada yada yada their way through it, which helps nobody. Love your teaching style and hope to see more.

  • @FireFly969
    @FireFly969 7 місяців тому +1

    Thank you so much Mr Andrey kaparthey, I watched and practice a pytorch course of like 52 hours and it was awesome, but after watching your video, it's seems that I was more of like learning how to build a neural network, more then how neural network works.
    With your video I know exactly how it works, and iam planning to watch all of this playlist, and see all of almost all your blog posts ❤ thank you and have a nice day.

  • @6Azamorn9
    @6Azamorn9 Рік тому +3

    This is enormously valuable knowledge and I'm grateful for your insight and how exceptionally well you are at teaching the fundamentals.
    Thank you Andrej

  • @tradunskih
    @tradunskih 9 місяців тому

    I feel a lot of gratitude to you Andrej.
    Your teaching skills are exceptional! The way you approach an explanation that considers broad backgrounds and goes into basics and refreshers from the school time is hard work. I very much respect the time and effort you put into this course and made it available for free, for humanity to improve.
    Thanks a lot Andrej!

  • @mingjunzhang
    @mingjunzhang 4 місяці тому +3

    • @AndrejKarpathy
      @AndrejKarpathy  4 місяці тому +6

      Umm wow that’s a lot thank you! -> Eureka

  • @alirezamogharabi8733
    @alirezamogharabi8733 Рік тому +2

    I have been watching educational videos about neural networks for years, but no one had ever taught me like this. I know you since 2016 with your machine learning class videos. Thank you very much for these wonderful tutorials

  • @Qattea
    @Qattea 2 роки тому +8

    Thanks for this Andrej! I love the direction you are taking. I’ve been wanting to learn this and now I get to learn from the best

  • @onthefall
    @onthefall 10 місяців тому +1

    No amount of thanks will be enough to show how much I appreciate your lectures. They have inspired me so much!
    One day I will make amazing things with it just like you told to show my appreciation for you.
    Thank you sooooo much. 🔥

  • @sanjay-89
    @sanjay-89 Рік тому +13

    This was an exceptional lecture. Just wanted to say thank you for taking the time to make this.
    I have spent time in university courses, reading books, doing assignments and yet, I truly understood more out of this single lecture than from anything else prior.

  • @nyariimani7281
    @nyariimani7281 Рік тому +4

    You are really fun to watch. It's so nice to learn this from someone who really understands how everything works.

  • @agehall
    @agehall 2 роки тому +12

    When I took my AI course some 20 years ago, people were pretty depressed because things were too hard to compute and there was very little future in this type of thing. Awesome to see the simplicity in this and how powerful it is. We’ve really come far in the field of AI.

  • @swathichadalavada9244
    @swathichadalavada9244 3 місяці тому

    Thank you Andrej, Your implementation of neural networks from scratch is impressive! The clarity and simplicity in your code make complex concepts like backpropagation much easier to grasp.

  • @harrypotter6505
    @harrypotter6505 Рік тому +5

    Wow the op nodes you demonstrated this through made it so damn easy to intuit, its crazy, I have no math recall from school and didn't do any advanced math after school, nor coding
    and I had to watch this whole video in 3 or 5 passes to completely grasp what was happening, it was such an amazing journey to be absolute intimidated with the length of the video, the code, the math... I knew neither, I don't even know python to begin with, yet I was able to derive exactly all the concepts necessary to understand this video
    thank you so much Andrej!

    • @galactic_dust42
      @galactic_dust42 11 місяців тому

      "Derive", yes, i think that's the word ! haha

  • @gamingvillage3414
    @gamingvillage3414 8 місяців тому

    This is the best walkthrough with explanations on Neural Nets. This actually explains what happens behind the functions we use in DL libraries. Amazing work by Andrej.

  • @mohit9920
    @mohit9920 2 роки тому +10

    That was incredible. Never has anyone been able to simplify Neural Networks in this manner for me. Please keep making such videos, you're doing gods work. By god, I mean the imminent AGI :)

  • @incognito7722
    @incognito7722 6 місяців тому +2

    Thanks so much sir.
    This really goes a long way for me in my career not only in ML but in every other thing.
    I will make sure to implement this in every single language i learn or have learnt.
    Once again thank you.

  • @SuperOnlyP
    @SuperOnlyP Рік тому +3

    1:54:01 if you got error : TypeError: unsupported operand type(s) for -: 'float' and 'Value' . Try to add
    def __rsub__(self,other):
    return other + (-self)
    to Value class

  • @AhmAsaduzzaman
    @AhmAsaduzzaman Рік тому

    This video on neural network creation is truly enlightening!
    It brilliantly captures the essence of how neural networks are built, providing a comprehensive understanding of their intricate architecture and function. The way it delves into the creative process behind designing neural networks showcases the immense skill and ingenuity of their creators.
    Watching this video is an enlightening and inspiring experience that leaves me with a newfound understanding of the creative process behind these powerful computational models.

  • @TylerMeester
    @TylerMeester Рік тому +3

    As a new CS grad trying to prepare for my career and job interviews, it was a real pleasure following alongside you in this video! I had no idea how integral (pun intended) calculus was to neural networks and backpropagation! Mind = BLOWN!

  • @freedmoresidume
    @freedmoresidume 8 місяців тому

    This was truly a spelled-out, exceptional presentation-I was able to code alongside and successfully completed the tutorial. It has significantly enhanced my comprehension of Neural Networks and their learning processes. Greatly appreciated!

  • @keikaku9298
    @keikaku9298 2 роки тому +4

    CS231 was life-changing for me. You are a fantastic educator. I hope this new endeavor works out for you!!

  • @leobianco9116
    @leobianco9116 Місяць тому

    This is truly valuable, thank you.

  • @chingizabilkasov6625
    @chingizabilkasov6625 2 роки тому +6

    Thank you very much for that lecture Andrej! It really helped to understand and to combine different pieces I learned separately into one structured concept. I especially appreciated that you left the part with the bug on gradient zeroing and made an explanation for it at 2:10:24. Making mistakes and learning from them is so effective and undervalued imo. Thanks a lot!

    • @王項-c6p
      @王項-c6p Місяць тому

      Special thanks for your comment! I was thinking for a long time about why zero_grad() wasn't used, and it was only because of your comment that I realized this is a bug that will be explained later.🤣

  • @deveshbhatt4063
    @deveshbhatt4063 10 місяців тому +2

    man this is an absolute masterpiece. i can finish at me own pace, and the intricate details and possible bugs are explained clearly. Feels like Morgan Freeman narrating. I can listen to Andrej all day long.

  • @notkamara
    @notkamara 2 роки тому +5

    Amazing how you taught me how to solve rubiks cubes all those years ago when I was 12 and now at 22 I'm back here again learning backpropagation. You're doing God's work!

    • @aqgi7
      @aqgi7 9 місяців тому

      Same! I learnt F2L all those years ago and now backprop!

  • @rohanshah9593
    @rohanshah9593 Рік тому +1

    It feels good to learn from someone well known in the industry. Thank you for sharing. I have learned a lot from a practical perspective and look forward for more. Really appreciate these videos!

  • @aamerabbas
    @aamerabbas Рік тому +5

    This was an incredible video. I wish I would have had this when I very first tried to learn about NNs. I would have been able to start my journey with so much more intuition on how things work. Thank you for making this - I will whole heartedly suggest this to anyone who wants to start learning about ML.

  • @zachli3070
    @zachli3070 9 місяців тому

    It's the most apparent and most straightforward explanation of backpropagation and training of neural networks I have ever learned, with effortless work to understand with a minor background in CS and Math!

  • @lawrenceadu-gyamfi4179
    @lawrenceadu-gyamfi4179 Рік тому +4

    Just wanted to say a big thanks to you Andrej and the team working on this. Truly amazing, the clarity with which you explain these things is impressive and inspiring! Looking forward to see the remaining videos and even more. Thanks again!

  • @jiahaosu
    @jiahaosu 17 днів тому

    I'm mind blow by the sheer clarity & simiplicity of the explaination.

  • @kerwinmarkgordo3458
    @kerwinmarkgordo3458 Рік тому +8

    Thank you so much for doing a step by step simulation on how gradient descent works. I am grateful for the passion and effort you make in order to teach. These lessons are very essential as we continue to dive deep into learning.

  • @ivanalejandrogarciaramirez8976
    @ivanalejandrogarciaramirez8976 4 місяці тому

    You rock, man. I love how you split backpropagation into very simple steps and the visualization of the gradient at every step. I love it. Thank you.

  • @necbranduc
    @necbranduc 2 роки тому +5

    I already know this is going to be good. I remember watching and enjoyin your Stanford lessons on UA-cam.

  • @TheArrowster
    @TheArrowster 2 місяці тому

    What a memorable video! It was exactly the explanation I was looking for! I literally have tears in my eyes because I finally learned neural networks the right way! Thank you so much for sharing, Andrej! Greetings from Brazil!!!

  • @rhydderc127
    @rhydderc127 Рік тому +4

    Thank you so much Andrej. I'm good at coding and bad at maths and this is the first time I've been able to properly understand a lot of this stuff... I just needed someone to explain it in my language :) I went along with the first half in detail, then in the second half I built the same MLP using pytorch tensors just to prove to myself that I understood how this all applies to pytorch, and it all worked great :)

  • @DarokCx
    @DarokCx Рік тому

    Wooow what an introduction! It is by far the best and the easiest to understand. The way you break up and simplify things in a way that we are not loosing the main focus on the WHY we are doing this, is absolutely impressive ! Thanks for sharing your knowledge.

  • @roddlez
    @roddlez 2 роки тому +66

    As someone who took Stanford's CS231n back in 2016 purely through watching the lectures and working though each assignment, this definitely strikes me as getting back to your educational roots. I do wonder how difficult it will be for others who do not have the background to approach this material.
    Conceptually, thinking about the training process of forward pass, calculating the loss, backprop, nudge on modern NNs with billions of parameters: from a computation standpoint, this requires many simultaneous reads from and updates to memory to/from the CPU/GPU, even for a single pass of data. Thinking from first principles, it would seem advantageous to assemble a custom computer architecture that would allow the entire NN math function (with billions of parameters) to remain in computational memory (registers) while doing forward and back propagation, thus saving time trying to load/unload/store updated values for weights and biases?
    Is there a company that's attempting to accomplish such a feat?

    • @AndrejKarpathy
      @AndrejKarpathy  2 роки тому +123

      your insight is exactly right. current computer architectures spend most of their time and energy shuttling data to/from memory, through the "von Neumann bottleneck", and calculating little pieces of the neural net at a time. this is not how it should be laid out and it is not how the brain works either, and yes many people are aware and working out various improvements. basically, "classical software" and "neural net software" need very different hardware for optimum efficiency and all the neural nets today run in "emulation mode".

    • @roddlez
      @roddlez 2 роки тому +8

      @@AndrejKarpathy Amazing. Super excited for the future of ML.

    • @elonfc
      @elonfc 2 роки тому +15

      @@AndrejKarpathy do you have elons phone number?😂

    • @The_Special_Educator
      @The_Special_Educator 2 роки тому

      @@AndrejKarpathy You never responded with Elon's phone number. If you want to maintain your credibility you must post Elon's personal phone number, address, and underwear size in the UA-cam comment section.

    • @unchaineddreameralpa
      @unchaineddreameralpa Рік тому

      Joking

  • @ernietam6202
    @ernietam6202 Рік тому +1

    Excellent. Instead of spending money and time going through Deep Learning books and papers, I now understanding how gradient descent works and how Pytorch does it. Thanks a million making this video.

  • @NFT2
    @NFT2 2 роки тому +15

    I've been working with Python for years and never implemented classes with those operator overrides. Its never too late to go back to the basics. Great video man.

    • @cantor_dust
      @cantor_dust Рік тому +1

      Fluent Python is a good book if you'd like to understand these overrides, otherwise known as dunder (double-underscore) methods.

  • @emmanueladebiyi2109
    @emmanueladebiyi2109 9 місяців тому

    Amazing how you broke this down into first principles. I understood a lot of these concepts before now but I'm pleasantly surprised at how much clarity I gained by watching this video. Thank.

  • @2ndfloorsongs
    @2ndfloorsongs 2 роки тому +4

    Thanks, really cleared up a few confusions for me... And added new ones. Perfect. I'm looking forward to your future videos.

  • @anhtudo4713
    @anhtudo4713 9 місяців тому +1

    I'm a little late now, but it's never too late to start learning I think :))) First I was just simply interested in your GPT video. But after watching it halfway, I realized that I was still missing some foundational knowledge on neural nets and wanted to build everything strong from the ground up. That's why I'm here. It's amazing how you can explain neural nets and gradients in such simple and visual ways unlike crazy formulas taught in school. Thank you!

  • @sam.rodriguez
    @sam.rodriguez Рік тому +4

    This is fantastic. Thank you Andrej

  • @KobeeFinsac
    @KobeeFinsac 6 місяців тому

    Thank you Andrej for this incredible and detailed video The clarity with which you explain backpropagation and the construction of micrograd is exceptional. Bravo and thank you for sharing your knowledge with us. You are an immeasurable source of inspiration

  • @imtexaspete
    @imtexaspete 2 роки тому +559

    "remember back in your calculus class?...." nope. I'm subscribing anyway whenever I need a humble reminder that I don't know anything and there are people way way smarter than I am.

    • @omkarajagunde4175
      @omkarajagunde4175 2 роки тому +9

      W
      O
      W
      Same realisation 🙌🙌😔😔😔

    • @Forrest_dev
      @Forrest_dev 2 роки тому +30

      It's never too late to learn.

    • @vidbina
      @vidbina 2 роки тому +26

      The beautiful part of tech is the feeling of constantly being mind blown when realizing how little one knows and how much there is to learn. Studying micrograd has been on my list for a while thanks to George Hotz and this series is making the owning of this context so much easier. Loving it. ❤️

    • @ycombine1053
      @ycombine1053 Рік тому +30

      Not smarter, more experienced. You are capable of understanding all of this given enough time and dedication.

    • @pastuh
      @pastuh Рік тому +4

      If someone can explain, means its simple

  • @mrabbottizer
    @mrabbottizer Рік тому

    Hands down, the most concise and easiest-to-understand lecture on backpropagation. Thank you for posting!

  • @antweb9
    @antweb9 Рік тому +3

    I would like to thank you from the bottom of my heart for making this. I'm a developer myself but this new advent in AI seemed unapproachable. Thanks, for making it clear that no subject is tough if you have a great teacher. I am seriously going to consider this area as a thing I want to do next.

  • @JingchaoChen
    @JingchaoChen 5 місяців тому

    Best 2.5 hours spent during weekend ❤

  • @markonjegomir8714
    @markonjegomir8714 2 роки тому +4

    Nice to see Karpathy going back to making some educational content! 🙂This is a must-watch!

  • @bluearctik3980
    @bluearctik3980 Місяць тому

    The level of the pedagogy here is incredible -- thank you so much for putting this out to the world!

  • @punto-y-coma7890
    @punto-y-coma7890 9 місяців тому +4

    By far, the best neuronal networks introduction and tutorial ever made on UA-cam. Thank you Andrej for sharing your valuable knowledge.

  • @aditya_kalaga99
    @aditya_kalaga99 6 місяців тому

    This video is a gift for anyone trying to understand what happens behind the wall of a PyTorch library, especially when starting out in the field of ML. Thanks, Andrej!!

  •  2 роки тому +8

    As a dev, sometimes I follow along some A.I. course so I'm looking forward to follow this one!
    Thanks for sharing your knowledge.

    •  2 роки тому

      Just watched the last bit.
      The python notation got really intense when Pytorch was introduced. As a non-regular Python dev, I'm going to work out some code snippets to fully grasp the example.
      But dang, so all the trouble of the gradient descent is "just" to know the direction of the optimization for each node?
      Very insightful!
      Is the mathematical capability of computing the local derivative the root cause of the local maximum trap?

  • @heliosobsidian
    @heliosobsidian 4 місяці тому

    Wanted to say thanks for that awesome backpropagation video. I've been scratching my head over this stuff for a while now - had all these bits and pieces floating around in my brain but couldn't quite connect the dots. Your explanation was like a lightbulb moment for me! Everything finally clicked into place. Really appreciate you putting this out there for us to learn from.🙌🙌🙌

  • @ajmeryexperiences4186
    @ajmeryexperiences4186 Рік тому +3

    Every morning I just visit this channel to check whether any video is uploaded or not , waiting for next lectures

  • @NBbhanu
    @NBbhanu 2 місяці тому +1

    this is a great tutorial. For those who are new to the mathematics, best take a calculus class first and come back to this to understand the lecture better. I did so and it makes a lot of sense after.

  • @karthikbhaskar974
    @karthikbhaskar974 2 роки тому +4

    Wow Andrej! Big Fan of your work. Looking to learn more from you about Neural Networks and you being on UA-cam teaching is going to reach billions of people who wants to get into AI and learn from the amazing instructor the world has ever seen. Really loved your Stanford Computer Vision course now looking forward to more awesome content here in UA-cam

  • @PeterCrabtree
    @PeterCrabtree Рік тому

    Most understandable thing on the basics of neural nets "from scratch" I've ever seen! And you even kept the mistakes in.

  • @TheAIEpiphany
    @TheAIEpiphany 2 роки тому +4

    58:41 "As long as you know how to create the local derivative - then that's all you need". Ok Karpathy. Next paper title "Local derivatives are all you need".
    Nice to see you on UA-cam! :))

  • @hxxzxtf
    @hxxzxtf 9 місяців тому +2

    🎯 Key Takeaways for quick navigation:
    00:42 *🧠 Micrograd is an autograd engine that implements backpropagation, essential for efficiently evaluating the gradient of a loss function with respect to neural network weights.*
    02:34 *📊 Micrograd allows building mathematical expressions, enabling operations like addition, multiplication, exponentiation, and more, forming an expression graph.*
    03:16 *🔍 Backpropagation in micrograd initializes at a node, recursively applying the chain rule to evaluate derivatives of internal nodes and inputs, crucial for understanding how inputs affect the output.*
    05:50 *💡 Micrograd operates at the scalar level for pedagogical reasons, but real-world neural network training employs tensors for efficiency, maintaining the same mathematical principles.*
    07:57 *⚙️ Micrograd's simplicity is highlighted by its small codebase: the autograd engine comprises just 100 lines of Python code, while the entire neural network library built atop it is only around 150 lines.*
    08:12 *📈 Understanding derivatives intuitively is essential for grasping their significance in neural network training, demonstrated through numerical approximation and analysis of slope changes.*
    19:11 *🛠️ Building data structures like the 'value' object is crucial for maintaining expressions in neural network training, laying the foundation for constructing more complex networks.*
    19:26 *🛠️ The Value class in micrograd is a simple wrapper for a single scalar value, allowing operations like addition and multiplication.*
    20:04 *➕ Python's special double underscore methods are used to define operators like addition for custom objects such as the Value class.*
    22:50 *📊 To maintain expression graphs in micrograd, each value object keeps track of its children, predecessors, and the operation that created it.*
    25:00 *📈 Graph visualization in micrograd helps to understand complex expression graphs, aiding in debugging and analysis.*
    29:40 *⏪ Backpropagation in micrograd involves computing gradients backward from the output to the input, enabling derivative calculations for optimization algorithms like gradient descent.*
    31:05 *🔄 The grad attribute in micrograd's Value class tracks the derivative of the output with respect to each value, facilitating gradient computation during backpropagation.*
    38:16 *🔢 Deriving gradients during backpropagation involves understanding how changes in intermediate values affect the final output, following the chain rule of calculus.*
    41:46 *📚 The chain rule in calculus is fundamental for differentiating through function compositions, expressing how derivatives are multiplied together correctly.*
    43:07 *🚗 The chain rule allows for calculating the rate of change of a composite function by multiplying the rates of change of its components, akin to a car's speed being a product of its individual speed changes.*
    45:33 *🔀 Plus nodes in neural networks effectively route gradients, distributing derivatives to all child nodes, as indicated by the chain rule.*
    47:28 *🔄 Backpropagation involves recursively applying the chain rule backward through the computation graph, multiplying local derivatives along the way.*
    51:14 *⏫ Adjusting inputs in the direction of the gradient during optimization can increase the output of a neural network, demonstrating the power of gradients in influencing outcomes.*
    53:10 *🧠 Understanding backpropagation through neurons lays the groundwork for building neural networks, utilizing mathematical models to propagate gradients and optimize network performance.*
    01:04:08 *🧠 Understanding the local derivative of the hyperbolic tangent function (tanh) is crucial in backpropagation, where it's expressed as \(1 - \text{tanh}^2(x)\).*
    01:04:45 *🔄 Backpropagating through a plus node involves distributing the gradient equally to both inputs, as the local derivative is 1.*
    01:06:13 *🔢 For a times node, the local derivative is the other term. Calculating gradients involves multiplying the local and global derivatives.*
    01:07:36 *📉 Understanding why certain gradients are zero is crucial; in this case, if the input doesn't influence the output, the gradient is zero.*
    01:10:11 *🔄 Defining backward functions for addition and multiplication operations in neural networks involves applying chain rule and accumulating gradients.*
    01:17:46 *📊 Implementing backpropagation involves a topological sort to ensure proper ordering of gradient calculations, especially in complex networks.*
    01:27:08 *🔢 Tanh function can be broken down into simpler expressions, aiding in understanding and implementation.*
    01:28:32 *➕✖️ Implementing addition and multiplication operations in a neural network library involves handling different data types and ensuring compatibility for arithmetic operations.*
    01:29:54 *➗ Understanding the concept of "r mul" (right multiplication) in Python helps handle arithmetic operations in neural networks efficiently.*
    01:31:38 *📈 Implementing exponentiation and division operations in neural networks requires understanding their mathematical derivatives and chain rule for backpropagation.*
    01:35:57 *➖ Implementing subtraction in a neural network involves expressing it in terms of addition and negation for efficient computation.*
    01:38:43 *🧠 The design and implementation of neural network operations are flexible, allowing developers to choose the level of abstraction for efficient computation and backpropagation.*
    01:52:20 *🧠 Loss in neural networks measures the performance, aiming to minimize it; mean squared error is a common loss function.*
    01:57:30 *🔍 Gradients are crucial for adjusting weights in neural networks through techniques like gradient descent.*
    02:02:08 *🔄 In gradient descent, parameters are adjusted iteratively in the opposite direction of the gradient to minimize loss.*
    02:05:28 *📉 Gradient descent involves a cycle of forward pass, backward pass (backpropagation), and parameter updates to improve neural network predictions.*
    02:07:45 *⚖️ Finding the right learning rate in training neural networks is crucial; too low leads to slow convergence, while too high can cause instability and loss explosion.*
    02:10:23 *🐛 Forgetting to zero gradients before backward pass can lead to subtle bugs in neural network training, causing gradients to accumulate and affect optimization.*
    02:11:58 *🔄 Resetting gradients to zero before backward pass prevents accumulation and ensures accurate gradient updates during optimization.*
    02:13:12 *🧠 Training neural networks can be challenging due to potential bugs, but simple problems may mask issues; thorough testing and debugging are essential.*
    02:14:11 *🤖 Neural networks consist of mathematical expressions processing input data through layers to minimize loss via gradient descent, enabling them to learn complex patterns.*
    02:16:46 *🔧 While understanding neural network principles remains consistent, real-world implementations like PyTorch can be complex, with extensive codebases and nuances in functions like backward passes.*
    02:21:39 *📊 Exploring PyTorch's codebase reveals complexities in functions like backward passes, where implementation details vary based on hardware and data types.*
    Made with HARPA AI

  • @yuyangzhu
    @yuyangzhu 9 місяців тому +16

    Andrej spend 10 hours making 1 hour of content, and the 1 hour content actually worth 10 hours to go through multiple times

  • @michaelgohome
    @michaelgohome 3 місяці тому

    Such an amazing video. For too long I was walking on earth only vaguely assuming how the autograd works, but now thanks to you I'm so much smarter and things are so much clearer. Thank you Andrej!!