Reinforcement Learning from scratch

Поділитися
Вставка
  • Опубліковано 21 тра 2024
  • How does Reinforcement Learning work? A short cartoon that intuitively explains this amazing machine learning approach, and how it was used in AlphaGo and ChatGPT.
    Part 1 of 3.
    0:00 - intro
    0:13 - pong
    0:28 - the policy
    0:51 - policy as neural network
    1:32 - supervised learning
    2:51 - reinforcement learning using policy gradient
    4:24 - minimizing error using gradient descent
    4:45 - probabilistic policy
    5:01 - pong from pixels
    6:58 - visualizing learned weights
    8:18 - pointer to Karpathy "pong from pixels" blogpost

КОМЕНТАРІ • 38

  • @darthvader4899
    @darthvader4899 Місяць тому +5

    this is video is super underrated. In fact the whole channel is underrated.

  • @themathguy3149
    @themathguy3149 7 місяців тому +3

    Your Channel IS SO GREAT, I share with all my eng friends for you to get more visibility!

  • @metaljacket8102
    @metaljacket8102 Місяць тому +2

    This is really awsome! It's the best video that explains DRL in such an easy to understand way!

  • @tushargupta1999
    @tushargupta1999 2 місяці тому +2

    This video is amazing. You explained everything in such a simple manner. I am feeling really motivated to learn more about reinforcement learning and neural networks after watching this.

  • @themax2go
    @themax2go 2 місяці тому +1

    agi: 1. ai develops understanding of win-loss conditions and sets policy params (inputs & actions) accordingly. 2. ai creates (= designs & builds) training env(s). 3. ai iterates, avals & adjusts policy parameters accordingly 4. done (or validation run(s) w/ human(s))

  • @ashketchum1244
    @ashketchum1244 9 місяців тому +4

    I don't know how I stumbled upon this video but that was very interesting and intuitive to understand. Thank you.

  • @a.aspden
    @a.aspden 8 місяців тому +2

    Your videos are great. Looking forward to more!

  • @marcinstrzesak346
    @marcinstrzesak346 7 місяців тому +1

    Great video, very helpful, easy to understand.

  • @gmjammin4367
    @gmjammin4367 9 місяців тому +1

    Amazing video as always :)!

  • @moldo800
    @moldo800 4 місяці тому +1

    Excellent. Congratulations ❤

  • @mado.madeleine
    @mado.madeleine 9 місяців тому +1

    Super helpful! Thank you 🙏🏽

  • @CptDoge-rn3ou
    @CptDoge-rn3ou 6 місяців тому +1

    I really like the way you visualize what you are talking about. Thank you for putting in the effort!

  • @cloudysh
    @cloudysh Місяць тому +1

    This was so surprisingly great :3

  • @luiseduardocraizer7416
    @luiseduardocraizer7416 День тому

    Excellent content!

  • @jameslibby5215
    @jameslibby5215 8 місяців тому +5

    Very very underrated channel

  • @mohajeramir
    @mohajeramir Місяць тому +1

    Excellent

  • @nikbivation
    @nikbivation 9 місяців тому +1

    thank you for this!

  • @ireoluwaTH
    @ireoluwaTH 9 місяців тому +1

    Thank you!!!

  • @BlueBirdgg
    @BlueBirdgg 8 місяців тому +1

    Can you playlist each one of your topics plz?
    I wanted to post on Twitter(X) your video topics but could only post a single video at a time.
    Great content by the way. Ty very much.
    Your perspective on some topics helped me a lot to get a more intuitive understanding.

    • @g5min
      @g5min  8 місяців тому

      Good idea! Here's one on generative AI:
      ua-cam.com/play/PLWfDJ5nla8UoR8P7AGqVw7ZPjXajUFLMo.html
      Here's one on reinforcement learning
      ua-cam.com/play/PLWfDJ5nla8UoexEaLqVMw7q3Ft0vRYscL.html
      Here's one on LLMs + text-to-image
      ua-cam.com/play/PLWfDJ5nla8UoG2mvvHs_OS0asAKC5HJeu.html

    • @BlueBirdgg
      @BlueBirdgg 8 місяців тому

      @@g5min Ty!

  • @edvinbeqari7551
    @edvinbeqari7551 4 місяці тому

    What is your reward function for the pong game? I did a similar pong game and I couldn't get it to learn.

  • @solveigberling1662
    @solveigberling1662 2 місяці тому +1

    That was dope

  • @kniv0gaffel
    @kniv0gaffel 6 місяців тому +1

    Brilliant

  • @bombur9007
    @bombur9007 Місяць тому

    how many layers should such network have

  • @mineq4967
    @mineq4967 Місяць тому

    but by what number do you change the weights like you never told us

  • @axe863
    @axe863 6 місяців тому +2

    Simple Reinforcement learning is extremely dangerous in certain nonstationary environments 😅

  • @nischalyou
    @nischalyou 8 місяців тому

    whats the name of this video game ?

  • @maxim_ml
    @maxim_ml 10 днів тому

    that was good

  • @FRANKONATOR123
    @FRANKONATOR123 8 місяців тому

    Can you share the source code for this project

    • @g5min
      @g5min  8 місяців тому

      You can follow the link to the Karpathy site at the end of the video, repeated here:
      karpathy.github.io/2016/05/31/rl/

  • @herikaniugu
    @herikaniugu 7 місяців тому

    Imagine using reinforcement learning in quantitative finance 😊

  • @macratak
    @macratak 9 місяців тому

    ah yes, reinforcement learning. a fundamental computer graphics technology

    • @g5min
      @g5min  9 місяців тому +5

      I think that character/game-AI is pretty central to graphics

    • @pw7225
      @pw7225 9 місяців тому +1

      Why so negative?

    • @revimfadli4666
      @revimfadli4666 9 місяців тому

      ​@@g5minespecially AI image generation or processing nowadays