[Classic] Playing Atari with Deep Reinforcement Learning (Paper Explained)

Поділитися
Вставка
  • Опубліковано 24 гру 2024

КОМЕНТАРІ • 55

  • @aa-xn5hc
    @aa-xn5hc 4 роки тому +37

    Totally love your historical papers reviews

  • @mahermokhtar
    @mahermokhtar 2 роки тому +4

    i literally whatched 1000 of videos and i couldn't fully understand the DRL untill i watched this video .. very impressive detailed explanation .. thank you for it

  • @bikrammajhi3020
    @bikrammajhi3020 Рік тому +1

    I am loving it. Thank you so much. YOU DESERVE MILLION SUBSCRIBER. HOPE YOU GET THERE SOON.

  • @chris--tech
    @chris--tech 4 роки тому +4

    Recently i am learning RL painfully, i understand what's happening in DQN until i watched your videos, thanks a lot.

  • @kumarsubham2078
    @kumarsubham2078 3 роки тому +3

    Thanks for historical papers series Yannic. Great explanation of contents with plenty citations of related happenings. Helps understand the evolution of DL. Hope to see more coming soon!

  • @sebastianrada4107
    @sebastianrada4107 9 місяців тому

    What a great video! Please keep doing this kind of content 😀

  • @zerorusher
    @zerorusher Рік тому +1

    It's November 2023 and you hear the magic name everybody is talking about: 20:52

  • @DefiantElf
    @DefiantElf 3 роки тому +4

    Thanks for the great explanation! Regarding sticky actions (29:05), I think those were proposed after in the paper "Revisiting the Arcade Learning Environment..." by Machado et. al. in 2018 to add stochasticity to the Atari problem

  • @coderboy4683
    @coderboy4683 3 роки тому

    I came to understand paper but I realised a lot of things what I used to feel very difficult in RL. Awesome explanation sir. Thank you.

  • @genesisevolution4243
    @genesisevolution4243 2 роки тому

    Damn! this was exactly what I wanted to learn!! Thank you so much...

  • @alexwhb122
    @alexwhb122 4 роки тому

    Absolutely love your videos! Thank you for making these. I've learned a lot!

  • @heyrmi
    @heyrmi 4 роки тому +11

    AlphaGo did to RL what Alex-net did to DL.
    David Silver got me interested in this field. Tho I am a beginner but I too want to contribute in this field.
    Thanks for covering this.

    • @TheThirdLieberkind
      @TheThirdLieberkind 4 роки тому +1

      I wouldn't entirely agree with this, as in my opinion, AlphaGO presented very few novel ideas, but was able to package 4 clever networks together to make something very practical - something reinforcement learning hadn't had before.
      AlphaZero, on the other hand, did have a couple of major novel ideas, but even then debatably, were not the inventors of those ideas.
      In my opinion most of the Alpha projects, while being more practically impressive than most research projects, did not invent the network architectures, but rather improved and were able to unload a massive amount of computing on it.

    • @Rhannmah
      @Rhannmah 4 роки тому

      @@TheThirdLieberkind having the AI play against itself and learn from that was pretty novel and definitely at the core of the success of AlphaGO.

    • @danielguffey
      @danielguffey 4 роки тому +1

      @@Rhannmah Wasn't RL founded with self-play in checkers?

    • @Rhannmah
      @Rhannmah 4 роки тому

      @@danielguffey Was it? I thought it was trained in human play.

    • @danielguffey
      @danielguffey 4 роки тому

      @@Rhannmah "The Samuel Checkers-playing Program was among the world's first successful self-learning programs"

  • @MrjbushM
    @MrjbushM 4 роки тому

    Thanks very useful for us learning deep learning!!!!!! I love the classic papers series

  • @marekdziubinski850
    @marekdziubinski850 Рік тому

    Nice joystick you’ve got there, Yannic 😂. But seriously, I enjoy your work - thank you for the contributions 😊

  • @snehalraj6898
    @snehalraj6898 4 роки тому

    This was really awesome! Thanks

  • @RinkuYadav-pn4jo
    @RinkuYadav-pn4jo Рік тому

    Yeahh....nice review..thankx

  • @dark808bb8
    @dark808bb8 4 роки тому +1

    Great video! I just coded a dqn type neural net to play Othello. It has only fully connected layers with a 64 dim input vector and 64 dim output vector. I hope to do some experiments with it in the future.

  • @PyTechVision
    @PyTechVision 2 роки тому

    Thanks for great explanation.

  • @utku_yucel
    @utku_yucel 4 роки тому +1

    Thanks!

  • @foodmart5122
    @foodmart5122 10 місяців тому +1

    What does he mean by latex savagery around 2:30?

  • @CHINNOJISANTOSHKUMARNITAP
    @CHINNOJISANTOSHKUMARNITAP Рік тому

    thanks for the explanation, can i expect a video on RAINBOW DQN

  • @michelprins
    @michelprins Рік тому

    thx great video

  • @MMc9081
    @MMc9081 4 роки тому +1

    @Yannic - Great video as always and really helped me get a grip on the basics of RL.
    Just wondering tho, did you mean to have adverts throughout the video? Up to now I have only seen them at the beginning, maybe the end too I cannot remember. But this video had 1 at start and then 3 during. I appreciate you need to generate some income from these videos (and you deserve it), but having the adverts during the video is very offputting. Would you consider having several at the start instead (if possible)?

    • @YannicKilcher
      @YannicKilcher  4 роки тому +1

      Thanks for the feedback. I turned them on in the middle during this video just to see the effect, but I agree they're annoying.

  • @davidromero1373
    @davidromero1373 Рік тому

    which program do you use in your ipad to make those annotations outside the margins of the papers?

  • @ThinkTank255
    @ThinkTank255 Рік тому

    Does anyone know what he is talking about at 2:10 ? LaTex savagery???

    • @TruMystery
      @TruMystery Рік тому

      did you understnad??

    • @nikhilgv9
      @nikhilgv9 2 місяці тому +1

      Those two lines are well outside the margin of the page. I noticed it when, I tried to crop the PDF

  • @jesschil266
    @jesschil266 4 роки тому

    Hi Yannic! Love your video so much! But there was one thing I am not clear about, Is y_i equal to the Q function approximated by at (i-1)th time, the weights of a neural network? Best

    • @YannicKilcher
      @YannicKilcher  4 роки тому

      It's the target value, so yes, the Q value to approximate

  • @HappyDancerInPink
    @HappyDancerInPink 4 роки тому +8

    What would you replace LaTeX with? Surely not Word?😂

    • @herp_derpingson
      @herp_derpingson 4 роки тому +7

      Markdown with MathJax. Or just use Jupyter Notebooks with inline code.

    • @snippletrap
      @snippletrap 4 роки тому +4

      @@herp_derpingson Exactly. Paperswithcode and distill.pub already moving in this direction. No reason that papers can't be interactive.

    • @SuperEmanuel98
      @SuperEmanuel98 4 роки тому

      Surely there is alternatives but the thing is that everyone knows latex so it is easy to collab and it is fast. Getting math formulas quickly and looking good is easy. Latex has some quirks but it is not hard to workaround and fix said things. I would say that there are alternatives but nothing come close.

  • @billykotsos4642
    @billykotsos4642 4 роки тому

    niceeeee

  • @mikhailkhlyzov6205
    @mikhailkhlyzov6205 4 роки тому +3

    what happened in Pong? C'mon, David!

  • @iliasp4275
    @iliasp4275 3 роки тому

    ai lob yiu

  • @JoaoVitor-mf8iq
    @JoaoVitor-mf8iq 4 роки тому +2

    Savagery is ok if it doesn't decrease the quality of the research, formating is so boring...

  • @sui-chan.wa.kyou.mo.chiisai
    @sui-chan.wa.kyou.mo.chiisai 4 роки тому

    y13 really ooold paper

  • @lawchakra7813
    @lawchakra7813 4 роки тому +1

    I cant share this gold mine content with anyone. I dont know anybody who would be interested in all this.

    • @42nb
      @42nb 4 роки тому +2

      But you can always find someone in this community later on, just stay interested :D

  • @CHINNOJISANTOSHKUMARNITAP
    @CHINNOJISANTOSHKUMARNITAP Рік тому

    thanks for the explanation, can i expect a video on RAINBOW DQN