i literally whatched 1000 of videos and i couldn't fully understand the DRL untill i watched this video .. very impressive detailed explanation .. thank you for it
Thanks for historical papers series Yannic. Great explanation of contents with plenty citations of related happenings. Helps understand the evolution of DL. Hope to see more coming soon!
Thanks for the great explanation! Regarding sticky actions (29:05), I think those were proposed after in the paper "Revisiting the Arcade Learning Environment..." by Machado et. al. in 2018 to add stochasticity to the Atari problem
AlphaGo did to RL what Alex-net did to DL. David Silver got me interested in this field. Tho I am a beginner but I too want to contribute in this field. Thanks for covering this.
I wouldn't entirely agree with this, as in my opinion, AlphaGO presented very few novel ideas, but was able to package 4 clever networks together to make something very practical - something reinforcement learning hadn't had before. AlphaZero, on the other hand, did have a couple of major novel ideas, but even then debatably, were not the inventors of those ideas. In my opinion most of the Alpha projects, while being more practically impressive than most research projects, did not invent the network architectures, but rather improved and were able to unload a massive amount of computing on it.
Great video! I just coded a dqn type neural net to play Othello. It has only fully connected layers with a 64 dim input vector and 64 dim output vector. I hope to do some experiments with it in the future.
@Yannic - Great video as always and really helped me get a grip on the basics of RL. Just wondering tho, did you mean to have adverts throughout the video? Up to now I have only seen them at the beginning, maybe the end too I cannot remember. But this video had 1 at start and then 3 during. I appreciate you need to generate some income from these videos (and you deserve it), but having the adverts during the video is very offputting. Would you consider having several at the start instead (if possible)?
Hi Yannic! Love your video so much! But there was one thing I am not clear about, Is y_i equal to the Q function approximated by at (i-1)th time, the weights of a neural network? Best
Surely there is alternatives but the thing is that everyone knows latex so it is easy to collab and it is fast. Getting math formulas quickly and looking good is easy. Latex has some quirks but it is not hard to workaround and fix said things. I would say that there are alternatives but nothing come close.
Totally love your historical papers reviews
i literally whatched 1000 of videos and i couldn't fully understand the DRL untill i watched this video .. very impressive detailed explanation .. thank you for it
Same here!
I am loving it. Thank you so much. YOU DESERVE MILLION SUBSCRIBER. HOPE YOU GET THERE SOON.
Recently i am learning RL painfully, i understand what's happening in DQN until i watched your videos, thanks a lot.
Thanks for historical papers series Yannic. Great explanation of contents with plenty citations of related happenings. Helps understand the evolution of DL. Hope to see more coming soon!
What a great video! Please keep doing this kind of content 😀
It's November 2023 and you hear the magic name everybody is talking about: 20:52
Thanks for the great explanation! Regarding sticky actions (29:05), I think those were proposed after in the paper "Revisiting the Arcade Learning Environment..." by Machado et. al. in 2018 to add stochasticity to the Atari problem
I came to understand paper but I realised a lot of things what I used to feel very difficult in RL. Awesome explanation sir. Thank you.
Damn! this was exactly what I wanted to learn!! Thank you so much...
Absolutely love your videos! Thank you for making these. I've learned a lot!
AlphaGo did to RL what Alex-net did to DL.
David Silver got me interested in this field. Tho I am a beginner but I too want to contribute in this field.
Thanks for covering this.
I wouldn't entirely agree with this, as in my opinion, AlphaGO presented very few novel ideas, but was able to package 4 clever networks together to make something very practical - something reinforcement learning hadn't had before.
AlphaZero, on the other hand, did have a couple of major novel ideas, but even then debatably, were not the inventors of those ideas.
In my opinion most of the Alpha projects, while being more practically impressive than most research projects, did not invent the network architectures, but rather improved and were able to unload a massive amount of computing on it.
@@TheThirdLieberkind having the AI play against itself and learn from that was pretty novel and definitely at the core of the success of AlphaGO.
@@Rhannmah Wasn't RL founded with self-play in checkers?
@@danielguffey Was it? I thought it was trained in human play.
@@Rhannmah "The Samuel Checkers-playing Program was among the world's first successful self-learning programs"
Thanks very useful for us learning deep learning!!!!!! I love the classic papers series
Nice joystick you’ve got there, Yannic 😂. But seriously, I enjoy your work - thank you for the contributions 😊
This was really awesome! Thanks
Yeahh....nice review..thankx
Great video! I just coded a dqn type neural net to play Othello. It has only fully connected layers with a 64 dim input vector and 64 dim output vector. I hope to do some experiments with it in the future.
Thanks for great explanation.
Thanks!
What does he mean by latex savagery around 2:30?
thanks for the explanation, can i expect a video on RAINBOW DQN
thx great video
@Yannic - Great video as always and really helped me get a grip on the basics of RL.
Just wondering tho, did you mean to have adverts throughout the video? Up to now I have only seen them at the beginning, maybe the end too I cannot remember. But this video had 1 at start and then 3 during. I appreciate you need to generate some income from these videos (and you deserve it), but having the adverts during the video is very offputting. Would you consider having several at the start instead (if possible)?
Thanks for the feedback. I turned them on in the middle during this video just to see the effect, but I agree they're annoying.
which program do you use in your ipad to make those annotations outside the margins of the papers?
Does anyone know what he is talking about at 2:10 ? LaTex savagery???
did you understnad??
Those two lines are well outside the margin of the page. I noticed it when, I tried to crop the PDF
Hi Yannic! Love your video so much! But there was one thing I am not clear about, Is y_i equal to the Q function approximated by at (i-1)th time, the weights of a neural network? Best
It's the target value, so yes, the Q value to approximate
What would you replace LaTeX with? Surely not Word?😂
Markdown with MathJax. Or just use Jupyter Notebooks with inline code.
@@herp_derpingson Exactly. Paperswithcode and distill.pub already moving in this direction. No reason that papers can't be interactive.
Surely there is alternatives but the thing is that everyone knows latex so it is easy to collab and it is fast. Getting math formulas quickly and looking good is easy. Latex has some quirks but it is not hard to workaround and fix said things. I would say that there are alternatives but nothing come close.
niceeeee
what happened in Pong? C'mon, David!
ai lob yiu
Savagery is ok if it doesn't decrease the quality of the research, formating is so boring...
y13 really ooold paper
I cant share this gold mine content with anyone. I dont know anybody who would be interested in all this.
But you can always find someone in this community later on, just stay interested :D
thanks for the explanation, can i expect a video on RAINBOW DQN