Deep RL is my favorite subfield of AI, because it asks some fundamental questions about what it takes to build safe and intelligent robots that operate in the real world. So many open problems and interesting challenges to solve!
Since 2017, Lex have improved his lessons spectacularly ! Now (2019), I watch a more fluid video with a feeling that this guy know exactly what his talking without hesitating . Once again, thanks Lex, for sharing this videos. Congratulations and thanks from Brazil.
Professor Lex, can we get the entirety of 6.S091 on MIT OCW ? This is an incredibly interesting topic that I've been working on (Evolutionary Computing) and am currently enrolled in a project with thorough knowledge of Deep RL as a requisite. This research field has very few online resources besides Stanford's CS 234 and Berkeley's CS 285. Your explanations are immensely helpful and intuitive. Humanity will present it's gratitude if this whole course is made available ! AGI and AI safety issues need more attention before it's the greatest immediate existential risk, your courses can help raise general AI awareness and advance our civilization to higher dimensions. Loved the fact that you grinned while just casually mentioning the Simulation Hypothesis..
I honestly don't care about AlphaGo or Dota 2 or the robots, I just cannot get over how incredible the thought structure is behind this. What is mean by thought structure is the strategy behind how to quantify the right things, asking the right questions, and model the policy upon which growth can be created. IT IS SICK
I have tried to study and understand Deep RL using several books and lectures over the last few years, but I only feel like I understood something in RL after listening to this lecture. Thanks, Lex. I am grateful to you for posting this lecture on UA-cam. Thank you!
Thanks Lex !! Deep Reinforcement Learning opens up a new world..Life is not that complex like the baby in your video taking his first steps...unsupervised learning. Take into account the amount of time and effort(brains+USD) of getting an AI to do what the baby is doing..WALK in a few days and in the years to come -be a professor and continue with this subject The baby is the moral of the story....what we are doing is not working...we need a radical way of thinking...Your radical way is the way forward
Important detail when trying to transfer from a simulation to the real world: make the simulation have many random variations in its behavior/mechanics during runtime. (such as drag, gravity, friction, size of the agent, random perturbations, etc) This will make the agent have to generalize more, and not over optimize on the details in the sim. This makes it easier to transfer the agents capabilities to a real world environment.
@@user-sc8ph2ds2m Take a brick, stand still, throw it straight up, then you can observe if gravity exists, or not. Very simple experiment to administer.
This is interesting for me as this is my favorite Mathematics n Statistics combined Algebraic equations 🎉 MATHEMATICS is the Best Subject in World 🌎 👌 ❤🎉🎉
I had never seen Lex's lecture videos other than the philosophical podcasts. This is my first. And I just wrote the above comment as a joke without seeing the video and three minutes in, I found Socrates, Kant, Nietzsche... 😂😂 That's very Lex👌
since i begin my journey of data science, machine learning, and AI I have been seeing patterns, I am the only one who see that is probably that we are just programs seeking for a never-ending end of this simulation, the way that q-learning is created is the most realistic comparison to human thought, so in order to maximize my output i have to reconsider my reward mechanism? (taking some info from huberman also).
Lex, I heard that DL professionals are now using the simulation which has nature based environment and using it to teach AI agent like making this agent to learn how to walk or run by itself. Yoshua bengio said next evolution will be based on simulation environment for AI. Would you have any idea or information to share with that? I really really appreciate all your works and spirit you have. All the world who have interests on AI really appreciate your work and sharing. Thank you ! :)
Another detail I have noticed in many presentations ... those agents are not trying to model the environment ... that is semantically impossible ... what they are trying to do instead, I believe, is to model AN INSTANCE OF A DUAL SPACE associated to the environmental space. It is very common to use linear regressions for instance ...
Kevvy Kim hmmm ... we are saying the same thing ... it seems that practitioners and lectures keep it short without realizing perhaps the big conceptual gap is being created.
Much better than the Standford university lecture, where the lady basically only reads the equations without giving any real intuition to what's going on.
Thank you for this amazing video. I want to know if i can use the DRL principe to enhance the QoS requirements in vehicular network?? Any suggestions??
The most funny part is where he was trying to explain the ability of human brains by evolution at 6:33 ! And he literally said, "it is some how being encoded" which contradicts the rewards concept he is introducing! Son, the most logical reason of having a predefined encoding scheme that never been trained, is the existence of a creator!
Are we really morally equipped to build AI that is safe and also built it for the right reasons. This is my problem with contporary science/techhnology... We dont focus on if we SHOULD do something. Just because its doable doesnt mean it should be made.
But if a diploma is your goal , it sometimes helpful to sit in on a class before you take it for credit, can make it easier, but sometimes it just makes it boring and counterproductive the second time around.
My naive perception is that every frame of "video" entering into each of our eyes and every second of sensory data we receive from birth constitutes a rather large data set for our brains to train on (although having the possibility to constantly train and update the network)
Yes, but my perception is, the brain is already a trained model with the data from all our ancestors and at the time of birth. we will have a trained model only with all the necessary weights excluding the dataset it is trained on (our ancestors' life). which can be retrained on the go, based on our experiences. : )
I think this maybe mostly true for other mammals, the less intelligent, the more hard wired. When it comes to human, maybe not so sure how much we rely on genetic wiring, vs. neural plasticity aka training. Not sure if any ethical experiments can bring any insight.
I am having hard ... very hard time believing that the brain uses back propagation as learning mechanism ... it just makes no sense in a space-time governed universe ... god damn good lecture ... by the way ...
Lex doesn't say the brain uses it (he says it's a mystery). And more generally most cognitive neurologists don't believe it does - although some think there are similar biological correlates. But it's a very efficient algorithm for ANNs to perform gradient descent.
As far as I know , biological brains don't use back propagation. But there are neural circuits where the flow of information goes opposite. There is also the chemical side of things integrating many levels of homeostasis from hunger to pain to emotion. I would say the combination of those two are the mysterious correlates of back propagation, back propagation being the obviously oversimplified version.
Deep RL is my favorite subfield of AI, because it asks some fundamental questions about what it takes to build safe and intelligent robots that operate in the real world. So many open problems and interesting challenges to solve!
we love you Lex!
It's my favorite one, too. Thanks for the lecture, I did enjoy a lot watching it.
With these lectures and interviews you are sharing and creating immense value: knowledge. Thank you!
Thank you Lex, awesome presentation!
love your spirit in sharing the latest information. appreciate!
I wish you still did videos like this, we appreciate you sharing such knowledge.
Deep RL is the field that excites me the most. Thank you Lex.
I really like that tongue in cheek chuckle when Lex talked about that multiverse and whoever created it.....
Thank you for bringing these lectures to us.
Since 2017, Lex have improved his lessons spectacularly ! Now (2019), I watch a more fluid video with a feeling that this guy know exactly what his talking without hesitating . Once again, thanks Lex, for sharing this videos. Congratulations and thanks from Brazil.
Seriously the best Deep RL lecture out there to date.
"Every type of machine learning is supervised learning", cannot agree more!!!
In fact, learning itself is a supervised process, otherwise it is acquiring not learning.
1:04:40 Best part, that grin after he just casually dropped that line in an MIT lecture.. All of infinite universes being Simulations
Every Lecture has a historical context, evolution, mathematics and inspiration, Technical overview, Network Architecture overview. Well Summarized!!
Certainly, one of the best videos on deep learning I have come across.
Professor Lex, can we get the entirety of 6.S091 on MIT OCW ? This is an incredibly interesting topic that I've been working on (Evolutionary Computing) and am currently enrolled in a project with thorough knowledge of Deep RL as a requisite. This research field has very few online resources besides Stanford's CS 234 and Berkeley's CS 285.
Your explanations are immensely helpful and intuitive. Humanity will present it's gratitude if this whole course is made available ! AGI and AI safety issues need more attention before it's the greatest immediate existential risk, your courses can help raise general AI awareness and advance our civilization to higher dimensions. Loved the fact that you grinned while just casually mentioning the Simulation Hypothesis..
Haha he says "that is super exciting", without being excited! He is a robot!
Thanks for the open lectures
Loved the lecture. Definitely recommend his podcast. Quality.
I honestly don't care about AlphaGo or Dota 2 or the robots, I just cannot get over how incredible the thought structure is behind this. What is mean by thought structure is the strategy behind how to quantify the right things, asking the right questions, and model the policy upon which growth can be created. IT IS SICK
I have tried to study and understand Deep RL using several books and lectures over the last few years, but I only feel like I understood something in RL after listening to this lecture. Thanks, Lex. I am grateful to you for posting this lecture on UA-cam. Thank you!
As always Le a generous Share, which will be a useful resource for loads of folks. Thanks.
I've seen a lot of these videos & read some of the books in ML; Lex has a clarity thats rare
Thanks Lex !! Deep Reinforcement Learning opens up a new world..Life is not that complex like the baby in your video taking his first steps...unsupervised learning. Take into account the amount of time and effort(brains+USD) of getting an AI to do what the baby is doing..WALK in a few days and in the years to come -be a professor and continue with this subject
The baby is the moral of the story....what we are doing is not working...we need a radical way of thinking...Your radical way is the way forward
I appreciate the philosophical insights sprinkled throughout the lecture!
Lex Fridman, I just love your videos. I am your great fan sir. Carry on.
Important detail when trying to transfer from a simulation to the real world: make the simulation have many random variations in its behavior/mechanics during runtime. (such as drag, gravity, friction, size of the agent, random perturbations, etc) This will make the agent have to generalize more, and not over optimize on the details in the sim. This makes it easier to transfer the agents capabilities to a real world environment.
gravity is fake buddy ;)
@@user-sc8ph2ds2m Take a brick, stand still, throw it straight up, then you can observe if gravity exists, or not. Very simple experiment to administer.
@@vast634 you will experience buoyancy 🤦
I like his lecture because it's more understandable. And I also like his tones.
Back again to enjoy the lecture.
This is interesting for me as this is my favorite Mathematics n Statistics combined Algebraic equations 🎉 MATHEMATICS is the Best Subject in World 🌎 👌 ❤🎉🎉
I like how you quoted many theorems from Dostoevsky and also a few axioms from the Nietzsche's texts
I had never seen Lex's lecture videos other than the philosophical podcasts. This is my first. And I just wrote the above comment as a joke without seeing the video and three minutes in, I found Socrates, Kant, Nietzsche... 😂😂 That's very Lex👌
One of the best lectures, I have ever watched ....Keep it up.
🐸🐸🐸🐸🐸
Loved this lecture. I learned a lot. Thank you.
since i begin my journey of data science, machine learning, and AI I have been seeing patterns, I am the only one who see that is probably that we are just programs seeking for a never-ending end of this simulation, the way that q-learning is created is the most realistic comparison to human thought, so in order to maximize my output i have to reconsider my reward mechanism? (taking some info from huberman also).
27:24 - There is no "reload" in Doom =)
It would be appreciated if anyone can have specific steps to get all the directions on the map from 18:51 to 21:32 in great detail.
Lex, I heard that DL professionals are now using the simulation which has nature based environment and using it to teach AI agent like making this agent to learn how to walk or run by itself.
Yoshua bengio said next evolution will be based on simulation environment for AI.
Would you have any idea or information to share with that?
I really really appreciate all your works and spirit you have. All the world who have interests on AI really appreciate your work and sharing. Thank you ! :)
Do you mean smth like GYM-based simulations? Mujoco is based on physical laws - you can already train with RL methods
Hi Lex, thanks for this great lecture! Which books of Nietzsche did you have on your mind around 4:33?
@50:06 DQN can't learn stochastic policies. DQN has a softmax output on actions... isn't that a stochastic policy in itself?
I have been waiting for this.
Another detail I have noticed in many presentations ... those agents are not trying to model the environment ... that is semantically impossible ... what they are trying to do instead, I believe, is to model AN INSTANCE OF A DUAL SPACE associated to the environmental space. It is very common to use linear regressions for instance ...
Kevvy Kim hmmm ... we are saying the same thing ... it seems that practitioners and lectures keep it short without realizing perhaps the big conceptual gap is being created.
very helpful, thanks a lot dear prof.
This guy is a better Siraj Raval.
"when the reward flips, the optimal path is grad school, taking as long as possible and never reaching the destination....pffff" lol 21:20
Very good explanation of RL, thanks for the speaker !
Love the Digital Physics reference at 1:04:00 :)
Which Nietzsche book is he recommending at 4:12 ?
Love this. Thanks Lex.
30:30 "you know, MIT does better than Stanford that kind of thing" xD
very good lecture! Thanks.
thank you so much for the lecture!
Amazing work. Excelent lecture
Thank u for this awesome video
Amazing lecture on DRL, can you also show us how can we implement Q function in Neural Network?
Thanks for the last slide.
This guy should do podcasting.
Much better than the Standford university lecture, where the lady basically only reads the equations without giving any real intuition to what's going on.
Great lecture!
THANK YOU MIT
Great lecture! Thank you.
Excellent talk.
Really great lecture learned a lot
Great overview of DRL
I took the engineering plasma class taught by your dad at Drexel :D
Remove the human factor. Have the traffic be free of human crossing
Thank you for this amazing video. I want to know if i can use the DRL principe to enhance the QoS requirements in vehicular network??
Any suggestions??
very good intro
You are my idol lex
Is the entire deep RL course available?
Wonderful lecture.
where are the rest of the lectures?
check out - "Spinning up with Deep RL by openai"
Brilliant!!
The most funny part is where he was trying to explain the ability of human brains by evolution at 6:33 ! And he literally said, "it is some how being encoded" which contradicts the rewards concept he is introducing!
Son, the most logical reason of having a predefined encoding scheme that never been trained, is the existence of a creator!
Can we take the idea's or clues from Biomimicry architecture in Reinforcement learning
that DeepRL animation looks like something out of Bamzooki
great lecture
very good!!
Are we really morally equipped to build AI that is safe and also built it for the right reasons.
This is my problem with contporary science/techhnology... We dont focus on if we SHOULD do something. Just because its doable doesnt mean it should be made.
Awesome
Nice.
Why no one is asking any doubts ? Or is that portions edited out? Nice lecture
27:07 lex is scared of Doom
Super
How/why can you even upload this for free? Doesn't university cost loads in the US?
Great stuff though!
I guess that's because MIT is focusing more on workshops/hands-on learning, AND to raise the barre for other universities/institutes... hhh
But if a diploma is your goal , it sometimes helpful to sit in on a class before you take it for credit, can make it easier, but sometimes it just makes it boring and counterproductive the second time around.
Sitting in doesn't get you credits or a diploma.
Nice vijeo...
شكرا جزيلا
isnt that the guy from joe rogans podcast? it takes me a week to grasp what he says in 5 minutes.
oh shit i listen to your podcast lmao
Couldn't always follow. Was distracted by the two cats and then later by the fool who fell in the water. 🙂
My naive perception is that every frame of "video" entering into each of our eyes and every second of sensory data we receive from birth constitutes a rather large data set for our brains to train on (although having the possibility to constantly train and update the network)
Yes, but my perception is, the brain is already a trained model with the data from all our ancestors and at the time of birth. we will have a trained model only with all the necessary weights excluding the dataset it is trained on (our ancestors' life). which can be retrained on the go, based on our experiences. : )
I think this maybe mostly true for other mammals, the less intelligent, the more hard wired. When it comes to human, maybe not so sure how much we rely on genetic wiring, vs. neural plasticity aka training. Not sure if any ethical experiments can bring any insight.
very good
just create a large number of random simulations. if you're successful in a large number of other realities then this one should be easy. o_o
nice tie
I am having hard ... very hard time believing that the brain uses back propagation as learning mechanism ... it just makes no sense in a space-time governed universe ... god damn good lecture ... by the way ...
Lex doesn't say the brain uses it (he says it's a mystery). And more generally most cognitive neurologists don't believe it does - although some think there are similar biological correlates. But it's a very efficient algorithm for ANNs to perform gradient descent.
As far as I know , biological brains don't use back propagation. But there are neural circuits where the flow of information goes opposite. There is also the chemical side of things integrating many levels of homeostasis from hunger to pain to emotion.
I would say the combination of those two are the mysterious correlates of back propagation, back propagation being the obviously oversimplified version.
@@petevenuti7355 got it ...
nice
Nice yo
the robots won't be happy to see that we kicked them in our labs instead of being friendly :/
34:12
2:15
hugest*
Lex is super low on cortisol and super high on gaba. So much so he sounds quite sleepy sometimes.
Could bet he is a fan of Jordan Peterson.
What makes you think so?
Insert comment
the viewers are a lot less from the first clip