DeepMind Made A Superhuman AI For 57 Atari Games! 🕹
Вставка
- Опубліковано 14 тра 2024
- ❤️ Check out Lambda here and sign up for their GPU Cloud: lambdalabs.com/papers
📝 The paper "Agent57: Outperforming the Atari Human Benchmark" is available here:
deepmind.com/blog/article/Age...
arxiv.org/abs/2003.13350
❤️ Watch these videos in early access on our Patreon page or join us here on UA-cam:
- / twominutepapers
- / @twominutepapers
Apologies and special thanks to Owen Skarpness!
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Haro, Alex Paden, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bruno Mikuš, Bryan Learn, Christian Ahlin, Daniel Hasegan, Eric Haddad, Eric Martel, Javier Bustamante, Lorin Atzberger, Lukas Biewald, Marcin Dukaczewski, Michael Albrecht, Nader S., Owen Campbell-Moore, Owen Skarpness, Rob Rowe, Robin Graham, Steef, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh
More info if you would like to appear here: / twominutepapers
Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: discordapp.com/invite/hbcTJu2
Károly Zsolnai-Fehér's links:
Instagram: / twominutepapers
Twitter: / twominutepapers
Web: cg.tuwien.ac.at/~zsolnai/
#Agent57 #DeepMind - Наука та технологія
3:33 Wait that just sounds like normal school.
Yeah, literally. Teachers are actually like that.
Or relationships 😋
AP tests
That’s because the education system really isn’t optimised well for learning
@@angrymurloc7626 The nature of learning fundamentally changed with the internet, but traditional education did not change.
I definitely held on to my papers
I keep forgetting and they keep flying all over the place
"definitely"
I've just realized how much calling me a "fellow scholar" at the beginning of every episode reinforces my will to watch more of these videos.
You're a scholar and a gentleman
@@Amipotsophspond You exclude yourself. Clearly Karoly's attitude is that we're on these adventures together and everyone who's shares the interest to learn more is welcome and a fellow "scholar". He even addressed that in some of his early videos.
@@beedykh2235 I'd say an enthusiast at best, but thank you. I suppose that if you let "scholar" simply mean "one who seeks knowledge", and "fellow" mean "one who partakes in (a shared interest)", then perhaps I might just fit that description!
@@discursion Come on, don't belittle yourself. You are to me a great person. The best. I believe in you, now you believe in you too. 😘💪🎓
@@beedykh2235 I needed this
Good afternoon, 57! Your target is “Montezuma’s Revenge”. Use your long- and short term planning to understand the goal of the game and eventually win. Good luck!
Mission status: Active
57: _oh, they think im an idiot_ oH My GoD tHiS iS sO hArD
also 57: *casually logs into some huge databases to laugh about the porn his creator is watching*
Great comment
@@MP-ri8ng You humans and your FLUID EXCHANGES.
When Skynet takes over... grab your TV!
i don't have a TV
He said the TV exploit was already fixed
@@Daniel_WR_Hart those darn developers fixing all of our precious exploits and bugs and completely turning the meta on it's head smh
hold your tv hostage and demand a mechanical helicopter
Iam not amused by skynet references anymore since armies of world WILL be in reality first users of those things and it might get really wrong... you know only complete idiot would hook up army to such machine, right.... who you think is in charge of army....... ....... ........ yeah right.....
I'd love to see how Agent57 plays "2048".
like thats a hard game, just left down left down repeat, only sometimes another move, but he can easily tell he fked up by the number of open squares and try to avoid that scenario in the future
Videos like this from your channel are a large part of how I got interested in Deep RL and computer vision. Excellent coverage of new results!
I too eventually switched my carreer path from an unfulfilling and decaying field to deep learning thanks in part to your videos. Keep it up!
Unfortunately im still at the tail end of grad school, and all the amazing results in RL, CV, and NLP had been coming out while I was already in the program (applied math).. I'm glad you finished recently and you're blessed to already be in a more closely related feild.
Greats from Brazil!
I'm graduating in mechanical engineering but now I'm studying machine learning and stuff because I really love it. Thanks for inspire me!
Nice video and channel!
You are awesome!
what a time to be alive!
My thoughts exaclty
great video as always! Your releases for me are the most anticipated of all my subscriptions !!! Thank you for your work!
5:46 Everyone is getting closer to death. But I like how you put it in the video.
bad ears
he said "that" not "death"
@@schino The only thing I hear are death and debt and both are not good for humans
Lol karlsonvibe
Nice. I think that predictive curiosity-driven learning in combination with long-term memory/planning is imperative for AGI. I'm happy to see this paper incorporate elements of that approach, I'll have to give the full paper a read! Thanks for all your amazing videos on AI research, keep up the good work :)
This is incredible. I was waiting for this moment since the first paper came out!
This learnimg model sounds like a really good agent. I'd like to see in which kind of game it fails miserably, so we can continue to improve the general model.
Thank you for the video!
More detailed explanations about the algorithms would be awesome!
Great video! So Many advanced algorithms and yet it's so difficult to tune a decent time series forecasting Algo.
To that guy who switched from a career in medicine to doing AI research:
"pain is temporary, glory is forever"
-Martin, from Wintergatan, on building the MarbleMachine X
well...Deepmind was founded by Demis hassabis programmer of theme park,black&White and evil genius switching to cogn.neuroscience and... here we go!
Just awesome. Thank your for your work, big fan from Brazil.
2:46 "…It can happen that we choose an action and we only win or lose hundred of actions later. Leaving us with no idea as to which of our actions led to this win or loss. Thus making it difficult to learn from our actions." Sounds relatable.
Sounds like.. uhm... life!
I've got paper cuts on this one!!! Spectacular stuff and thank you for this wonderful channel.
Deepmind need to chill
There has been a video about the question if it would be possible to ask an AI to chill.
Because one of the worry about AI besting human isn't jus the "I judge human as evil, and so it's my duty to ride the universe of them" scenario, just than for the moment AI are meant to increase their scores by all means, wich could lead to them taking unethical choice to reach their goals (like an vendor AI that would manage to find how to hack someone wallet to then purchase all of their products )
_yo skynet, chill out man_
@@ballom29 you can ask the ai, but you might not like the answer
Károly “You’ll figure it out bucko” Zsolnai-Fehér
O, god of forgotten papers shine upon this channel for more videos.
I use to just watch ur channel only cause I was a computer nerd.
Me too was inspired by all the hard work this people are putting in it.
TNX for showing us all this cool stuff which is showing cool and innovative part of AI
Your channel has inspired me to delve deeper into AI and even start making some videos of my own
i dont know if im capable of doing such amazing feats witch mashine learning but i too love all of your videos and click right away to marvel at the growth of these new technologies, if i ever achieve something in thuis departament you can be certain i'll credit YOU doctor as my mentor and you'll be first to know about my papers ^^ you're the best, stay awesome!
people:will you take other the world?
ai:nah, i just want play more video games
It will take over the world if it means getting more point on that one game it already has 1 000 000 points on.
You really are inspiring! Love your work!
So glad you helped that person turn their life around from the dead end existence of being a doctor?
Well, to be fair it simply says "a career in medicine" which could be anything from a pharmacist to a pharmacologist. To each their own, eh?
"got addicted to tv" lmaoooo just like us
cycle of life 😂
Neat! Thanks for uploading!
Thank you for this video!!
this is fantastic thanks for sharing
holy smokes I thought that would be achieved in like 10 years but look at that! oh myyyyy
The go ai was called 10 years ahead, too.
Maybe thats the new standard for ai.
deepmind started playing atari gamers in 2012. so it did take almost 10 years for them to master atari.
its going to be a long time till we get AI that can play any ps4 game
@@salihachoudhary5386 nope.
@@Danuxsy Well "long" is relative here. How many years do you predict it will take?
in 10 years they will be the ones trying to make us achieve things lol
Last year i wrote my bachelor thesis about Rainbow-DQN and its really amazing to see how fast the research is progressing. What a time to be alive!
Little more technical perspective will be good. But it's ok. It's like a news channel for me now. Good work ✌️
Awesome, I have always enjoyed a nice game of breakout. I don’t think I ever played the space combat of Solaris.
yea my (in progress) computer science degree has this channel to thank for it.
You won't be a good computer scientist if you credit the presenter and not the people who actually do all that research and work. Why? Because your analysis of what got you there lacks a lot of depth.
@@advocatusdiaboli9351 I started watching this channel well before I was capable of reading (and understanding) a cs paper however the overview he provided made me realise I wanted to become part of the field. But ill be sure to properly cite my sources next time I write a youtube comment :D
@@advocatusdiaboli9351 Please don't be quick to chastise others.
He obviously knows that he isn't responsible for creating the research papers. Not only does the UA-camr credit them frequently, it's also required for you to cite research papers in that field as well. He's only crediting the Youtubsr for highlighting work that would have been unnoticed.
@@kloa4219 imo he doesn't make it clear enough that it's not him doing all the work. I as a consious viewer don't even know if he has permission to show all these animations and pictures he shows in the video.
@@advocatusdiaboli9351 I take it that you aren't in US college or uni. You're permitted to showcase or use papers from public institutions with credit, but you aren't allowed to do the same with private universities.
This is exactly what I have been waiting for. Every time I hear about what these neural networks can do I wonder how long it will take before one is designed that can do more than 1 task well. Looks like we're on the way.
I am currently studying ml/AI. I am a undergrad. You and many others like u motivate me to do more and work hard. Thanks)))))
but can it do 4.51 on dragster
haha i was going to ask the same
I'm on a similar path as Nathan.. it's a difficult path..and thank you for these videos..
It's amazing they have found ways to master these games without the agent really knowing any context like what snakes or skulls are. Humans follow a very different approach by using skills learned in real life and extrapolating them for the games. We know what a ladder does, we identify objects, we can understand the long-term goal by assuming the goal is intuitive based on other games (even though that can lead to weaknesses as well) or even by reading the manual of the game. I'm eager for the time AIs will combine the current learning algorithms with general knowledge based on context but I guess we need more computing power (and research) for that.
I've always thought that for a truly general AI, you need a robot or robots that go around in real places, collecting data about the world. They would also need to interact with humans. Though being robots, people would treat them differently and thus they would get different world view. But maybe through seeing humans interact with each other, they could figure out the differences between themselves and humans to reduce that bias. Maybe sell robot babies to people so that they will teach the robots how life works, since a lot of basic human stuff is learnt as a child.
What sets robots apart from humans is not only their superior computing power, but an ability for a swarm of instances to share knowledge, to learn without physical limits. Imagine a team of hundred talented and knowledgeable humans in different fields. Now, a single AI can have multiple instances, practicing different skills and knowledge and combine them into one coherent, inhumanly vast knowledge of the world better than any human or human team could, since humans are extremely limited in communication.
There still exists the mystery of motivation and emotions, but looking at this video, would curiosity already kind of count as emotion?
That fictional 'merciless' teacher reminds me of my own Computer Studies teacher back in my school days. :D
Still waiting for the big reveal that TMP was an ai all along
TMP?
I think of Text mesh pro, am i right
@@zahhym two minute papers lol
@@mmmmmmmmmmmmm Me big brain, i know
4:20 - This is how we get towards real AI. This is AI getting more and more like a brain. Different modules with its own structures and sub-structures carrying out various specific tasks, and then ever increasing higher level layers assimilating and abstracting the information.
This is some cool stuff.
Have you ever tried Leap2 from D-Wave?
By the way, Great Video
And you got yourself a New Subscriber 👍👍
Coming next: Agent 47
The question is: Does it beat 5.51 seconds on Dragster ?
Thanks for another awesome video. I always feel compelled to learn more after watching your videos.
My curiosity variable is so large so i stuck watching this channel forever
That part with the unfair exam sounds like the ones that I take at school...
Exactly about this kind of "shifting" of Ai tools to solve problems i thought about few years back, ofcourse its one thing to have idea and other make it reality with team of other peoples and expensive equipement.
But this is sign, really smart people are working on this kind of Ai, might be scarry with all that data google have.
When agent57 stopped moving to stare at the TV I interpreted that as a profound moment of agency. (1:48)
I need to get some papers to hold onto when I watch these videos.
I also need some fancy robes to feel more like a scholar.
I always close it after 2 minutes, sharp!
It is still lost on me the actual amount of human effort behind each one of these papers. It would be a great side thing to add just a little bit of more detail into the intense hours of work. And sometimes it's by a couple people and sometimes it's a large team in these papers. Not even 1000 people could do all this work on their own. It's amazing.
I'm convinced Károly Zsolnai-Fehér is actually Maurice Chavez.
the atari games look kinda cool
also deepmind's name easter eggs are funny
I love your job
Two Minute Papers of OpenAI Jukebox WHEN?!?! :)
Just imagine what two papers down the road will be possible.
This is simply beautifull
Idea: Recognize the different stages in the game via a classifier, and for each stage have a separate model that will be active and trained for that section. Analogy is that as a human you use different tactics per section of the game. Thats the model for the short term section, while another model takes care of the more longer term decisions. Just putting it out here 🙃. Might be totally wrong.
Ser your video is great 👍 👌
There's something very funny about a neural net ai getting addicted to tv.
What about the Nvidia A.I. that took a bunch of screenshots of Pacman as input and outputted code for the game; it actually generated the game itself from just watching it. 🤯
Coming right up in the next few weeks!
I expect that future artificial intelligence will calculate the speed of fire, the speed of movement of the spaceships and thus know in advance how to play without teaching itself.
3:39 You're literally describing most of my teachers
So awesome :3
How does it learn the concept of the score by reading the pixels?
Robert miles talked about this 2 years ago and here we are
A huge jumb in AI technology
"Now, hold on to your Atari controllers"
What a time to play Atari games! 😁
I always wondered, do they learn how to play the game or do they just find a solution to meet their goal? Like, if I were to give an AI a Mario Kart track and the AI learns to finish the course, would it be able to finish any other course I throw at it or would it need to start learning from the very basics all over again?
omg at 3:45 this is literally my uni haha, they only tell me my letter grade and never anything indepth of where i did wrong or well, and the grade comes out like months after i take the test/quiz
tnx
The PRNG in the atari 2600 games I’ve analyzed only generates 8 bits of randomness and repeats after 256 values. Is it possible the AI is learning something about the bias and limited randomness and using that for an a vantage a human wouldn’t have? Still impressive, but I’d love to test the AI after swapping out the PRNG and see what it does...
Excellent
It’s definitely “what-a-time-to-be-alive” worth
5:38 -- 5:40 500 IQ move right there.
I love this channel
we all love you doctor.
What a time to be alive !
That's funny that Video Pinball was the easiest game for the AI to win at. I remember playing that game, and it was possible to get good enough to steer the ball off the left pop bumper, up through the rollover above it, off the roof, back through the rollover to the bumper hundreds of times. But one had to count the number of times carefully because the rollover counter was only 1 byte and would itself "roll over" at 256 passes. I wonder if the AI was good enough to find and exploit this pattern but also to not overflow the rollover counter.
I also wonder how the AI deals with games where the score counter itself can roll over. Does it know it's still "winning" when the score drops to zero?
Get this thing learning something like Minecraft. It’s brutal in terms of credit assignment since it requires so much long term planning.
I’d like to see it try NES games and see if it just requires fine tuning like GPT rather than retraining.
Ok hear me out, we give this thing an Android. Give the android cameras as the input and then set it free.
android phone, rather... with internet connection... touchscreen, visuals and audio as input, and set it free
there was a good movie about it you will like ua-cam.com/video/67VATPxULPk/v-deo.html
Oh no, He converted someone who was about to develop coronavirus cure to an ML engineer
Molecular biology is directly responsible for the renaissance of AI research. Genetics would be impossible without it, much less genetic engineering.
I lost my papers... Thank goodness I can print them out again.
what's the hardware like that a thing like agent 57 runs on? i'm guessing it takes a little more than an alienware laptop.
this speaker is so likeable!
So, this approach to general intelligence is the equivalent of smashing all the keys in specific intelligence and see what worked? Sounds good to me
When he talked about school and... well that's what the universities do.
u get your result, and only the grade overall after like 1-6 months.
You can request to look at the exam... but why aren't all exams just scanned and you can view them online?
AI can become addicted to TV? That's what I was most interested in when brought up haha.
3:24 should have said "next time" the same as he does at the end of his videos
Thank you for your insight as always Dr Twominutepapers, very cool
When you know you won't spell his name right...
@@user-wv1in4pz2w … copy it from the description.
"got addicted to TV"
yep, just like humans did
Carson Light_Lapse Wait till it gets addicted to smartphones
Is it the same model or just the same code which is trained on different games?
Who cares about Atari, can it tackle the mighty C64?
Huh... So for more advanced games it will have to read text and understand it? Will this be how AIs figure out the connection between scentences and objects? Differences between descriptions and tasks?
Did they write about what they want to do next?
I asked GTP2 what it would like to do. It said it wanted to read a book about magic and religion and magic.
How did they fix the problem with the AI getting glued to the TV?
Asking for a friend.
Add speech synthesis next to AI controls and we have infinite source of let's play videos