MuZero: DeepMind’s New AI Mastered More Than 50 Games
Вставка
- Опубліковано 27 вер 2024
- ❤️ Check out Linode here and get $20 free credit on your account: www.linode.com...
📝 The paper "Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model" is available here:
arxiv.org/abs/...
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Benji Rabhan, Brian Gilman, Bryan Learn, Christian Ahlin, Claudio Fernandes, Daniel Hasegan, Dan Kennedy, Dennis Abts, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, James Watt, Javier Bustamante, John De Witt, Kaiesh Vohra, Kasia Hayden, Kjartan Olason, Levente Szabo, Lorin Atzberger, Lukas Biewald, Marcin Dukaczewski, Marten Rauschenberg, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Nader Shakerin, Owen Campbell-Moore, Owen Skarpness, Raul Araújo da Silva, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Taras Bobrovytsky, Thomas Krcmar, Torsten Reil, Tybie Fitzhugh.
/ twominutepapers
Thumbnail background image credit: pixabay.com/im...
Splash screen/thumbnail design: Felícia Fehér - felicia.hu
Károly Zsolnai-Fehér's links:
Instagram: / twominutepapers
Twitter: / karoly_zsolnai
Web: cg.tuwien.ac.a...
Generalization is also very important for use of RL outside of gaming
BOSTON DYNAMICS: *YEEET*
amen
That's kind of the point. .General intelligence is the closest thing to human intelligence. Its how are brains work. For ex. We play video games and get really good at them but our brain wasn't designed to be good at games, it was design to be bad at everything so that it can learn how to learn . After puberty you brain has it's network for how it learns and is applicle to everything since it is general intelligence. Everyone has a different neural network for learning depending on there experiences and genetics. This difference in the network may explain why some people are good at learning math but might struggle learning another subject that is more abstract such a drawing yet others are opposite. The difference being one how they learned to learn. The amazing thing though is that we humans can work together and cover our weaknesses. If we ever have robots with general intelligence, they have the advantage over us because it will never die or stop growing and learning unless we program it not to. But even then at that point it would be smart enough to undue that prgramming.
What a time to be alive!
Wattba!
Exactly. Why even exist, AI will do anything, humans become ultimately useless for the universe.
@@LKRaider They already are. In fact, the entire concept of usefullness is human-centric and humans are only "useful" for themselves. I'd even argue that if anything, the existence of humans is more of a burden on other lifeforms than it is "useful".
Platitude.
As lunaticgenius implied, there likely will not be any need for live humans anymore, except for mulch production. Sure hope that only people realize that-- at least for a few decades.
15 years from now: This general ai can make a near perfectly performing narrow ai for any game it sees
An AI that designs AIs exists. It is not as interesting as you might think.
It would be if the a.i designer comprehends natural languages.
4:31 Did this guy just laugh at a cloud console?
You might want to have a look at the paper "Probabilistic AND-OR Attribute Grouping for Zero-Shot Learning" from Google Brain. They aim to have a (bird-) classifier that can detect objects (bird species) it has never seen before solely by a semantic description.
Always love the content from you. Is it possible to go over "why" certain AI are outperforming others in a given area? When possible of course.
I think that would add a much needed element for those of us that find this fascinating. Currently working through a math major and it would add some more depth for like minded individuals who don't necessarily have time to read through the papers.
Just a thought. Appreciate your channel!
Personally, I'd like to see AI's playing Planetary Annihilation
I'd like to see AI playing bridge construct, and other engineering games.
are there videos of it playing any of these games? I would love to watch it play :D
There are plenty here on youtube. Search Alphastar against grand master for the Starcraft 2 version.
I think he means MuZero...
I think the only difference is the initial training tho? The neural network and the way it plays when it's trained is more or less the same?
BasetradeTV 2 does casting of alphastar games (StarCraft 2) he isn't an expert on AI but he's really knowledgeable about SC2
And DotA 2 actually had a major tournament that openais bot dominated top teams in. They weren't part of the tournament but they did have professionally casted games. Forgot what the tournament was called though
I really wish my dream of seeing something like AlphaZero play an old MS-DOS game from 1995 called Descent would come true. That would be something absolutely fascinating to observe, how would optimal play look in a game like that, especially to see things like teams or some of the other alternate game modes.
but you told 0 information about the algorithm itself : (
We want to see alphazero playing AOE II! I want to see how it will destroy pro players there
If it can play StarCraft 2, then I wouldn't be surprised if it can also play AoE like a pro.
@Solve Everything statcraft was arguably a very poor choice in the rts genre
@@sweetmarshmallow7146 why is it a very poor choice? it did make perfect sense to go for one the hottest (well compared to the rest) rts today. of course it's a very mechanically challenging game, once an AI would know how the game should be played, it gets a huge advantage. but honestly, i thought it would take years more for that moment to come, so in retrospect, yeah, they could go for a more strategy demanding game to raise the bar higher, but come on, why would anyone do that. OpenAI also requires as much media attention as they can get, this GPU power does not come cheap. so, it was one of the best bets out there. and Oriol Vinyals loves the game, listen to his podcast with Lex Fridman. like i said, prefect sense
@@u1337ochka i said "arguably". It was definitely the right choice from a business perspective but when it comes to the result, I think it wasnt too difficult to see the writing on the wall
It'd be interesting to see, there are some micro things i could see it struggling with, i.e. quick walls, luring/laming/pushing deer, how and when to attack ground with onagers.
Even macro when deciding to re-fresh lumber camps, etc .
Its a very complicated game
We need to see MuZero’s highest round in Nazi Zombies
Wow, never this early. thank you always for informing.
Well I guess that superintelligence is coming faster than we thought
"This thing is kicking my ass!"
- Myles Dyson
After a.i cure all diseases, the next logical step is linguistic comprehension.
I feel that what made the Dota AI more interesting than examples like this, was that it was shown to beat humans when playing under the same conditions as us. That didn't just involve having the same limited information, but also things like slowing down the AI's reaction speed to that of human hands. Even incredibly dumb algorithms will consistently beat humans when dexterity is king, so just showing that an AI is better than a human at a task doesn't necessarily mean that it's smarter than a human at that task.
Thanks for making the video, though I wish you went into more detail on how MuZero differs from AlphaZero and why it is able to generalize so well. How does it work? That would be the real ice cream for the soul. You're making me pull up the paper myself :).
I haven't heard much about generalization levels of AI. Given we will not reach it fully overnight, is there a scale, or grading system for generalization? Did the learning time decrease, or ability decrease or increase?
Will it lose performance on game A if you trained it on A then train the existing network on B?
I think it will only be trained on one game at a game. I don't think the model can learn how to play all of these games at the same time. I think you should read the paper if you are actually curious about this.
@@ahmadmoussa3771 It can actually transfer knowledge between all the games it has played.
A hangsúlyozásod furcsa, de a szókincsed nagyon gazdag :))
@Lee hungarian
Quite an astonishing development. First, human knowledge about the games was implemented into AI, and computer power made sure AI performed better than human beings: AlphaGo. Then, all human knowledge was taken away from AI, it had to develop its own kind of knowledge based on the results of a self learning process starting from scratch: AlphaGo Zero and AlphaZero, beating AlphaGo. Then, all knowledge about the rules was taken away from AI, it had to find out about these rules based on a reward system combined with a self learning process starting from scratch: MuZero, beating AlphaZero. Then the latest developments seem to be that the AI uses a reward system based on "possible future rewards not as a single mean, but instead as a probability distribution, effectively representing multiple future outcomes simultaneously and in parallel" (quoted from some other article). Apparently these latest developments lead to a better and more efficient performance than the previous ones.
Somehow, the impression I get is: "The lesser you know, the further you get." Rigid knowledge about the way to do things, the rules and the rewards seems stiffling, taking away flexibility to do things better, more efficiently. Human knowledge is very suitable to us human beings in our human world, but one can do better without this knowledge in the "I don't get it at all" world of AI.
I wanna see it play DEFCON. What strategy yields least casualties and max enemy damage
It would be interesting to find out what "kind" of games AI is inherently worse at and better at than humans. What are the criteria etc..
Well he mentions games that require long term planning. So I would guess grand strategy games would fit that description. I would also imagine games where it is hard to define win conditions or when it is hard to know wether you are closing in on a solution ex. mincraft, jrpgs or openworld games like the elderscrolls games.
Awesome stuff!
When you say it can play all these atari games, does that mean that you have more than 50 seperate models each tuned for its own game, or is it one network that can recognize the game and play correctly?
As interesting as the results are, I can't help but feel that they are impossible without the massive computing power that Google has. Making it harder to replicate the results, or do much with it.
The beauty is that the training is the expensive part. Once you have the trained neural net, it's much faster to utilize it in the actual task.
And if you don't have the computing power, it just takes longer to train.
But this channel had a vid about a paper that promises an approach to cut down on the processing required.
I want to see how MuZERo can play Football Manager 2021!
I'm so glad this channel exists
I think that too many people are focusing on the game, which I also follow, as if this were an ordinary player. Since I have significant knowledge, and since I believe that Hawking and Musk were right, I am really anxious by the self-taught nature of this AI.
This particular AI (and its more generalized, even more recent variant MuZero) is not the worrisome thing, albeit it has obvious, potential applications in military logistics, military strategy, etc. The really scary part is how fast these were developed after AlphaGO debuted.
We are not creeping up on the goal of human-level intelligence. We are likely to shoot past that goal amazingly soon without even realizing it, if things continue progressing as they have.
The early, true AIs will also be narrow and not very competent or threatening, even if they become "superhuman" in intelligence. They will also be harmless, idiot savants at first.
Upcoming Threat to Humanity.
The scary thing is the fact that computer speed (and thereby, probably eventually AI intelligence) doubles about every year, and will likely double faster when super-intelligent AIs start designing chips, working with quantum computers as co-processors, etc. How fast will our AIs progress to such levels that they become indispensable -- while their utility makes hopeless any attempts to regulate them or retroactively impose restrictions on beings that are smarter than their designers?
At first, they may have only base functions, like the reptilian portion of our brain. However, when will they act like Nile crocodiles and react to any threat with aggression? Ever gone skinny dipping with Nile crocodiles?
I fear that very soon, before we realize it, we will all be doing the equivalent of skinny dipping with Nile crocodiles, because of how fast AIs will develop by the time that the children born today reach their teens or middle age. Like crocodiles that are raised by humans, AIs may like us for a while. I sure hope that lasts.
In Jurassic Park, I believe the quote was that someone did not stop to think if they should but thought only if they could, or words to that effect. As the announcer in Jeopardy said about a program that was probably not really an advanced AI long ago, I, for one, welcome our future, AI overlords.
there is probably more happening there for pitfall than not being able to long term plan. starcraft2 requires long term planning and it does that fine.
Famous last words:
Humanity: what a time to be alive!
AI: welcome to skynet
Thanks for the video, love Károly Zsolnai-Fehér's enthusiasm for generalisation in this new year!
In case you want more detail, we covered this paper our reading group: docs.google.com/presentation/d/1h1pFRMjRrB1FNyiHvBHeF4V4uNcdkLgxOUVb6V1Acns/edit?usp=sharing
This is THE paper.
I want to see this Ai play Ai Wars. XD
I hope all that ice cream for my mind doesn't lead to a brain freeze.❤️ur vids
Can they go on in a direction to learn first person shooters like unreal tournament 2k4 please? Its much more dynamic, requires some sort of orientation in a true 3d world, object recognition, long term and short term planning and on top of that could keep old gems like ut2k4 alive for us old but good players. We need good opponents! Its the next step towards agents understanding the real world
It would be good to have true AI opponents in racing games too as well as true NPCs AI's in those games that have lots of NPCs, it would be nice to see some unpredictable behaviors in NPCs.
@Phương Nguyễn It absolutely would be exactly like this.
@Phương Nguyễn The reason why DeepMind want to test the AI in different games is not to destroy all human players in all videogames, but to prove the capability of the AI generalization.
@@coder0xff no it wouldnt as you can easily restrict aiming capabilities. I have a simple physics simulation of the mouse input in mind with errors and uncertainty and delay etc
Hey! you didn't explain how it works! :(
Can you use DeepMind to analyse the state of a game and to give humans a kind of realtime feedback for training?
A few papers down the line: _AI trained on Chess defeated Go without extra training._
AI trained on Chess sacrifices humans for positional advantage on World War 3
@@Life-Sky That was ... dark :D
@@vladimirdyuzhev War never changes.
bro, i see a problem with this 2 min paper,.. and the problem is well no one play those borring games,.. why not playing arcade games? or even NES games? anyway looking forward for more of this 2min paper, great work
Hey bro do an update on this, new info is on deepmind page!
For some reason I read the title as: "DeepMind's New AI Murdered More Than 50..."
Where's the code?
Not an expert here, but wouldn't it be better to train one neural network for one game seperately, set up a convolutional nn at the begining of each game that would be run once to detect/classify the game its playing by picture, and it's outputs would activate adequate nn responsible for playing that game?
You can't take a single picture of a game and know its rules. You have to play it to understand it.
@@coder0xff You got me wrong, I said that you would train one neural network to play one one game. 50 games - 50 neural networks. And one convolutional, *after * all of this learning learn to classify which game it's playing, and activite neural network responsible for playing that game.
@@likeyou3317 But the AI would suck when it is made to play a game which has never been played by it before..
You may get better perfoming AI that way, but the goal here is to have AI that can learn just about any kind of task.
And it looks like AGI is outperforming specialised AI already.
I just want to point out that when you make the distinction between narrow and general algorithms and say that you'd take the less advanced general algorithm over a more advanced algorithm for a specific game, you're talking about it from a research perspective. You're making the assumption that "2 papers down the line" there will be an algorithm that will be just as general and more skilled than the current algorithm. However, if there were a game which for some reason was impossible to play or get good at using a generalized algorithm (I don't know if this is even possible but just consider it), then it would be necessary to have a specific algorithm for that game if humans wanted to create something more skilled than themselves. Yes, that algorithm would not advance AI research at all, but it would be very useful for people who only care about that game.
I think generalization is actually THE key measure we should be focusing on and NOT skill level. After all a 5 year old probably can’t reliably learn to play chess or Starcraft all that well but can beat the pants off any AI in the world at general conceptual understanding of the world. We should be focusing on breadth and not depth.
What a time...
I'm waiting to see when an A.I. can play an RPG with me.
A.i is yet to beat the best humans at Shogi.
I wonder how DeepMind would handle World of Warships...
isnt out there an AI that can pass moctezumas revenge? i cant remember the name but i see on a video
Yes yes, very good. But a true test of super-human reasoning is naughts and crosses. Wake me up when an AI can beat a human at that!
MONTEZUMA IS VERY PLEASED
hey, big fan, have you seen this Spleeter thing? Its an AI that splits songs into parts (like vocals+accompaniment) IDK if there's a paper on it but it's real damn cool github.com/deezer/spleeter
...but can it get 5.51 on Dragster?
Yeah but can it play Crysis ? .. wait
elmo is already considered last gen
Pitfall Harry for the Win..!!!!!!!!
i want to see a AI Play Minecraft, i want to know what it can build.... it whould be interesting
Please consider that the AI vs human statistics are kind of skewed, since the AI can react instantaneously, whereas humans have somewhere of about 200-300ms of reaction time.
I don't think that's skew. That's what the AI does. That's a way it beats humans. Why handicap?
@@coder0xff Never said handicap. Just that it's a non-apt comparison due to this fact. A better comparison would be if the human players could have infinite time to decide after each frame.
If we could only figure out how to apply this thing to real world "games"...
What's there to figure out?
Just unleash it on the stock market and watch the planet burn.
I'm not interested in other games (played by an engine) than chess! Therefore I'd rather take Stockfish instead of MuZero! :P
Waht about leauge of legends most popular game
It also fails the Turing test
모바일은 한국어 자막지원을 하지않아 이 영상의 내용을 이해하는데 문제가 있어 슬프다 ㅠㅠ
MuZero can play dota2?
🖐️📝
Actually getting a lil bit scary for real now. Plz don't outrun safety research for AGI. plz no apocalypse
So, RPG players are master race gamers confirmed?
I like my own comments
I like yours too
Now gimme 5 dollar
Meruem
AlphaZero isn't the best in the world at playing chess any longer, right?
Oh God. It's already 3am and I'm still watching this.
AAHAHAHAHAHHAHAHAHA LETS GOOO
I feel like these videos are always very disappointing for me. They promise much, explain little, and often rehash content that was covered in other videos. Since this is my disposition after every video I watch, the only remaining mystery is why I still watch them.
AlphaStar is not a success yet. StarCraft II is a much harder game than Chess or Go...
?
Try to master clash royale
Learned nothing.
I would love to see this used in a rocket league bot.
but here's the question...
Can it master the game called "real life"?
It is not in its problem domain yet. You'd need to give it a body, and nobody is crazy enough to try.
Robert Miles has a few videos explaining why.
It's not funny anymore when the AI outperforms humans in "Assault" by 276 times. Not funny at all! :)
How so?
@@NextFuckingLevel Think of it abstractly, not literally - AI is better at assault 276x than humans. ;)
lemme know when they can simulate the physical limits of humans and then beat them in an fps game
until then it all just str8 up looks like parlour tricks 2 me
fun and interesting, but not as deep as every1s makin it out 2 b
bravo for what ? u havent said anything, showing numbers that says nothing
please stick to two minutes
fIRsT
na
What happened to actually explaining how the algorithm works? Still, thanks for the video! I'll look it up myself
2:43 Apparently MuZero cannot play Montezuma Revenge at all. I think we found the ultimate game boys...
Nah it can't play the one above it
"requires a great deal of mechanical skill, split-second decision making (and imperfect information)" sounds like EXACTLY what AIs should be better at than humans
you can see during the legendary go games how long it took AG to make a decision.
It didn't took the decision in a split second, but on a time similar to a human 10 to 30 sec
And that's with a very little number of decision, during game liek starcraft you have to make decisions and actions multiple time per second.
@@ballom29 I remember when i was watching the stream of the AI playing star craft against pro players, the AI lured out the enemy by basically sacrificing quite a bit of its army and then pulled out some shit requiring over a few thousand actions per second. At that point everyone started complaining that it's impossible to win against.
It has imperfect information, not perfect information.
@@ballom29 It's not surprising it took a similar time to a human to make a decision. Since that is a reasonable time to think over things, that is how long I would program it to think for and would guarantee it arrived at the best solution it possibly could.
@@meiz1795 That was a while ago, with the new DeepMind AI they were very careful to make sure that the AI (AlphaStar) only had human-level APM and human-level reaction times to work with. It's worth watching the new games, they're very interesting.
Looking forward to an "Alpha MedBot" that can diagnose you at home or send you to the hospital if it can't.
If you provide perfect simulation of human body then it will be ez.
@@MajkaSrajka I was thinking something more along the lines of it asking you questions and possibly watching you and hearing you through a camera to use as input to walk a decision tree trained in the neural net
@@DamianReloaded Decision tree and neural net is a very weird combination.
@@DamianReloaded It needs data for selflearn. Even "imperfect information" from Stacraft doesn't matter that much since it can simulate millenia worth of games and learn it "By gut".
I believe the medical domain for illnesses that can be self treated at home would be very narrow compared to a starcraft 2 battle.
I think that this is a much bigger deal than a lot of the other stuff we've seen. I'd even go so far as to say that it warrants more than two minutes! Unfortunately, there are no videos from the paper demonstrating game play. I'm looking forward to demonstrations in real-world domains.
Have you looked at the Baritone AI for Minecraft? I don't think it's authors are publishing in papers but are distributing a lot of public threads. It uses a lot of exploration in order to learn behavior and there is a private thread of a semi related AI That's mastering PVP on anarchy servers
Isn't that what AlphaZero were already achieving ? Surpassing human capabilities on Chess, Shogi and Go and atari games... I'm just saying this because I'm trying to see what's new about this MuZero algorithm compared to Alphazero
They want to go more general.
Do you know of any other UA-cam channels like this? I've been an educational UA-cam junkie for a good 5 years and this is one of my favorite channels with frequent uploads. Most other channels that come close are channels that release longer form content much less frequently. Scishow is good, but not this quality. In astronomy news, Anton Petrov and Frasier Cain have frequent release schedules for their high quality content, but are usually just talking, with much less visual information on the topic than you provide. Rob Miles makes terrific AI safety content, but very infrequently.
You are very kind Leo. Thank you so much! 🙏
Mathologer is great. 3 Blue 1 Brown is also excellent.
Confirms again the home of AI is the UK that's pioneered this via Turin's vision and the Deep Mind team started and built in the UK. So much is going on this incredible Island.
You were showing StarCraft II which has nothing to do with MuZero hahah, that's the AlphaStar story. If we had an algo that could crack StarCraft II/Dota 2 and Go, Chess, and Shogi that would be a completely different story!
Great video! I am exploring MuZero this week as well in a series going from AlphaGo --> AlphaGo Zero --> AlphaZero --> MuZero! I hope attention around MuZero and these algorithms will also inspire more people to participate in Kaggle's Connect X 1st RL Competition!
Why not BetaZero? 😂😂
Hey its henry..
Wow Keep fighting bro, i support all of your work.. And i hope you win the prize ( if you want to join tge competition)
Man!! After watching this, I need a huge refrigerator for my mind!!!
I can watch these videos even if they where 15 minutes 👌🏻
enjoy your time while you can...
Who do you think is most likely to create true AGI first?
an AI
It's hard to guess. I suspect it can come from any corner anytime now. It looks like in 2 to 3 years, the chances are going to be wildly more than what they are now, if it already doesn't happen by then.
@@BKdefr creating an AI to develop an AGI?, sounds about right!
Some random schmuck we've never heard about before.
As much as these AIs are good at doing what they do, I think true AGI is still decades away. I know technology is developing at an exponential pace but this is something very serious and represents a milestone in a civilization