AI Learns to Play Tag (and breaks the game)
Вставка
- Опубліковано 4 чер 2024
- AI vs AI Playing Game of Tag!
brilliant.org/AIWarehouse/
If you want to learn more about AI and deep reinforcement learning (how Albert is trained), there are amazing courses teaching those exact concepts on Brilliant! You can use my link to get a free 30 day trial with 20% off! I've personally gone through the course "Introduction to Neural Networks", and it's one of the best courses on Neural Networks I've ever seen. They're paying us to promote them, but they're genuinely a great service, I've had a Brilliant account for over 5 years and can't recommend it enough :)
In this video two AI Warehouse agents named Albert and Kai learn to play Tag against each other. The AI were trained using Deep Reinforcement Learning, a method of Machine Learning which involves rewarding the agent for doing something correctly, and punishing it for doing anything incorrectly. Albert and Kai's actions are controlled by Neural Networks that are updated after each attempt in order to try to give Albert and Kai more rewards and less punishments over time. Check the pinned comment for more information on how the AI was trained!
Current Subscribers: 347,546 - Розваги
More information about how Albert and Kai were trained:
Time it took to train :
Room 1: 12h 30m (though I stopped the recording after Albert broke the game)
Room 2: 13h 40m
Room 3: 1d 20h 2m
Final Battle: 6h 48m (this wasn’t shown but was needed since the agents weren’t used to seeing other teammates)
We continue training on top of the previous brains, meaning by the end of the video Albert and Kai both have trained for 3 days and 5 hours
Thank you so much for watching! These short videos take literally hundreds of hours to make, if you want to help allow us to make them faster, please consider becoming a channel member! By becoming a member, your name can be in future videos, you can see behind-the-scenes things that don’t fit in the regular videos, you can also use stickers of Albert, Kai and some other characters our team made in comments (more coming) :D
NOTES
When I mention it took x days to train, that’s in game time, and much larger than the displays indicate since there are 200 copies training simultaneously.
This is a very long comment going over more of the details of how Albert and Kai works, issues they’ve had, unexpected results etc.
THE BASICS:
Albert and Kai were trained using reinforcement learning, meaning they were rewarded for doing things correctly and punished for doing them incorrectly (the reward is just increasing their score, and the punishment is decreasing it). After they finish each attempt, the actions they took are analyzed and the weights in their neural networks (brains) are adjusted using an algorithm called MA-POCA to try to prioritize the actions that led to the most reward. The agents start off making essentially random decisions until Kai accidentally tags Albert in the first room and is rewarded, then, as mentioned above, the weights in his neural network brain are adjusted in order to try to replicate that reward (it wasn’t this simple for this video since we use self-play to train both agents at the same time, more on that later). This leads to Kai learning that tagging Albert is good, and since Albert is punished when he’s tagged, it also leads to Albert learning that getting tagged by Kai isn’t good. This process continues through 10s of millions of steps until one of the agents consistently loses, or the agents are able to counter each other well enough to where it’s a draw.
REWARD FUNCTION:
Albert and Kai are given two types of rewards, group rewards and individual rewards. When Albert gets tagged he’s punished by getting a -1 group reward and Kai is rewarded by getting a +1 group reward and vice versa, encouraging Kai to tag Albert, and Albert to avoid being tagged by Kai. Additionally, Albert is given an individual reward of 0.001 for each frame he’s alive (0.6 total in a room lasting 10s), and Kai -0.001, to encourage Kai to try to tag Albert as quickly as possible. When we introduce the grabbable cubes we also give Albert an individual reward of +1 the first time he picks up the cube to make sure Albert actually starts using the cube (since without this, the rewards were too infrequent for Albert to learn to use it effectively).
BRAIN:
Albert and Kai’s brains are neural networks with 4 layers each (one input layer, 2 hidden layers and one output layer).
The agents collect information about the scene through direct values and raycasts. Every 5 frames they’re fed data about their position in the room, the opponent’s position, velocity, direction etc., and they also collect information through raycasts (a simplified version of eyes). The agent's eyes (raycasts) can differentiate between walls, ground, moveableObjects and Kai/Albert.
The agents' brains (neural networks) are given the data the agents collect from direct values and raycasts and use them to predict 4 numbers for the respective agent which control how that agent moves. An example of an output of one of the neural networks is: [1, 2, 0, 1], this would be interpreted as [1=move forward, 2=turn right, 0=don’t jump, 1=try to grab], so the agent being controlled by this neural network would try to move forward while turning right and grabbing.
The fact that we have two agents training simultaneously complicates things a bit, normally we’re able just update the agents brains every x steps, but if we did that for both brains at the same time then they would struggle developing multiple strategies, since reinforcement learning tends to be best at finding a single solution, that would lead to the winner dominating and the loser stuck doing the same strategy over and over. The way we tackle this issue is by using something called self-play. Since we use self-play, we technically only train one agent at a time, and swap which is being trained every 100k steps. When we’re training Albert, we use a recent model of Kai’s brain as his opponent, and to avoid there only being one strategy, we store 10 recent brains to use as opponents, swapping them out every couple thousand steps so that Albert learns to beat all of them and not just one. This results in a much more general AI that’s hard to exploit.
UNEXPECTED BEHAVIORS:
In room 1 Albert manages to break out of the room by exploiting a small hole in the hitbox near the top of the room, which was there because I didn’t make the hitboxes on the walls tall enough. Though Albert used it to escape, I’m not convinced he actually would learn to do it consistently. The challenge with this video is that it can be difficult to interpret the agent’s behaviors; Albert could be making certain unexpected moves as a way to exploit Kai’s poorly trained brain to get him to make bad moves, or Albert could just be making these unexpected moves because he hasn't trained enough. Albert was able to find the hole a few times, however he wasn’t able to do it consistently, this could be from either him not training long enough, his observations not making it easy to detect when he can jump out, or Kai quickly learning to counter him getting to the display in time.
In room 2 Albert also manages to glitch out of the room, and he was able to do this consistently. We made sure the cube grabbing functionality was coded as rigorously as possible, even with it automatically detaching the grab if the force exerted is too high, I couldn’t find a single way of exploiting it in testing, but Albert certainly didn’t have issues finding it.
Albert also had a couple moments of throwing the cubes at Kai and spinning with the cube to throw Kai out of the room, we didn’t even consider this being a possibility before training, AI’s able to come up with some really clever solutions to problems.
OTHER
Thank you so much to our amazing team that helped make this video! Jonas helped with setting up the character controls, Tyler helped create the clean grabbing functionality, Catt helped edit and Andrew and Steve helped solve any issues we ran into while making the video. If you want to meet our team and talk to all of us, join our discord server!:) discord.gg/qDRtuFe5gp
YOUR BACK!
I like you ;)
i love you vids
I ain’t reading all that!
first ig?
Kai occasionally obliterating Albert's dead body shows that AI is capable of learning
*gamer rage*
4:32 He's literally teabagging Albert.
"It's Kai, the Blue Cube! He's loveable but he has an attitude on account of those frowny eyebrows! We hope he'll be a welcome addition to the game crew."
(200 sim-hours later:)
"We regret to inform you that the Blue Cube is racist now."
AI learns BM
@@rogerhepton1785 8:31 Albert got revenge
@@rogerhepton1785 bro learnt the backshot technique
"Albert, you can't escape"
Albert: "Okay, I'll force Kai to escape."
In a game where the most aggressive thing you can do is a light prod or moving a foam cube, Albert clobbering Kai into a different frame of existence is pretty Gamer of him.
@@NoxedwinAs opposed to Kai who subjects you to fucking Malevolent Shrine the attosecond he touches you
When does this happen
8:40@@forabba5776
@@forabba5776~8:39
"Kai, that was aggresive"
Albert like 2 mins later: *throws Kai out of the map*
Albert*
He mastered the legendary technology, *The cube*
Revenge
He became one with *THE CUBE*
Karma
My favorite genre of A.I.:
No art thievery.
No job thievery.
Just A.I. learning how to play tag.
Edit2: Obviously, the first edit didn't help, so I'm just gonna delete that. Sorry y'all. 😓
This one was pretty messed up, bro forced them to fighttothe death
exactly
Ai doesn't steal anything, who said drawing is made for humans
@@scoreandspore.5606it does actually bc to make AI art the AI has to take art that already exists and smash it all together to make a new piece, and it’s been proven time and time again that the creators of these art engines LOVE taking art without consent for their database. Without human art, AI would never be able to make art, and as soon as AI starts pulling art from other AIs it’ll just poison itself.
@@redbassett2462 So it has to see art much like those human artists, to learn how to do it itself?
The 1v5 went crazy you can’t even lie
They say there's strength in numbers, but the winner proved that to be incorrect
So long as one survives, they haven’t lost. This is what people mean when they say that
Boy got launched almost out of the map again, BOUNCES OFF A WALL and proceeds to bamboozle the taggers so much they get mentally crippled
"If the Kais went at me together, I'd definitely have trouble."
"But would you lose?"
"Nah. I'd win."
Albert even learned to *_WALL-JUMP_* to escape the team of 5 Kai’s!
finally, the main villain was introduced
Those athletes from 100 meter dash are Albert's buddies, Kai is his enemy.
Everything makes sense now...
Uhm, well actually he was introduced in shorts about 2 months ago 🤓
Yes, but will he ever appear again?
I'm gonna draw them having s-
@@juanleon3875 Every hero needs a villain. He must be a reappearing character.
I love how you basically reimplemented the hide and seek experiment from OpenAI and ran into the exact same problems as them with the agents abusing the simulator physics
E
To be fair..... People would abuse stuff like that too
@@siliciaveerah9327exept most of these glitches are probably difficult to reproduce in terms of exact input, but thats not a problem for the ai😂
@@lazydk2654 oh ye of little faith
@@siliciaveerah9327 Lol I loved discovering stuff like this back when I played Roblox. Prop flinging never gets old though, it doesn't matter what game.
4:14 Albert started emoting💀
nahh even the ai is foul
Albert: "while you struggled on foolish pursuits, i studied the cube"
Had to be the 69th like for ya
The tungsten cube is the way.
@@darkwelder9736the density
@@malthe236 its so beautiful
@@frankaoooooooooooooooooooooooo open the curtains lights on
7:06 "Now Kai's frustrated"
**explodes in frustration**
Gamer rage? More like AI rage
AI RAGEE
Actually relatable
😂
rAIge
Albert clutching at the end was ludicrous, had to show the youth how the oldheads used to play
I love how Albert’s biggest breakthrough was just escaping the tag arena altogether
The only winning move is not to play
@@edgarallenjoe6494 you’ve.. Been..
@@trollguy2616 hit by
@@trollguy2616 TROLLED! You've been trolled, yes you've probably been trolled
literally breakthrough
9:59 nobody's gonna talk about Albert's casual wall jump lol
Kai*
@@ChaosHaveralbert literally wall jumped wdym
@@ChaosHavernot kai
@@iplayjtohitsfun oh
@@iplayjtohitsfun oh yeah your right
Albert constantly throwing himself outside has to be the most hysterical strategy of all.
I mean. It works! Who can argue with the results?
This is actually a fairly common thing for learning algorithms to do. There was one I remember reading about which was tasked with finding landing approaches with minimal damage to the plane it was flying. Eventually it started flinging itself at the ground fast enough to overflow the damage calculator, resulting in a massive negative damage number.
@@stargate525That's actually hilarious!
@@stargate525Imagine giving it control of a real plane. The "safest" landing method would be a nosedive
@@deltap6967 News: "123432 deaths around the world due to Japan's planes nosediving into the ground at mach 19.23!"
5:47 the way albert just ascended
10:15 I absolutely love that four of them just instantly died cause they could not comprehend what was happening fast enough, and one just instantly kicked into life or death mode and started doing insane strategies on the fly to avoid the army of death following him
6:25 Albert makes a wall, Kai breaks it, and Albert proceeds to send a block back at mach 10 speeds
i mean albert does know how to fling a block to make him escape the room
DODGEBALL!!! ⚾🥎🏀⚽🧶🏀🥎⚾⚽🧶🏐🏉🏈🏀⚽🥎6:25 999M/SEC
Bro they really do get angry I swear
Bro hollow purpled him
What was once a game of tag, now became a murder attempt from Albert sending a cube at light speed
albert perfectly understood that "to confuse your opponent you must first confuse yourself"
😆
Chat is the the super bone player
Lol
Can I quote this
i like how Kai kept emoting on Albert whenever he tagged him, very human
Albert loves throwing himself out of the maps, he really knows how to think outside the box
5:30 speedrunner discovers new gamebreaking glitch, learns how to perform it consistently, smashes competition to bits
And it's literally cube jumping, one of the most famous glitches in Portal that speedrunners use!
@@sheersternfeld1914You're right lol
"He was treated like a loser but then he discovered a cheat SSS rank skill and become a god!"
underated
And finds another glitch at 8:40
Albert casually doing a 1v5 at the end, true gamer
Albert #3 coming in clutch.
average snd match in cod be like
@@Noxedwin he showed them the power of being the third strongest
He even pulled off a walljump
Albert casually doing clutch and exploit the game
6:11 I love how happy Albert looks here! He loves all the cubes!
"This wil be fair game of tag"
Albert throw a cube to Kai 😂
8:40 Albert is savage. He woke up badass today.
Probably the funniest part of the video
Albert is god
*_iM a SaVoG yUh cLaSsiC gUcCi-_*
Bro really said “so long gay bowser”
@@themarkerchannel3170 FRRR
8:30 even without human intervention, THEY INVENTED TBAGGING LOLLL
You can see the kai go through the actions at 4:32 too 😭
@@flosamuu nah he was cleaning up Albert's dead body (it was messy)
what do you mean you are going into 1v1s we are useful teammates!
also my team: 9:45
That Albert 1v5 clutch at 10:13 is insane, Albert is a pro gamer
please make "AI learns to play capture the flag", it would be so fun watching them battle to get a little flag!
Yup
Give them guns.
Re create tf2
@@roboxplayer0 aye
@@roboxplayer0CF2. Cube Fortress 2.
8:30 they played enough time to become toxic teabaggers
He's teebagin
I love that you're able to put the sponsor in without disrupting the video. Good stuff.
Honestly, that Brilliant ad integration was... brilliant, lol. Integrating it directly into the video and interweaving it with the content. Still very upfront, but presented in a way where it's very difficult to skip. Nice job!
Oh hey new character, and Albert is back to a square, my favorite version of him
Mine too! He's a happy cube!
Go square Albert!
Nah, Kai first appeared on the boxing battle short.
Its always been square albert~ he was just piloting a mech ;)
not walking Albert sliding Albert
Albert throwing the cube at Kai was funnier to me than it should have been.
Albert got his Limit Break. He was sick of always being on the defensive.
"Me? Run? Hell nah. Take this Kai!" *boink*
i remember there was this old 4 minute or so video about aiis learning tag, it was very interesting, im glad someone made another one
I love how you integrated the sponsor into the video. Super well done, I wish more content creators found a way to do this
Albert constantly T-bagging Kai is wild
Kai doing the same to Albert aswell lol
They've learned the TF2 Humiliation lap.
dang
@@NoxedwinBAHAHAHSHGHGHXK
What
Albert juking kai out was hilarious
You are criminally underrated for how much you put into these projects. The AI, ideas, and everything else are all very impressive! I’m excited to see Albert biking :D
Also the attention to detail is crazy with the square people blinking.
I love these videos. Its funny watching Albert and his many fails but also really cool seeing his successes
Watching the AI cubes dance on each other's dead corpses is probably the funniest thing I've ever seen
Being toxic is a universal 𝒯𝓇𝓊𝓉𝒽
No matter the age, toxicity always hides underneath.
I love how after the final battle ends, Albert goes to the timer and tries to use it to escape once again
When in doubt rely on instinct. The old ways of doing things exist for a reason.
I LOVE this content, it's so goofy, yet also entertaining.
Your format make your content enjoyable even for those who know nothing about AI! Thanks, it was a lot of fun!
4:30 i love how Albert is just celebrating like he already won and then Kai just said “nope”
and then starts teabagging him
4:32 Wow, Kai already learned how to teabag Albert. Truly the best timeline.
AI is capable of gamer rage
I mean, Albert teabagged Kai at 8:32 just after he threw a cube at him
rel
Here i am,enjoying my break at my summer job by watching two cubes chasing each other,loving it
this is my favourite channel on youtube right now
Oh I just love the fun little dynamics you can see here, like Kai exploding from frustration in a corner while Albert hangs out on the ledge, Albert being an absolute MENACE to level design, Kai stomping on Albert's remains after his victory, and Albert in turn throwing cubes at him aggressively 😂 I've wondered of statistical unfairness of a game of tag in a plain environment and environment with obstacles present, there's clearly a strong connection as seen in the first two levels. But soon enough these two brought so much chaos to the scene I forgot to be analysing and just enjoyed the show. We missed you, Albert and friends!
They're so funny
“‘Albert and friends’ THE SERIES”
@@1.-.MULBERRY.-.1 can't wait!
Lol
4:22
Albert jumping up and down after getting the cube stuck is actually so cute😭
Finally, a proper medium for the use of ai, not only to be as a companion, but for entertainment as well. I thank you for providing a great video.
I love those types of Videos so much
This is actually a great example of how the rules to a competitive game will greatly influence how the competitors will play the game, even beyond what the rules intended. Since Albert gets rewarded for not being caught, but doesn’t get punished for leaving the arena, he‘ll just do that, even if it goes against the original spirit of the game. Or like how in many modern martial arts the point system introduced to establish a winner allows for effective strategies that work in the environment of the sport but would not have been feasible in the context of combat that the art was initially developed for.
I love how there's the moment of similarity between this and the other AI hide and seek thing, where the AI do things not intended by the developer
I've seen that one too. AI shutting down paths, glitching out of the stage, and finding all sorts of creative ways to abuse the map.
Whats this thing about another AI?
@@soumickdas9674 OpenAI's hide and seek experiment video
@@soumickdas9674 ua-cam.com/video/Lu56xVlZ40M/v-deo.htmlsi=7DP7xwuaA7cj7qxC
@@soumickdas9674there's another video where the AI learns to play hide and seek, the runner learns how to glitch out of the map to escape
Wow, it was hard to do! But result was really cool, congratulations, Albert!
I love how they learned to use the force to start chucking boxes
8:35 That wasn't very sportsmanly, Albert. I love it.
And here I thought cubes couldn't uppercut people in a game of tag
YEEEEEEET
The fact I am so attached to this little orange cube just shows how humans will pack bond with anything…
You hear stories of someone accidentally stubbing their toe on their Roomba, and having to coddle and apologise to it.
I think humans just have an affinity for Funny Little Guys.
Yeah. Anything that might to appear to be "alive" even though its completely inanimate or doesn't really have a mind of its own, we still feel the want or need to bond/apologize to these things xD probably the same way you imagined your little plushie to be sorta real
I just love your videos, also considering how much time it takes you to produce a 10-minute video.
That double wall jump on the 1v5 was sick and you can’t lie
Kai: I run around strong to catch Albert
Albert: I CONSISTENTLY GLITCH THE MATRIX
Loved the video, subscribed instantly, the competition between kai and albert at least to me makes this a lot more enjoyable to watch than just albert training (although those videos are really good too). One thing that did bother me though is that, while I like the idea of interweaving it with the video (especially that "on brilliant you learn by doing, just like these two are" line), it stretches the ad too much. From my counting it goes from 6:30 to 9:09 which is like 35% of the entire video and the constant switching between the two made me be able to focus on neither.
Other than that, ive rarely had so much fun watching something as these two blocks repeatedly disintegrate and break the game
this man clearly focuses more on quality than on quantity. love the vids!
Dude I was just wondering this morning how Albert the AI robot has been and you drop this only hours later, what a legend
"I WIIIINNNNN" Albert screamed, falling forevermore in to the endless white abyss
I would like to see another episode learning more about tag like buttons, parkour, levers
(I subed & ur videos are hilariuos and amazing)
Albert: I'm not locked in here with you, you're locked in here with me.
Albert abusing the physics engine reminds me of a hide and seek ai video i watched a few years back
I believe I know exactly the video you’re talking about, I thought about it too!
It reminds me of the Henry stickmin collection!
@@MatthewMorris6148 Please do tell
@@RealCCre oh yeah the Toppat 4 Life ending
Yeah the OpenAi video
Making it so that we can't skip the sponsor without missing a chunk of the video. You clever bastard.
I'm just ignoring it anyway
that app is great tho
I prefer it
Gotta love watching them chuck the cubes at each other
Bro, I’m an OG. I remember when you only had 3k congrats 🎉
8:06 Kai lost so much he just raged at albert
"ALBERT SMASH!!!"
Albert:💥
7:08 I'm so mad I could explode!
Yes
The famous line from lightning
@@sticonox9058 surprised it took people so short to figure out the reference
@@SheckoFromEWOW as soon as I saw that your profile picture looked like book I was like ohhhhhh I know the reference
@@SheckoFromEWOWOHH i got the reference XD
That was the biggest clutch ever at the end.
Albert mastered the ability of throwing cubes and glitching
YOOO THE 1V5 CLUTCH AT THE END WAS INSANE THOUGH
8:35 Albert when he had enough:
Imaginary technique: Stay on the cube.
“You can’t get me I’m on home base!” Breaking the game of tag is the natural way to win as the runner
now we need a combination of all the previous trainings into one.
I like it when AI/Neural Networks are used to do funny wacky stuff like this. Awesome stuff! :]
Ur everywhere lol
Kai is Alberts number one opp the lore is expanding
Albert clutched so hard at the end he deserves an award
the last part of the final battle truly described the person with the crown in Untitled Tag Game
This is a type of channel who post video every 4 months but you know that the video that will come out will be a banger
And this channel is the perfect example 🗿
The final 1V5 was crazy
Fr, Albert is such a clever lil cube for pulling that off! 👏
Crazy that the subtitles knows what’s in the video
the 1v5 at the end was crazyyy..
These are fantastic. Inventive tests, great visuals, brilliant captioning/storytelling. Always a joy when a new one pops up in my subscription feed!
thank you so much!!:D
Why no pin?
7:36 bro the jukes are smooth, and the way he can glitch the cube every time goes to show how AI can calculate what to do, to do it perfectly
I love how Albert learned to watch his opponent then dash off when they get close while looking where he's running (while Kai learns to look at Albert).
00:47 kai also learned to teabag when he wins
Albert learning to leave the boundaries of his world (and launch his enemy out of it) was not the plot twist I was expecting, but a fun one nonetheless!
next he'll escape the simulation
Watching Albert consistently break and exploit the game will definitely not have any real-world implications with AI alignment :D
We release AI's into the world, and a few centuries later they will all have ascended to a higher dimension by exploiting a glitch
Albert causing chaos to disrupt Kai in the second to last round is exactly how I play prop hunt and games like it when camouflage alone doesn't keep the hunters away from me. If you can somehow distract your pursuers to get them off your tail, it buys you precious seconds to find a new safe place to lose them. This can be anything from throwing something to try and draw their attention elsewhere or running right by other hiders to try to spook them out of their hiding place and occupy the hunters with them.
That 1v5 clutch was so intense
That walljump in the final battle was crazy
The way how AI learns to eventually abuse game physics in order to go out of bounds and gain any sort of advantage over the opponent, that is so impressive and what you'd expect out of real experienced players.
Honestly cool to see over-training (idk the actual term) in action
Yay you posted again