AI Learns to beat other AIs (Mario Kart Wii)
Вставка
- Опубліковано 22 чер 2024
- This video shows an AI learning to play Mario Kart Wii against the game's default medium AI.
For anyone who saw my first Mario Kart video and wondered why this one didn't do as well, its because this AI trained in about a quarter of the time! My computer needed a break after the last video!
00:00 Explanation
2:19 Racing Starts
4:54 Final Agent - Ігри
I wonder what would happen if you implemented a small penalty for starting a drift and not getting a mini turbo, since that seems to be a problem the AI is running into. I think that would also force it to align for turns better.
That’s a good idea, I actually used something similar in a previous video where I just gave it a reward for finishing the turbo, and that seemed to help a lot. Definitely something to add again in the future!
Yeah, I agree it definitely depends on what the intent behind creating the AI was. If it was only to get the best time using any resources necessary, it could simply just start from the TAS run.
Also maybe a bonus from hitting boost pads?
@@ghost_ship_supreme boost panel = fast and if it’s faster to take the boost panel than not to it will probably automatically go for it
Ohhh i think that’s cause it can’t drive right or left without drifting
It's really interesting that the AI's solution to keeping speed up even though drifting reduces speed is to alternate between drifts and wheelies, even though it reduces speed in the long run and takes the turn much wider. I wonder if you could reward low lap times so that it can continue to improve even after it is able to complete laps.
In future mario kart videos this is definitely something I will add in! Or at least some type of reward for getting around the track quickly to encourage faster driving, rather than just being safe
Heck yeah! I've always wanted to see a custom Mario Kart A.I. that, instead of speeding up and getting unfair items (MKDS, I'm looking at you), taking advanced lines and properly using items. Keep it up!
MKDS AIs were pretty dumb, I remember being confused when I was younger with what they were doing! Will try my best!
Idk if learning to race starting from the same point in a race repeatedly is a good learning basis, it might be getting too used to the information specific to that race and be unable to adapt to other ones
That’s a good point, in reinforcement learning it’s quite common to have many different start locations, both to widen the experiences of the AI and help to AI explore the environment.
I'm surprised the agent didn't go for the boost panels more often after the early epochs. Do you think given more time the algorithm would start to use the boost panels more? Were you running this thru a comp server or local side? Very interesting, appreciate the nice watch!
It tried it a few times, but couldn’t seem to use them without dying so gave up on the idea. It’s possible eventually it figure it out without dying, but could take quite a while. This was all running locally, just left my pc running overnight. Glad you enjoyed it!
I wrote out like 7 questions about your architecture before realizing that to properly understand the answers to those questions I need to go read some papers to remind myself about how reinforcement learning even works, Wa-hoo into the PDFs I go. Good job you got me to read
I guess I did a good thing lol! Feel free to ask questions if you're still stuck after reading!
a maybe better reward system to use could be based on the game’s race completion and current checkpoint variables, but this is still pretty cool
Just wait until the next video!
this is sick!! cool idea!
fancy seeing you here, falcon
hi falc!
Hi Captain RedFalcon
!!
hello falc
A good idea would be to give them a reward for completely entering each key checkpoint that is in order to complete the race, that’s how the AI in the game function to an extent, this combined with their already programmed: drive forward, turn, and wheelie maintaining speed would make it easy for them to learn multiple tracks consistently with the goal to make it to the “end” so to speak
This is great stuff. I would love to see more and I’m sure many others would too. Looks like you’re putting out videos at a good pace, keep it up and this channel will grow a lot for sure!
Thanks! Planning on putting out much more content so hopefully I can get a lot of people interested in AI
Great video! I'm wondering in terms of controls, why not have turning be represented by a single float value (if that is possible) that is constrained within maybe -1 and 1 (or whatever works best). Negative values would represent left turns, and positive could be right. Then you could feed this into the game as a joystick input. You could then make drifting be a toggle, and so the agent chooses to drift the corner rather than choosing to turn a lot one way or the other. I figure this reduces outputs while giving the agent a lot more control over it's fine movement.
Everyone else's ideas about rewards are great, and I'm excited to see updates!
This would definitely be better in terms of optimal performance, as it would allow far finer control over turning. This is definitely possible and is a great suggestion, however would likely increase the training time slightly, as instead of 4 different actions, it kind of has infinitely many actions (although this has been done many times, and is technically called a continuous action space). Furthermore, this would require a fair bit of coding on my part to get this working, but would be great to see what it could do!
I can’t help but think that it isn’t learning how to drive at all. Because it’s racing the same start and same track every time, it could totally ignore any visual input at all and just be memorizing inputs.
Basically it’s just learning how to speedrun Mario circuit blindfolded, and that would also explain why once it manages to get past all the other racers (which would be inserting random variance) it is able to then complete the rest of the track and multiple laps relatively easily. It’s just creating a blind TAS :P
Oh I greatly look forward to seeing this evolve!
I will do my best!
This got me back into making Mario Kart Wii content! Thank you :D
Glad to hear it, thanks for watching!
Very cool video! I don’t know a lot about AI training, but it seems to me that the AI is risk-averse, seeming to prefer hopping and not wheelieing when anywhere near other AIs for fear of being wheelie bumped, which causes it to drive slower but eliminated the risk of losing most of its speed, which happened early in the training when it kept getting wheelie bumped by Peach.
Thanks! In future versions I may prevent wheelie bumping from killing the ai as that seemed to be problematic
Very cool video!
I am super interested in AI learning with games, I hope we can learn from the AI when trying to improve records.
Thanks! I’d love to see AI helping people improve records, would be absolutely amazing!
I might try to do make my own ai with this as well. Thanks for the great inspiration!
Go for it! Love to see what others can do
Can't believe I'm jealous of an AI. It took me 20 hours to beat my first lap on mario kart :(
Not sure what image processing you did for each frame, but I wonder if making the image "smaller" would help. And I don't mean resolution of the image, but maybe the color depth. Did you use a full 256 values for grayscale? What if it were 64 or less? What if you ran each frame through a edge detection algorithm and got a binary image? Then maybe you could increase the resolution? I also wonder if you cropped out or simply blacked out the irrelevant data, like the time/lap numbers, and the place in the race. That might help the algorithm focus on what we would expect would help them race better, the track itself. I also wonder what would happen if you put a big black box on top of funky kong, since again, in some ways the look of funky kong isnt relevant to the course. But these are all ideas to try to see if training is quicker, perhaps.
The image processing was just downsized, then greyscaled with 256 values, then converted to a float. Using 64 values could work, but I am unsure if there would be consequences due to the loss of detail as the image is already extremely low quality (Less than 100x100 pixels). Other edge detection could speed up the process, but could also harm the detail if done poorly. The architecture of the AI uses a convolutional neural network, which typically learns edge detection and similar things on its own anyway. Cropping is something I'm definitely considering, since its unlikely the edges of the screen provide useful information. I'm not sure what the result of blacking out regions of the screen would be, as I haven't looked into what regions of the screen the AI is currently focusing on. If its looking at things such as the time, it may be useful.
Would there be a way of implementing the minimap as a way for the AI to recognise what's track and what's offroad?
So the ai can see the minimap, but I’m unsure if the ai is using it for this way. It’s possible I could make some representation of the map which the ai could easily use to distinguish off-road and on the track. Definitely a good idea! My first approach to this problem (the video AI Learns Mairio Kart Wii) actually used an approach where I defined the off-road, and used this information to kill the ai when going off course.
Cute how it saw it had to turn left and then made a hard right instead, because that's how it solved every other turn up to that point. 3:23
this is cool, it would be nice to see how this works from a more technical standpoint/see some code and implementation details as well
I plan on making at least one quite technical video soon. Let me know what you would want to see in that video!
This is cool!
Nice video bro
Thanks!
One simple method for determining the fitness could be to just take what place it's in in the race and use that to determine fitness.
Very interesting video. I wonder if you would consider making an explanation video on how you did this? I'd love to try this out, I assume this is running in Dolphin?
At some point I definitely will! This is running in dolphin, however the main challenge of getting this to work was allowing the AI to access dolphin!
I thought it said AL so I also thought that he made an all that beat other ALS
4:58 when is Nintendo agent kong will be driving until the Smg4 court.
Glup
Btw. NICE CODING THO!
This makes me wonder what an AI playing F-Zero would look like...
It is possible to run on dolphin (the emulator im using), so if the people wanna see it I will make it!
This Chanel is soooo underrated! Its just so cool what AI can do. I tried this on rainbow road...
34 hours and it could do THE ULTRA!
That's very interesting! What method/algorithm was you using?
I somehow doubt that.
3:00 NEFFEX - That’s what it takes
I got an ad it was just a UA-cam video from a channel that I watch
Pro AI
nice video.
Thanks
It would be interesting to see how it would handle the use of items. I noticed that there were no items available during this race
I thought I would keep it simple for now, but it’s definitely on my todo list if people want to see it!
Well, cool. But did it really learn racing, or did it just become overfitted to that particular track?
It likely did overfit to the track, but still learnt a little about racing. If it was put on another track, it would probably struggle but learn a lot faster than learning it’s first track
What Reinforcement learning algorithm did you use? I am currently working on a project using tabular Q learning
This was using Rainbow DQN! It was DQN with 7 different improvements to improve the efficiency and performance. Paper is here: arxiv.org/abs/1710.02298
Maybe look into reinforcing and rewarding the AI for having the player’s icon being on the road for the map? That might reduce the amount of time needed for your program to complete a lap instead of 16 hrs
Speed works pretty similarly since whenever the AI goes off-road it will get killed. Next time I may try to stop wheelie bump deaths though
I have no understanding of how AI works / trains but could you feed it world record video's and train it from that for a perfect run?
There are a few techniques which utilise previous information, but they are fairly complex. The simplest version I could do is just to play a bit myself and let it learn from that. It's easy to do it that way as I can collect the data in the exact form the AI needs. I imagine it would certainly speed up training and most importantly, exploration.
I want to see a long term fo4m of this project.
Maybe you can change the turn rate into a floating point variable, and omit the drift turning entirely, simppy replacing it with the drift button so it manaully decides to drift, though that idea, unlike the variable turn rate, is entirely up to you.
This way, it can turn more precisely, and maybe you can inpliment a shake function as well? Also, add a specific functionality for items, then start training it on grand prixs. Constantly.
Eventually, maybe you can set it to train against real players??? I bet that'll be fun!
That is an interesting idea, giving the ai more control always results in interesting results! Items increase the complexity of the problem a lot hence why I haven’t added them in yet, but they are definitely coming soon if people are interested!
can't wait to see ai in online races
Not sure when that'll be, but if this channel gets big enough I'll see what I can do haha
Very interesting, and also quite a good way to represent AI learning in a simple way. I'd just like to know whether the AI would ever stop making those turbo-less drifts and the hops, because the time loss is minimal and so it doesn't contradict the core goal of the AI. A human would see that the turbo-less drifts waste time, but would the AI ever learn this unless the drifts actually cost a large amount of time?
The AI might learn it, but its hard to tell. In theory they should eventually try new actions that result in the mini-turbos and get the increased reward and learn that behaviour, but in practice its unlikely. In future Mario Kart videos I will likely reward the AI for getting the turbo, making it much more significant
Wtf, this is so crazy
Is your AI using the minimap? Does it generalise to other courses? Can you share your code? I'm interested!
It has access to the minimap, but I do not know how much it uses it. If you are interested in generalisation, check out my previous video, "Mario Kart Wii but it's an AI", where I train an AI on 10 different courses at once! A code reveal is coming at some point, but will take a while to clean up. I'm thinking of doing it as a 1000 subscriber special (if I reach that amount!)
@@aitango Stacksmashing did a Mario Kart 64 AI video and discovered there that the AI would indeed use the minimap. I don't see any reason why your CNN (I guess?) wouldn't do the same thing. I'll definitely check out your other videos, and I'll be revving up my GPU for when I can train the model myself :).
If the AI just learn how to do one course, how would it learn how to do the other courses? What I think you should have done is ( idk if it is possible) make it so that the AI spawns at a random position in a random map and instead of giving it points for speed, give it points for passing check points. But again, i dont know if it is possible
It would learn other tracks in the same way, but just takes a lot longer since it needs to learn about the different Color’s/textures in the other tracks. I’ve done a video on something very close to this though, check out “Mario Kart Wii buts it’s an AI”
This is very nice. Do you have any plans to make the code public?
I'm considering doing it as a 1000 subscriber special, along with a video or two explaining the details of how it works!
@@aitango Cool. You just got a little closer
What’s the variability of the AI learning? Is it bound to make the same mistakes eventually with the parameters put in place, or every time you restart the learning process different information reshapes the outcome?
These ai have a lot of variability, which is both a blessing and a curse! All AIs will have slightly different tendencies and biases, so you never really know what you’ll get!
I could think of two ways to reward and punish:
Reward for getting 1st place, gradually increases while in first place but will penalize losing that place by shortening the reward.
Reward for any sort of interaction that would place the opponent behind them, and keep them behind
I strive to be a software engineer, so I need to figure out how to do this stuff. Yet my brain just fries while thinking about it. How do you even create ai? I have high interest with the concept, but with no idea to begin, especially with it adapting the layers. The input/output is fine, but the layers remind me how far I have to go.
Interesting ideas for sure! I hope you can work towards becoming a software engineer, programming and AI is something I do both for fun and for work! I recommend the channel Deeplizard, as they were the channel that gave me much of what I know about deep learning, and they also have a playlist on deep reinforcement learning
This one is a no-brainer, but I think you should give the AI rewards for going at top speeds
Maybe also giving negative rewards if their Position on the map (X, Y, and Z) changes less than it should by going at high speed, to encourage going forward as much as possible, while also rewarding to stay around the middle of the track where it's harder to bump into walls and such.
Going at top speeds is an idea I've been considering to encourage things like boost panels and mini-turbos, as it seems to have trouble with these for sure. Mario Kart also contains a "race completion" variable which I may use, as this should be an effective may to make sure it goes fast in the right direction, rather than just going fast.
If the ai was injected instead of emulated, you could've used the ingame invisible checkpoint flags that determine your respawn point as they're really good at tracking where you are on the map relative to the start, and since their ID in the files are really easy to access
Yeah true, I've been debating using the race completion variable vs the checkpoints. Race completion is nice as it gives constant information, but the checkpoints are more accurate
I know devs have time constraints but it would be cool if ai actually went through a process like this giving them actual skill instead of just having rubber banding to make the game more difficult.
If there was a company that wanted to do that, I would be happy to assist! It would be much cooler than current approaches
16 hours of basically just trial and error. And the progress won't even transfer over to another track. However, what might be even more interesting is whether it would "learn" similarly fast without any visual input at all.
Reinforcement learning does require trial and error, however it could not learn without visual input. This AI has no memory at all, so isn’t just memorising input. As other comments have rightly suggested, it may be overfitting, which is where it’s inputs are based on a very specific visual input, rather than being able to come up with a good action for any input
what happens if we introduce tricking, items and new tracks
i think instead of feeding it an image, you need to feed it an image without a hud and the hud specific things such as minimap info and items
Tricking is actually already in there, just nothing to trick off without a mushroom on Luigi Circuit. I considered adding items, but I brought up some problems. As I'm currently using the speed to detect when to reset the AI, I would have to change this as getting hit by an item would cause the AI to get reset. Definitely possible, it would just mean I would need to change my way of detecting death. For more tracks, check out my video Mario Kart Wii but it's an AI! I could potentially crop the image to only include the most relevant things
So cool 😮
Glad you think so!
wow
Lmao love how it was just spamming drift all the time
How did it not find out how to supergrind tho🤔🤔🤔
That would’ve been so cool if it did
I wouldn't suggest terminating a run when AI hits the wall bcs for some shortcuts or other stuff it may be useful, i'd say let it continue runs but with a negative reward.
Also, to avoid overfitting to that particular track change the track, the location of cpu and of the AI (basically make it so it always lands in random spaces)
Ideally I would like to remove the resetting, but it just slows down the early stages of training far too much. For the first 25% of training, the AI would literally just be driving into walls and collecting non-useful information, and likely wouldn't reach the point of proper driving, at least not for a very long time. Later on in training it is beneficial for the reasons you mentioned, so its definitely something to consider
so this is how t1 players learn to drive?
So it's sort of like a TAS that writes itself
Pretty cool, but it won't work that well if you don't use save state anymore and start a new race.
NEFFEX music I hear? Hmyes. Perfect.
Gotta love some NEFFEX
@@aitango 100%
Ngl your voice sounds like octopaul
If you can let the AI read variables from the game itself, I know from shortcut videos that Mario Kart Wii has a "race progress" variable that tracks as a number how far into the track you are, so maybe you could use that as a reward for the AI? This would not only prevent it from driving in circles or driving backwards, but would also incentivize it to take better racing lines.
I actually used that variable in one of my other videos (AI Learns to Master Mario Kart Wii)! It worked pretty well, however I think a hybrid approach may be best as it seemed to be a little inaccurate at times. For example Ghost Valley 2 had some terrible accuracy issues with that variable. Maybe using both can get the best of both worlds though
ice
Altuntive title old computer players vs a next gen computer player
I like to think this AI would have far more RGBs if it were a computer!
Thanks but now I'm no longer Toby Zilla I'm --why
only the title made me laugh
How do you do that
Quick question? Can you make an AI to beat the bots in tf2? is that possible? Like a bot hunting killer
Definitely with enough training, but may take a very long time! The typical issue with first person shooters is that there are a lot of actions the ai can do (look and move up, down, left, right, shoot, reload etc). But at some point I am looking to try an fps game
@@aitango If that can be done you’ll be the community’s hope
@@aitango We need our own bot to fight back,you’re our only hope
If it could just turn without drifting that would fix it
I plan on adding this to the next version!
So awesome..! Imagine Ai being able to compete with top players...
Look how bad cpus are btw, being outrunned with 30s laps on louisy circuit 🤣🤣
Yeah Mario Kart Wii AI really weren't the best haha
imagine if joe biden was just one of these AIS
now face off against the bot
Maybe if I do a Mario Kart AI finale that should be the video!
Alternatively, you could try importing a chess bot, that should cut the time down a bit
666th subscriber
this is obviously faked
wanna bet
You should really breathe more when you speak I can barely understand what you're saying through my phone speaker your voice is too raspy on the low end.
Looks like my friend trying to learn Mario Kart Wii