AI Learns to beat other AIs (Mario Kart Wii)

AI Tango

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 22 чер 2024
This video shows an AI learning to play Mario Kart Wii against the game's default medium AI.
For anyone who saw my first Mario Kart video and wondered why this one didn't do as well, its because this AI trained in about a quarter of the time! My computer needed a break after the last video!
00:00 Explanation
2:19 Racing Starts
4:54 Final Agent
Ігри

КОМЕНТАРІ • 149

@farazmirza6048 Рік тому ⁺⁷³²
I wonder what would happen if you implemented a small penalty for starting a drift and not getting a mini turbo, since that seems to be a problem the AI is running into. I think that would also force it to align for turns better.
@aitango Рік тому ⁺²⁵⁰
That’s a good idea, I actually used something similar in a previous video where I just gave it a reward for finishing the turbo, and that seemed to help a lot. Definitely something to add again in the future!
@farazmirza6048 Рік тому ⁺¹⁸
Yeah, I agree it definitely depends on what the intent behind creating the AI was. If it was only to get the best time using any resources necessary, it could simply just start from the TAS run.
@ghost_ship_supreme Рік тому ⁺³
Also maybe a bonus from hitting boost pads?
@X4R80 Рік тому ⁺⁴
@@ghost_ship_supreme boost panel = fast and if it’s faster to take the boost panel than not to it will probably automatically go for it
@X4R80 Рік тому ⁺³
Ohhh i think that’s cause it can’t drive right or left without drifting
@autumn4442 Рік тому ⁺⁵³
It's really interesting that the AI's solution to keeping speed up even though drifting reduces speed is to alternate between drifts and wheelies, even though it reduces speed in the long run and takes the turn much wider. I wonder if you could reward low lap times so that it can continue to improve even after it is able to complete laps.
@aitango Рік тому ⁺¹³
In future mario kart videos this is definitely something I will add in! Or at least some type of reward for getting around the track quickly to encourage faster driving, rather than just being safe
@nintandrewgaming7573 Рік тому ⁺⁷⁴
Heck yeah! I've always wanted to see a custom Mario Kart A.I. that, instead of speeding up and getting unfair items (MKDS, I'm looking at you), taking advanced lines and properly using items. Keep it up!
@aitango Рік тому ⁺¹⁹
MKDS AIs were pretty dumb, I remember being confused when I was younger with what they were doing! Will try my best!
@deff8487 Рік тому ⁺²⁵⁰
Idk if learning to race starting from the same point in a race repeatedly is a good learning basis, it might be getting too used to the information specific to that race and be unable to adapt to other ones
@aitango Рік тому ⁺⁹⁶
That’s a good point, in reinforcement learning it’s quite common to have many different start locations, both to widen the experiences of the AI and help to AI explore the environment.
@echo4086 Рік тому ⁺⁵¹
I'm surprised the agent didn't go for the boost panels more often after the early epochs. Do you think given more time the algorithm would start to use the boost panels more? Were you running this thru a comp server or local side? Very interesting, appreciate the nice watch!
@aitango Рік тому ⁺¹⁴
It tried it a few times, but couldn’t seem to use them without dying so gave up on the idea. It’s possible eventually it figure it out without dying, but could take quite a while. This was all running locally, just left my pc running overnight. Glad you enjoyed it!
@DallinBackstrom Рік тому ⁺¹⁵
I wrote out like 7 questions about your architecture before realizing that to properly understand the answers to those questions I need to go read some papers to remind myself about how reinforcement learning even works, Wa-hoo into the PDFs I go. Good job you got me to read
@aitango Рік тому ⁺³
I guess I did a good thing lol! Feel free to ask questions if you're still stuck after reading!
@eclipse2445 Рік тому ⁺¹²
a maybe better reward system to use could be based on the game’s race completion and current checkpoint variables, but this is still pretty cool
@aitango Рік тому ⁺¹
Just wait until the next video!
@RedFalconTV Рік тому ⁺³⁷
this is sick!! cool idea!
@wclxb Рік тому ⁺²
fancy seeing you here, falcon
@had0j Рік тому
hi falc!
@luigimario5114 Рік тому
Hi Captain RedFalcon
@BrianXPlayz Рік тому
!!
@PianoHypnoshroom Рік тому
hello falc
@Ajr0616 Рік тому ⁺⁵
A good idea would be to give them a reward for completely entering each key checkpoint that is in order to complete the race, that’s how the AI in the game function to an extent, this combined with their already programmed: drive forward, turn, and wheelie maintaining speed would make it easy for them to learn multiple tracks consistently with the goal to make it to the “end” so to speak
@Newt2799 Рік тому ⁺²
This is great stuff. I would love to see more and I’m sure many others would too. Looks like you’re putting out videos at a good pace, keep it up and this channel will grow a lot for sure!
@aitango Рік тому ⁺³
Thanks! Planning on putting out much more content so hopefully I can get a lot of people interested in AI
@andrewmccandliss399 Рік тому ⁺¹
Great video! I'm wondering in terms of controls, why not have turning be represented by a single float value (if that is possible) that is constrained within maybe -1 and 1 (or whatever works best). Negative values would represent left turns, and positive could be right. Then you could feed this into the game as a joystick input. You could then make drifting be a toggle, and so the agent chooses to drift the corner rather than choosing to turn a lot one way or the other. I figure this reduces outputs while giving the agent a lot more control over it's fine movement.
Everyone else's ideas about rewards are great, and I'm excited to see updates!
@aitango Рік тому
This would definitely be better in terms of optimal performance, as it would allow far finer control over turning. This is definitely possible and is a great suggestion, however would likely increase the training time slightly, as instead of 4 different actions, it kind of has infinitely many actions (although this has been done many times, and is technically called a continuous action space). Furthermore, this would require a fair bit of coding on my part to get this working, but would be great to see what it could do!
@YOUTY209 Рік тому ⁺²
I can’t help but think that it isn’t learning how to drive at all. Because it’s racing the same start and same track every time, it could totally ignore any visual input at all and just be memorizing inputs.
Basically it’s just learning how to speedrun Mario circuit blindfolded, and that would also explain why once it manages to get past all the other racers (which would be inserting random variance) it is able to then complete the rest of the track and multiple laps relatively easily. It’s just creating a blind TAS :P
@HomeByMidnight Рік тому
Oh I greatly look forward to seeing this evolve!
@aitango Рік тому
I will do my best!
@MiniMushie Рік тому ⁺²
This got me back into making Mario Kart Wii content! Thank you :D
@aitango Рік тому
Glad to hear it, thanks for watching!
@ryanmccartney244 Рік тому ⁺²
Very cool video! I don’t know a lot about AI training, but it seems to me that the AI is risk-averse, seeming to prefer hopping and not wheelieing when anywhere near other AIs for fear of being wheelie bumped, which causes it to drive slower but eliminated the risk of losing most of its speed, which happened early in the training when it kept getting wheelie bumped by Peach.
@aitango Рік тому ⁺¹
Thanks! In future versions I may prevent wheelie bumping from killing the ai as that seemed to be problematic
@powercell_ Рік тому ⁺²
Very cool video!
I am super interested in AI learning with games, I hope we can learn from the AI when trying to improve records.
@aitango Рік тому ⁺¹
Thanks! I’d love to see AI helping people improve records, would be absolutely amazing!
@matthewmalone4301 Рік тому ⁺²
I might try to do make my own ai with this as well. Thanks for the great inspiration!
@aitango Рік тому ⁺¹
Go for it! Love to see what others can do
@sourlessfix9109 Рік тому ⁺²
Can't believe I'm jealous of an AI. It took me 20 hours to beat my first lap on mario kart :(
@MrRutrum Рік тому ⁺²
Not sure what image processing you did for each frame, but I wonder if making the image "smaller" would help. And I don't mean resolution of the image, but maybe the color depth. Did you use a full 256 values for grayscale? What if it were 64 or less? What if you ran each frame through a edge detection algorithm and got a binary image? Then maybe you could increase the resolution? I also wonder if you cropped out or simply blacked out the irrelevant data, like the time/lap numbers, and the place in the race. That might help the algorithm focus on what we would expect would help them race better, the track itself. I also wonder what would happen if you put a big black box on top of funky kong, since again, in some ways the look of funky kong isnt relevant to the course. But these are all ideas to try to see if training is quicker, perhaps.
@aitango Рік тому ⁺³
The image processing was just downsized, then greyscaled with 256 values, then converted to a float. Using 64 values could work, but I am unsure if there would be consequences due to the loss of detail as the image is already extremely low quality (Less than 100x100 pixels). Other edge detection could speed up the process, but could also harm the detail if done poorly. The architecture of the AI uses a convolutional neural network, which typically learns edge detection and similar things on its own anyway. Cropping is something I'm definitely considering, since its unlikely the edges of the screen provide useful information. I'm not sure what the result of blacking out regions of the screen would be, as I haven't looked into what regions of the screen the AI is currently focusing on. If its looking at things such as the time, it may be useful.
@s.tpatrick6544 Рік тому ⁺⁷
Would there be a way of implementing the minimap as a way for the AI to recognise what's track and what's offroad?
@aitango Рік тому ⁺⁶
So the ai can see the minimap, but I’m unsure if the ai is using it for this way. It’s possible I could make some representation of the map which the ai could easily use to distinguish off-road and on the track. Definitely a good idea! My first approach to this problem (the video AI Learns Mairio Kart Wii) actually used an approach where I defined the off-road, and used this information to kill the ai when going off course.
@omegahaxors3306 9 місяців тому
Cute how it saw it had to turn left and then made a hard right instead, because that's how it solved every other turn up to that point. 3:23
@NateM135 Рік тому ⁺¹
this is cool, it would be nice to see how this works from a more technical standpoint/see some code and implementation details as well
@aitango Рік тому ⁺¹
I plan on making at least one quite technical video soon. Let me know what you would want to see in that video!
@General12th Рік тому ⁺¹
This is cool!
@Imoouri Рік тому
Nice video bro
@aitango Рік тому
Thanks!
@leofisher407 Рік тому ⁺¹
One simple method for determining the fitness could be to just take what place it's in in the race and use that to determine fitness.
@kingamezz Рік тому ⁺²
Very interesting video. I wonder if you would consider making an explanation video on how you did this? I'd love to try this out, I assume this is running in Dolphin?
@aitango Рік тому ⁺¹
At some point I definitely will! This is running in dolphin, however the main challenge of getting this to work was allowing the AI to access dolphin!
@Bozozo-pt2rk Рік тому ⁺¹
I thought it said AL so I also thought that he made an all that beat other ALS
@TeamTonyStudios10 Рік тому
4:58 when is Nintendo agent kong will be driving until the Smg4 court.
Glup
Btw. NICE CODING THO!
@JoeContext Рік тому ⁺¹
This makes me wonder what an AI playing F-Zero would look like...
@aitango Рік тому ⁺¹
It is possible to run on dolphin (the emulator im using), so if the people wanna see it I will make it!
@thijmencornelissen8722 Рік тому ⁺³
This Chanel is soooo underrated! Its just so cool what AI can do. I tried this on rainbow road...
34 hours and it could do THE ULTRA!
@aitango Рік тому
That's very interesting! What method/algorithm was you using?
@Ragemodepigeon Рік тому ⁺⁴
I somehow doubt that.
@QuinSynchr0 Рік тому
3:00 NEFFEX - That’s what it takes
@jaybrandonmartin Рік тому ⁺¹
I got an ad it was just a UA-cam video from a channel that I watch
@BabanDlh Рік тому ⁺¹
Pro AI
@Z54 Рік тому ⁺¹
nice video.
@aitango Рік тому
Thanks
@mrmicrowave4574 Рік тому ⁺¹
It would be interesting to see how it would handle the use of items. I noticed that there were no items available during this race
@aitango Рік тому
I thought I would keep it simple for now, but it’s definitely on my todo list if people want to see it!
@veggiet2009 Рік тому ⁺¹
Well, cool. But did it really learn racing, or did it just become overfitted to that particular track?
@aitango Рік тому
It likely did overfit to the track, but still learnt a little about racing. If it was put on another track, it would probably struggle but learn a lot faster than learning it’s first track
@rock924 Рік тому ⁺¹
What Reinforcement learning algorithm did you use? I am currently working on a project using tabular Q learning
@aitango Рік тому ⁺²
This was using Rainbow DQN! It was DQN with 7 different improvements to improve the efficiency and performance. Paper is here: arxiv.org/abs/1710.02298
@donskelz7771 Рік тому
Maybe look into reinforcing and rewarding the AI for having the player’s icon being on the road for the map? That might reduce the amount of time needed for your program to complete a lap instead of 16 hrs
@aitango Рік тому
Speed works pretty similarly since whenever the AI goes off-road it will get killed. Next time I may try to stop wheelie bump deaths though
@OrangeYTT Рік тому ⁺¹
I have no understanding of how AI works / trains but could you feed it world record video's and train it from that for a perfect run?
@aitango Рік тому
There are a few techniques which utilise previous information, but they are fairly complex. The simplest version I could do is just to play a bit myself and let it learn from that. It's easy to do it that way as I can collect the data in the exact form the AI needs. I imagine it would certainly speed up training and most importantly, exploration.
@Leekodot15 Рік тому
I want to see a long term fo4m of this project.
Maybe you can change the turn rate into a floating point variable, and omit the drift turning entirely, simppy replacing it with the drift button so it manaully decides to drift, though that idea, unlike the variable turn rate, is entirely up to you.
This way, it can turn more precisely, and maybe you can inpliment a shake function as well? Also, add a specific functionality for items, then start training it on grand prixs. Constantly.
Eventually, maybe you can set it to train against real players??? I bet that'll be fun!
@aitango Рік тому ⁺¹
That is an interesting idea, giving the ai more control always results in interesting results! Items increase the complexity of the problem a lot hence why I haven’t added them in yet, but they are definitely coming soon if people are interested!
@namesurname4666 Рік тому
can't wait to see ai in online races
@aitango Рік тому
Not sure when that'll be, but if this channel gets big enough I'll see what I can do haha
@kjn3350 Рік тому
Very interesting, and also quite a good way to represent AI learning in a simple way. I'd just like to know whether the AI would ever stop making those turbo-less drifts and the hops, because the time loss is minimal and so it doesn't contradict the core goal of the AI. A human would see that the turbo-less drifts waste time, but would the AI ever learn this unless the drifts actually cost a large amount of time?
@aitango Рік тому
The AI might learn it, but its hard to tell. In theory they should eventually try new actions that result in the mini-turbos and get the increased reward and learn that behaviour, but in practice its unlikely. In future Mario Kart videos I will likely reward the AI for getting the turbo, making it much more significant
@RealizedTruth20 Рік тому ⁺¹
Wtf, this is so crazy
@Anthestudios Рік тому ⁺¹
Is your AI using the minimap? Does it generalise to other courses? Can you share your code? I'm interested!
@aitango Рік тому ⁺²
It has access to the minimap, but I do not know how much it uses it. If you are interested in generalisation, check out my previous video, "Mario Kart Wii but it's an AI", where I train an AI on 10 different courses at once! A code reveal is coming at some point, but will take a while to clean up. I'm thinking of doing it as a 1000 subscriber special (if I reach that amount!)
@Anthestudios Рік тому ⁺¹
@@aitango Stacksmashing did a Mario Kart 64 AI video and discovered there that the AI would indeed use the minimap. I don't see any reason why your CNN (I guess?) wouldn't do the same thing. I'll definitely check out your other videos, and I'll be revving up my GPU for when I can train the model myself :).
@Blank-yj9ji Рік тому ⁺¹
If the AI just learn how to do one course, how would it learn how to do the other courses? What I think you should have done is ( idk if it is possible) make it so that the AI spawns at a random position in a random map and instead of giving it points for speed, give it points for passing check points. But again, i dont know if it is possible
@aitango Рік тому
It would learn other tracks in the same way, but just takes a lot longer since it needs to learn about the different Color’s/textures in the other tracks. I’ve done a video on something very close to this though, check out “Mario Kart Wii buts it’s an AI”
@ModBros8434 Рік тому
This is very nice. Do you have any plans to make the code public?
@aitango Рік тому
I'm considering doing it as a 1000 subscriber special, along with a video or two explaining the details of how it works!
@ModBros8434 Рік тому
@@aitango Cool. You just got a little closer
@matthewcrowley1853 Рік тому ⁺¹
What’s the variability of the AI learning? Is it bound to make the same mistakes eventually with the parameters put in place, or every time you restart the learning process different information reshapes the outcome?
@aitango Рік тому ⁺¹
These ai have a lot of variability, which is both a blessing and a curse! All AIs will have slightly different tendencies and biases, so you never really know what you’ll get!
@bull1085 Рік тому
I could think of two ways to reward and punish:
Reward for getting 1st place, gradually increases while in first place but will penalize losing that place by shortening the reward.
Reward for any sort of interaction that would place the opponent behind them, and keep them behind
I strive to be a software engineer, so I need to figure out how to do this stuff. Yet my brain just fries while thinking about it. How do you even create ai? I have high interest with the concept, but with no idea to begin, especially with it adapting the layers. The input/output is fine, but the layers remind me how far I have to go.
@aitango Рік тому
Interesting ideas for sure! I hope you can work towards becoming a software engineer, programming and AI is something I do both for fun and for work! I recommend the channel Deeplizard, as they were the channel that gave me much of what I know about deep learning, and they also have a playlist on deep reinforcement learning
@Eldoofus Рік тому
This one is a no-brainer, but I think you should give the AI rewards for going at top speeds
Maybe also giving negative rewards if their Position on the map (X, Y, and Z) changes less than it should by going at high speed, to encourage going forward as much as possible, while also rewarding to stay around the middle of the track where it's harder to bump into walls and such.
@aitango Рік тому ⁺¹
Going at top speeds is an idea I've been considering to encourage things like boost panels and mini-turbos, as it seems to have trouble with these for sure. Mario Kart also contains a "race completion" variable which I may use, as this should be an effective may to make sure it goes fast in the right direction, rather than just going fast.
@cocciclaque9084 Рік тому
If the ai was injected instead of emulated, you could've used the ingame invisible checkpoint flags that determine your respawn point as they're really good at tracking where you are on the map relative to the start, and since their ID in the files are really easy to access
@aitango Рік тому
Yeah true, I've been debating using the race completion variable vs the checkpoints. Race completion is nice as it gives constant information, but the checkpoints are more accurate
@thefancydoge8668 Рік тому
I know devs have time constraints but it would be cool if ai actually went through a process like this giving them actual skill instead of just having rubber banding to make the game more difficult.
@aitango Рік тому ⁺¹
If there was a company that wanted to do that, I would be happy to assist! It would be much cooler than current approaches
@AxidoDE Рік тому
16 hours of basically just trial and error. And the progress won't even transfer over to another track. However, what might be even more interesting is whether it would "learn" similarly fast without any visual input at all.
@aitango Рік тому
Reinforcement learning does require trial and error, however it could not learn without visual input. This AI has no memory at all, so isn’t just memorising input. As other comments have rightly suggested, it may be overfitting, which is where it’s inputs are based on a very specific visual input, rather than being able to come up with a good action for any input
@mechaboy95 Рік тому ⁺¹
what happens if we introduce tricking, items and new tracks
i think instead of feeding it an image, you need to feed it an image without a hud and the hud specific things such as minimap info and items
@aitango Рік тому ⁺²
Tricking is actually already in there, just nothing to trick off without a mushroom on Luigi Circuit. I considered adding items, but I brought up some problems. As I'm currently using the speed to detect when to reset the AI, I would have to change this as getting hit by an item would cause the AI to get reset. Definitely possible, it would just mean I would need to change my way of detecting death. For more tracks, check out my video Mario Kart Wii but it's an AI! I could potentially crop the image to only include the most relevant things
@sushi_guylol2644 Рік тому ⁺¹
So cool 😮
@aitango Рік тому ⁺¹
Glad you think so!
@wawiwuwewo69420 Рік тому ⁺¹
wow
@X4R80 Рік тому ⁺²
Lmao love how it was just spamming drift all the time
How did it not find out how to supergrind tho🤔🤔🤔
@aitango Рік тому
That would’ve been so cool if it did
@azura8596 Рік тому
I wouldn't suggest terminating a run when AI hits the wall bcs for some shortcuts or other stuff it may be useful, i'd say let it continue runs but with a negative reward.
Also, to avoid overfitting to that particular track change the track, the location of cpu and of the AI (basically make it so it always lands in random spaces)
@aitango Рік тому
Ideally I would like to remove the resetting, but it just slows down the early stages of training far too much. For the first 25% of training, the AI would literally just be driving into walls and collecting non-useful information, and likely wouldn't reach the point of proper driving, at least not for a very long time. Later on in training it is beneficial for the reasons you mentioned, so its definitely something to consider
@TrevorRS05 Рік тому ⁺¹
so this is how t1 players learn to drive?
@jschmidty2332 Рік тому
So it's sort of like a TAS that writes itself
@lionkingmerlin Рік тому
Pretty cool, but it won't work that well if you don't use save state anymore and start a new race.
@QuinSynchr0 Рік тому
NEFFEX music I hear? Hmyes. Perfect.
@aitango Рік тому ⁺¹
Gotta love some NEFFEX
@QuinSynchr0 Рік тому
@@aitango 100%
@cxdronemaster1593 Рік тому ⁺¹
Ngl your voice sounds like octopaul
@nataly_171 Рік тому
If you can let the AI read variables from the game itself, I know from shortcut videos that Mario Kart Wii has a "race progress" variable that tracks as a number how far into the track you are, so maybe you could use that as a reward for the AI? This would not only prevent it from driving in circles or driving backwards, but would also incentivize it to take better racing lines.
@aitango Рік тому
I actually used that variable in one of my other videos (AI Learns to Master Mario Kart Wii)! It worked pretty well, however I think a hybrid approach may be best as it seemed to be a little inaccurate at times. For example Ghost Valley 2 had some terrible accuracy issues with that variable. Maybe using both can get the best of both worlds though
@swisspissman8455 Рік тому ⁺¹
ice
@tobyzilla Рік тому
Altuntive title old computer players vs a next gen computer player
@aitango Рік тому
I like to think this AI would have far more RGBs if it were a computer!
@tobyzilla Рік тому
Thanks but now I'm no longer Toby Zilla I'm --why
@tierfreundeclub5404 Рік тому
only the title made me laugh
@alienespinel Рік тому
How do you do that
@arconisthewolf4430 Рік тому
Quick question? Can you make an AI to beat the bots in tf2? is that possible? Like a bot hunting killer
@aitango Рік тому
Definitely with enough training, but may take a very long time! The typical issue with first person shooters is that there are a lot of actions the ai can do (look and move up, down, left, right, shoot, reload etc). But at some point I am looking to try an fps game
@arconisthewolf4430 Рік тому
@@aitango If that can be done you’ll be the community’s hope
@arconisthewolf4430 Рік тому
@@aitango We need our own bot to fight back,you’re our only hope
@arzfan29 Рік тому
If it could just turn without drifting that would fix it
@aitango Рік тому
I plan on adding this to the next version!
@kimmandu6907 Рік тому ⁺¹
So awesome..! Imagine Ai being able to compete with top players...
Look how bad cpus are btw, being outrunned with 30s laps on louisy circuit 🤣🤣
@aitango Рік тому ⁺¹
Yeah Mario Kart Wii AI really weren't the best haha
@ZacharypjgStudios Рік тому ⁺³
imagine if joe biden was just one of these AIS
@parrokeet9877 Рік тому
now face off against the bot
@aitango Рік тому ⁺¹
Maybe if I do a Mario Kart AI finale that should be the video!
@armstronggaming8566 Рік тому
Alternatively, you could try importing a chess bot, that should cut the time down a bit
@MrBanana-le3db Рік тому
666th subscriber
@xichil_ Рік тому ⁺¹
this is obviously faked
@aitango Рік тому
wanna bet
@spirits1595 Рік тому
You should really breathe more when you speak I can barely understand what you're saying through my phone speaker your voice is too raspy on the low end.
@CodyTheBlackChickenSubscribe Рік тому
Looks like my friend trying to learn Mario Kart Wii

Наступне

Автоматичне відтворення