AI Learns to Outrun Police Officers

cozmouz

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 24 лис 2024

КОМЕНТАРІ • 432

@SinnaMon-s9p 11 місяців тому ⁺²³¹¹
so basically the equivalent of putting a baby in a timeloop and teaching it to steal.... i approve
@stoobidthing 11 місяців тому ⁺¹¹³
"1 hour here is 7 years on earth"
@rennoc6478 11 місяців тому
And pumping dopamine into it every time it succeeds
@ilyysm 11 місяців тому ⁺¹¹
@@stoobidthing 😭
@SirNob 10 місяців тому ⁺⁴
Why is there no comment here
@rennoc6478 10 місяців тому ⁺²
@@SirNob theres 3, now 4
@anador1877 11 місяців тому ⁺³⁶⁷⁷
I think there should have been a negative reward for jumping off too. It was clearly a preferable strategy to risking touching police officers, especially before discovering that coins give rewards or when the AI thought there were no way to get coins.
@cozmouz 11 місяців тому ⁺¹⁵³¹
Letting you in on a lil secret. I did code a negative reward for falling off the map, or lets say atleast I tried to 🙂. However after 4 days of numerous repeated training sessions, for the life of me, the implementation wasn't working. I knew things would work just fine without the falling penalty at the expense of increased training time so dats the way we went.
@toasterhavingabath6980 11 місяців тому ⁺²⁸¹
Walls.
@Hunter57588 11 місяців тому ⁺²
@@toasterhavingabath6980
Get this man a job at NASA
@winnerwannabe9868 11 місяців тому
@@cozmouzwalls tagged as police that kill on contact?
@skull_lee 11 місяців тому ⁺⁷⁶
@@toasterhavingabath6980hog rida
@fureyXD 11 місяців тому ⁺¹³⁰⁶
You should have added a negative reward for getting seen by the police, that way Loki will sneak around them instrad of speedrunning trough them, maybe a level where police couldn't be outrunned could have helped
@cozmouz 11 місяців тому ⁺⁴⁶³
If I added the police officers activation radius as input for the AI, its plausible the AI would've learned to sneak around them by not entering the radius. Thanks for the idea!
@TheEggDev 11 місяців тому ⁺¹³⁷
Negative reward might’ve been too much, as it could lead to a local maximum where the ai thinks going through police only leads to lower score, unaware that if it sacrificed a little score it could reach more coins. This could lead the ai to get stuck not going pass the first police officer, if not getting seen was impossible
@Mcervera 10 місяців тому ⁺¹⁰
@@cozmouzbut what about levels where you have to, like level 2
@cozmouz 10 місяців тому ⁺³⁵
@@Mcervera In machine learning, after a series of experimentations, one realizes that there is no "have to". There are many possible inputs I could've added, many different elements implemented. Its impossible to implement just everything! Implementing the baseline requirements to get the thing working was the idea behind this video. In level 2, Loki learned to maneuver around the officers regardless of whether the activation radius was an input or not.
@medievalcatguy6776 8 місяців тому ⁺¹
You should have made jumping into the void a negative reward@@cozmouz
@flamingfox2984 11 місяців тому ⁺³⁷²
Next Video: “AI Learns Tax Evasion”
@SovietComrade6675 7 місяців тому ⁺²
yes
@kitkat2849-b3h 11 місяців тому ⁺¹⁹⁴
i guess you could say hes _lowkey_ a fast learner
@cozmouz 11 місяців тому ⁺²⁹
nice one!
@DuOpig 8 місяців тому ⁺⁷
Idk. 3.5 million tries... Wonder how many controllers Loki broke trying to beat these levels
@Incog-0000 3 місяці тому ⁺¹
Yup.
@Benw8888 11 місяців тому ⁺²¹¹
The problem with videos like this is that the AI can overfit to a specific map. You need to have some sort of shuffled dataset or randomly generated sequence of maps/coin arrangements for proper training.
@AceTheAro7 11 місяців тому ⁺²⁴
I remember seeing a video train a track mania bot to solve a maze and they tried to fix it by having the car spawn randomly in the maze once that issue became the bottleneck
@o1-preview 11 місяців тому ⁺¹
@@AceTheAro7 sounds expensive
@arcadesmasher 10 місяців тому ⁺⁵
Yea, this videos coding seems good enough to mostly prevent that though. Notice how the AI doesn’t always take the same path in different levels. I have definitely seen videos where this does happen though.
@grandpretredesalpagas4665 11 місяців тому ⁺³⁰⁹
if the police start using robot dogs, we will start making robots cat robber
@cozmouz 11 місяців тому ⁺⁴²
that would be Purr-fect
@redstocat5455 11 місяців тому ⁺¹⁶
That would be funny, cats are perfect for this
@thebooknerd5223 10 місяців тому ⁺⁷
A cat burglar, if you will.
@uncommonusername 10 місяців тому ⁺³
Theyre really sneaky, i think itd work imo. I'm a cat owner so I'd know.
@Strong256 7 місяців тому
@@thebooknerd5223 Nami 😂
@Pasu4 11 місяців тому ⁺¹⁹⁰
11:17 I think this happens because the AI only learned to effectively collect coins in the one direction, or gets confused by there being no police to dodge.
AI is not that good at changing its perspective, since it has no real correlation between x, y and z. It doesn't know that they are just sides of the same coin, it only knows what outputs will change them individually.
I saw a video of a table tennis AI that worked great for one player, but once they spun it around for the second player, it just fell over, because it only learned to stay upright while looking in one direction. Their solution was to rotate the coordinate system with it (rotating a parent object and using local coordinates probably).
I think something similar may work here too, by changing Loki's sensors to be relative to his orientation, thereby eliminating the need to correlate different axes (unless you are already doing that).
@cozmouz 11 місяців тому ⁺³²
Amazing explanation!
@silasnebulous4533 11 місяців тому ⁺⁶
I think it was just hasty getting to the coins above and didn't bother moving a little bit to get the coins leading up to it.
It's shown to be able to turn to pick up coins before, so idk.
@draketurtle4169 11 місяців тому ⁺⁶
@@silasnebulous4533yeah seemed more like it detected more coins further ahead and therefore decided to ignore one’s immediately ahead for a bigger long term pay off (also cause they were away from the bad thing)
@tach5884 11 місяців тому ⁺⁸²
"That's her officers! That's the woman who programmed me for evil!" - Bender
@Golden_Projects 11 місяців тому ⁺⁴⁶
if you increased the reward from coins by dividing it by the amount of time from the last coin (less time more reward) you'd also make it so that he doesn't skip nearby coins to often, but it would also result in more speedrun-ish behavior
@cozmouz 11 місяців тому ⁺¹³
That's a great recommendation. More complex reward functions is something I will implement in the coming videos. Stay tuned!
@Strong256 7 місяців тому
wow nice idea i hope i remember this too iin the future
@forcelightningcable9639 11 місяців тому ⁺¹⁰⁴⁰
I like how Loki figured that it’s better to die than get caught by the pigs
@generaldelasmontanas2699 11 місяців тому ⁺⁹⁴
he knew he was going to drop the soap
@forcelightningcable9639 11 місяців тому ⁺¹⁰
@@generaldelasmontanas2699 lmaoo
@moodlethenoodle 10 місяців тому ⁺¹⁴
Are you an anarchist
@moodlethenoodle 10 місяців тому
@@undefinedchannel9916 Why? Without cops we'd have anarchy... so they muct be an anarchist?
@cjharrisson7522 10 місяців тому
@@undefinedchannel9916pigs aren’t also known as cops. Cops are known as pigs.
@redstonewolfx 11 місяців тому ⁺¹⁸⁶
You might want to add a very small negative reward that accumulated over time, and/or a time limit, so Loki is encouraged to pick up the pace. He might also be less scared of the police, as the penalty for meandering aimlessly will eventually be worse than just running for it.
@kitsunemusicisfire 9 місяців тому ⁺¹⁵
Loki isn't evil he's just a silly guy
@3emad.305 11 місяців тому ⁺⁵⁴
Programmers already teach AI how to do crimes. Perfect for our Sci fi apocalyptic fantasy doom.
@Danjor0613 8 місяців тому ⁺²
In Mass Effect 1 someone created an illegal AI to steal money from gambling machines. When caught it self destructed to try and kill you along with itself rather than be shutdown.
@Digby8 11 місяців тому ⁺²⁰⁵
Maybe we shouldn't be teaching AI to break the law, maybe that's just me.
@beywheelzhater8930 11 місяців тому ⁺⁵¹
Yes this definitely applies to actual irl crime. I love it when police digitize themselves to charge at rectangles
@karetsin8700 11 місяців тому ⁺⁵
just maybeeeee
@TurbopropPuppy 11 місяців тому ⁺⁷
what do you mean, this AI is based?
@joeljude9180 11 місяців тому
It's just flavor
@Dr3wskee14 10 місяців тому ⁺¹
@JulieGallows are you stupid?, it's In the name of the video😂
@a.j.outlaster1222 11 місяців тому ⁺¹²¹
This is cool, But wouldn't the A.I. learn more effectively if the levels scale slower in difficulty and repeated the same sort of scenarios?
Idk, This just seemed to scale at a rate that's fine for players but maybe staggering for an A.I.
@cozmouz 11 місяців тому ⁺⁶⁰
Sir, you are absolutely right. Gradual scaling in difficulty would've resulted in more thorough learning.
@a.j.outlaster1222 11 місяців тому ⁺⁸
@@cozmouz Btw, What were the inputs?
I mean, Were the cops and rewards registered separately from the walls?
Or was there like a separate input that changed based on what it hit?
@cozmouz 11 місяців тому ⁺¹¹
The 360 Degree Ray-cast is the main input source for the AI. Its like lasers being fired in all directions and waiting to hit something. If the AI hits a cop and gets negative reward, overtime, whenever the raycast beams hit anything tagged "police" , the AI will try to avoid that area. Raycast hits a wall, this is something I can stand and jump over! Thats how it works basically.
@thomasb6434 11 місяців тому ⁺²
@@cozmouzSo, the AI know in which direction are "things", but not at which distance ?
@cozmouz 11 місяців тому ⁺³
It knows direction as well as distance.
@L-iv6lx 11 місяців тому ⁺⁵
"started to associate negatives with something tagged as police"
it started using twitter
@corruptedmineral 11 місяців тому ⁺²⁸
damn i can finally create army of ai thief with ability to escape on its own
@InksAutism 11 місяців тому ⁺¹¹
He kept getting caught when teasing the cops
@SolomonFinney 10 місяців тому ⁺⁴
It’s so good that you finally have recognition for this.
@cozmouz 10 місяців тому
Ayyy I remember ur comment from the basim video, thanks a lot man.
@SolomonFinney 10 місяців тому ⁺¹
Ye@@cozmouz
@adriantcullysover4640 11 місяців тому ⁺²⁴
Although a 6 year old (ie my younger siblings) can finish these levels with with wayyyyyy less tries, this is still so impressive from something with no conciousness.
@xxxD3FC0N_1xxx 11 місяців тому ⁺⁸
that’s the point it’s a learning AI it’s not supposed to get it right the first time eventually it would be better than the best human player
@ethantasti2521 11 місяців тому ⁺¹⁰
@@xxxD3FC0N_1xxx actually no it wouldn't be better than a human. Change the map or enemy slightly and the AI would crumble. considering it only went for safe route a human would be faster and would take less time completing this
@eldritchcupcakes3195 11 місяців тому ⁺¹
@@xxxD3FC0N_1xxxactually no! If you changed anything major about say, the map at 8:50 Loki would freak out and take millions of tries to figure it out again. It could get very good at this specific map but nothing else. It can’t figure out how to apply the “knowledge” from this to a changed terrain. It just eventually figured out “these motions get me positive rewards and avoid the negative”.
@Данилтычкрейзи 10 місяців тому ⁺¹
@@eldritchcupcakes3195that's called overfitting, usually AI is trained on a lot of different data to prevent this
@sirk603 10 місяців тому
@@ethantasti2521if you trained it on a wide variety of different maps, it would probbaly become much better than any person
@JDRed117 11 місяців тому ⁺⁸
1:30 ULTIMATE SPASM GO
@Recodetfort0 10 місяців тому ⁺¹
I love how at just at the end of his journey he doesn't just collect all the reward, he also jumps around which looks like he really does have consciousness and is happy to see so much reward! It looks really nice and interesting)
@pietrobarbosa2464 5 місяців тому ⁺¹
Ok so basically training an AI is like beating ur kid if it doesnt bring you beer and giving it candy if it does
@FlaiseSaffron 11 місяців тому ⁺⁴
In this video: programmer explains criminal psychology without realizing it.
@davidaugustofc2574 11 місяців тому ⁺⁹
Loki is Low-key one of the AIs of all time
@goobinroblox12 10 місяців тому ⁺⁵
bros learning ai to evade taxes 💀
@Insanity-m3c 7 місяців тому
Teaching?
@FriarJoe66 8 місяців тому ⁺²
I think there should be a negative reward for coming into close proximity of a coin and then leaving proximity without collecting it.
@Commenter_101 11 місяців тому ⁺⁹
I love how patient you are 😊
@cozmouz 11 місяців тому ⁺³
Thanks !
@aurnok1237 11 місяців тому ⁺⁶
When Loki moves randomly he kinda looks like a speedrunner lol
@YuriGen2423 11 місяців тому ⁺²²
Very good work! also you could include what the AI is receiving as input too
@FireyDeath4 6 місяців тому ⁺¹
Infantile robber tries to steal drugs in broad daylight while avoiding police officers: visualised
I wonder if you can make it solely focus on the positive rewards of the coins and learn that obstacles are naturally detrimental because of the way they prevent the collection of coins
@matt.stevick 3 місяці тому
I have experience training an early AI / LLM (me along with many other associates) at a large wealth management firm starting in 2009 and leaving at 2015. It was not at all a primary focus or task we had to do, but very simply … we did it voluntarily when we had time. This video is a very good explanation to people new to AI on how it works in general, for such a complex area of study.
@FileXocelot 11 місяців тому ⁺³
There should have been a boing sound when he jumps
@cozmouz 11 місяців тому
Man I had this exact intrusive thought when I was programming Loki LOL, but it would've made the audio chaotic so I ditched the idea.
@henrycrystal9740 4 місяці тому ⁺¹
next video: " AI cops learn """pattern recognition""" "
@mutantdog 10 місяців тому ⁺³
next video: i reprogrammed elons self driving cars to outrun police cars
@kuroyami9757 4 місяці тому ⁺¹
5:35 So like people
@ninrts 10 місяців тому ⁺²
i love how at 6:49 it almost looks like he's taunting the cop lmao
@FurryNonsense 9 місяців тому ⁺¹
The music is too loud
@binguri-e7l 10 місяців тому ⁺²
next video: AI Learns To Evade Taxes
@PierreLucSex 5 місяців тому
This is the easy mode. The police just rewards you less
@zellenny1784 11 місяців тому ⁺⁵
Cool Video! Would love more of an end goal to it though..
@someangrypotato7197 10 місяців тому ⁺¹
This was really cool to watch! I wonder how it would go if you made a city for Loki to run from police in. It’d be interesting seeing if Loki develops an optimal route to go.
@burridi 5 місяців тому ⁺¹
Is that the cat ninja music???? I loved that flash game so much as a child
@CrazedKen 11 місяців тому ⁺²
Well
Well
Well
Nice! I just came have after a day and he’s at 1k, keep it up!👍
@plasmaflare5217 10 місяців тому ⁺¹
Reinforcement learning is such a cool concept. It just learns things by trial and error, just like people do.
@Somebody0960 11 місяців тому ⁺²
I’m going to use this knowledge to get away with violent crimes
@kyleyoung2464 10 місяців тому ⁺¹
Now optimize him
@oliverthesupercoolbully 5 місяців тому ⁺¹
welcome to our new friend loki! :DDDDDDDD
@Kiwi_Inventor 10 місяців тому ⁺¹
do part two but loki has to learn how to use legs
@Evaboii11 11 місяців тому ⁺⁴
200th subscriber here you have earned a sub keep up the work bro 💯💪🙏
@cozmouz 11 місяців тому
You are a Legend, Thanks a Lot 😎👊
@patchinator6 Рік тому ⁺⁵
Earned my sub! Keep it up!
@cozmouz 11 місяців тому ⁺¹
Sir, you are a legend. Thanks a ton.
@NunyaBizniz-om6xf 11 місяців тому ⁺²
I know nothing about AI but i think scaling the reward function of coins dependent on the closeness of police could encourage riskier behaviour, as long as contacting the police shortly after would remove those bonuses. or it could be fun to see what the ai does without that safeguard
@aaronpark686 10 місяців тому ⁺¹
He is so happy at the end lol
@simply_oat755 10 місяців тому ⁺¹
shoulda added a reward for going closer towards the coins and for going faster (and negative reward for going slower aka jumping around)
@GunSpyEnthusiast 11 місяців тому ⁺²
this video was recommended to me, most likely by an algorithm.
I am now scared.
@spacer7205 8 місяців тому
cool project! i think the jumping behaviour observed is a result of the raycasts being centred on the character's body; the AI is initiating a jump because it causes the rays to jump with it, which means they don't hit the chasers, so the AI associates jumping with that positive outcome. might be worth only associating being caught with a negative reward for more distinct emergent behaviour
@monkeysarestinky3106 10 місяців тому ⁺²
You should teach the cops to catch the robber now
@DarkLynel 8 місяців тому ⁺¹
What if the police is also ai
@lawden210 11 місяців тому ⁺²
Lupinranger Vs Patranger lookin different
@themarkerchannel3170 10 місяців тому ⁺¹
This is so underrated, great job! Also, if possible, can you do a tutorial on how to make these?
@jakub_noj 10 місяців тому ⁺²
Loki is goated fr
@moodl3d856 11 місяців тому ⁺⁴
"ai will be used to help people!"
ai:
@davidgeinoz2277 11 місяців тому ⁺⁵
Really like this one keep going 👍
@add7231 11 місяців тому ⁺⁴
i feel like i just went through all the stages of parenthood with loki
@elkapalio 11 місяців тому ⁺³
yooo ur content is incredible!, new sub
@cozmouz 11 місяців тому
ayyy thanks!
@Nikko_0905 8 місяців тому
The way the AI seemed to celebrate at the end was cute :)
@cludration 9 місяців тому ⁺²
bro puts so much effort for his video holy sh*t
@adriantcullysover4640 11 місяців тому ⁺¹
It seemed soo happy at the end. Lol.
@Burgers21 11 місяців тому ⁺¹
Next up: AI learns to drive my car to the bank
@orkhanabdullayev-sr5xe 10 місяців тому ⁺¹
AI LEARNS TAX EVASION! (real1!1!1!!)
@cassandranoice1563 9 місяців тому ⁺¹
Im glad my kids didnt spam jump when they were infants. Yeeting themselves off the edge of the world tracks though.
@dazley8021 11 місяців тому ⁺²
Wouldn't it be funny if "loki" comes to the conclusion that stealing is not the glorious purpose he's looking for? 😉
@cozmouz 11 місяців тому ⁺¹
I am Loki of Asgard, and I am burdened with devious purpose!
@SADmemer. 11 місяців тому ⁺²
Little bro breaking some ankle’s
@Likemea 11 місяців тому ⁺²
was it possible to place invisible barriers?
@cozmouz 11 місяців тому ⁺²
Yes it was possible.
@PaintedCryptid 9 місяців тому
The landing full of coins... truly the best ending to this video
@A-Clear_View 10 місяців тому ⁺²
magestic
@SpeedOfSol 11 місяців тому
Thanks man my wheelchair bound sister didn’t stand a chance from the tactics displayed here
@guillermomazzari8320 25 днів тому
Watching this made me think, that it is exactly like evolution, for us, it might not seem that long, but for Loki, it took countless generations to achieve victory, this can be used as proof that we are indeed in a simulated universe and this is exactly how evolution works, our genes just do reinforced learning.
@marmaje69 10 місяців тому ⁺¹
Is Loki like… relearning everything each level? Or do you keep his knowledge for the next level. Cuz I saw more AI’s that do that more effectively.
@Stranger-gl6ie 7 місяців тому
Loki was probably like: OK WHAT DID I DO WRONG
9:30
@Benw8888 11 місяців тому ⁺²
This video was incomplete without you explaining what the architexture, input/output structure, and training algorithm was. We don't just want to see cool art, we want to know that the AI is good.
@beywheelzhater8930 11 місяців тому ⁺²
I dunno, 2.5k people seem to like the video as the time of writing
@Benw8888 11 місяців тому ⁺¹
@@beywheelzhater8930 just because people liked the video doesn't mean they wouldn't like the vid more if it improved
@Benw8888 11 місяців тому ⁺¹
@@beywheelzhater8930 case in point, many other commenters are asking for details on the input structure, reward function, etc.
@rayyannoor129 11 місяців тому ⁺¹
If AI starts hacking banks, this guy is gonna be held for accusations
@cozmouz 11 місяців тому
My timbers are shivering
@qwanton8632 11 місяців тому ⁺¹
Bro's called lowkey
@cozmouz 11 місяців тому
😂
@OreoDoesStuff 11 місяців тому
instructions unclear, i got caught stealing orphans and they sent me to the shadow realm irl
@cozmouz 11 місяців тому
bro
@OreoDoesStuff 11 місяців тому
i know right, so strange... they used to send me to the 4th dimension but yesterday they sent me there
@kylebarvel 10 місяців тому
When pillars decide to run away from police pillars
@alussk 11 місяців тому ⁺²
Really cool!
@MilesKiyaAnny 11 місяців тому ⁺¹
Soo his got no eye to actually see where the cop is to respond like human
@cozmouz 11 місяців тому
He got Raycast sensors
@arugula9253 8 місяців тому
I’ll be taking notes.
@crabbydisk7658 11 місяців тому ⁺²
If you are gonna make a sequel, you should try proceduraly generating the map to make the ai more general purpose.
@Jayce_2624 10 місяців тому ⁺¹
Finally, a real world use case for AI
@Doggoko 11 місяців тому ⁺²
wtf did i stumble upon
@Kiwi_Inventor 10 місяців тому
you should make it so he can fight the police in dire circumstances
@silverbuckett 10 місяців тому
Loki spent a few millions years in prison now
@Fallout3131 8 місяців тому
Thank you!!
@maxclark7890 9 місяців тому
AI is gonna learn to shoplift now
@russellvanwagner8864 10 місяців тому
AI like this cpuld benefit from getting points for surviving for long periods of time. That should, in theory, cause the character to do whatever allows it to prolong the run as much as possible, such as avoiding police.
@cozmouz 10 місяців тому
Ironically, it does exactly the opposite with that reward mechanism. The AI simply stays away from the officers and keeps jumping around in safe areas to survive till the end of round, getting rewards for doing nothing essentially!
@jermaine459 10 місяців тому ⁺¹
i wonder what sparked this idea
@Gaminginvrrr 7 місяців тому
finally i can hop on this dude to escape the feds from all those endangered animal killing, trespassing, animal abuse, and a kill count in the triple digits!
@rgbatom5145 11 місяців тому ⁺¹
When i was scrolling past i thought the thumbnail was a gen z joke
@nightmaretheoverlord1825 11 місяців тому ⁺¹
I thought the video name was I train ai to overrun the police
@cozmouz 11 місяців тому ⁺¹
Maybe in the future 😅
@9yearoldgaming 11 місяців тому
@@cozmouzyou didn’t specify. DO IT NOW

Наступне

Автоматичне відтворення

AI Learns to Run Faster than Usain Bolt | World Record