A.I. Learns to play Snake using Deep Q Learning

Поділитися
Вставка
  • Опубліковано 11 лип 2019
  • Can an AI learn to play the perfect game of Snake?
    Huge thanks to Brilliant.org for supporting this channel, check them out: www.brilliant.org/CodeBullet
    Twitter: / code_bullet
    Patreon: / codebullet
    Discord: / discord
    Art created by @Dachi.art / dachi.art

КОМЕНТАРІ • 9 тис.

  • @Chocice75
    @Chocice75 4 роки тому +3927

    "It will not take three months."
    Four months later...

    • @marekmiarka8550
      @marekmiarka8550 4 роки тому +54

      Wow, 21 likes in one day. There must be a lot of people waiting for him to make a new video

    • @lithuanianinbound589
      @lithuanianinbound589 4 роки тому +14

      @@marekmiarka8550 and youre one of them.

    • @axmoylotl
      @axmoylotl 4 роки тому +12

      @@lithuanianinbound589 we all are one of them on this blessed day

    • @tylerchew3732
      @tylerchew3732 4 роки тому +21

      Technically it hasn’t been 3 months (but 4)?

    • @SwBruh
      @SwBruh 4 роки тому +7

      Code bullet is pulling a ceeday

  • @cameronschiralli3569
    @cameronschiralli3569 4 роки тому +2019

    "It'll not take 3 months"
    Top 10 anime betrayals

  • @pyren4931
    @pyren4931 4 роки тому +1652

    Code bullet 3 months ago: "it will not take 3 months"
    Me after 3 months: "lmfao"

    • @redion8575
      @redion8575 4 роки тому +1

      Pyren 4 months

    • @owenlok2325
      @owenlok2325 4 роки тому

      Sure bud.

    • @chris86simon
      @chris86simon 4 роки тому +1

      I need to change my bank.

    • @guicky_
      @guicky_ 4 роки тому +2

      i mean... he wasnt wrong, it took 5 months...

  • @dominicward4723
    @dominicward4723 4 роки тому +921

    What if he actually died and we’re just sitting here making fun of him for not being able to upload within three months

    • @romanobaumann7158
      @romanobaumann7158 4 роки тому +50

      holy fuck dont fukcking scare me

    • @eeeeee8762
      @eeeeee8762 4 роки тому +8

      Would be funny

    • @money_fiery444
      @money_fiery444 4 роки тому +9

      He was in uni so he couldn’t upload but it’s done now and he’s editing a video

    • @tiffanyjaworski2942
      @tiffanyjaworski2942 4 роки тому +3

      Dominic Ward I thought the same thing

    • @elin9217
      @elin9217 4 роки тому +4

      It was the fbi this is his clone making this vid

  • @chengaming8120
    @chengaming8120 4 роки тому +2908

    "It wont take 3 months, Im thinking a couple of weeks max" - Code bullet before not uploading for 3 months

  • @dylbanyeah
    @dylbanyeah 4 роки тому +1419

    This ended up being a snake version of a dvd screensaver hitting the corner

    • @hashirama1507
      @hashirama1507 4 роки тому +3

      RavenCrow only reply

    • @crimsonclaw6337
      @crimsonclaw6337 4 роки тому +1

      @@hashirama1507 that's not treu

    • @hashirama1507
      @hashirama1507 4 роки тому

      :((((

    • @danielstein3659
      @danielstein3659 4 роки тому

      enjoyed all 5 myself :))))) @Andres Lorenzo

    • @stylis666
      @stylis666 4 роки тому

      @@danielstein3659 I didn't, but I'm also not 12 years old anymore so I suppose I'm old and bitter.

  • @thomasbrennan1881
    @thomasbrennan1881 4 роки тому +526

    He’s right it didn’t take 3 months. We’re on 4.

    • @Qazsnivy
      @Qazsnivy 4 роки тому +5

      Next is 1 year

    • @tastychunks
      @tastychunks 2 роки тому +6

      @@Qazsnivy You have no idea how right you were

  • @raiynen
    @raiynen 4 роки тому +169

    "What is my purpose"
    You play snake.

  • @bhk6091
    @bhk6091 4 роки тому +6348

    I have an idea
    You should put a time limit for the snake to find the apple
    So it can find the most efficient way to find the apple and to prevent it from doing shit over and over again without finding the apple

    • @kyraaa__
      @kyraaa__ 4 роки тому +34

      Good idea!

    • @theboman4816
      @theboman4816 4 роки тому +137

      Not gonna work i guess. if the snake is to long this will kill him most of the time lmao

    • @jessenxxx
      @jessenxxx 4 роки тому +72

      @@theboman4816 i mean u can extend the time by how long he is aka each block give it 5 more seconds or something but in end even then time scaling a need to be adjusted muply times or b it probbaly wont work

    • @theboman4816
      @theboman4816 4 роки тому +7

      @BlazePlayz YT yea but in the end there are still chance it will kill itself :D

    • @skncl9859
      @skncl9859 4 роки тому +2

      @@jessenxxx floor(8 + 0.2b)? I can't remember what length the snake starts, should give 8s to find the apple to begin

  • @danibuchanan3978
    @danibuchanan3978 4 роки тому +2405

    Somewhere, Code realizes he could've just made the head green.

    • @thebestnerd4444
      @thebestnerd4444 4 роки тому +74

      I already suggested that. like 4 months ago

    • @kevinwells9751
      @kevinwells9751 4 роки тому +95

      That was my first idea, and yet here we are with this shitshow of an AI lol

    • @marventhedepressedrobot4656
      @marventhedepressedrobot4656 4 роки тому +8

      Four months ago

    • @davidparker7974
      @davidparker7974 4 роки тому +53

      Sounds like that would be.............. For pussies

    • @exa4564
      @exa4564 4 роки тому +13

      @@kevinwells9751 if you say, the thing he did is a shitshow, then make an AI yourself and programm the game, let it learn......
      i am waiting for you to upload such a video, where you make it bether than him ;)

  • @FatedHandJonathon
    @FatedHandJonathon 2 роки тому +23

    If you rotate the frame you feed the machine so that the snake is always facing upwards, you effectively quadruple your training data by allowing the snake to treat isomorphic game states as identical. It'd take a little work to correctly rotate it, then rotate the output, but the savings in training time would probably be worth it.

  • @juneru2
    @juneru2 4 роки тому +61

    what would you do if adrian suddenly spelled "HELP" with his tail after 20000 games.

  • @Sanguine-Tenshi
    @Sanguine-Tenshi 4 роки тому +451

    "I'm as stubborn as I am lazy." uf, i felt that, i felt that on a spiritual level

  • @matthewmcwane9569
    @matthewmcwane9569 4 роки тому +1623

    “I’m as stubborn as I am lazy” ~ Code Bullet
    Relatable

  • @Rotrokas
    @Rotrokas 4 роки тому +189

    Why not giving him a reward base on the time between eating apples?.

    • @nightniro131
      @nightniro131 4 роки тому +1

      It's been 4 months... RIP

    • @athousandyearoldfossil5059
      @athousandyearoldfossil5059 4 роки тому +14

      Because if you go straight into the apples you could accidentally run into itself because it's trying to get there as fast as possible and won't realize where it's body is until it hits it because it's not focusing on that.

    • @yourmomgayyourdadlesbian9051
      @yourmomgayyourdadlesbian9051 4 роки тому +2

      A Thousand Year Old Fossil negative punishment

    • @DawnAfternoon
      @DawnAfternoon 2 роки тому

      A far enough game encourages you to take the longest time/route possible to get to the Apple, that is, to fill up the whole screen with their body before going for the Apple
      A time-based reward would be very counterintuitive to that

    • @EngineScypex
      @EngineScypex 2 роки тому

      @@DawnAfternoon That highly depends.
      I'd say the key is to store some values and modify them constantly.
      As for the video the AI doesn't have any memory. But it needs some memory values to adopt it's strategies better.
      Example:
      You can count the amount of apples the AI got. Just decrease the time-reward every few apples and the AI will be more aggressive at start and more careful later.

  • @Evelaraevia
    @Evelaraevia 4 роки тому +100

    "Won't take 3 months"
    9 days before it hits the 4 month marks.

    • @eicikle1809
      @eicikle1809 4 роки тому +2

      He never said that it wouldn't take longer than 3 months though, so technically he's still right

  • @someauzzie9139
    @someauzzie9139 4 роки тому +250

    "It will take a couple of weeks."
    Another 3 months later.

  • @ffttossenz
    @ffttossenz 4 роки тому +817

    2:30
    CB: hey artist can u draw my avatar for coding videos
    Artist: sure what poses do u need?
    CB: punching
    Artist: why would u need that for a programming video
    CB: with both hands

    • @puffel8145
      @puffel8145 4 роки тому +37

      @@firestar9650 how could this be a nazi Joke?

    • @P.cookie
      @P.cookie 4 роки тому +33

      CB: Oh! And one with a pistol.
      Artist: hoW IS THAT CODING RELATED???

    • @puffel8145
      @puffel8145 4 роки тому +15

      @@P.cookie CB: IT'S FOR PUNISHING MY AI!!!!!
      Artist:......o....ok...

    • @tlkfanrwbyfan8716
      @tlkfanrwbyfan8716 4 роки тому +6

      Am I going to hit you with my left or my right fist?
      All of I could think of

    • @nickthecherryking682
      @nickthecherryking682 4 роки тому +10

      @Firestar how do you even pull goddamn *Nazis* from "draw my guy punching for programming videos"? it's a joke about the artist being concerned why someone needs to punch for programming videos, it's not that hard

  • @corosso8820
    @corosso8820 4 роки тому +28

    I can’t believe you’re still SO underrated after all these years

    • @BarioIDL
      @BarioIDL 2 роки тому

      *that's what she said*

  • @JambaYCS
    @JambaYCS 4 роки тому +58

    "it definitely won't be 3 months"
    Me 3 months later: bruh

  • @KatKitKay
    @KatKitKay 4 роки тому +611

    I love how the snake just randomly decides to be a squiggly boi sometimes

  • @oliversmith9966
    @oliversmith9966 4 роки тому +632

    well programming + animations just screams out long video release dates so you dont need to apologize, we know what we subscribed to
    edit: also code bullet, why dont you make the head of the snake a different color so the bot will know where the head is and you can make it see the hole screen!

    • @weblure
      @weblure 4 роки тому +12

      What animations? These are premade images being swapped out and, at most, being manipulated with basic 2D transforms on whatever editor he uses.

    • @klcompany7303
      @klcompany7303 4 роки тому +6

      Weblure Joltik still has too draw

    • @edselcervantes6229
      @edselcervantes6229 4 роки тому +11

      @@weblure still counts as animating

    • @frick3256
      @frick3256 4 роки тому +5

      They’re premade and not by him

    • @weblure
      @weblure 4 роки тому +4

      @@klcompany7303 No; he did not draw them.

  • @wrillwastaken
    @wrillwastaken 4 роки тому +73

    "I won't die for 3 months"
    Dies for 4 months.

  • @archanamotagi1675
    @archanamotagi1675 4 роки тому +19

    "It will definitely not take 3 months"
    -- *Code Bullet 4 months ago*

  • @theaureliasys6362
    @theaureliasys6362 4 роки тому +416

    1. Punish for going too long without an apple.
    2. Recurring. Yes. I dared saying that.

    • @potato_hoarder
      @potato_hoarder 4 роки тому +23

      like a -1 punishment for every move, + 99 when he gets the apple.

    • @niezbo
      @niezbo 4 роки тому +5

      @@potato_hoarder I ifnd it problematic. If snake, at some point, will have to travel more than 100 blocks, wouldn't it lose points therefore perform worse?

    • @jhonnythejeccer6022
      @jhonnythejeccer6022 4 роки тому +7

      Niezbo yea, maybe calculate how many blocks a snake in average would need to travel to hit an apple
      And if its over 1.5 or 2 times this average (bc ai are dumb) let it loose points

    • @theaureliasys6362
      @theaureliasys6362 4 роки тому +3

      @@niezbo it would still be better than continually wandering around without getting the apple: remember it has no other choice but to move.

    • @mpradeepnair
      @mpradeepnair 4 роки тому

      What do you mean by 'Recurring'?

  • @jaloveast1k
    @jaloveast1k 4 роки тому +484

    It feels like the natural selection algorithm should have taken in account time, if snake doesn't find next apple in X * 3 + 50 (just an example) steps.. then it "dies", where the X is the current length of snake, since longer the snake harder it is to navigate on the board.
    In your case, where time is "infinite" you just ended up with a safe snake that doesn't focus on searching, only avoiding obstacles and occasionally picking up these goddam apples.
    This also would have improved the learning speed, with all the procrastination snakes getting killed right away ¯\_(ツ)_/¯

    • @blueraspberrylemonade32
      @blueraspberrylemonade32 4 роки тому +16

      Justice and honor glad I'm not the only one

    • @rileyzanatta7664
      @rileyzanatta7664 4 роки тому +18

      it's not a natural selection algorithm, it's q learning.

    • @derrilazkia1002
      @derrilazkia1002 4 роки тому +7

      When the snake is too long, it needs to kinda "fold" or "sort" itself before eating apple to avoid collision with its body right?
      Maybe the time will need to be extended as the snake goes longer ?

    • @rb1471
      @rb1471 4 роки тому +19

      Yes he should add time. You can just add time to the feedback (reinforcement learning) of Q-learning.
      EX)
      The snake get's the apple in X steps.
      2. If X > 10, punish the AI by some function of X points.
      3. If it dies, punish it.
      4. If it gets the apple, give it a reward.
      The snake would be incentivized to get the apple in a timely manner since it gets punished depending on the amount of time wasted. Here 10 is just the number of steps before we consider it to be wasting time. Here you would need to balance how much punishment it gets on death vs wasted time.
      ---------
      Other ways to improve learning is to give the snake a better sense of its surroundings. Both the boxed region and the simplified region force the snakes vision down significantly.
      How to fix this:
      1. -> One way would be to feed the whole screen to the snake and call it a day. This is slow (as we found out) so not really realistic.
      You can improve this by using a CNN. This would use a "window" and scan it along all the positions. In other words, instead of feeding all w*h inputs, we can scan 7x7 windows along the board with the same "network". Since it's the same network scanning, we just need to build a network for 42 inputs rather than w*h inputs. This would allow for a much smaller network size and significantly speed up training.
      2. -> Another way without using complex networks would be to feed the network with "interesting" points. All the unused space is really not needed. So we can feed the network a few things: The apple, the borders of the game and finally the snake body.
      You can actually do better than this and get rid of more useless information. For example we can reduce the border to 4 corners (we don't need all the squares connecting the borders). Even better than this, just provide the height / width of the game (no need to give (0, 0), (h, w), (h, 0), (0, w) when we just need h and w).
      On top of that, we don't need to give the entire snake body to the AI. We just need the head position, tail position, and all the positions of "bends" in the snake from previous turns. The rest of the snake can be assumed by the AI.
      The only problem now is that the snake size is dynamic (can be 3 bends, or 150 bends). The network cannot change the number of inputs without more complex code (such as an RNN). So we can fix this by assuming a max number of "bends" like 50. This way we fill in the bends as inputs to the network and put 0's for the unused portions. We can have the code kill the snake if it surpass the max number of bends and use that as another reinforcement punishment for the AI, basically teaching it to limit the number of turns it makes.
      Now the number of inputs are 1 apple + 2 for borders + 50 bends + 1 head + 1 tail = 55 inputs.
      We also have feedback on getting the apple, getting the apple in time, death, and the overall efficiency (number of turns made, which was maxed out at 50).
      Training should go by much much faster with a lot more improvement given the different feedback and FULL visibility. We can even expand the network to a much larger size to really learn some techniques.

    • @Cacoiaaa
      @Cacoiaaa 4 роки тому

      I'm no coder, but wouldn't that be pretty ineffective, since the apple begins in different places. It would try to go in the direction of the first trial's apple, but it wouldn't be able to find randomized apples

  • @Botpointo
    @Botpointo 4 роки тому +148

    A Day in The Life of Code Bullet:
    Wake Up
    Coffee
    Crying
    Doritos
    Coffee
    Working on his number addiction
    Doritos
    More Coffee
    Coding
    Doritos
    Even More Coffee
    Sleep

    • @SquooshyCatboy
      @SquooshyCatboy 3 роки тому +2

      Sounds like hes more of a coffee addict honestly

    • @duncannonnn4259
      @duncannonnn4259 3 роки тому +2

      The last one is wrong he doesn’t sleep

    • @DoubleE5135
      @DoubleE5135 3 роки тому

      @@duncannonnn4259 Yeah he’s a uni student he has no clue what sleep is

    • @jvpro2076
      @jvpro2076 3 роки тому +1

      make a game of this

    • @wolfy9011
      @wolfy9011 3 роки тому

      K

  • @fredb5626
    @fredb5626 3 роки тому

    honestly bro - Ive been watching your channel for a while, and it makes me laugh so much - Its a wonder break for me from my own code

  • @oliverganski3509
    @oliverganski3509 4 роки тому +1384

    AI:what is my purpose
    CB: you eat apples
    AI: oh my god

    • @yeetyeet-jb6nc
      @yeetyeet-jb6nc 4 роки тому +4

      Ywuhhly ywuhhly ywoh
      Ywoh nottt jwoooh you lithuaniaaaan speeekaeeeerre

    • @oliverganski3509
      @oliverganski3509 4 роки тому +11

      yeet yeet can you translate this to English

    • @yeetyeet-jb6nc
      @yeetyeet-jb6nc 4 роки тому +13

      I just wrote how your name is pronounced in the international phonetic alphabet

    • @oliverganski3509
      @oliverganski3509 4 роки тому +4

      yeet yeet ok, but if I do wanna read it right, flip ur head upside dowb

    • @oliverganski3509
      @oliverganski3509 4 роки тому +7

      yeet yeet also, do you get my Rick and morty reference

  • @psychopiper8224
    @psychopiper8224 4 роки тому +681

    things that make you go hmmmmm:
    "won't take three months"

  • @pewdular
    @pewdular 3 роки тому +5

    youtube recommended me this exactly a year after it was made

  • @chknnuggies
    @chknnuggies 4 роки тому +4

    This guy is really out her taking something mind-boggling boring to watch (I.e. coding) and making it entertaining, somewhat educational, and funny. Definitely worth a sub

  • @xavier.ashkar
    @xavier.ashkar 4 роки тому +429

    1:00 flips the avatar but name on shirt is backwards and too lazy so just writes cb in arial white font lol

    • @TechSupportDave
      @TechSupportDave 4 роки тому +27

      that cracked me up.
      It's the funniest when these "top-quality content" channels do it too.

    • @jujuyee2534
      @jujuyee2534 4 роки тому +10

      It is a top quality channel what do you mean?

    • @capncantread2985
      @capncantread2985 4 роки тому +1

      69 likes

    • @moex1713
      @moex1713 4 роки тому

      Yeh we saw that u don't have to say it again

    • @RAFMnBgaming
      @RAFMnBgaming 4 роки тому

      CB: Creative Bibliography.

  • @cerealenjoyer3000
    @cerealenjoyer3000 4 роки тому +287

    idea: punish it whenever it takes more than 20 seconds to find the apple

    • @CyseRev
      @CyseRev 4 роки тому +1

      @Notabotatall _ 27 Correct

    • @jamesshiervlogs2613
      @jamesshiervlogs2613 4 роки тому +1

      That’s the idea I had it would make it have a way better way of searching

    • @ambitsamb
      @ambitsamb 4 роки тому +5

      Or just make it to where the apple is always in it's vision?
      Like if the apple is out of the snake's field of view, make a separate field for the apple?

    • @zeebzeebo
      @zeebzeebo 4 роки тому

      @Notabotatall _ 27 WTF is "ceying" ?!

  • @MystearicaClaws
    @MystearicaClaws 2 роки тому

    I love binging your stuff because you go from polite and funny text to being the mostile hilariously hostile creative in a snap XD

  • @mega-nuke1587
    @mega-nuke1587 4 роки тому

    I love his pure joy and enthusiasm when making this video it was brilliant

  • @mitchelljulius5875
    @mitchelljulius5875 4 роки тому +365

    Me: "Oh he's still alive."
    CB: "Yeah, I'm still alive."
    Lmao

    • @jamesgockel854
      @jamesgockel854 4 роки тому +10

      He's going to create an AI that ends up killing him. I just know it, not that I want that to happen.

    • @echelon5162
      @echelon5162 4 роки тому +3

      @@jamesgockel854 Shh... don't tell him.

    • @Bobbingtonn
      @Bobbingtonn 4 роки тому +1

      I had to double check i wasn't seeing things

    • @cyclus_gaming
      @cyclus_gaming 4 роки тому

      James Gockel why are you checked??

    • @Wunderboy08
      @Wunderboy08 4 роки тому

      @@cyclus_gaming he's verified I think

  • @owenmclaughlin280
    @owenmclaughlin280 4 роки тому +2939

    As a suggestion, maybe punish him if he goes to long without finding food. So he can't get caught in a loop.
    Edit: oh wow, this blew up a bit more than I expected. @skeletalZ @Alex.Doan @Ian and @Jonathan.Yang have my favorite suggestions. I feel like a mix between my suggestion, @skeletalZ 's suggestion, and @Alex.Doan 's suggestion could work really well. But at the same time, a mix of @Ian 's suggestion and @Jonathan.Yang 's should work really well too.

    • @evanjames575
      @evanjames575 4 роки тому +81

      Owen McLaughlin issue with that is if the snake is too dumb to find the food, it doesn’t have any other option. So punishing it won’t result in anything.

    • @jeffvader811
      @jeffvader811 4 роки тому +79

      That would be against his animal rights >:(

    • @jonathanyang1423
      @jonathanyang1423 4 роки тому +18

      This can be done with the use of a discount factor

    • @robertodelier9999
      @robertodelier9999 4 роки тому +28

      Hmm
      Maybe a timer based on the snake size

    • @cadmallard
      @cadmallard 4 роки тому +24

      Owen McLaughlin and a reward for finding the food in that time limit

  • @PsionicNoMad
    @PsionicNoMad 4 роки тому

    How have I never seen your stuff yet. This is amazing :D

  • @sontath7102
    @sontath7102 4 роки тому +1

    been about 3 months, must mean that mr. code bullet must be starting on his next project. HYPE!

  • @specyboy9054
    @specyboy9054 4 роки тому +647

    Next you should try and do:
    AI CREATES REGULAR UPLOADING SCHEDULE!!!

  • @vincentnicholson3946
    @vincentnicholson3946 4 роки тому +522

    “There’s no way it could take me 3 months” - Codebullet, 3 months ago.

    • @peracality7648
      @peracality7648 4 роки тому +4

      "Theres no way it could take me 3 months" - Codebullet *1 year ago*

    • @shireenakter4409
      @shireenakter4409 4 роки тому +2

      I mean, he died in the accident, so it makes sense

    • @pentabitsmusic
      @pentabitsmusic 4 роки тому +1

      4 months ago

  • @kimbisdoes115
    @kimbisdoes115 2 роки тому +22

    3 months? Try 13

  • @BountyHunterLetsPlaysOnGoogle
    @BountyHunterLetsPlaysOnGoogle 4 роки тому +9

    "It definitely wont take three months, you can bank on that."
    My house would have been foreclosed by now then.

  • @Gemini-Lion
    @Gemini-Lion 4 роки тому +751

    I was think this for the problem of where the head is: Make the head a different color than the apples and rest of the body. Maybe like a blue or something?

    • @philiphunt-bull5817
      @philiphunt-bull5817 4 роки тому +69

      Yeah, that was the most obvious solution.

    • @mattsekira5536
      @mattsekira5536 4 роки тому +49

      I was thinking of that as well and waiting to see it implemented. It can see the whole screen so it always knows where the apple is, it knows where it’s head is, and it does that all with one frame of 1600 pixels.

    • @LemonChieff
      @LemonChieff 4 роки тому +32

      It’s actually way less efficient. You’d need to check all the sample until you find the head and only then look at the 4 adjacent pixels to know the direction. With his method you skip the looking for the head part, it’s always in the same place

    • @adammullarkey4996
      @adammullarkey4996 4 роки тому +10

      Make it the same colour as the apples. Then you get point every frame!

    • @ZaHandle
      @ZaHandle 4 роки тому +3

      Adam Mullarkey color and function is a different thing

  • @Wizzkidwas
    @Wizzkidwas 4 роки тому +220

    7:43
    "You'd either call a doctor, or an exorcist" had me dying laughing

    • @Wizzkidwas
      @Wizzkidwas 4 роки тому

      oh whoops, fixing that typo

  • @AshtonianGaming
    @AshtonianGaming 4 роки тому +19

    could you just not have made the head of the snake a different color? like, purple, or something?

  • @nk361
    @nk361 4 роки тому +2

    The video and idea is all great! I just wanted to give a little input that may help. Give the snake the coordinates to the apple and let it figure out the direction to move and such since that's sorta what we determine with our eyes. Also, training the snake wise you might want to train it to play mostly when it has a long tail since that is when the game is hardest and needs a specific behavior of planning not to run into itself. Hope that helps a bit! Oh and for the direction the snake is going, you could maybe cheat and make the head a different color for it then later on change it back to white and have it train more on the existing weights.

  • @archanamotagi1675
    @archanamotagi1675 4 роки тому +313

    Theory: Code bullet is taking so long because he's still secretly working on the Enigma decoder

    • @HDTomo
      @HDTomo 3 роки тому +1

      Yes

    • @pug8714
      @pug8714 2 роки тому +5

      Dancing animations are more important

  • @urban_ghost_5226
    @urban_ghost_5226 4 роки тому +3566

    Change the color of the head
    - Everyone in the comment, 2019

    • @creeperlamoureux
      @creeperlamoureux 4 роки тому +13

      Lol

    • @AnnihilatedBrainsample
      @AnnihilatedBrainsample 4 роки тому +143

      My idea too. Give it a head, and now the AI knows which way the snake is facing. And now it could see the whole screen.

    • @MrRolnicek
      @MrRolnicek 4 роки тому +260

      "How can the AI know where the head of the snake is"
      Me: "Tell it"
      CB: "Shit ... my solution is way more complicated"

    • @weblure
      @weblure 4 роки тому +50

      The head was only a minor problem. The bigger problem was that the AI had too many inputs, which is why he shrank its vision range. This just happened to solve the problem of knowing where the head was, which was a bonus.

    • @nils2711
      @nils2711 4 роки тому +46

      not a different color, give him a cute face :D

  • @kevinsantos5050
    @kevinsantos5050 4 роки тому +1

    So I'm new to the channel and I like the idea of using the whola map for the snake to see.
    What I notice is that after snake is bigger that the "sight" box it will have high chance to eat itself cuz it doesn't remember the position of his body and will box itself.
    What I would suggest is to use the whole map as you intended but you couls change the color of the head like so it knows where the head is(sometimes the simplest solution are the right ones ;)
    You could also try putting a starvation timer to motivate the snake to go eat.
    I'll look forward to your next video

  • @mrmunch5615
    @mrmunch5615 4 роки тому +20

    “I’m still alive!”
    Three months later…
    Bruh 😂

    • @nightniro131
      @nightniro131 4 роки тому

      All comments are how he has taken more than 3 months

    • @nightniro131
      @nightniro131 4 роки тому

      You have no videos or 10k subs
      I've been duped

  • @jacksonstein809
    @jacksonstein809 4 роки тому +376

    You at least upload more often than cgp grey and oversimplified combined!

  • @superspol
    @superspol 4 роки тому +243

    As a fellow Adrian speaking, he did better at snake than I ever could.

    • @adrianremo1988
      @adrianremo1988 4 роки тому +6

      Fellow Adrian here to humbly agree.

    • @bastion212
      @bastion212 4 роки тому +1

      Fellow adrian is calling bs

    • @buckiez
      @buckiez 4 роки тому +7

      Not Adrian here to ruin the Adrian reply chain

    • @gregoryderpwrld111
      @gregoryderpwrld111 4 роки тому

      MrCinch ok Shepard

    • @adrianremo1988
      @adrianremo1988 4 роки тому +4

      @@buckiez Other Adrian to complain about the sudden loss in the Adrian snake chain. How are we gonna succeed in the snake game and appease the Q Gods if you cut the Adrian snake off like this?

  • @glo8516
    @glo8516 4 роки тому +1

    Wow thanks for not taking 3 months!
    *IT'S TAKING MORE*

  • @jesseboy7951
    @jesseboy7951 4 роки тому

    Man, u r amazing! Both smart and hilariously funny, we don't get a lot of those recently...

  • @NecromancerHD
    @NecromancerHD 4 роки тому +788

    *_Who else would legitamitely love a Livestream of Adrian's training?_*

    • @rumfordc
      @rumfordc 4 роки тому +3

      who is adrian?

    • @rumfordc
      @rumfordc 4 роки тому +3

      @@marley7776 oh my b

    • @Eliswap
      @Eliswap 4 роки тому

      YES

    • @bastion212
      @bastion212 4 роки тому +8

      I'm Adrian

    • @TheVexCortex
      @TheVexCortex 4 роки тому +5

      @@bastion212 But can you play 20,000 games of snake in a row with no breaks?

  • @danke8945
    @danke8945 4 роки тому +657

    “How long has it been?” “THREE MONTHS?!?”
    Who else thinks this is just going to be his intro for the next video

    • @spino1526
      @spino1526 4 роки тому +6

      It has been 3 months -.- so yeh kinda annoying 4 no vids

    • @redion8575
      @redion8575 4 роки тому +3

      It's 4 months

    • @modle4108
      @modle4108 4 роки тому

      Caden Allison what are you wooooshing?

    • @K4LxMaddog
      @K4LxMaddog 4 роки тому

      @@opponentbacon R/YouAreStupid

    • @scoliosis8264
      @scoliosis8264 4 роки тому +1

      Well guess who is right...

  • @alexfisher4207
    @alexfisher4207 4 роки тому

    I love consistent upload schedules

  • @RedRingOfDead
    @RedRingOfDead 4 роки тому +2

    When you get back. We miss your video's. We love them.

  • @NorseGraphic
    @NorseGraphic 4 роки тому +119

    "I'm as stubborn as I'm lazy....."
    -Code Bullet 2019

  • @NathanTheMan
    @NathanTheMan 4 роки тому +418

    I named him Adrian
    **2 seconds later**
    ADRIAN WHAT THE F

  • @aleam7203
    @aleam7203 4 роки тому

    The fact that I could forget about this channel and then remember about it again months later and still not have another upload says something I think 🤔

  • @fincourtier4666
    @fincourtier4666 3 роки тому +4

    This guy is genuinely so funny unintentionally 😂

  • @ItIsJum
    @ItIsJum 4 роки тому +243

    UA-cam: is that a gun
    UA-cam: DEMONITIZED

  • @rufioh
    @rufioh 4 роки тому +448

    what about colouring the snakes head FEFEFE and the body FFFFFF so it looks the same to humans, but it can see it's own head cos it's a different colour

    • @mmmmm49513
      @mmmmm49513 4 роки тому +90

      Literally the first thing I thought he’d do. Talk about overkill

    • @shiinondogewalker2809
      @shiinondogewalker2809 4 роки тому +37

      you don't have to color it differently, just give it a different value to the ai and keep the color the same

    • @seanl.5181
      @seanl.5181 4 роки тому +15

      @@shiinondogewalker2809 further on this, why not just give he ai a version of the screen where each block is a single value in a 2d array (0-3 could be empty, body, head, food) . This would be incredibly small and really the bare minimum without losing any data

    • @fiendfi7119
      @fiendfi7119 4 роки тому +3

      Nerds be watching nerds code AI

    • @janhetjoch
      @janhetjoch 4 роки тому +3

      @LintyCarcass that was showing 2 pictures

  • @commandblockv1
    @commandblockv1 4 роки тому

    For Christmas I want a new video from code bullet

  • @AlexAuHoShun
    @AlexAuHoShun 2 роки тому

    Two years and it still brings back memories

  • @beauvoirferril
    @beauvoirferril 4 роки тому +273

    Looks like he found a way to clone himself. If he can do that, he can definitely do this.

  • @zidanez21
    @zidanez21 4 роки тому +33

    I want "Adrian wtf" merch cause I think that was everyone's reaction to him

  • @brennanchapman2384
    @brennanchapman2384 4 роки тому +6

    Code bullet: “it will not take 3 months”
    Me after four months: so that was a f*cking lie

    • @satka9481
      @satka9481 4 роки тому

      No it isn't. I didn't take 3 months. It's taking more

    • @Ronald_McAura
      @Ronald_McAura 4 роки тому

      *AC Unity flashbacks*

  • @michealwarren6681
    @michealwarren6681 4 роки тому

    Right bruv just take your time you do what's necessary in your life you amazing human being
    Love your videos you glorious nut job! Rock on! Code on!

  • @Skeleton-bs7zy
    @Skeleton-bs7zy 4 роки тому +328

    Why not make the head a different color and feed blurred pixels from outside the vision

    • @williebrort
      @williebrort 4 роки тому +3

      That's a great idea.

    • @archangelgaming2463
      @archangelgaming2463 4 роки тому +35

      And does it need to see all the empty squares? If you just give it the information of the Apple, the head, and the body couldn’t it extrapolate all the empty space?

    • @IamCoalfoot
      @IamCoalfoot 4 роки тому +7

      Or perhaps keep the nearby vision, but give the snake which quadrant of the game board the apple is in, so it at least knows where to look.

    • @vex3488
      @vex3488 4 роки тому +4

      I fell like the head part might be kinda cheating but love the idea of blurring the outside so that like every 4 pixels is condensed and if theres a apple its red. Great idea!

    • @finikksu
      @finikksu 4 роки тому +5

      I was thinking the same. If the initial problem is it can't see where the head it, can't the head be a different color and feed it the whole map at once? Maybe like that it will work and build and strategy based on where the whole body is at the time and where the apple is

  • @trazh
    @trazh 4 роки тому +143

    Had you made it 4 months we would’ve had a problem

  • @Lskdodod
    @Lskdodod 4 роки тому

    Thx for all the training mate - Adrian

  • @Benmf
    @Benmf 4 роки тому

    You are such a great youtube coder.
    PLEASE UPLOAD I MISS YOU SO MUCH YOUR FUCKING COMEDY
    Its so fucking good

  • @queijoman999
    @queijoman999 4 роки тому +319

    ...
    Why didn't you just, you know, gave the snake a head?
    Like
    Slightly different color pixel for example
    Then you can make it see the whole screen

    • @tankofnova9022
      @tankofnova9022 4 роки тому +8

      We would know that the different color is the head but the AI wouldn't.

    • @MitchellD249
      @MitchellD249 4 роки тому +13

      I still don't understand why he was trying to find the head originally, since he eventually seemed to just have that information and could center the square of invisibility around it. But the problem with it seeing the whole screen is that's too many pixels for it to have scan, so it just takes absolutely forever to train. It doesn't matter how easy it would be to find the head with that scan, you just don't want to scan the whole screen for any reason or it's going to be too slow.

    • @DropkickedBarracuda
      @DropkickedBarracuda 4 роки тому +9

      @@MitchellD249 its not that he couldn't find it, the ai couldn't because it only saw individual still frames, and had no memory. Either end could be the head from that information.

    • @onursahin4387
      @onursahin4387 4 роки тому +6

      Why does it have to be pixels tho? Why not just give it bounds, tail, head, apple coordinates and the direction? The rest is void anyway.

    • @MitchellD249
      @MitchellD249 4 роки тому +2

      @@DropkickedBarracuda So how did the AI find it in the end?

  • @magnussundorf
    @magnussundorf 4 роки тому +682

    Why didnt you just make the head a different color and then Tell the ai “Thats you”??

    • @neutralghast352
      @neutralghast352 4 роки тому +28

      Because the og snake didn't do that, so why should he

    • @linuspauly2380
      @linuspauly2380 4 роки тому +30

      Well that's basically what he did by putting it in the middle. He could've increased the viewdistance to like 30 to achieve a effect that would come closer to what you say, I guess he thinks that it's too many inputs once again.

    • @magnussundorf
      @magnussundorf 4 роки тому +3

      Smug Anime Girl oh yeah that Makes sense

    • @daniel_pinilla
      @daniel_pinilla 4 роки тому +2

      That’s what I was thinking!,

    • @dudeawsomeness1
      @dudeawsomeness1 4 роки тому

      Maybe he could give the AI a list of the positions of the body parts and maybe a bit of extra information that tells which one is the head.

  • @Robotron56
    @Robotron56 4 роки тому +2

    10:50 HOW IS THAT NOT A THUMBS UP

  • @andrewmanchiraju8005
    @andrewmanchiraju8005 2 роки тому

    Man, I wish code bullet went with idea 3. That sounds like it would make an amazing video that could instantly become his most popular video ever.

  • @harryvpn1462
    @harryvpn1462 4 роки тому +556

    Ok you made a Snake ai, but does it know how to abuse his Up-Tilt?

  • @SuperNova-nl5eb
    @SuperNova-nl5eb 4 роки тому +279

    "it will not take 3 months"
    3 months later: nothing

    • @Melinstri
      @Melinstri 4 роки тому

      Pschhh its gonna be 3 months ! In 5 days so it was only 2 months when you wrote the comment 😂😂

    • @carlosesparza5225
      @carlosesparza5225 4 роки тому +2

      @@Melinstri Ok Ok now its 3 months+

  • @badyoutuberadrian5724
    @badyoutuberadrian5724 4 роки тому +6

    When he said the name Adrian I'm like HOW DID HE KNOW MY NAME then he's like I named it that and now every time I hear Adrian I'm like wassup

  • @slooowz8746
    @slooowz8746 3 роки тому +1

    "Can a 2000 degree knife cut through my crippling uni debt" I love it, I'm first year and already crying when I see my uni debt.

    • @VulsikDoSurilim
      @VulsikDoSurilim 2 роки тому

      Well at least it might be possible for it to cut through a crippled uni *dept* building. 🤷🤷🤷

  • @sfdfsc2483
    @sfdfsc2483 4 роки тому +117

    “Milking snake”
    Code bullet - 2019

  • @MRtecno98
    @MRtecno98 4 роки тому +271

    Every vaguely science-related channel: **Exists**
    Brilliant: _allow us to introduce ourselves_

  • @ibrahimabdalla1642
    @ibrahimabdalla1642 4 роки тому +1

    I really want to see some tutorials from you!

  • @TheYargonaut
    @TheYargonaut 4 роки тому

    Q-learning works on Markov Decision Processes. In order to avoid any memory in the neural network, the input feature vector needs to encode all the information about the state. If it doesn't, you have created a Partially Hidden Markov Decision Process, and need to add memory for your NN to manage, making it some form of RNN.
    There are efficient ways to decrease the number of input features quickly in your neural net without manually throwing out potentially-necessary information: attention-based neural networks, which, happily, outperform RNNs on the types of sequence tasks where you would normally have this problem.
    I am surprised the 2-frames method was too large an input vector, given that the original paper used 4 frames of an Atari game at a time as the input vector.
    Impressive results though.
    Came here from the next video to learn the rules you imposed, so let's hope you found these ideas in the interim :)

  • @XIIchiron78
    @XIIchiron78 4 роки тому +409

    What if you add a penalty for wasting time? For going x amount of time without eating for example.

    • @underrated1524
      @underrated1524 4 роки тому +20

      This is a reasonable thing to do, and in fact is pretty commonly done with Q-learning algorithms. It might come with some downsides in the case of snake (getting so greedy for food that you corner yourself) but that seems like a fair price to pay.
      Come to think of it, I wonder if Code Bullet simply chose not to discuss that possibility because the video was already tangential and long-winded as is.

    • @seagullskunk
      @seagullskunk 4 роки тому +5

      this was my idea too. Because otherwise the only thing the snake has to do is survive long enuogh so that it has almost unlimkited time to grow slowly

    • @connorkruger
      @connorkruger 4 роки тому +2

      Yeah you could give it like a hunger value. If (hunger == 0) { snakeHitpoints = 0; }

    • @connorkruger
      @connorkruger 4 роки тому +2

      Sorry for the Java example 😂

    • @Adrobiel
      @Adrobiel 4 роки тому +1

      This was something I was considering while watching as well. It might help to create a different behavior than its pong like behavior while searching for food.

  • @sporkator4632
    @sporkator4632 4 роки тому +53

    Your Snake's Name is Adrian, mine too. You loaded this video up at the 13.07. which is my birthday.
    Thx

  • @Houdm
    @Houdm 3 роки тому

    R.I.P. Code Bullet. He was taken from us to early. I hope his family and friends are doing fine. He will be missed

  • @neoleomedia1676
    @neoleomedia1676 4 роки тому

    He’s alive! And just uploaded!

  • @lootbox289
    @lootbox289 4 роки тому +35

    Code Bullet: going to milk snake for a fourth video
    Adrian: (chuckles) I'm in danger

  • @SC19_ow
    @SC19_ow 4 роки тому +78

    So Adrian is basically looking for the apple randomly, but any human player knows where the apple is all the time, so why not just tell him where it is, either with or instead of the 20x20 fov

    • @kasonnara
      @kasonnara 4 роки тому +7

      How do you tell it where it is? 2 solutions:
      1) giving the full grid with 0 when there is nothing 1 when there is the apple, this is the solution used here and often used beause by experience AI perform well with that form, but it require a lots of inputs
      2) Giving it coordinates, so 2 inputs, one for each the horizontal and vertical axis:
      So less inputs so far so good. The problem is that AI often struggle to understand this kind of informations, because they need to learn more complicated task to understand that sometimes a x=10 mean go up and some times it will mean go down, it need to learn some tasks like making substraction... Not as easy as it looks for an AI that learn by itself.

    • @robh5246
      @robh5246 4 роки тому +11

      @@kasonnara Maybe you could give the coordinates relative to the head. So negative/positive numbers would always mean the same direction and it would become a minimization problem.

  • @the_vine_queen
    @the_vine_queen 2 роки тому +1

    you could also just give it the screen position or coordinates of the apple, the head and the body, since you made the game yourself so it'll be easily accessible to the AI. Then it'll be way less inputs (maybe more as the snake gets bigger but once it reaches the halfway point you could just give it all the empty squares instead)

  • @Danielle-fm2tj
    @Danielle-fm2tj 2 роки тому

    The banana man is everyone who immediately thought "Just make the head a different colour"

  • @FacterinoCommenterino
    @FacterinoCommenterino 4 роки тому +402

    Today's fact: Alligators will give manatees the right of way if they are swimming near each other.

    • @godlyhax4172
      @godlyhax4172 4 роки тому

      wut

    • @ZippyMagician
      @ZippyMagician 4 роки тому +1

      Facterino Commenterino go back to hearthstone lmao

    • @poke_1879
      @poke_1879 4 роки тому +4

      Damn thats weird, do you know why tho?

    • @eat_you_beans5252
      @eat_you_beans5252 4 роки тому +1

      Wait really that is weird the alligator can just attack it and win

    • @cheese0261
      @cheese0261 4 роки тому

      The right of way right into there mouth

  • @The_RayBlast
    @The_RayBlast 4 роки тому +792

    Why not punish the AI for every 30 seconds it doesn't find the apple?

    • @montylemon9445
      @montylemon9445 4 роки тому +33

      Big brain mode

    • @FragileJesseLord
      @FragileJesseLord 4 роки тому +8

      I said the same thing!

    • @apples7568
      @apples7568 4 роки тому +51

      Nooooo your gonna make Adrian sad he isn’t the smartest but he is still a good boy

    • @0xBE7A
      @0xBE7A 4 роки тому +46

      No, that wont work. The Input needs to be as predictable as possible. By introducing a "death-timer" the network has no idea, why the fuck it just died and looks for any possible correlation in the input data. this will result in shitty unexplainable behaivor. en.wikipedia.org/wiki/Markov_decision_process

    • @kajus3616
      @kajus3616 4 роки тому +2

      Yeh its like beating his as every second sometimes -_-

  • @aubreyterry9453
    @aubreyterry9453 4 роки тому +2

    CodeBullet: Good news I’m back!
    Also CodeBullet: Doesn’t post for another 4 months.

  • @KrazyCrafter421
    @KrazyCrafter421 4 роки тому +1

    I think you should have the general direction of the apple to prevent the searching problem