Balancing self-driving training data - Python plays GTA p.10

Поділитися
Вставка
  • Опубліковано 20 гру 2024

КОМЕНТАРІ • 72

  • @XenotriX
    @XenotriX 7 років тому +55

    1.Drive around for about 30 minutes using the directional keys
    2.run balance_data.py
    3.Wonder why it doesn't work
    4.complain in the comments
    5.try again with wasd
    Btw, your vids are awesome ;D

  • @pwal6773
    @pwal6773 5 років тому +34

    It looks like a recent numpy update has changed the default np.load(Path) function to have allow_pickle=False by default. To accommodate this numpy update, I needed to change the following line in the balance_data.py script from:
    train_data = np.load('training_data.npy'
    to:
    train_data = np.load('training_data.npy', allow_pickle=True)

  • @alexnick7119
    @alexnick7119 6 років тому +7

    This series is just so amazing!
    I love that you fail from time to time and your great explanations.
    "every frame is its own snowflake"

  • @Damaged7
    @Damaged7 7 років тому

    I love your videos. I'm not a great programmer at all but seeing someone with the skills you have still mess up and have fun with it makes me feel better about all the mistakes I make.

  • @rohanarora2728
    @rohanarora2728 7 років тому +4

    this rocks !!!
    easy ,fast and efficient than the last method !!! GREAT WORK !
    note - we all can share our data in github and hence every one will have huge data sets to train from!

    • @sentdex
      @sentdex  7 років тому +3

      Anyone who wants to share some training data is welcome to, I will happily host it and validate it. I think I first want to come up with a final concept before I start building anything too large. I may try to further perfect this traffic speeder guy. Also curious about implementing the evading police a bit more. I am also not sure if I want to stay in 3rd person or move to 1st.

    • @rohanarora2728
      @rohanarora2728 7 років тому

      yeah!
      in next video if you can map mouse input it will bring your AI to next level! :)
      All the best! hoping to see next tutorial soon!

  • @bohdankhv
    @bohdankhv 3 роки тому

    I'm following this series because I'm wanting for neural network from scratch series and I wanna build AI for my Android game that I made :) Much love Sentdex

  • @theknight2510
    @theknight2510 7 років тому +2

    I'm thinking that pre-allocating the memory for lefts, rights and forwards would be a lot faster. I was looking at this as inspiration for my own data (3-second audio files). I have about 700,000 of them, and pre-allocating memory helped make it blazingly fast. I was also using numpy arrays instead of lists though.
    P.S. Still my favourite youtube channel. Sorry, Siraj.

    • @theknight2510
      @theknight2510 7 років тому

      A lot more cumbersome though :(

    • @andrybratun7064
      @andrybratun7064 6 років тому

      pythonprogramming.net/more-interesting-self-driving-python-plays-gta-v/?completed=/testing-self-driving-car-neural-network-python-plays-gta-v/

    • @shivamraisharma1474
      @shivamraisharma1474 5 років тому

      What is pre allocation and how to do it?

  • @sethbettwieser
    @sethbettwieser 6 років тому +2

    I laughed when he pasted in rights a third time at 10:43.

  • @cashdogg411
    @cashdogg411 6 років тому +2

    Is Training-data-vid.npy a separate file you trained, or did you ass '-vid' to the original file to see what was going on? I'm a little confused on that, thanks!

    • @saint-jiub
      @saint-jiub 2 роки тому

      he changes it back ua-cam.com/video/wIxUp-37jVY/v-deo.html

  • @abdengineer6225
    @abdengineer6225 4 роки тому

    hello please can any one illustrate the numbers wich appear at 5:59 is it contain the slopes and what another informaition in it

  • @tuhinmukherjee8141
    @tuhinmukherjee8141 3 роки тому

    Maybe one shouldn't break the temporal/linear consistency of the data. Rather, pack the data into tuples of size 2 or 3 depending on your choice of threshold following the Markov property. Rather than shuffling the entire thing, one should shuffle those tuples rather. For eg:
    1. Break the list into tuples of size, let's say 3 :
    new_data = zip(data[::3], data[1::3], data[2::3])
    2. Shuffle the new data instead
    shuffle(new_data)
    I might be wrong but maybe this could be a better input to feed to a neural network rather than a single frame at a time.

  • @TheDeadking100
    @TheDeadking100 3 роки тому

    Hey Sentdex, I love this series. I have one question - How comes you did divide your image data values by 255, so that they fit between 0 and 1? I thought this was important for the model to work with the data better? Was this step left out intentionally?

  • @WoodyWilliams
    @WoodyWilliams 7 років тому +1

    Freeze-framing at 12:34 the math of your balance slices doesn't add up. Does it matter? What's making it not add up?
    70365 - forwards
    6708 - rights
    +6427 - lefts
    ----------
    83500 total!! Sweet, that's = len(final_data), but...
    Taking the smallest (lefts) and trimming the others to its len() should create 3x6427 = 19281 < 22436. Also 22436 % 3 != 0. Our final_data isn't really balanced??
    What's causing this?

    • @geniousofdarkness
      @geniousofdarkness 7 років тому

      There should be rights = rights[:len(forwards)] instead of rights = rights[:len(rights)].

    • @WoodyWilliams
      @WoodyWilliams 7 років тому +1

      Thanks GeniusOD. I and hopefully everyone else caught that error in their code (if they coded alongside the video). The github code was corrected by Sentdex quickly but I didn't notice he'd actually run the erroneous code in the video. Thanks for pointing that out.

  • @uobscdarkside732
    @uobscdarkside732 7 років тому +2

    silly question, but wouldnt setting the lengths of lefts rights and forwards equal just make each one equal probability, as if you'd pressed them an equal amount of times thus making it pointless having done the training??? what am i missing / not understanding?

  • @dennischeung3745
    @dennischeung3745 7 років тому +1

    Can anyone explain more about the purpose of balancing the data? Isn't that makes "left" and "right" more important and "straight"less important in the dataset and causing the model generate too many "left" and "right" signals than it should be?

    • @ashwhall
      @ashwhall 7 років тому

      cheung dennis
      It doesn't make the lefts or rights more important, it just shows more samples where the correct action is to turn left or right.
      The problem is actually the reverse. Without balancing the data we're teaching the network that going straight is the correct answer 90% of the time. The easiest solution for the network to learn then is to just always say go straight, unless it's VERY confident that it should turn. Doing so guarantees it a 90%+ success rate - much better than having to actually learn what the correct option is.

  • @jyashi1
    @jyashi1 5 років тому

    At what part was the labeling of images done ?

  • @xR0G3R
    @xR0G3R 7 років тому

    You could mirror your 'right' and 'left' part of the dataset, right? That way you should be able to augment the number of the not-forward data.
    Let me know if this does not work.
    btw, great tutorials :}

  • @mugundhanbalaji
    @mugundhanbalaji 7 років тому +2

    can you explain why you did forwd= forwd[:len(left)][:len(right)]???

    • @sentdex
      @sentdex  7 років тому +9

      The goal is to slice that list so it's only as long as the shortest list.
      Let's say forward is 500 long, left is 205 long, and right is 298 long.
      forward = forward[:len(left)] makes forward 205 long.
      Then when we also do [:len(right)] , we're saying we'll slice up to 298, but the length is already 205, so the length is still 205. Hope that clears it up. If not, make some examples for yourself and play with it to see how it works.

    • @mugundhanbalaji
      @mugundhanbalaji 7 років тому +3

      equivalent of forward = forward[: min(len(left),len(right))]????

    • @sentdex
      @sentdex  7 років тому +4

      Yep, that's right.

    • @mugundhanbalaji
      @mugundhanbalaji 7 років тому

      Thanks man

  • @akcricketlive6029
    @akcricketlive6029 6 років тому

    Where is the training data hosted?

  • @dosonleung536
    @dosonleung536 6 років тому +2

    I think LSTM + CNN will play better grade than simple cnn cause we should know our speed as short term memory.

  • @abbasshodroj6805
    @abbasshodroj6805 5 років тому

    any help ? : while running balance script i get error : AttributeError: 'NoneType' object has no attribute 'fileno'

  • @sminsms
    @sminsms 7 років тому +10

    please do a neural network that silence the noise from your keyboard

    • @sentdex
      @sentdex  7 років тому +35

      +sminsms wouldnt that cancel out my neural network that amplifies my keyboard noise?

    • @FalloutNewNarwhal
      @FalloutNewNarwhal 7 років тому

      sentdex Shook 😲

    • @emretatbak
      @emretatbak 5 років тому

      Keyboard noise sounds great only for me? :D

  • @junweima
    @junweima 7 років тому

    I feel like training with imitation learning before actual DQN or DDPG is a good idea

  • @h0len
    @h0len 6 років тому

    small question, i've created about 140 files each of 500 iterations, but when i load the files i get different counter values, anyone have a clue what is happening? wondering if it is just a memory error or something, to clarify the counter is for all of the files

  • @Parth.Deshpande
    @Parth.Deshpande 3 роки тому

    For those who're not able to get correct values for [W,A,D] / getting [0,1,0] always.
    1. Run the terminal/anaconda prompt as administrator & then run the python file.
    2. Run the game as administrator
    3. Turn on CAPS-LOCK

  • @bchoor
    @bchoor 7 років тому

    really enjoy your videos; would appreciate if you can balance your voice volume with your very loud keyboard. maybe possibly moving your mic, or using a different keyboard would be super awesome! still love your videos!

  • @sandeepganesh7397
    @sandeepganesh7397 4 роки тому

    Can anyone please share their 'training_data.npy' ?!

  • @davidwang4461
    @davidwang4461 7 років тому

    Could anyone please explain to me why the number of data after balancing, which is 22436, is not equal to three times the least number of choices?

    • @davidwang4461
      @davidwang4461 7 років тому

      I think the shuffle here has problem, I tried this code and found that after shuffling, the number of each choices changed, it confused me....

    • @asharkhan6714
      @asharkhan6714 6 років тому

      pass train_data as a list to shuffle. shuffle([train_data])

  • @outroutono4937
    @outroutono4937 Рік тому

    4:10 - 4:24 best part

  • @jfliu730
    @jfliu730 2 роки тому

    you import the wrong shuffle function,
    it should be np.random.shuffle,not random.shuffle

  • @TechAspiron
    @TechAspiron 6 років тому

    After getting my training data in training_data.npy and running balance_data.py, I get 'None' value for each iteration. Can someone tell me what my error is?

    • @TechAspiron
      @TechAspiron 6 років тому

      SOMEONE PLEASE REPLY

    • @dwightschrute782
      @dwightschrute782 6 років тому

      I’ve had this issue as well, I believe it’s a bug that’s ongoing

    • @gavargas22
      @gavargas22 5 років тому +1

      Do you have a link to your source code? No one is responding probably because it is hard to know what you did wrong because we can't see your code

  • @i_norwe_i
    @i_norwe_i 6 років тому +1

    it`s amazing

  • @yashshrivastava1648
    @yashshrivastava1648 6 років тому +1

    mine always showing [0,1,0]

    • @cangunen2165
      @cangunen2165 5 років тому

      Mine is kind of similar [1,0,0], did you get any solution

    • @Parth.Deshpande
      @Parth.Deshpande 3 роки тому +1

      Caps-Lock

    • @Parth.Deshpande
      @Parth.Deshpande 3 роки тому

      I FOUND THE SOLUTION!!!
      1. Run the terminal/anaconda prompt as administrator & then run the python file.
      2. Run the game as administrator
      3. Turn on CAPS-LOCK

  • @rohanshankar4576
    @rohanshankar4576 7 років тому

    Have you trained it using a RNN?
    Should work better I guess.

  • @deknas1407
    @deknas1407 4 роки тому

    This works 2020-02-24:
    import numpy as np
    import pandas as pd
    from _collections import _count_elements
    from random import shuffle
    import cv2
    trainin_data = np.load("training_data-vid.npy",allow_pickle=True)
    for data in trainin_data:
    img = data[0]
    choice = data[1]
    cv2.imshow(("test"),img)
    print(choice)
    if cv2.waitKey(25) & 0xFF == ord("q"):
    cv2.destroyAllWindows()
    break

  • @tuhinmukherjee8141
    @tuhinmukherjee8141 3 роки тому

    play GTA-V FOR SCIENCE!!
    right 😂

  • @cerpokas
    @cerpokas 7 років тому

    You could try balance your data using weighted_cross_entropy_with_logits

  • @DimulyaPlay
    @DimulyaPlay 3 роки тому

    from pandas import 🐼🐼🐼🐼🐼

  • @Ятич
    @Ятич 2 місяці тому

    help

  • @herp_derpingson
    @herp_derpingson 7 років тому

    Host the data separately, dont bundle it with the code. Thanks.

    • @sentdex
      @sentdex  7 років тому +2

      It's already been hosted, and it's not bundled with the code.

    • @akcricketlive6029
      @akcricketlive6029 6 років тому

      where can I find the training data?