Deep Q Learning for Video Games - The Math of Intelligence #9

Поділитися
Вставка
  • Опубліковано 19 жов 2024

КОМЕНТАРІ • 246

  • @jmzg1229
    @jmzg1229 7 років тому +18

    Hey Siraj, I don't think any of us can beat the Dota 2 bot that OpenAI just unveiled. Those guys really deserve a shout-out.

  • @ben_jamin01
    @ben_jamin01 10 місяців тому

    This is a high quality video and I'm sure a lot of people can tell you put a lot of effort into these.

  • @satindershergilla
    @satindershergilla 7 років тому +1

    Finally a UA-camr I always wanted to watch. Speed cool and great information

  • @herougo
    @herougo 7 років тому +5

    Hi Siraj, could you include pseudocode of algorithms you talk about? I think it is crucial to be able to implement algorithms you learn about (ie "What I cannot code myself, I do not understand"). Explaining pseudocode is a great way to communicate algorithms in a clear, complete, and non-ambiguous way.

  • @giuseppeguap7250
    @giuseppeguap7250 5 років тому

    Just saw this now, your jokes we're KILLING IT back then

  • @ShaneCalderssc
    @ShaneCalderssc 7 років тому

    Thanks Siraj. Can't wait for the Super Mario Bros Bot. I enjoyed your videos in the deep learning ND. Cheers your effort is appreciated.

  • @basharjaankhan9326
    @basharjaankhan9326 7 років тому +3

    OMG, i googled "Q Learning with Neural Network" a few months back without realising it was this important.

  • @aksa706
    @aksa706 7 років тому +60

    I can't feel my brain anymore

    • @SirajRaval
      @SirajRaval  7 років тому +4

      oh no

    • @aliciasuper7014
      @aliciasuper7014 5 років тому

      Tell me about it :/

    • @randomorange6807
      @randomorange6807 4 роки тому +1

      Hey did u know? The brain can't feel pain, it senses the pain of the body and transmits it but it can't feel the pain...
      ..It can get hurt tho and get brain damage

    • @janmichaelbesinga3867
      @janmichaelbesinga3867 4 роки тому

      @@randomorange6807 but what if you punch a brain? does it feel pain?

  • @luckybrandon
    @luckybrandon 7 років тому +2

    One of your best vids Siraj!!

  • @piyushgupta809
    @piyushgupta809 7 років тому

    Great Improvement Brother. I am sorry, but previous videos were not good. Nice Tutorial and Intuition. although I Do recommend watching Deep mind's Reinforcement Learning Tutorial before jumping into Practical Application

  • @shreyas707
    @shreyas707 7 років тому

    I don't understand 10% of what you say but your videos are just epic! Please keep posting them often :)

  • @lampsonnguyen9425
    @lampsonnguyen9425 4 роки тому

    you explained it so well. thank you

  • @Machinima2013
    @Machinima2013 7 років тому +5

    You should do a video comparing this with NEAT, which is popular for this same use case.

  • @rheeaa
    @rheeaa 7 років тому

    Siraj I'm a huge fan of your UA-cam channel and I truly admire the way you taught yourself ML. I'm in my final year of undergrad, and I was thinking of not pursuing a master's degree rn. Any advice on what resources to use to teach myself ML or how to get some industry level exposure ?
    Thanks in advance 😉

    • @SirajRaval
      @SirajRaval  7 років тому +1

      thanks rhea! see the ML subreddit

  • @insightfulgarbage
    @insightfulgarbage 6 років тому

    Very nice information and rythm, subscribed !

  • @prashanttarey9902
    @prashanttarey9902 6 років тому

    Awesome and optimized explanations in all the videos! Thanks a lot!!

  • @Machin396
    @Machin396 6 років тому +1

    Your videos are amazing, thanks.

  • @weinansun9321
    @weinansun9321 7 років тому

    thank you Siraj for your awesome content, you really made learning fun and easier!

  • @qwerty11111122
    @qwerty11111122 7 років тому

    Hi Siraj, could you have a video mention the OpenAI bot that beat a pro gamer at Dota 2 a few days ago? It's great that you released this video so close to this current event

  • @boemioofworld
    @boemioofworld 7 років тому

    That was an awesome explanation. Thanks.

  • @HuyNguyen-rt7eb
    @HuyNguyen-rt7eb 7 років тому

    Hey Siraj great job on the videos. :) what do you think of the dota 2 ai that beat a pro player?

  • @tylersnard
    @tylersnard 5 років тому

    Smart guy, talented teacher.

  • @nermienkhalifa5997
    @nermienkhalifa5997 6 років тому

    Great, I do really love your way of explaining!! thanks

  • @hanyuliangchina
    @hanyuliangchina 7 років тому +1

    i Like this interesting video rather than the previous purely theoretical video,
    more humor is better
    For me, now the most important question:
    1. for Machine learning beginners to train machine model
    which one is better buy gpu graphics card, or buy amazon cloud gpu hours
    2.the tip of deep learning environment configuration,
    3. the tip of Programming development skills,

  • @Chris-di3us
    @Chris-di3us 7 років тому

    I love you man, I always wanted to do this myself

  • @nitishravishankar6586
    @nitishravishankar6586 6 років тому

    Thanks a lot Siraj! This video provided a great insight on applications of Q learning and RL. Are there any programming assignments (that includes a dataset) for this?

  • @stuartdavid
    @stuartdavid 7 років тому

    Very nice! Do you have a video with more detail on Q learning? Would be interesting to see how the Q matrix evolves over play of a simple game.

  • @altafurrahman9404
    @altafurrahman9404 5 років тому

    Hi Siraj, I want to know that I am going to do a path planning project to navigate a robot with Q learning. How much minimum hardware will be required to do this? Do we need a GPU? Will a core i5 PC only with CPU will be enough?

  • @jobrown04
    @jobrown04 6 років тому

    Hi Siraj. Have you thought about using Capsules (CapsNet) instead of not having a MaxPooling layer?

  • @look3248
    @look3248 7 років тому +58

    Hey Siraj could you expand on this topic and explain how Sethbling's MarI/O program works?

    • @xPROxSNIPExMW2xPOWER
      @xPROxSNIPExMW2xPOWER 7 років тому +6

      I believe siraj already has a video on Genetic evolution decision making if im not mistaken. Does't Seth explain it pretty indepth tho, he talks about everything from math to how he programmed it with pearl I think.

    • @SirajRaval
      @SirajRaval  7 років тому +14

      genetic algo vid coming this week (similar to what he used)

    • @-justyourfriendlyneighborh5898
      @-justyourfriendlyneighborh5898 7 років тому +1

      Siraj Raval Hey Siral, in a previous stream you mentioned that learning this kind of thing (neural networks/Machine Learning) is best to do on the internet. I was wondering, for a near complete beginner, (minor experience with Processing.JS) where would you suggest that I start off? (I'm 15 and want to get into this field as soon as possible)

    • @flyingsquirrel3271
      @flyingsquirrel3271 7 років тому +2

      icancto
      Did you read the NEAT paper? If not, I'd recommend it, because it's actually really smart and comprehensible. NEAT is not just picking the best randomly generated genomes but uses a crossover mechanism which makes sure that only connections are crossed over that have a similar "purpose" inside of the neural net. It can intelligently crossover neural networks of different topologies which are created through mutation, starting with minimal neural networks. That way it improves the weights AND selects the ideal topology of the neural nets.
      Comparing NEAT to back-propagation doesn't make any sense because it's purpose is to be used, when you can't use back-propagation. MarI/O is a good example for this. What target-data would you use for back-propagation there? ;-)

    • @TheAnig140895
      @TheAnig140895 6 років тому

      he used lua

  • @user-zu1ix3yq2w
    @user-zu1ix3yq2w 7 років тому +8

    RIP Chester.

  • @koppuravuriravisankar7954
    @koppuravuriravisankar7954 7 років тому

    Hi Siraj, I love your teaching style and I am a member in UDACITY's deep learning foundation program in which you are an instructor, Here my doubt is that can we use DEEP Q-LEARNING in any other situations where image or pixel input would not be there, If yes can you tell how. I have read that for building Q-table we can use neural networks instead of table(state * action). can you explain it or if possible do a video about this.

  • @yehorpererva6803
    @yehorpererva6803 7 років тому

    Cool video. Thanks.
    But who to adjust this for certain purpose (like collecting all coins / getting the less score / speedrunning)?

  • @Rambo9677
    @Rambo9677 6 років тому

    Great Video Siraj Thanks
    But I don't get something. How do you input 4 Gamescreens?
    Do you combine them as one input?

  • @frankribery3362
    @frankribery3362 5 років тому

    That part where he says hello world it's siraj... I'm replaying it again and again coz it's soo funny xD

  • @rolininthemud
    @rolininthemud 7 років тому

    I understand that a convolutional neural network can be used to simplify the state from an array of pixels to a smaller collection of values, but how does the algorithm use a deep network to approximate the Q-function? 8:19
    Thank you!

  • @JakubRohla
    @JakubRohla 7 років тому +4

    I still don't understand how we can store these Qs. Wouldn't they contain a quadrillions of states and actions for a pretty simple game? Seems pretty inefficient, so I would love to know where I'm wrong in my understanding of Q learning. Is there some generalization in place or what?

    • @poc7158
      @poc7158 7 років тому +8

      You can store all possible actions for all possible state in a matrix for a simple game like tic tac toe. However as you say it is impossible for more complexe game, it's why we use a neural network wich replace this matrice by taking the pixels of the screen as input (the state) wich then output an action. After training it is suppose to give the optimal action for any state we give as input.

    • @SirajRaval
      @SirajRaval  7 років тому +1

      great answer pierre

    • @JakubRohla
      @JakubRohla 7 років тому

      Thanks for the reply, this clarified it for me. Much thanks ^^

    • @ml-squared
      @ml-squared 5 років тому

      The way this works is by approximating an optimal Q function. A Q function is a function of state and action, so Q(s,a). Q*(s,a) is the optimal Q function. This is great for games with few states, but because of combinatorics, it does not scale to games with hundreds of thousands of states, such as video games. To accommodate this, we approximate Q* by using a parameterized Q function, Q(s,a,Theta), where Theta is a set of parameters that we need to optimize to bring us to approximating Q*. A type of function that's excellent at iteratively approximating functions through parameters is a neural network. So that's where Deep Q learning comes in, optimizing a neural network to approximate Q*.

  • @arafatullahturjoy5380
    @arafatullahturjoy5380 5 років тому

    Can Q-Learning be used for solving classification problem? If it does then how? could you explain or make a video regarding this topic? If you do it will very helpful.

  • @thedeliverguy879
    @thedeliverguy879 7 років тому

    Thanks for the great video. I‘m still confused how this algorithm can generalize in any game. Is the generalization of algorithm different from the generalization of a specific AI program? Since the input and label (or control/buttons whatever) are fixed in a game, I don't think you can make an AGI just with this algorithm.

  • @Veptis
    @Veptis 3 роки тому

    I am starting a deep learning course at university this semester. And maybe I can do a homework project. There is a mobile game from my childhood: Mirror's Edge mobile which launched on iOS and Windows Phone in like 2011 but is no longer available. If I somehow find a way to emulate the game on a computer and get either frames or game state values and manage to give it one of four different inputs per frame, I might try and teach a network to play the game. I also want to have it beat levels really fast and explore speedrunning this way.

  • @chiranshuadik
    @chiranshuadik 7 років тому +4

    Nice video!
    Are u coming to Pune or Mumbai ??

    • @SirajRaval
      @SirajRaval  7 років тому +6

      Mumbai

    • @chiranshuadik
      @chiranshuadik 7 років тому +1

      Siraj Raval when and where ??
      Can ur fans meet you?

  • @abhinashkhare1933
    @abhinashkhare1933 7 років тому

    hey siraj , can you help me explain this.. in sethbling video , the bot learned to play a mario level. But he didn't use the learning on new data or level. isn't this a overfitting, i mean bot just learned that level from trial n error.

  • @hangchen
    @hangchen 5 років тому +2

    7:46 Well I don't think the pooling layer is used to get insensitive about the locations of the objects in an image. The convolutional layer can already do that since the convolutional operation is actually a pixel window going from location to location until all locations are considered under the set stride. The pooling layer is used to semantically merge similar features into one, like in the max pooling example used in this video, you can see the image is partitioned into 4 parts and in each part, the max number is preserved. The max number can semantically represent a feature in that region. It's more like image compression but we have preserved the key features of this object in this image. Feeding this pooled image into the neural net could be more efficient.

  • @mankitpong5591
    @mankitpong5591 7 років тому

    The videos of David Silver from Deepmind are worth watching, that might be the bast reinforcement learning courses on web.

  • @johndoe-ug3lo
    @johndoe-ug3lo 7 років тому

    So I am working on an AI for a hidden information game (for the sake of simplicity, you can think of poker). Optimal play would actually be a nash equilibrium problem, where each action is being taken some percentage of the time. Would the proper way to make an AI for this be to use a random number generator, and scale the frequency of each action to its Q value?

  • @GKS225
    @GKS225 7 років тому +23

    And now open AI beats human in Dota2 1v1 matchups

    • @SirajRaval
      @SirajRaval  7 років тому +7

      stuff is moving fast

    • @readingsteiner6061
      @readingsteiner6061 7 років тому +1

      Sir I don't know who you are, but you totally blew me away with your comment. It is very rare to come across an individual who did us(the viewers on the Internet) a huge help in debunking certain methodologies in machine learning. I would love to see more of your writings. Folks at Vicarious are of a different breed I believe, maybe it is because of their influence by the Redwood Neuroscience Institute. It would certainly be a privilege if you consider my request. Truly humbled. Thanks Sir. I hop

  • @karljay7473
    @karljay7473 6 років тому

    Can't find the links to the winner and runner up. Great series of videos!

  • @haziqhamzah3071
    @haziqhamzah3071 5 років тому +1

    Can you give some insights for Deep Q Learning in Mobile Networking?

  • @herstar9510
    @herstar9510 2 роки тому

    When are you forming a mid 90s boy band with machine learning themed ballads?

  • @neonipun
    @neonipun 5 років тому

    At 8:08 whats the input_shape supposed to be ?? the challenge code and what you show are different ......

  • @larryteslaspacexboringlawr739
    @larryteslaspacexboringlawr739 7 років тому

    thank you for deep q video game video

  • @fayezbayzidify
    @fayezbayzidify 7 років тому

    first, but seriously nice vid Siraj you are amazing at what you do !

  • @AliAkhtarcs
    @AliAkhtarcs 6 років тому

    What is the difference between static and dynamic dataset? Can you elaborate more?

  • @voodoocobblestone8320
    @voodoocobblestone8320 7 років тому +47

    i cannot understand your videos. how should i start learning?

    • @vladislavdracula1763
      @vladislavdracula1763 7 років тому +20

      Start by learning basic calculus, statistics, and linear algebra. Once you understand the basics, learning advanced concepts is not that hard.

    • @xPROxSNIPExMW2xPOWER
      @xPROxSNIPExMW2xPOWER 7 років тому +26

      no tensorflow and most of the other libraries handle almost all of the higher level math, all you will need buddy is to learn basic object orientation and then move into ML techniques. Don't fret, most of the complex math has been solved, all you will need to do is creativity implement that. Trust me it gets very easy once you learn the flow. If you are interested in advanced topics where you want to build your own ML algorithm then learn Linear Algebra with an emphasis on higher dimensional linear algebra will help greatly.

    • @hammadshaikhha
      @hammadshaikhha 7 років тому +3

      Like others have mentioned, have a math and some machine learning background helps understand these faster pace videos. Another thing you can do is look in the description, read some of the blogs on the topics under "learning resources", and then come back and watch video again, it should make more sense.

    • @MachineLearningwithPhil
      @MachineLearningwithPhil 7 років тому +1

      Great place to start is coursera's class on machine learning. It's free and a solid intro to the core concepts
      From there, there are plenty of step by step tutorials on UA-cam. SentDex has a great channel with lots of content - check him out

    • @notapplicable7292
      @notapplicable7292 7 років тому +2

      Tip if you're trying to start don't start with Siraj. Start with someone slow (possibly the udemy machine learning micro-degree) as Siraj is very fast and awesome to expand your understanding but hard to start learning with.

  • @Sohlstyce
    @Sohlstyce 3 роки тому

    hey siraj, can you make a full tutorial on reinforcement learning? thanks siraj

  • @huluvublue112
    @huluvublue112 7 років тому +1

    Question: Why do pooling layers make the Network spatially invariant? Don't they just compress information? I thought convolutional layers do that, which the model does have

    • @viviankeating7327
      @viviankeating7327 7 років тому

      Max pooling compresses information, but it's lossy. On the first pooling operation you lose a pixel or two of position information. On a final pooling operation you might effectively be taking the max across the entire image.

  • @BhagavatVibes
    @BhagavatVibes 7 років тому

    Hey Siraj, fantastic work. I am a unity developer so how can i integrate this functionality in games i already coded. Best wishes for future videos.

  • @OthmanAlikhan
    @OthmanAlikhan 5 років тому

    Thanks for the video =)

  • @moonman239
    @moonman239 7 років тому

    So with a Markov discrete process, there will always be some reward function R because getting the reward depends only on the states and actions we take. Thus, our AI can learn Q simply by going?

  • @shahzmalik
    @shahzmalik 6 років тому

    The only thing I am impressed is with his creativity

  • @NaoshikuuAnimations
    @NaoshikuuAnimations 7 років тому +30

    Just a piece of advice, I hope you see this : never speak while showing text ! (I remember Vsauce saying this in a video too)
    But really, either show text and read it, or show images / yourself while talking; but displaying a text while saying something different is really hard to follow.
    If you want to talk about a part of the text, try to darken everything but the line you're talking about; overwise we won't know where to stop and whether to listen to you or read. (at least that's what most "educational" UA-camrs I follow do, and it works quite well)
    Especially when you're talking about such complicated subjects (and with such pace), I think that's important !
    Hope it'll be useful somehow;
    Thanks for the vid' !

  • @TheLordyyx
    @TheLordyyx 7 років тому

    Hey Siraj, Deep Mind also works on a StarCraft 2 Learning Enviroment. I would love to see a video about it :)

  • @UNIVERSALMINDBAND
    @UNIVERSALMINDBAND 6 років тому +1

    And, what happen to rewards functions?, are the same for all these games?

  • @vamsikrishna-qz8rt
    @vamsikrishna-qz8rt 7 років тому

    Hi Siraj, is there any way we can train a machine learning mode with a raw text file and properly arranged data from the text file in .csv file? So that when we input a new text file it automatically converts that text file into the .csv file format with columns and rows which we used as training data. Is this even possible?

  • @srenkoch4597
    @srenkoch4597 6 років тому

    Hey Siraj! Great stuff! it could be really cool if you would combine Recurrent Neural Network and Deep Q-network = DRQN in a video! Thanks!

  • @anonco1907
    @anonco1907 6 років тому

    The memes were distracting, was to busy laughing that I didn't learn anything.

  • @sandzz
    @sandzz 7 років тому +25

    Bill Nye of Computer Science
    Kanye of Code
    Beyonce of Neural Networks
    Osain Bolt of Learning
    Chuck Norris of Python
    Jesus Christ of Machine Learning

    • @SirajRaval
      @SirajRaval  7 років тому

      thanks Sandzz

    • @sandzz
      @sandzz 7 років тому

      I copied it from your channel description...I don't deserve that "thanks"

    • @sandzz
      @sandzz 7 років тому

      I copied it from your channel description....I don't deserve that "thanks" .

  • @rnparks
    @rnparks 7 років тому

    Can you show the game Mario game actually running? It throws an error in my notebook. I'm using python 3.6 so maybe its a translation issue?

  • @harshakada3374
    @harshakada3374 7 років тому

    hey siraj
    I have a 4 node raspberry pi cluster computer, can I use it to train this Mario game?

  • @YaduvendraSingh
    @YaduvendraSingh 7 років тому +1

    This is ultimate !! A game bot !! Thanks a lot Siraj ! When are you heading to India for a meet-up ?

    • @SirajRaval
      @SirajRaval  7 років тому +1

      thanks! Sept 1 delhi one-way ticket. i'll figure things out from there

  • @egor.okhterov
    @egor.okhterov 7 років тому +7

    Too fast. I need a longer video :(

    • @mferum77
      @mferum77 7 років тому +2

      поставь скорость 0,5 )

    • @SirajRaval
      @SirajRaval  7 років тому +2

      more to come

  • @MavVRX
    @MavVRX 5 років тому

    How would reinforcement learning work on a game with a town hub? One that requires mouse clicks to go into a dungeon, eg, Diablo, MMOs.

  • @harshitagarwal5188
    @harshitagarwal5188 7 років тому

    at 5:15
    you say the more in the fuure the reward is - more are we uncertain of it? i didn't get it-can you explain with an example ?

  • @chicken6180
    @chicken6180 7 років тому

    been waiting so long for this! havent even watched it , but know it's going to be great already
    edit: confused, but not dissapointed :D

  • @claudiowalter3092
    @claudiowalter3092 5 років тому

    How do you get the computer to play the game by itself and read the screen?

  • @MotazSaad
    @MotazSaad 5 років тому +1

    The link of the paper
    web.stanford.edu/class/psych209/Readings/MnihEtAlHassibis15NatureControlDeepRL.pdf

  • @JazevoAudiosurf
    @JazevoAudiosurf 7 років тому

    so this version is without a NN, at which point to you need a NN?

  • @Belofsky1
    @Belofsky1 7 років тому

    I'm a hardware guy mostly, how do I go about ai or algorithms?

  • @synetic707x
    @synetic707x 7 років тому

    A video about Q learning for video games on actual games (without opengym) would be great

    • @SirajRaval
      @SirajRaval  7 років тому +1

      will consider thanks

    • @synetic707x
      @synetic707x 7 років тому

      Siraj Raval Awesome, thank you.

  • @tomwojcik
    @tomwojcik 7 років тому +1

    Video uploaded Aug 2017 and it's only 9:46 long? Autolike from me :)

  • @_____8632
    @_____8632 5 років тому +1

    Wait, where my brain at?

  • @masoudmasoumimoghaddam3832
    @masoudmasoumimoghaddam3832 7 років тому

    Siraj all your videos are awsome.
    Could you make a video about temoral-difference learning which is announced by Professor Sutton.
    I also ask you to make another one about General Game Players and Monte Carlo Tree Search.
    Thanks

    • @xPROxSNIPExMW2xPOWER
      @xPROxSNIPExMW2xPOWER 7 років тому

      Yes a video on TD Learning would be wonderful

    • @masoudmasoumimoghaddam3832
      @masoudmasoumimoghaddam3832 7 років тому

      Yeah! Specially if its differences and similarities with reinforcement learning would be pointed out.

    • @xPROxSNIPExMW2xPOWER
      @xPROxSNIPExMW2xPOWER 7 років тому

      I think TD learning is just an extension to back propagation, Its pretty fascinating

  • @donaldhobson8873
    @donaldhobson8873 7 років тому +2

    Wouldn't it work better if you trained a variational autoencoder on the screen data to capture the important patterns, then trained the deepQ model on the encoded screen. That way the VarAuto can learn a lot of info about how the world works even when rewards are scarce. I would use a bottleneck thats about 1/4 the dimensions of the image with say 3 layers. Leave the shrinking down from convolutional layers to dense layers for the deepQ.

    • @hammadshaikhha
      @hammadshaikhha 7 років тому

      I don't know anything about this topic yet. But why don't you submit something along this line for the coding challenge for this week?

    • @SirajRaval
      @SirajRaval  7 років тому

      hmm good thought . an autoencoder could work well.

  • @manojagarwal3441
    @manojagarwal3441 4 роки тому

    Hey siraj , can you please share the link to code by winner and runner up on lda challenge , i know i am pretty late but i would really appreciate if you could help

  • @kermitthehermit9373
    @kermitthehermit9373 6 років тому

    we all miss Chester!!😢

  • @anudeep168
    @anudeep168 7 років тому

    Awesome video :) But reminded me of Xin-Yang's HotDog-NotHitDog App :D

    • @codethings271
      @codethings271 7 років тому

      thst was a classifier , SUPERVISED learning

  • @tjhannover3069
    @tjhannover3069 6 років тому

    Is it possible to do that with games like Overwatch?

  • @maxitube30
    @maxitube30 6 років тому

    where can find the winner of stock prediction challange?

  • @williamcosta6683
    @williamcosta6683 7 років тому

    Could you guys give me any hint on how i can approach pong game to build a model where i can apply q learning? (I have all the informations necessary, like ball x and y position, player x and y position, ball speed, etc). I'm struggling at this :_:

  • @darshansharma_
    @darshansharma_ 7 років тому

    Nice

  • @oliverguggenbuhl4447
    @oliverguggenbuhl4447 5 років тому

    and now OpenAI is beating the Dota 2 world champions in 5v5

  • @MissFashionDesign
    @MissFashionDesign 7 років тому

    Siraj Rival is the neurotransmitter of generation z

  • @TheLibertarian97
    @TheLibertarian97 5 років тому

    How I define when will give a reward to the bot?

  • @eav300M
    @eav300M 7 років тому

    Super Siraj AI. Who do you think is correct regarding the future of AI, Elon or Zuck?

    • @getrasa1
      @getrasa1 7 років тому

      Elon, because he's aware of the danger that AI might cause to human race if we lose control over it.

    • @vijayabhaskar-j
      @vijayabhaskar-j 7 років тому +1

      If you know AI,then you wont think AI as a Danger.

    • @getrasa1
      @getrasa1 7 років тому

      Edgar Vega As soon as its intelligence starts increasing exponentially, we won't be able to keep up with it and understand it. Everything we don't understand is dangerous at some point (I'm referring to AGI and ASI)

    • @SirajRaval
      @SirajRaval  7 років тому +1

      elon. we do need some regulation.

  • @pinkiethesmilingcat2862
    @pinkiethesmilingcat2862 7 років тому +1

    Siraj you have not accepted english subs in MoI #6 :(

  • @akashdeepjassal3746
    @akashdeepjassal3746 7 років тому

    Wow Nice

  • @mattgoralka3941
    @mattgoralka3941 5 років тому

    Hi, can someon please explain to me how the model is predicting in this sequence of code when it hasn't been trained yet? I'd really appreciate it. Thanks!!
    if np.random.rand()

    • @lefos99
      @lefos99 5 років тому

      Hey there, so the epislon tells us when it is ready to exploit Q-values instead of explore the map.
      The main idea is:
      1) We specify an exploration rate “epsilon,” which we set to 1 in the beginning. This is the rate of steps that we’ll do randomly. In the beginning, this rate must be at its highest value, because we don’t know anything about the values in Q-table. This means we need to do a lot of exploration, by randomly choosing our actions.
      2) We generate a random number. If this number > epsilon, then we will do “exploitation” (this means we use what we already know to select the best action at each step). Else, we’ll do exploration.
      The idea is that we must have a big epsilon at the beginning of the training of the Q-function. Then, reduce it progressively as the agent becomes more confident at estimating Q-values.
      Here is a nice graph of this idea: cdn-media-1.freecodecamp.org/images/1*9StLEbor62FUDSoRwxyJrg.png
      Hope that helped! :D

    • @mattgoralka3941
      @mattgoralka3941 5 років тому

      @@lefos99 Hi, thanks for helping me out! I understand that (at least I think), but I don't understand how the model can predict if it hasn't been trained. At what point is the model learning from the D values and is able to "exploit." I'm from more of a c background but I don't get how it's learning until the next block of code where it does "Experience Replay."

    • @lefos99
      @lefos99 5 років тому

      @@mattgoralka3941 Oh okay now I see your question. Well it depends on the Reinforcement Learning Technique you use.
      For example if you use simple Q-Learning, you just create a matrix (row is for state and column is fro action). There are plenty of concepts that you use, that I cannot explain in just one UA-cam comment.
      A reallllly good and simple tutorial is this: simoninithomas.github.io/Deep_reinforcement_learning_Course/#syllabus
      In this tutorial you will find not only mathematical explanation but also explanation with examples in simple games.
      Check this out! ;)

  • @vaibhavacharya9428
    @vaibhavacharya9428 7 років тому

    Your videos are becoming ununderstandable.

    • @SirajRaval
      @SirajRaval  7 років тому

      nearing end of course, this is advanced stuff. going to get easier starting next week.

    • @vaibhavacharya9428
      @vaibhavacharya9428 7 років тому

      Yep! Thx.

  • @bofeng6910
    @bofeng6910 7 років тому

    Do I have to learn calculus to learn deep learning?

  • @divinceocristao3321
    @divinceocristao3321 7 років тому

    Sounds like q learning for investments