Is the reward R(s) actually R(s')? And should that also be multiplied with the transition probability? max(over a) sum P(s', r|,s, a) [r + gamma*V(s')] ? I am trying to relate the equation presented in the video to standard notation 4 par notation.
can anyone explain (32:00) the switch between two modes (i.e. represented by green and red arrow). To me the green one seems like deterministic rule, the red one seems like stochastic rule. Can they exist simultaneously?
at 17:35, why isn't it gamma = (0,1), instead of (0,1]...if gamma = 1, the influence of the actions farther down the road stays the same as all other actions, rather than shrinking the influence...right?
Nothing too fancy. This was done with Apple Keynote, and I'm faking that scrolling effect with "Magic Move" animations. I'm always looking for better tools to build useful visuals for lectures.
That is the best video I watched so far to understand this topic
very good lecture maybe the best introduction to this topic i've ever seen on youtube
Thank you, I spent hours in this algorithm, finally understood it !
This is such a clear explanation!! Ty for this!! I wish I had taken your class while I was in VT!
Thank you very much, very well explained.
thank you for this brilliant explanation. I wished there was a Question with solution to practice on.
The video provides intuitive but deep understanding in MDP
This is a great explanation video, thanks so much. Your voice is easy to listen to too haha.
Ryan Flynn Thanks! I’m glad it’s helpful. My smooth voice is a huge disadvantage when I teach morning classes and my students all fall asleep.
Very interesting topic. And i think that you will make a fortune if you use your voice in publicity field. Best regards.
thank you so much, you explain the subject very well and have helped me to understand..
Realy great explaintion on Machine learning
Fantastic video! Thanks a lot!
This was fantastic, thank you!
Thanks a lot! I found this very helpful.
Is the reward R(s) actually R(s')?
And should that also be multiplied with the transition probability?
max(over a) sum P(s', r|,s, a) [r + gamma*V(s')]
? I am trying to relate the equation presented in the video to standard notation 4 par notation.
Absolutely amazing lecture!!!
Around 34:00 when there are equations on the screen you should have had a pointer or something to point at what you are talking about. It's not clear.
can anyone explain (32:00) the switch between two modes (i.e. represented by green and red arrow). To me the green one seems like deterministic rule, the red one seems like stochastic rule. Can they exist simultaneously?
Very good easy to understand.
This is so good!!!
Really good video about this topic. Thank you
Great video. Thank you. Could you please make a similar video while we consider a two-dimensional Markov chain with more states?
Best video for MDP on youtube
at 17:35, why isn't it gamma = (0,1), instead of (0,1]...if gamma = 1, the influence of the actions farther down the road stays the same as all other actions, rather than shrinking the influence...right?
much better than my lecturer
Thanks for the explanation!
I have a question at time 8:30
: if we take an action to go to the left, why Pr(c | b, left) isn't 0.00? (we go to another side)
Thanks for your nice tutorial is it possible upload the slides?
What software are you using to make this?? It looks like you have like an infinite page which gives a really clean look
Nothing too fancy. This was done with Apple Keynote, and I'm faking that scrolling effect with "Magic Move" animations. I'm always looking for better tools to build useful visuals for lectures.
Thank you.
excellent
What is the name of textbook
What textbook ? thank you very much
God bless
honesty
The best I find is [4, 1]. I couldn't achieve [4.2, 1.2]. Does anyone achieve [4.2, 1.2]?
YES, I GOT IT
33:27
jhk