Passive Reinforcement Learning

Markov Decision Processes Continued

Model Based Reinforcement Learning: Policy Iteration, Value Iteration, and Dynamic Programming

👆🏻Жми на «МЫ поехали в Питер…» и смотри 1 из 48 видео про мою жизнь

ЖІНОЧИЙ ЛІКАР. НОВЕ ЖИТТЯ. Сезон 2. Серія 14. Драма. Мелодрама. Серіал про Лікарів.

哈莉奎因怎么变骷髅了#小丑 #shorts

Markov Decision Processes

Bert Huang

Переглядів 75 682

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 15 вер 2024

КОМЕНТАРІ • 41

@hosamfikry2924 5 років тому ⁺¹⁸
That is the best video I watched so far to understand this topic
@hobby_coding 3 роки тому ⁺³
very good lecture maybe the best introduction to this topic i've ever seen on youtube
@Pexers. 3 роки тому ⁺³
Thank you, I spent hours in this algorithm, finally understood it !
@syedrumman3920 2 роки тому ⁺²
This is such a clear explanation!! Ty for this!! I wish I had taken your class while I was in VT!
@jff711 3 роки тому ⁺²
Thank you very much, very well explained.
@Ahmed.r.a 5 місяців тому
thank you for this brilliant explanation. I wished there was a Question with solution to practice on.
@coeusmaze9413 5 років тому ⁺²
The video provides intuitive but deep understanding in MDP
@ryanflynn386 5 років тому ⁺⁴
This is a great explanation video, thanks so much. Your voice is easy to listen to too haha.
@berty38 5 років тому ⁺¹⁵
Ryan Flynn Thanks! I’m glad it’s helpful. My smooth voice is a huge disadvantage when I teach morning classes and my students all fall asleep.
@tarik8622 4 роки тому
Very interesting topic. And i think that you will make a fortune if you use your voice in publicity field. Best regards.
@jub8891 Рік тому
thank you so much, you explain the subject very well and have helped me to understand..
@richardm5916 4 роки тому
Realy great explaintion on Machine learning
@seanxu6741 Рік тому
Fantastic video! Thanks a lot!
@JustinMasayda 2 роки тому
This was fantastic, thank you!
@consolesblow 5 років тому ⁺¹
Thanks a lot! I found this very helpful.
@srujayop 2 роки тому ⁺¹
Is the reward R(s) actually R(s')?
And should that also be multiplied with the transition probability?
max(over a) sum P(s', r|,s, a) [r + gamma*V(s')]
? I am trying to relate the equation presented in the video to standard notation 4 par notation.
@quantlfc 2 роки тому
Absolutely amazing lecture!!!
@Throwingness 3 роки тому ⁺¹
Around 34:00 when there are equations on the screen you should have had a pointer or something to point at what you are talking about. It's not clear.
@xruan6582 4 роки тому ⁺¹
can anyone explain (32:00) the switch between two modes (i.e. represented by green and red arrow). To me the green one seems like deterministic rule, the red one seems like stochastic rule. Can they exist simultaneously?
@peterkimemiah9669 3 роки тому
Very good easy to understand.
@JebbigerJohn Рік тому
This is so good!!!
@ismailasmcalskan2552 4 роки тому
Really good video about this topic. Thank you
@linfrancis5204 5 років тому ⁺¹
Great video. Thank you. Could you please make a similar video while we consider a two-dimensional Markov chain with more states?
@sanskarshrivastava5193 3 роки тому
Best video for MDP on youtube
@treegnome2371 4 роки тому
at 17:35, why isn't it gamma = (0,1), instead of (0,1]...if gamma = 1, the influence of the actions farther down the road stays the same as all other actions, rather than shrinking the influence...right?
@_brenda4975 3 роки тому
much better than my lecturer
@sander1426-2 4 роки тому
Thanks for the explanation!
@behmandtirgar 4 роки тому
I have a question at time 8:30
: if we take an action to go to the left, why Pr(c | b, left) isn't 0.00? (we go to another side)
@rezadarooei248 4 роки тому
Thanks for your nice tutorial is it possible upload the slides?
@joshuasegal4161 5 років тому
What software are you using to make this?? It looks like you have like an infinite page which gives a really clean look
@berty38 5 років тому ⁺³
Nothing too fancy. This was done with Apple Keynote, and I'm faking that scrolling effect with "Magic Move" animations. I'm always looking for better tools to build useful visuals for lectures.
@jaideep_yes 5 років тому
Thank you.
@y-3084 4 роки тому
excellent
@EdupugantiAadityaaeb Рік тому
What is the name of textbook
@zenchiassassin283 4 роки тому
What textbook ? thank you very much
@dminn 4 роки тому
God bless
@suvinaybothra8988 4 роки тому
honesty
@ahmet9446 5 років тому
The best I find is [4, 1]. I couldn't achieve [4.2, 1.2]. Does anyone achieve [4.2, 1.2]?
@linfrancis5204 5 років тому
YES, I GOT IT
@abdullahmoiz8151 4 роки тому
33:27
@izazkhan1640 5 років тому
jhk

Наступне

Автоматичне відтворення

Passive Reinforcement Learning

Passive Reinforcement Learning

Markov Decision Processes Continued

Markov Decision Processes Continued

Model Based Reinforcement Learning: Policy Iteration, Value Iteration, and Dynamic Programming

Model Based Reinforcement Learning: Policy Iteration, Value Iteration, and Dynamic Programming

👆🏻Жми на «МЫ поехали в Питер…» и смотри 1 из 48 видео про мою жизнь

👆🏻Жми на «МЫ поехали в Питер…» и смотри 1 из 48 видео про мою жизнь

ЖІНОЧИЙ ЛІКАР. НОВЕ ЖИТТЯ. Сезон 2. Серія 14. Драма. Мелодрама. Серіал про Лікарів.

ЖІНОЧИЙ ЛІКАР. НОВЕ ЖИТТЯ. Сезон 2. Серія 14. Драма. Мелодрама. Серіал про Лікарів.

哈莉奎因怎么变骷髅了#小丑 #shorts

哈莉奎因怎么变骷髅了#小丑 #shorts

🤔Насколько Глубокую Яму можно Выкопать ? #shorts

🤔Насколько Глубокую Яму можно Выкопать ? #shorts

Markov Decision Processes - Computerphile

Markov Decision Processes - Computerphile

Hidden Markov Models 12: the Baum-Welch algorithm

Hidden Markov Models 12: the Baum-Welch algorithm

Markov Decision Processes 1 - Value Iteration | Stanford CS221: AI (Autumn 2019)

Markov Decision Processes 1 - Value Iteration | Stanford CS221: AI (Autumn 2019)

Markov Chains Clearly Explained! Part - 1

Markov Chains Clearly Explained! Part - 1

Bayesian Networks

Bayesian Networks

Policy and Value Iteration

Policy and Value Iteration

Markov Decision Processes (MDPs) - Structuring a Reinforcement Learning Problem

Markov Decision Processes (MDPs) - Structuring a Reinforcement Learning Problem

The Boundary of Computation

The Boundary of Computation

Plagiarism Examples from Former Students

Plagiarism Examples from Former Students

Вот в чём отличие ТЯЖЁЛОЙ весовой #shorts

Вот в чём отличие ТЯЖЁЛОЙ весовой #shorts

😲 Гаишник шокировал водителя Мерседеса такими новостями! | Новостничок

😲 Гаишник шокировал водителя Мерседеса такими новостями! | Новостничок

Сестра не поделила надувной матрас с братом..🤦‍♂️🪡⚓️

Сестра не поделила надувной матрас с братом..🤦‍♂️🪡⚓️

Russian soldier catches Ukraine FPV drone with his bare hands and runs with it

Russian soldier catches Ukraine FPV drone with his bare hands and runs with it

МастерШеф 14 сезон. Випуск 1 від 24.08.2024 | ПРЕМ’ЄРА

МастерШеф 14 сезон. Випуск 1 від 24.08.2024 | ПРЕМ’ЄРА

Остановили аттракцион из-за дочки!

Остановили аттракцион из-за дочки!

escape in roblox in real life

escape in roblox in real life

Жителі Курської області ДИВЛЯТЬСЯ ФІЛЬМ ПРО БУЧУ

Жителі Курської області ДИВЛЯТЬСЯ ФІЛЬМ ПРО БУЧУ