Markov Decision Processes 1 - Value Iteration | Stanford CS221: AI (Autumn 2019)

Markov Decision Processes - Computerphile

Q-Learning: Model Free Reinforcement Learning and Temporal Difference Learning

Железная задница #орехов #типичный #мотоциклист #байкер

когда достали одноклассники!

«35 років прожив з ногами, стільки ж проживу і без них»: боєць без ніг про життя #війна #shorts #зсу

Markov Decision Processes 2 - Reinforcement Learning | Stanford CS221: AI (Autumn 2019)

Stanford Online

Переглядів 67 659

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 2 чер 2024
For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: stanford.io/2Zv1JpK
Topics: Reinforcement learning, Monte Carlo, SARSA, Q-learning, Exploration/exploitation, function approximation
Percy Liang, Associate Professor & Dorsa Sadigh, Assistant Professor - Stanford University
onlinehub.stanford.edu/
Associate Professor Percy Liang
Associate Professor of Computer Science and Statistics (courtesy)
profiles.stanford.edu/percy-l...
Assistant Professor Dorsa Sadigh
Assistant Professor in the Computer Science Department & Electrical Engineering Department
profiles.stanford.edu/dorsa-s...
To follow along with the course schedule and syllabus, visit:
stanford-cs221.github.io/autu...

КОМЕНТАРІ • 8

@albert2266 Місяць тому
Just to clarify a concept as I think 7:29 is not true because value function shouldn't be equal to the Q value. Value function is the expected utility for "all possible actions" at a given state. Therefore, it should be the expected Q_pi rather than just simply equal to Q_pi since Q_pi is the expected utility for "a given action" at a given state. Please correct me if I'm wrong.
@aojing 2 місяці тому
A legacy question from last MDP-1 is still hovering around 2: What is the Transition function for this class? Is it a function of Action?
@henkjekel4081 Рік тому
Yeah, u really need to be having an episode to play this game
@black-sci 3 місяці тому
Somehow Lecture left me confused in the end. may be I should rewatch.
@JumbyG Рік тому ⁺²
I think there may be a typo at 28:27, it states that the Qpi is (4+8+16)/3 however I believe it should be (4+8+12)/3? Please correct me if I am wrong
@seaotterlabs1685 Рік тому ⁺²
I think it should be (4+8+16)/3, as I believe their last run has four 4 values.
@endoumamoru3835 5 місяців тому
he is calculating sum of all rewards you can get. First time sum was 4 as only one reward was present and next was 8 as 2 rewards and then next it was 16 as 4 rewards were there.
@Moriadin 15 днів тому
not as good as the previous lecture. harder to follow.

Наступне

Автоматичне відтворення

Markov Decision Processes 1 - Value Iteration | Stanford CS221: AI (Autumn 2019)

Markov Decision Processes 1 - Value Iteration | Stanford CS221: AI (Autumn 2019)

Markov Decision Processes - Computerphile

Markov Decision Processes - Computerphile

Q-Learning: Model Free Reinforcement Learning and Temporal Difference Learning

Q-Learning: Model Free Reinforcement Learning and Temporal Difference Learning

Железная задница #орехов #типичный #мотоциклист #байкер

Железная задница #орехов #типичный #мотоциклист #байкер

когда достали одноклассники!

когда достали одноклассники!

«35 років прожив з ногами, стільки ж проживу і без них»: боєць без ніг про життя #війна #shorts #зсу

«35 років прожив з ногами, стільки ж проживу і без них»: боєць без ніг про життя #війна #shorts #зсу

Passat CC на 300 л.с. Начало проекта!

Passat CC на 300 л.с. Начало проекта!

Search 1 - Dynamic Programming, Uniform Cost Search | Stanford CS221: AI (Autumn 2019)

Search 1 - Dynamic Programming, Uniform Cost Search | Stanford CS221: AI (Autumn 2019)

A friendly introduction to deep reinforcement learning, Q-networks and policy gradients

A friendly introduction to deep reinforcement learning, Q-networks and policy gradients

Machine Learning 1 - Linear Classifiers, SGD | Stanford CS221: AI (Autumn 2019)

Machine Learning 1 - Linear Classifiers, SGD | Stanford CS221: AI (Autumn 2019)

Game Playing 1 - Minimax, Alpha-beta Pruning | Stanford CS221: AI (Autumn 2019)

Game Playing 1 - Minimax, Alpha-beta Pruning | Stanford CS221: AI (Autumn 2019)

Reward Machines: Structuring Reward Function Specifications and Reducing Sample Complexity...

Reward Machines: Structuring Reward Function Specifications and Reducing Sample Complexity...

MIT Introduction to Deep Learning | 6.S191

MIT Introduction to Deep Learning | 6.S191

Lecture 1 | The Fourier Transforms and its Applications

Lecture 1 | The Fourier Transforms and its Applications

Bayesian Networks 1 - Inference | Stanford CS221: AI (Autumn 2019)

Bayesian Networks 1 - Inference | Stanford CS221: AI (Autumn 2019)

MDPs: Markov Decision Processes | Decision Making Under Uncertainty using POMDPs.jl

MDPs: Markov Decision Processes | Decision Making Under Uncertainty using POMDPs.jl

«35 років прожив з ногами, стільки ж проживу і без них»: боєць без ніг про життя #війна #shorts #зсу

«35 років прожив з ногами, стільки ж проживу і без них»: боєць без ніг про життя #війна #shorts #зсу

😱Ви будете шоковані, що випливло в Оманських домовленостях! Що пообіцяв Зеленський Патрушеву

😱Ви будете шоковані, що випливло в Оманських домовленостях! Що пообіцяв Зеленський Патрушеву

Історія військовослужбовця з ТЦК на Миколаївщині #shortsvideo

Історія військовослужбовця з ТЦК на Миколаївщині #shortsvideo

ВОЛКОВА: хочу поїхати в РЕХАБ. Мене ДОМАГАВСЯ викладач. Після СМЕРТІ чоловіка відчула ПОЛЕГШЕННЯ

ВОЛКОВА: хочу поїхати в РЕХАБ. Мене ДОМАГАВСЯ викладач. Після СМЕРТІ чоловіка відчула ПОЛЕГШЕННЯ

100😭🎉 #thankyou

100😭🎉 #thankyou

Who says Goldman Statue is not getting kisses at work?😎 #performer #livingstatue #streetart

Who says Goldman Statue is not getting kisses at work?😎 #performer #livingstatue #streetart

ПРОВЕРКА НЯНИ НА ПРОЧНОСТЬ🫢 Ролик уже на канале Димас Блог #димасблог #аняищук #проверка

ПРОВЕРКА НЯНИ НА ПРОЧНОСТЬ🫢 Ролик уже на канале Димас Блог #димасблог #аняищук #проверка