REINFORCE: Reinforcement Learning Most Fundamental Algorithm

Everything You Need To Master Actor Critic Methods | Tensorflow 2 Tutorial

Why is the MEM So Popular? 🤔| MS in USA | Fall 2025

САМАЯ ТРАГИЧНАЯ ИСТОРИЯ ЛЮБВИ! БЫВШИЙ РАЗРУШИЛ ЕЁ ЖИЗНЬ, ЧТОБЫ ВЕРНУТЬ СЕБЕ? | Новинки мелодрам 2024

Попри зливу у Полтаві відкрили дошку воїну

Ценник автомеханика 😂 #тнт #shorts #юмор #шоу #однаждывроссии #равдин #дорохов #механик #ремонт

Actor-Critic Reinforcement for continuous actions!

Thinkstr

Переглядів 7 945

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 3 гру 2024

КОМЕНТАРІ • 11

@AM-dj4vp Рік тому ⁺⁴
very underrated video, literally the best explanation on actor/critic that I've seen. Good Job! and Thanks!
@Thinkstr Рік тому ⁺²
Hey, thanks for watching! These are fun to make and I learn a lot. I think my understanding has come a long way since I made this video, so I'll have to make another eventually.
@PeterIntrovert 3 роки тому ⁺²
This is rocket science to me lol but I get value from your videos anyway. I learn critical thinking from you. I think I understand and like general idea I hope you wont invent skynet or something in the future. :D
@Thinkstr 3 роки тому ⁺³
Haha, thanks! If I ever invent skynet, I hope it's a NICE skynet.
@aprameyandesikan3648 Рік тому ⁺¹
Hey, awesome video!! I had a question regarding how the model is choosing the averages and standard deviations. It is supposed to be continuous, so how is the model choosing a continuous output for the the two?
@Thinkstr Рік тому
Thanks for watching! I'm not sure I understand the question, but I think it's actually easier to make a neural network which outputs in a continuous range instead of a discrete range (like categorization). After the actor makes the mean "mu" and standard deviation "sigma", it samples "epsilon" from a normal distribution and adds mu + sigma * epsilon; it's called the "reparameterization trick." sassafras13.github.io/images/2020-05-25-ReparamTrick-eqn2.png
@aprameyandesikan3648 Рік тому ⁺¹
Thanks! I think that answers my question. So you just essentially take the continuous outputs of your network for your action itself, I presume, instead of like categorisation where the one with the highest probability is chosen?
@Thinkstr Рік тому
@@aprameyandesikan3648 Yes, exactly!
@aprameyandesikan3648 Рік тому ⁺¹
Awesome, thanks for taking your time to answer my questions! Keep up with the videos!

Наступне

Автоматичне відтворення

REINFORCE: Reinforcement Learning Most Fundamental Algorithm

REINFORCE: Reinforcement Learning Most Fundamental Algorithm

Everything You Need To Master Actor Critic Methods | Tensorflow 2 Tutorial

Everything You Need To Master Actor Critic Methods | Tensorflow 2 Tutorial

Why is the MEM So Popular? 🤔| MS in USA | Fall 2025

Why is the MEM So Popular? 🤔| MS in USA | Fall 2025

САМАЯ ТРАГИЧНАЯ ИСТОРИЯ ЛЮБВИ! БЫВШИЙ РАЗРУШИЛ ЕЁ ЖИЗНЬ, ЧТОБЫ ВЕРНУТЬ СЕБЕ? | Новинки мелодрам 2024

САМАЯ ТРАГИЧНАЯ ИСТОРИЯ ЛЮБВИ! БЫВШИЙ РАЗРУШИЛ ЕЁ ЖИЗНЬ, ЧТОБЫ ВЕРНУТЬ СЕБЕ? | Новинки мелодрам 2024

Попри зливу у Полтаві відкрили дошку воїну

Попри зливу у Полтаві відкрили дошку воїну

Ценник автомеханика 😂 #тнт #shorts #юмор #шоу #однаждывроссии #равдин #дорохов #механик #ремонт

Ценник автомеханика 😂 #тнт #shorts #юмор #шоу #однаждывроссии #равдин #дорохов #механик #ремонт

Я СДЕЛАЛ ГИГАНТСКИЙ БУРРИТО ВЕСОМ 125 КИЛОГРАММ

Я СДЕЛАЛ ГИГАНТСКИЙ БУРРИТО ВЕСОМ 125 КИЛОГРАММ

CS885 Lecture 7b: Actor Critic

CS885 Lecture 7b: Actor Critic

Overview of Deep Reinforcement Learning Methods

Overview of Deep Reinforcement Learning Methods

Reinforcement Learning, by the Book

Reinforcement Learning, by the Book

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

Policy Gradient Methods | Reinforcement Learning Part 6

Policy Gradient Methods | Reinforcement Learning Part 6

What is Actor-Critic?

What is Actor-Critic?

Learn Machine Learning Like a GENIUS and Not Waste Time

Learn Machine Learning Like a GENIUS and Not Waste Time

Actor Critic Methods Foundations

Actor Critic Methods Foundations

Reinforcement Learning from scratch

Reinforcement Learning from scratch

ШАМАНКА НЕ СТРИМАЛА ЕМОЦІЙ! “ЧОМУ ВИ НЕ ЗБЕРІГАЄТЕ ЖИТТЯ УКРАЇНСЬКИХ СОЛДАТ?!” - СЕЙРАШ

ШАМАНКА НЕ СТРИМАЛА ЕМОЦІЙ! “ЧОМУ ВИ НЕ ЗБЕРІГАЄТЕ ЖИТТЯ УКРАЇНСЬКИХ СОЛДАТ?!” - СЕЙРАШ

Learn Colors Magic Lego Balloons Tutorial #katebrush #shorts #learncolors #tutorial

Learn Colors Magic Lego Balloons Tutorial #katebrush #shorts #learncolors #tutorial

НЕ ПОКУПАЙ iPhone 17 Air!

НЕ ПОКУПАЙ iPhone 17 Air!

Quilt Challenge, No Skills, Just Luck#Funnyfamily #Partygames #Funny

Quilt Challenge, No Skills, Just Luck#Funnyfamily #Partygames #Funny

Проблемы полиции в США @TheCOPCOMIC

Проблемы полиции в США @TheCOPCOMIC

Їжа Львова 2. Наш топ 20.

Їжа Львова 2. Наш топ 20.

3 Дня как Бомж! Масленников, Сабина, Даник живут на помойке

3 Дня как Бомж! Масленников, Сабина, Даник живут на помойке

Як в Уторопах варять сіль із соровиці з місцевого джерела

Як в Уторопах варять сіль із соровиці з місцевого джерела