CS885 Module 1: Trust region & proximal policy optimization

Transformers (how LLMs work) explained visually | DL5

Model Based RL Finally Works!

Тайское мороженое в Калининграде

АРТЕМ ДАМНИЦЬКИЙ ведучий КЛУБУ ДИЛЕТАНТІВ #46

😮‍💨🔥 Усик знову перевершив себе: подивіться на це! #усик #україна #славаукраїні

CS885 Module 5: Distributional RL

Pascal Poupart

Переглядів 6 121

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 11 гру 2024

КОМЕНТАРІ • 4

@beebeekhor346 2 роки тому
「ビデオコンテンツはとても素晴らしいです、おめで
@vincevella3905 2 роки тому ⁺¹
Thanks for video, very helpful to speed up learning this topic. Question: So if one wants to utilize other distributional statistics rather than just expectation, is it just a question of changing the policy from argmax of means of action return distributions to argmax of some risk adjusted statistic?
@monkiedeinhau557 Рік тому
What you describe in your question sound like a dynamic risk. If you interest in dynamic risk, although it's result is not easily interpretable with respect to the cumulative return. Due to it's convenience people just apply it by replacing the expectation to the risk statistics.
However, if you concern about the risk statistics over the cumulative return (static risk), since other measure may not satisfy the "tower rule" or "positive homogeneity" you may not able to just replace the policy optimization by naively changing the statistics of interest. If a risk statistics is positively homogeneous, convex, and satisfy tower rule then it may be used directly. However, I am not aware of such risk statistics other than "min", "mean", "max".
@articlesvideos-ys9ut Рік тому
@@monkiedeinhau557 thanks for your your reply. My question was more in terms of dynamic risk, example applying the standard deviation which indirectly can act as a measure of uncertainty.

Наступне

Автоматичне відтворення

CS885 Module 1: Trust region & proximal policy optimization

CS885 Module 1: Trust region & proximal policy optimization

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

Model Based RL Finally Works!

Model Based RL Finally Works!

Тайское мороженое в Калининграде

Тайское мороженое в Калининграде

АРТЕМ ДАМНИЦЬКИЙ ведучий КЛУБУ ДИЛЕТАНТІВ #46

АРТЕМ ДАМНИЦЬКИЙ ведучий КЛУБУ ДИЛЕТАНТІВ #46

😮‍💨🔥 Усик знову перевершив себе: подивіться на це! #усик #україна #славаукраїні

😮‍💨🔥 Усик знову перевершив себе: подивіться на це! #усик #україна #славаукраїні

The evil clown plays a prank on the angel

The evil clown plays a prank on the angel

A Distributional Approach to Reinforcement Learning - paper presentation

A Distributional Approach to Reinforcement Learning - paper presentation

CS885 Lecture 10: Bayesian RL

CS885 Lecture 10: Bayesian RL

But what is a convolution?

But what is a convolution?

CS885 Module 6: Inverse RL

CS885 Module 6: Inverse RL

But what is the Central Limit Theorem?

But what is the Central Limit Theorem?

Deep Q-Network & Dueling network architectures for deep reinforcement learning

Deep Q-Network & Dueling network architectures for deep reinforcement learning

A Distributional Perspective on Reinforcement Learning - Marc Bellemare

A Distributional Perspective on Reinforcement Learning - Marc Bellemare

CS885 Module 4: Partially Observable Reinforcement Learning

CS885 Module 4: Partially Observable Reinforcement Learning

Creative Justice at the Checkout: Bananas and Eggs Showdown #shorts

Creative Justice at the Checkout: Bananas and Eggs Showdown #shorts

Тернистий шлях до рівноправ’я - Кохання на виживання - Сезон 5 - Випуск 1 - 02.12.2024

Тернистий шлях до рівноправ’я – Кохання на виживання – Сезон 5 – Випуск 1 – 02.12.2024

🚀 ШОК! Трамп ПЕРЕДАВ ЯДЕРНУ ЗБРОЮ КИЄВУ! ВІДОМО, КУДИ БИТИМЕ УКРАЇНА!

🚀 ШОК! Трамп ПЕРЕДАВ ЯДЕРНУ ЗБРОЮ КИЄВУ! ВІДОМО, КУДИ БИТИМЕ УКРАЇНА!

Что будет если украсть в магазине шоколадку 🍫

Что будет если украсть в магазине шоколадку 🍫

Тайское мороженое в Калининграде

Тайское мороженое в Калининграде

Правильный подход к детям

Правильный подход к детям

The Security Guard Fell Into The Trap Of The Beauty #still #parkour #funny#skate

The Security Guard Fell Into The Trap Of The Beauty #still #parkour #funny#skate

Cheerleader Transformation That Left Everyone Speechless! #shorts

Cheerleader Transformation That Left Everyone Speechless! #shorts