DeepMind x UCL RL Lecture Series - Planning & models [8/13]

Gradient Descent Explained

RL Course by David Silver - Lecture 6: Value Function Approximation

когда достали одноклассники!

«Пристрелив чоловік п’ять»: історія військового, який був 41 день у заваленому бліндажі в Авдіївці

How many pencils can hold me up?

DeepMind x UCL RL Lecture Series - Function Approximation [7/13]

Google DeepMind

Переглядів 23 387

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 3 чер 2024
Research Scientist Hado van Hasselt explains how to combine deep learning with reinforcement learning for "deep reinforcement learning".
Slides: dpmd.ai/functionapproximation
Full video lecture series: dpmd.ai/DeepMindxUCL21
Наука та технологія

КОМЕНТАРІ • 8

@haliteabudureyimu638 2 роки тому ⁺⁹
You said at 1:36:05 that there are algorithms inspired by TD which guarantee to converge for non-linear cases. Can you provide the names and papers of these algorithms?
@robinranabhat3125 Рік тому ⁺¹
1:02:45 Convergence and Divergence
1:52:41 Deep Reinforcement Learning
@imranqureshi4299 2 роки тому ⁺³
Thank you for the lecture. Are there exercises available with these lecture series?
@chrishermans 2 роки тому ⁺¹
I have a bit of an implementation question here... So what's the difference between implementing neural q-learning (as in DQN, but without the target network), and implementing fitted q-iteration (FQI), but with a neural net as a function approximator? Assuming we retain the weights from the last step, that is.
I am not finding many comparisons with FQI anywhere in the material provided here, and I'm just wondering if these are implementation details, or if there are some fundamental differences that I am missing somehow.
Please elaborate, anyone who can.
PS: and yes, I saw some response on this very question on stackoverflow. But it was not exactly a helpful response on there, imho. I'm hoping this audience will fare better. ;-)
@josefbajada5106 2 роки тому
At 1:51:06 should it be n < t instead of n
@entropist5251 Рік тому
Yes.
@starkt1554 4 місяці тому
thank you
@nuvembook505 2 роки тому
Gostei disto

Наступне

Автоматичне відтворення

DeepMind x UCL RL Lecture Series - Planning & models [8/13]

DeepMind x UCL RL Lecture Series - Planning & models [8/13]

Gradient Descent Explained

Gradient Descent Explained

RL Course by David Silver - Lecture 6: Value Function Approximation

RL Course by David Silver - Lecture 6: Value Function Approximation

когда достали одноклассники!

когда достали одноклассники!

«Пристрелив чоловік п’ять»: історія військового, який був 41 день у заваленому бліндажі в Авдіївці

«Пристрелив чоловік п’ять»: історія військового, який був 41 день у заваленому бліндажі в Авдіївці

How many pencils can hold me up?

How many pencils can hold me up?

«Я зрозумів, що ми потрібні» Військовий про перші враження після звільнення з полону РФ

«Я зрозумів, що ми потрібні» Військовий про перші враження після звільнення з полону РФ

Gradient Descent, Step-by-Step

Gradient Descent, Step-by-Step

Approximations. The engineering way.

Approximations. The engineering way.

The Art of Linear Programming

The Art of Linear Programming

How to Paint Light in Watercolor

How to Paint Light in Watercolor

Visualization of the universal approximation theorem

Visualization of the universal approximation theorem

After 2.5 Years, They're Turning Victoria 3 Into EU 4.5

After 2.5 Years, They're Turning Victoria 3 Into EU 4.5

Variational Autoencoders

Variational Autoencoders

Visualizing AlphaFold 3's advances in protein structure prediction

Visualizing AlphaFold 3's advances in protein structure prediction

Reinforcement Learning Series: Overview of Methods

Reinforcement Learning Series: Overview of Methods

Power up all cell phones.

Power up all cell phones.

Не обзор DJI Osmo Pocket 3 Creator Combo

Не обзор DJI Osmo Pocket 3 Creator Combo

⚡️⚡️⚡️ MITSUBISHI 🔴 MD165144 or (MD175515)❗️❗️❗️ #automobile #restoreECU

⚡️⚡️⚡️ MITSUBISHI 🔴 MD165144 or (MD175515)❗️❗️❗️ #automobile #restoreECU

iphone fold ? #spongebob #spongebobsquarepants

iphone fold ? #spongebob #spongebobsquarepants

Рукописные сообщения на iPhone 😳

Рукописные сообщения на iPhone 😳

Huawei который почти как iPhone

Huawei который почти как iPhone

Pratik Cat6 kablo soyma

Pratik Cat6 kablo soyma

Good Tool Cutting And Recycling Circuit Board Easily- Wisdom Tips Machine Easy Easyway Easywork !

Good Tool Cutting And Recycling Circuit Board Easily- Wisdom Tips Machine Easy Easyway Easywork !