Mastering Robotics with Hindsight Experience Replay | Paper Analysis

AI can't cross this line and we don't know why.

Here's How Deep Mind Coded N Step Deep Q Learning

😳 Машина хотела сбежать от хозяина в режиме автоматической парковки? | Новостничок

😳Что делать, если вас Похоронили заживо ? #shorts

Я уговариваю своего друга выпить Лава Лава

I've Been Doing This Wrong The Whole Time ... The Right Way to Save Models In PyTorch

Machine Learning with Phil

Переглядів 3 083

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 16 вер 2024
It turns out I wasn't saving PyTorch models correctly. You really need to save the optimizer state as well as the state of the current weights of your deep neural network. This is critical to getting deep Q learning and actor critic agents training on complex environments that may require multiple sessions spread over time.
Starter code for this video is here:
gist.github.co...
Learn how to turn deep reinforcement learning papers into code:
Get instant access to all my courses, including the new Prioritized Experience Replay course, with my subscription service. $29 a month gives you instant access to 42 hours of instructional content plus access to future updates, added weekly.
Discounts available for Udemy students (enrolled longer than 30 days). Just send an email to sales@neuralnet.ai
www.neuralnet....
Or, pickup my Udemy courses here:
Deep Q Learning:
www.udemy.com/...
Actor Critic Methods:
www.udemy.com/...
Curiosity Driven Deep Reinforcement Learning
www.udemy.com/...
Natural Language Processing from First Principles:
www.udemy.com/...
Just getting started in deep reinforcement learning? Check out my intro level course through Manning Publications.
Reinforcement Learning Fundamentals
www.manning.co...
Here are some books / courses I recommend (affiliate links):
Grokking Deep Learning in Motion: bit.ly/3fXHy8W
Grokking Deep Learning: bit.ly/3yJ14gT
Grokking Deep Reinforcement Learning: bit.ly/2VNAXql
Come hang out on Discord here:
/ discord
Website: www.neuralnet.ai
Github: github.com/phi...
Twitter: / mlwithphil

КОМЕНТАРІ • 20

@forough2387 Рік тому ⁺³
Many thanks for updating us. I really appreciate you providing us with such wonderful courses.
@MachineLearningwithPhil Рік тому ⁺²
Thanks for the kind words
@jose-alberto-salazar-jimenez 3 місяці тому
Thanks for update. I wonder how would you go about loading the model, once trained, for "testing"? I've tried for example, loading the model state to q_eval, setting q_eval to ".eval()" mode, using "with torch.no_grad()", then getting the predictions with "model(observation)", but the model/agent doesn't perform as how it was doing during training (it does really bad in comparison).
@adamjiang3044 11 місяців тому
Hey Phil, how to save/reload the seed for numpy, env.seed, env.action_space.seed, T.cuda.manual_seed_all and other related seed setting? Just say if I need to run a long training, but have to switch over to different machines (very likely happen in HPC). Could you please provide some advice on this? Thanks in advance.
@forough2387 Рік тому
Dear Phil,
May I ask why q_next[done] = 0? Is not q_next a single value?
@animesound 10 місяців тому
this is gold, thank you very much
@markadyash Рік тому ⁺¹
hey phil good to see you
what you working on RL or anything else?
@MachineLearningwithPhil Рік тому
RL predominantly.
@meowatthemoon6645 Рік тому ⁺¹
Hi, thanks for the video. Can you explain what are the benefits of saving the state of the optimizer?
@etterathe Рік тому
You have an answer in the description of the video
@mikhailkhlyzov6205 Рік тому
Saving optimizer allows for user to continue training from where he left. Most optimizers create some inner variables (e.g. momentum parameter in Adam, which depends on training history) that determine the way weights are being changed each step. Check out Adam's formula to understand what happens if we randomly zero-out the momentum mid training.
The old way Phil was saving models (only weights fo nets) was meant to allow "train on one machine and run on the other" and nothing else.
Of course for those who use SGD this is not as crucial LoL
@meowatthemoon6645 Рік тому
@@mikhailkhlyzov6205 Understood, thank you very much!
@MrCmon113 11 місяців тому
I did horrible on all of this organizational shit for by bachelor thesis, next time I wanna be organized from the start. Gonna watch what a bunch of different experienced people are doing.
@DR-ot9hh 11 місяців тому
You're an excellent teacher!
@sayanchaudhuri2966 Рік тому
thanks for the tips
@killereks Рік тому
What are ur pc specs? Training is rly fast
@MachineLearningwithPhil Рік тому
Intel i7 7820x with an rtx Titan
@spsharan2000 Рік тому
What keyboard do you use?
@MachineLearningwithPhil Рік тому
Some hyper X model. I'm not a huge fan but it gets the job done.
@fastaalemyapanadam Рік тому
Thanks for video.Please check your e mail

Наступне

Автоматичне відтворення

Mastering Robotics with Hindsight Experience Replay | Paper Analysis

Mastering Robotics with Hindsight Experience Replay | Paper Analysis

AI can't cross this line and we don't know why.

AI can't cross this line and we don't know why.

Here's How Deep Mind Coded N Step Deep Q Learning

Here's How Deep Mind Coded N Step Deep Q Learning

😳 Машина хотела сбежать от хозяина в режиме автоматической парковки? | Новостничок

😳 Машина хотела сбежать от хозяина в режиме автоматической парковки? | Новостничок

😳Что делать, если вас Похоронили заживо ? #shorts

😳Что делать, если вас Похоронили заживо ? #shorts

Я уговариваю своего друга выпить Лава Лава

Я уговариваю своего друга выпить Лава Лава

Прийняла ваду синочка | #ЯСоромлюсьСвогоТіла #медицина #здоровя

Прийняла ваду синочка | #ЯСоромлюсьСвогоТіла #медицина #здоровя

Auto-Tuning Hyperparameters with Optuna and PyTorch

Auto-Tuning Hyperparameters with Optuna and PyTorch

Actor-Critic Reinforcement for continuous actions!

Actor-Critic Reinforcement for continuous actions!

Deep Q-Learning/Deep Q-Network (DQN) Explained | Python Pytorch Deep Reinforcement Learning

Deep Q-Learning/Deep Q-Network (DQN) Explained | Python Pytorch Deep Reinforcement Learning

PYTORCH COMMON MISTAKES - How To Save Time 🕒

PYTORCH COMMON MISTAKES - How To Save Time 🕒

Has Generative AI Already Peaked? - Computerphile

Has Generative AI Already Peaked? - Computerphile

Dear Game Developers, Stop Messing This Up!

Dear Game Developers, Stop Messing This Up!

Dueling Double Deep Q Learning is Easy in PyTorch

Dueling Double Deep Q Learning is Easy in PyTorch

Deep Q Learning is Simple with PyTorch | Full Tutorial 2020

Deep Q Learning is Simple with PyTorch | Full Tutorial 2020

How AI 'Understands' Images (CLIP) - Computerphile

How AI 'Understands' Images (CLIP) - Computerphile

СНЯЛИ ДОМИК С ОЧЕНЬ СТРАННЫМ ХОЗЯИНОМ.. | Fears to Fathom - Woodbury Getaway

СНЯЛИ ДОМИК С ОЧЕНЬ СТРАННЫМ ХОЗЯИНОМ.. | Fears to Fathom - Woodbury Getaway

Russian soldier catches Ukraine FPV drone with his bare hands and runs with it

Russian soldier catches Ukraine FPV drone with his bare hands and runs with it

КТО БОИТСЯ КЛЕЩЕЙ?? #shorts

КТО БОИТСЯ КЛЕЩЕЙ?? #shorts

Brawl Stars Edit😈📕

Brawl Stars Edit😈📕

ПОСТОЯННИК ЛОМБАРДА #шоу #юмор #спб #фитнес #вау

ПОСТОЯННИК ЛОМБАРДА #шоу #юмор #спб #фитнес #вау

Players vs Corner Flags 🤯

Players vs Corner Flags 🤯

On Track To A World’s First

On Track To A World’s First

БЕРЕМЕННА В 16 ► Репер АЛЬФОНС и мама АЛКАШКА

БЕРЕМЕННА В 16 ► Репер АЛЬФОНС и мама АЛКАШКА