Розмір відео: 1280 X 720853 X 480640 X 360
Показувати елементи керування програвачем
Автоматичне відтворення
Автоповтор
Nice recap on online / offline RL with connection to a overview over the whitepaper.
Excellent lecture
How big should the dataset of an RL algorithm needs to be until its indistinguishable from brute force?
Brute forcing is different because it's still on policy. But at some point of dataset size you'll have samples of every possible policy and could just rejection sample, so good question...
can i use a dataset or pandas dataframe and go ahead with Q learning
Nice recap on online / offline RL with connection to a overview over the whitepaper.
Excellent lecture
How big should the dataset of an RL algorithm needs to be until its indistinguishable from brute force?
Brute forcing is different because it's still on policy. But at some point of dataset size you'll have samples of every possible policy and could just rejection sample, so good question...
can i use a dataset or pandas dataframe and go ahead with Q learning