Reinforcement Learning Vs other Learning Tasks, Limitations of Reinforcement Learning

#1. Q Learning Algorithm Solved Example | Reinforcement Learning | Machine Learning by Mahesh Huddar

Reinforcement Learning : Tic-Tac-Toe

Экстремальные Прятки в Огромной Усадьбе Закрытая Школа!

«Найстрашніше було, що люди вірили росіянам». Житель Богородичного про російську окупацію села

Solved Example on Q-learning in Reinforcement Learning/Q-Learning example

Sriveni

Переглядів 16 826

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 28 сер 2024
Q-learning Task, Q-learning Algorithm, Solved Example on Q-Learning

КОМЕНТАРІ • 27

@i_izhar03 Рік тому ⁺²
15:42 why didn't you choose the path 1->2->4->5->6 the sum here is 380 here and in 1->2->3 the sum is 190 only , we choose this path over 1->2->4->5->6 ?
@sriveni Рік тому ⁺³
i have given one optimal policy, u can choose any but in your path without passing through 3 you cannot go to 4 since 3 is the goal state we will stop there. Moreover the agent is not aware of the rewards , so in trail and error basis only he has to proceed. After certain number of iterations the agent might find the highest profitable path
@Solinvencible 5 місяців тому
Thank Sriveni ! A very good explained example!
@sriveni 5 місяців тому ⁺¹
Glad you liked it.Please do check other videos and share to all those who will be benefited.
@starultra2863 7 місяців тому
For 13:51, the Q(1,2) = 90, which is correct. If we would have started solving the problem from state 1 --> 2, then Q(1,2) would have been 0 right, in the first step ????????
@sriveni 6 місяців тому
yes u are correct.So only u have to start with some non-zero value,then only the q-matix will be updated.Since we are doing it manually we started with Q(2,3),the same is not the case with the algorithm
@MS_P007 3 місяці тому
Thank u for the lecture maam❤️
@sriveni 3 місяці тому
Thank you 😊
@ramazandurmaz3012 10 місяців тому
Why are we taking next states that have only non negative Q values? And there are 6 cells/states that one can end up in. But I can see only 4 actions(up,down,right,left). So why do we have 6x6 matrix?
@sriveni 10 місяців тому
As per the problem definition reward should be maximized,so take only non negative Q values.u can take a matrix of any size.This is like a board game where u can move only one step up,down,right,left
@ramazandurmaz3012 10 місяців тому
@@sriveniI understand. One more question though. Why don't we include the learning rate alpha? Or at what step we need it?
@sriveni 10 місяців тому
@@ramazandurmaz3012 you can use to update the Q value using the formula
Q(s,a)
@user-cu6gf2rj6y 7 місяців тому
Does we have to update the R matrix also? Because certain point of time rewards in Q matrix gets updated and R will be remain as it is. And till what time we have to use R matrix if it is not get updated?
@starultra2863 7 місяців тому ⁺¹
no, we dont have to update the r matrix. the r matrix remains constant...... only the q matrix changes..........r matrix serves as a guidance for updating the q matrix with right values.......
@sriveni 6 місяців тому
yes,u need not update the R-matrix
@sundarammuthu5020 10 місяців тому
hi mam,
the maximum cumulative reward by seletcing the path 4,5,6 is 81+90+100=271 mam..
@sriveni 10 місяців тому
Yes ,two paths are mentioned in the solution 1st path is the shortest
@sriveni 10 місяців тому
First path is 1,,2,3 and 2nd path is 4,5,6 and 3.Both are mentioned in the solution graph
@rams9256 9 місяців тому
mam Can you please explain same Q learning concept with another example?
@kartikeysokhal301 8 місяців тому
@sriveni 6 місяців тому
little busy,i will do it
@rahuljacker7171 4 місяці тому
hme kb pta chlega ki hmara Q matrix ab update nhi hoga ??
@sriveni 4 місяці тому ⁺¹
In the process of updating if the values won't change after 2 to 3 iterations we will stop
@afshinmonfared187 29 днів тому
only you said so or you said ok.
@sriveni 27 днів тому
I didn't get your doubt
@skayllacodm 4 місяці тому
Parabens
@sriveni 4 місяці тому
Thank you

Наступне

Автоматичне відтворення

Reinforcement Learning Vs other Learning Tasks, Limitations of Reinforcement Learning

Reinforcement Learning Vs other Learning Tasks, Limitations of Reinforcement Learning

#1. Q Learning Algorithm Solved Example | Reinforcement Learning | Machine Learning by Mahesh Huddar

#1. Q Learning Algorithm Solved Example | Reinforcement Learning | Machine Learning by Mahesh Huddar

Reinforcement Learning : Tic-Tac-Toe

Reinforcement Learning : Tic-Tac-Toe

Экстремальные Прятки в Огромной Усадьбе Закрытая Школа!

Экстремальные Прятки в Огромной Усадьбе Закрытая Школа!

«Найстрашніше було, що люди вірили росіянам». Житель Богородичного про російську окупацію села

«Найстрашніше було, що люди вірили росіянам». Житель Богородичного про російську окупацію села

So brutal REVENGE 😂😭🔥 @BrutalAssaultOFFICIAL #youtube #festival #comedy #metal #corpsepaint

So brutal REVENGE 😂😭🔥 @BrutalAssaultOFFICIAL #youtube #festival #comedy #metal #corpsepaint

Foundation of Q-learning | Temporal Difference Learning explained!

Foundation of Q-learning | Temporal Difference Learning explained!

Q-learning - Explained!

Q-learning - Explained!

The BEST Q-Learning example! | The Mountain Car Problem

The BEST Q-Learning example! | The Mountain Car Problem

Where Does Bad Code Come From?

Where Does Bad Code Come From?

Q-Learning: A Complete Example in Python

Q-Learning: A Complete Example in Python

1 Principal Component Analysis | PCA | Dimensionality Reduction in Machine Learning by Mahesh Huddar

1 Principal Component Analysis | PCA | Dimensionality Reduction in Machine Learning by Mahesh Huddar

How AI Discovered a Faster Matrix Multiplication Algorithm

How AI Discovered a Faster Matrix Multiplication Algorithm

SARSA vs Q Learning

SARSA vs Q Learning

Policy Gradient Theorem Explained - Reinforcement Learning

Policy Gradient Theorem Explained - Reinforcement Learning

«Вони вміють воювати як терористи»: військовослужбовець «Пастор»

«Вони вміють воювати як терористи»: військовослужбовець «Пастор»

Олександрія - Шахтар / УПЛ / 4 тур / Огляд матчу #Олександрія #Шахтар #уплтб

Олександрія - Шахтар / УПЛ / 4 тур / Огляд матчу #Олександрія #Шахтар #уплтб

MELLSTROY - первое интервью: как живет самый обсуждаемый стример года

MELLSTROY — первое интервью: как живет самый обсуждаемый стример года

Секрет фокусника! #shorts

Секрет фокусника! #shorts

😢«Найбільш за все хочеться побачити МАМУ»: чого найбільше хочеться жителям Довгенького #харківщина

😢«Найбільш за все хочеться побачити МАМУ»: чого найбільше хочеться жителям Довгенького #харківщина

C’est qui le plus fort 😂

C’est qui le plus fort 😂

АЛАУДИНОВ у Скабеевой: эти их ВСУ нас НЕ ДОГОНЯТ 😁 [Пародия]

АЛАУДИНОВ у Скабеевой: эти их ВСУ нас НЕ ДОГОНЯТ 😁 [Пародия]