MIT 6.S191 (2022): Reinforcement Learning

Поділитися
Вставка
  • Опубліковано 24 лис 2024

КОМЕНТАРІ • 26

  • @liangertyguo100
    @liangertyguo100 2 роки тому +30

    Best introductory video for DRL. Read lots of books or reviews but none of them explained it so clearly. Thank u so much for the excellent presentation

  • @josephcheung2311
    @josephcheung2311 2 роки тому +2

    This course is excellent. The two instructors explain the concepts very well.

  • @martinsichibeya3405
    @martinsichibeya3405 2 роки тому +5

    my very favorite... honestly i so much love this DL course... thanks for your efforts.

  • @midimusicforever
    @midimusicforever 2 роки тому +3

    Another step towards the singularity.

  • @tatendamuzenda8442
    @tatendamuzenda8442 Рік тому

    loving the course so easy to understand

  • @alexandertimofeev7626
    @alexandertimofeev7626 2 роки тому +2

    Nice lecture! However, it was hard for me to follow the idea of loss function at 44:44. So it works if R_t is negative for low rewards and positive for high rewards, right?

    • @tommyholladay
      @tommyholladay 2 роки тому +1

      We minimize the loss. By minimizing negative log of the probability multiplied by the reward, we are actually optimizing for the higher reward, which in a sense makes it gradient ascent.

    • @uk_with_jatin3512
      @uk_with_jatin3512 Рік тому +1

      @@tommyholladay totally agreed, but an easier statement to explain this would be that we are taking the negative of loss likelihood because for high values, we want to proceed towards that direction in our algorithm, so, we user the negative to reverse the direction of gradient.

  • @jamesgambrah58
    @jamesgambrah58 Рік тому

    This is awesome, but how can some of us watching this recorded video on UA-cam have the opportunity to practice with VISTA, is there any arrangement for us.

    • @AAmini
      @AAmini  Рік тому

      Yes! VISTA is available to the public as well here: github.com/vista-simulator/vista
      Also checkout the VISTA related lab3 on the open source software labs for the class for examples.

    • @jamesgambrah58
      @jamesgambrah58 Рік тому

      @@AAmini Thanks Prof., I will explore it, Data Science community will forever appreciate your contribution to the growth of the field.

  • @hassinijalil5533
    @hassinijalil5533 Рік тому

    Hello
    I have a question, when we do the training, what data is used to train the agent? Is it the environment (Carla for exemple ) ? And can we transform the environment into images ?
    I hope to reply me sir i have a project in university .
    and thank you .

  • @darshank8748
    @darshank8748 2 роки тому +1

    Great Work !!!

  • @chrcheel
    @chrcheel 2 роки тому +1

    Thanks a lot!

  • @khaoticttv6506
    @khaoticttv6506 2 роки тому

    Hey, do you use a Mac or a windows machine with Ubuntu installed on it?

  • @harmhoeks5996
    @harmhoeks5996 2 роки тому

    Why Tesla has 1500 data labelers instead of reinforcement learning?

    • @RC-bm9mf
      @RC-bm9mf 2 роки тому

      Because actual accidents are much costly.

  • @ahmedb2559
    @ahmedb2559 Рік тому

    Thank you !

  • @nguyenvandien8996
    @nguyenvandien8996 2 роки тому

    Hello, Amini. Why can't I see the slides of this video on the homepage?

  • @MarkSimithraaratchy
    @MarkSimithraaratchy 2 роки тому

    Excellent lecture; thank you.

  • @helloansuman
    @helloansuman 2 роки тому +1

    Amazing ❤️

  • @andreas.karatzas
    @andreas.karatzas 2 роки тому +1

    Now, that's the good stuff!!!

  • @theneumann7
    @theneumann7 Рік тому +2

    👏

  • @buoyrina9669
    @buoyrina9669 2 роки тому

    Looking to it

  • @kellybrower301
    @kellybrower301 2 роки тому

    Hallucinate? 🤔😭

  • @zhihuiyuze
    @zhihuiyuze 2 роки тому

    Starcraft 2!!!!