Markov Decision Processes Four - Georgia Tech - Machine Learning

Поділитися
Вставка
  • Опубліковано 4 лют 2025

КОМЕНТАРІ • 15

  • @EmilyXieX
    @EmilyXieX 4 роки тому +15

    This is super clear. Thanks so much for making this video.

  • @audic2350
    @audic2350 2 роки тому +2

    The greatest video I could watch to understand MDP.

  • @vishalkumarpandey5546
    @vishalkumarpandey5546 Рік тому

    Such an insightful discussion based explanation. Great 👍

  • @QQ-xx7mo
    @QQ-xx7mo 6 років тому +2

    Awesome videos, Thank you

  • @cigxhang486
    @cigxhang486 11 місяців тому

    so the policy tells you the next action to take in order for you to reach the reward eventually?

  • @renskirchner6309
    @renskirchner6309 4 роки тому +1

    You're a genius

  • @braineedly7543
    @braineedly7543 2 роки тому

    Is decision of policy based on model?

  • @enditend2
    @enditend2 9 років тому +8

    no part 5?

  • @lahaale5840
    @lahaale5840 7 років тому

    is the reward by given? or where is the reward come from? is it equivalent to label data in supervise learning?

    • @oldcowbb
      @oldcowbb 3 роки тому

      i think it is more like the cost function associated with whether the prediction matches with the label, it is some numerical function to indicate what you want the algorithm to optimize, like matching labels in classification or getting closer to the goal in navigation

    • @braineedly7543
      @braineedly7543 2 роки тому

      @@oldcowbb so we should store every reward of each state?

    • @oldcowbb
      @oldcowbb 2 роки тому

      @@braineedly7543 well you can't solve an MDP without the reward so yes

  • @joselabaki8290
    @joselabaki8290 2 роки тому

    The Instructor is excellent, unfortunately, the explanation is slowed down, sometimes "blurred" because of the non-stop interjections. I believe a single voice is more than enough.