CS 7646: QLearning Trader Project Overview

Поділитися
Вставка
  • Опубліковано 30 жов 2024

КОМЕНТАРІ • 7

  • @makeshiftpenny
    @makeshiftpenny 3 роки тому +27

    2:15 - [START HERE] code structure and templates
    4:45 - StrategyLearner API
    6:15 - addEvidence() parameters and behavior
    7:10 - testPolicy() parameters and behavior
    9:30 - Evaluation rubric
    12:25 - Implementation of StrategyLearner
    17:00 - how to frame trading as a reinforcement learning (RL) problem
    19:55 - What defines the State (in this problem)?
    21:15 - What are the Actions?
    27:35 - What is the Reward?
    28:30 - Should the reward be delayed (long-term, i.e. cumulative return) or frequent (short-term, i.e. daily return)?
    31:15 - Balch assumes we are not using the Transition Matrix (i.e. no Dyna-Q)
    33:45 - How to represent the State?
    38:15 - StrategyLearner addEvidence() pseudocode
    44:20 - Q: How do we define convergence?
    50:45 - testPolicy() pseudocode
    52:15 - adding missing line in addEvidence() pseudocode
    54:05 - discussion of short-term vs long-term rewards
    58:00 - daily return rewards depend on state (long, short, none). If holdings are none, you should get no rewards
    59:15 - should we use Dyna-Q? Dyna is not recommended, because we want to minimize runtime, but it should reduce the number of trades
    1:00:30 - [END OF LECTURE]

  • @japanboy31415
    @japanboy31415 11 місяців тому +6

    Ml4t gang wya ?

  • @bronsonschnitzel7493
    @bronsonschnitzel7493 7 місяців тому +8

    Another 7 year old video courtesy of OMSCS

  • @kuatroka
    @kuatroka 7 років тому +1

    Hi professor Balch, thanks for the Udacity course ML for Trading!
    I'd like to ask a question. In the section 03-06 - Q-Learning - Quiz: The Trading Problem: State (min 0:43) you explain that Adjusted Close and SMA are not good to be chosen as factors for our State because the values are meaningless outside of the context of comparison. You say that the Price/SMA ratio, on the other hand is a better fit. Later you say that BB values are good and could be used. My question is in this context (Q-Learning for Trading) what is the difference here between BB value and SMA for example. The BB values will also be different for different stocks and are also of the same nature as the Price or SMA would be since BB value is not a ratio. Maybe I'm missing something and somehow you meant a normalised sort of BB value, for example in percentage points? I'm just trying to understand what make sense to use as features for Trading. what makes sense to use and what not, but I want to understand the general idea behind it. Thanks

    • @viniciusepheta
      @viniciusepheta 7 років тому +2

      Yes, I think he meant the normalized BB, i.e. a ratio. In fact, the most usual thing to do is to standardize the BB value, also called as z-score, this is a comparable value among different stocks.

    • @VR-fh4im
      @VR-fh4im 3 роки тому +1

      @@viniciusepheta He does means to say standardize. When we standardize the training feature, we will use mean and standard deviation values of training factors later with test data, when we use the Q-Learner.

  • @japanboy31415
    @japanboy31415 11 місяців тому +2

    money man