MedAI #41: Efficiently Modeling Long Sequences with Structured State Spaces | Albert Gu

Поділитися
Вставка
  • Опубліковано 10 вер 2024

КОМЕНТАРІ • 21

  • @EobardUchihaThawne
    @EobardUchihaThawne 7 місяців тому +15

    he is one of the heads of this new Mamba architecture

  • @user-lf4tu9fq8j
    @user-lf4tu9fq8j 8 місяців тому +4

    excellent presentation. Thank you

  • @MrHampelmann123
    @MrHampelmann123 Рік тому +4

    Amazing talk, and impressive research. Thanks.

  • @ranwang9505
    @ranwang9505 5 місяців тому

    impressive presentations. thank you

  • @yuktikaura
    @yuktikaura 8 місяців тому +1

    Excellent presentation

  • @mohdil123
    @mohdil123 Рік тому +1

    Awesome!

  • @theskydebreuil
    @theskydebreuil 8 місяців тому

    Super interesting! Thanks for the presentation.
    I work in game development for now, but cool to see how things are going in the ML world 😊

  • @user-kp2el5ib2m
    @user-kp2el5ib2m 6 місяців тому

    Excellent presentation and impressive research, i only wonder why SSMs are recurrently efficient? (video timestamp : 32:27)
    Suppose k is the token len of input history. The general sequence model takes k square (s.t. transformer) time complexity. On the other hand, SSMs still need to encode all stateful history "recurrently". The S4 paper also aims to deal with this issue (multiply A, k-1 times to create a K bar matrix, it also ends in nearly k square) by diagonalizing the matrix.
    So, it seems SSMs recurrent aren't "naturally" efficient, but require some linear algebra technique.
    Any suggestion will be appreciated!!

  • @temesgenmehari3749
    @temesgenmehari3749 Рік тому

    Why do you need to learn the delta? For example, for the ecg example, you already know the sample rate of the data, right?

  • @YUNBOWANG-tx4ju
    @YUNBOWANG-tx4ju 4 місяці тому

    so good

  • @JamesTJoseph
    @JamesTJoseph Рік тому

    Will subspace identification help to initialize A,B,C and D?

  • @salehgholamzadeh3368
    @salehgholamzadeh3368 2 роки тому +1

    Thanks for a very nice Presentation.
    at 44:17 (algorithm1). you mentioned "we've been developing simplifications of the model that allow you to bypass all of this and do things much more simply"?
    Is it already done by now?

    • @albertgu4131
      @albertgu4131 2 роки тому +4

      There were two follow-ups on simpler diagonal state space models: DSS (arxiv.org/abs/2203.14343) and S4D (arxiv.org/abs/2206.11893). The code for these is also available from the main repository

  • @p0w3rFloW
    @p0w3rFloW 2 роки тому +2

    Thanks for the amazing talk and work! Maybe it's trivial but I wonder how you actually reconstruct the signal from the hidden state, i.e., how does C look like ? (at 23:50)

    • @albertgu4131
      @albertgu4131 2 роки тому +2

      Just as A and B have specific formulas, there is a corresponding formula for C (related to evaluations of Legendre polynomials) that can be used for reconstruction. Notebooks for reproducing plots in this talk are available here in the official repository

  • @salehgholamzadeh3368
    @salehgholamzadeh3368 2 роки тому

    Regarding the speech classification example (53:53):
    theoretically I am not convinced why should the model works perfectly if is trained at different sampling rate.
    As we know A_bar and B_bar are calculated based on the delta_t (as well as A and B). So sample rate affect A_bar and B_bar and therefore we are training A_and B_bar specifically for that sample rate.
    Can you please clarify what I am I missing here?
    Thank you in advance

    • @albertgu4131
      @albertgu4131 2 роки тому

      Instead of training Abar and Bbar, the parameters that are trained are A, B, and Delta. At test time on a different sampling rate, Delta can simply be multiplied by the relative change in rate (for the given experiment, Delta would be doubled at test time without retraining any parameters)

  • @JoeClare-x3s
    @JoeClare-x3s 7 днів тому

    084 Veum Drive

  • @马辉-r5l
    @马辉-r5l 2 місяці тому

    希望有中文字幕,英文听力不好。

    • @PeinQein
      @PeinQein Місяць тому

      不是有自动翻译嘛