MedAI #42: Domain Adaptation with Invariant Representation Learning | Petar Stojanov

Efficiently Modeling Long Sequences with Structured State Spaces - Albert Gu | Stanford MLSys #46

Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Paper Explained)

Бабулька Granny пытается поймать Nuggets Gegagedigedagedago , но не тут то было!

Brawl Stars Edit😈📕

У ГОРДЕЯ ПОЖАР в ОФИСЕ!

MedAI #41: Efficiently Modeling Long Sequences with Structured State Spaces | Albert Gu

Stanford MedAI

Переглядів 28 913

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 10 вер 2024

КОМЕНТАРІ • 21

@EobardUchihaThawne 7 місяців тому ⁺¹⁵
he is one of the heads of this new Mamba architecture
@thefourthbrotherkaramazov245 7 місяців тому ⁺⁴
And s4, and the ssm paper before that lol
@user-lf4tu9fq8j 8 місяців тому ⁺⁴
excellent presentation. Thank you
@MrHampelmann123 Рік тому ⁺⁴
Amazing talk, and impressive research. Thanks.
@ranwang9505 5 місяців тому
impressive presentations. thank you
@yuktikaura 8 місяців тому ⁺¹
Excellent presentation
@mohdil123 Рік тому ⁺¹
Awesome!
@theskydebreuil 8 місяців тому
Super interesting! Thanks for the presentation.
I work in game development for now, but cool to see how things are going in the ML world 😊
@user-kp2el5ib2m 6 місяців тому
Excellent presentation and impressive research, i only wonder why SSMs are recurrently efficient? (video timestamp : 32:27)
Suppose k is the token len of input history. The general sequence model takes k square (s.t. transformer) time complexity. On the other hand, SSMs still need to encode all stateful history "recurrently". The S4 paper also aims to deal with this issue (multiply A, k-1 times to create a K bar matrix, it also ends in nearly k square) by diagonalizing the matrix.
So, it seems SSMs recurrent aren't "naturally" efficient, but require some linear algebra technique.
Any suggestion will be appreciated!!
@temesgenmehari3749 Рік тому
Why do you need to learn the delta? For example, for the ecg example, you already know the sample rate of the data, right?
@YUNBOWANG-tx4ju 4 місяці тому
so good
@JamesTJoseph Рік тому
Will subspace identification help to initialize A,B,C and D?
@salehgholamzadeh3368 2 роки тому ⁺¹
Thanks for a very nice Presentation.
at 44:17 (algorithm1). you mentioned "we've been developing simplifications of the model that allow you to bypass all of this and do things much more simply"?
Is it already done by now?
@albertgu4131 2 роки тому ⁺⁴
There were two follow-ups on simpler diagonal state space models: DSS (arxiv.org/abs/2203.14343) and S4D (arxiv.org/abs/2206.11893). The code for these is also available from the main repository
@p0w3rFloW 2 роки тому ⁺²
Thanks for the amazing talk and work! Maybe it's trivial but I wonder how you actually reconstruct the signal from the hidden state, i.e., how does C look like ? (at 23:50)
@albertgu4131 2 роки тому ⁺²
Just as A and B have specific formulas, there is a corresponding formula for C (related to evaluations of Legendre polynomials) that can be used for reconstruction. Notebooks for reproducing plots in this talk are available here in the official repository
@salehgholamzadeh3368 2 роки тому
Regarding the speech classification example (53:53):
theoretically I am not convinced why should the model works perfectly if is trained at different sampling rate.
As we know A_bar and B_bar are calculated based on the delta_t (as well as A and B). So sample rate affect A_bar and B_bar and therefore we are training A_and B_bar specifically for that sample rate.
Can you please clarify what I am I missing here?
Thank you in advance
@albertgu4131 2 роки тому
Instead of training Abar and Bbar, the parameters that are trained are A, B, and Delta. At test time on a different sampling rate, Delta can simply be multiplied by the relative change in rate (for the given experiment, Delta would be doubled at test time without retraining any parameters)
@JoeClare-x3s 7 днів тому
084 Veum Drive
@马辉-r5l 2 місяці тому
希望有中文字幕，英文听力不好。
@PeinQein Місяць тому
不是有自动翻译嘛

Наступне

Автоматичне відтворення

MedAI #42: Domain Adaptation with Invariant Representation Learning | Petar Stojanov

MedAI #42: Domain Adaptation with Invariant Representation Learning | Petar Stojanov

Efficiently Modeling Long Sequences with Structured State Spaces - Albert Gu | Stanford MLSys #46

Efficiently Modeling Long Sequences with Structured State Spaces - Albert Gu | Stanford MLSys #46

Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Paper Explained)

Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Paper Explained)

Бабулька Granny пытается поймать Nuggets Gegagedigedagedago , но не тут то было!

Бабулька Granny пытается поймать Nuggets Gegagedigedagedago , но не тут то было!

Brawl Stars Edit😈📕

Brawl Stars Edit😈📕

У ГОРДЕЯ ПОЖАР в ОФИСЕ!

У ГОРДЕЯ ПОЖАР в ОФИСЕ!

ПОРВАЛАСЬ РЕЗИНКА и подвела… #гордей

ПОРВАЛАСЬ РЕЗИНКА и подвела… #гордей

Harvard Professor Explains Algorithms in 5 Levels of Difficulty | WIRED

Harvard Professor Explains Algorithms in 5 Levels of Difficulty | WIRED

Mamba - a replacement for Transformers?

Mamba - a replacement for Transformers?

MAMBA and State Space Models explained | SSM explained

MAMBA and State Space Models explained | SSM explained

The Most Important Algorithm in Machine Learning

The Most Important Algorithm in Machine Learning

The State Space Model Revolution, with Albert Gu

The State Space Model Revolution, with Albert Gu

[1hr Talk] Intro to Large Language Models

[1hr Talk] Intro to Large Language Models

Computer Scientist Explains Machine Learning in 5 Levels of Difficulty | WIRED

Computer Scientist Explains Machine Learning in 5 Levels of Difficulty | WIRED

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

Diffusion and Score-Based Generative Models

Diffusion and Score-Based Generative Models

«Давайте доб'єм!»: під Покровськом український дрон полює на техніку окупантів #війна #зсу #донбас

«Давайте доб'єм!»: під Покровськом український дрон полює на техніку окупантів #війна #зсу #донбас

说好的半夜不睡觉，敲门这操作太意外了！#cute #baby #funny #comedy

说好的半夜不睡觉，敲门这操作太意外了！#cute #baby #funny #comedy

Друг без машины #непосредственнокаха

Друг без машины #непосредственнокаха

БЕЛКА РОЖАЕТ?#cat

БЕЛКА РОЖАЕТ?#cat

Прощання з сімʼєю Базилевич у Льовові

Прощання з сімʼєю Базилевич у Льовові

Сбежать от Granny : Nuggets Gegagedigedagedago пытается удрать от страшной бабульки !

Сбежать от Granny : Nuggets Gegagedigedagedago пытается удрать от страшной бабульки !

Новий концерт Єдиного Кварталу від 1 вересня 2024. Повний випуск

Новий концерт Єдиного Кварталу від 1 вересня 2024. Повний випуск