I don't usually comment on youtube videos, but this one was very easy and intuitive for such a complex subject. The details on the animations and your clear explanation really helped me a lot!
I was fascinated by the power of state space models in control theory field and now it finds its way in the new era of AI. I really love these models and thank you Mr. Serrano for the easy and interesting explanations
@@MaartenGrootendorst oh thank you! What an honor to hear from you, I love your articles and your recent book! It’s thanks to your article that I learned SSMs.
Columbo does it again!! How many videos have I left halfway through despondent that I would NEVER understand SSM's with these damned ABCD equations - 1 trip to Serrano Academy and I'm like... "Nobody does it better, Makes me feel sad for the rest, Nobody does it half as good as you Columbo, you're the best!!! Very special teacher. Always-as-One, We'll be home soon, Mother XXXXX[]
Thank you for the video, very informative! It would be really interesting to see a video explaining the training phase of SSM. What are the trainable parameters and how does the training process work?
I'm not confident at all in this, so take this with a grain of salt, but I'd assume the parameters would be the entries of the three matrices A, B, and C.
Your explanation is excellent, but I have a question. In the video, the final computation is performed using convolution. How can the convolution kernel be constructed quickly? Doesn't this also require many matrix powers?
The matrix h_t-1 is not easily read by us (interpretable). This detail was also omitted in mamba's brief explanation of attention mechanisms. But which I believe is similar to the attention mechanisms of transformative networks. But to understand in more detail, you would need to read the disruptive work in the field of AI mentioned at the beginning of the video. 0:12
Maybe I'm not understanding because it's getting pretty late here, but this seems like it's using a neural network to learn the transition functions (represented by the matrices A, B, and C) of a finite state machine, no? Also, I've heard a lot of people contrasting Mamba and SSMs with Transformers and claiming Mamba will replace Transformers, going so far as to say "we don't need attention after all!" But isn't the matrix A (or at least, the combination of A and B) basically acting similarly to an attention matrix anyway?
@@AravindUkrd thanks, great question! My guess is that implementing it is hard and may be disruptive. They would only do it if the performance is much better, and right now it’s comparable but not a lot better. But lemme find out and if it’s something different I’ll post it here.
One of the advantages of transformers and something that helped train very big transformers on very big datasets was paralellism(and it was said it was an advantage compared to RNNs), isn't that lost with SSMs? maybe that's the reason why they have not been so widely adopted?
"Hello, I am living Dubai and from India and I have a very strong background in advanced mathematics across multiple disciplines. Recently, I started learning Data Science and AI. I came across your channel, and believe me, it has motivated me a lot. I feel like I am learning Algebra with you. You're doing a great job, and I enjoy all your videos. Nice work! May Allah bless you."
Thanks for the suggestion! I touch on mamba at the end of this video, but I'm still trying to understand the details... hopefully I'll have one soon! :)
Since I can't like this video more than once, I added my likes in the comments 👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍 👏👏👏👏
I don't usually comment on youtube videos, but this one was very easy and intuitive for such a complex subject. The details on the animations and your clear explanation really helped me a lot!
I was fascinated by the power of state space models in control theory field and now it finds its way in the new era of AI. I really love these models and thank you Mr. Serrano for the easy and interesting explanations
thank you so much , i was trying to understand this mamba but when i saw you made a video for it was such a relief
i just discover your channel you are like dream thank you so much
Perfect, but please continue this series and make a video that why we need mamba as a transformer replacement
Great video! You always find an amazingly intuitive way to explain these technical and detailed subjects.
@@MaartenGrootendorst oh thank you! What an honor to hear from you, I love your articles and your recent book! It’s thanks to your article that I learned SSMs.
Columbo does it again!! How many videos have I left halfway through despondent that I would NEVER understand SSM's with these damned ABCD equations - 1 trip to Serrano Academy and I'm like... "Nobody does it better, Makes me feel sad for the rest, Nobody does it half as good as you Columbo, you're the best!!! Very special teacher. Always-as-One, We'll be home soon, Mother XXXXX[]
what an amazing visualization !!
Very intuitive, thank you.
Thank you for the video, very informative! It would be really interesting to see a video explaining the training phase of SSM. What are the trainable parameters and how does the training process work?
I'm not confident at all in this, so take this with a grain of salt, but I'd assume the parameters would be the entries of the three matrices A, B, and C.
Thaaaaank you for your vidoe let me easy to learn manba
Brilliant, Next Video on "KAN"
Thank you!!!
i wonder where did you learn this from ? like what resources did you use to give such a simple example
Your explanation is excellent, but I have a question. In the video, the final computation is performed using convolution. How can the convolution kernel be constructed quickly? Doesn't this also require many matrix powers?
Hi, please make a video on samba model just like this masterpiece. Thanks in advance.
How will it know which word to focus more on. Is there any logic it uses in the backend
The matrix h_t-1 is not easily read by us (interpretable). This detail was also omitted in mamba's brief explanation of attention mechanisms. But which I believe is similar to the attention mechanisms of transformative networks. But to understand in more detail, you would need to read the disruptive work in the field of AI mentioned at the beginning of the video. 0:12
Maybe I'm not understanding because it's getting pretty late here, but this seems like it's using a neural network to learn the transition functions (represented by the matrices A, B, and C) of a finite state machine, no?
Also, I've heard a lot of people contrasting Mamba and SSMs with Transformers and claiming Mamba will replace Transformers, going so far as to say "we don't need attention after all!" But isn't the matrix A (or at least, the combination of A and B) basically acting similarly to an attention matrix anyway?
Thanks for the explanation.
Was curious to know your thoughts on why Mamba is not already replacing transformers in mainstream large language models?
@@AravindUkrd thanks, great question! My guess is that implementing it is hard and may be disruptive. They would only do it if the performance is much better, and right now it’s comparable but not a lot better. But lemme find out and if it’s something different I’ll post it here.
@@SerranoAcademy Thought so. Thanks for reply 😊.
One of the advantages of transformers and something that helped train very big transformers on very big datasets was paralellism(and it was said it was an advantage compared to RNNs), isn't that lost with SSMs? maybe that's the reason why they have not been so widely adopted?
I would found a video about Kalman Filters interesting.
"Hello, I am living Dubai and from India and I have a very strong background in advanced mathematics across multiple disciplines. Recently, I started learning Data Science and AI. I came across your channel, and believe me, it has motivated me a lot. I feel like I am learning Algebra with you. You're doing a great job, and I enjoy all your videos. Nice work! May Allah bless you."
please make a video on mamba sir
Thanks for the suggestion! I touch on mamba at the end of this video, but I'm still trying to understand the details... hopefully I'll have one soon! :)
@@SerranoAcademy thanks a lot sir
Hi everyone
Hey there
Hello!!!
Since I can't like this video more than once, I added my likes in the comments 👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍 👏👏👏👏