Ditch the Tokens, Hello MambaByte LLM !!!

Why & When You Should be Using Claude over ChatGPT

Jamba MoE 16x12B 🐍: INSANE Single GPU Capability | Is Mamba the FUTURE of AI?

Кінець РФ близько ❗️ Власна балістична ракета України

Пришёл к другу на ночёвку 😂

Каха отправляет дочь в школу #непосредственнокаха

🐍 Mamba2 8B Hybrid 🚀: NVIDIA Stealth drops their latest Mamba2 Model!

Ai Flux

Переглядів 4 126

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 12 вер 2024
The new Nvidia Mamba2 8B Hybrid LLM is here, and it's shaking things up. This video dives deep into the advancements it brings, potentially offering faster performance than traditional Transformer models. Could this be the future of large language models? We'll explore the rumors, specs, and what this means for the field of NLP. Join us to unravel the mystery of the Nvidia Mamba2 8B Hybrid LLM!
Tell us what you think in the comments below!
------------------------
Mamba 2 Hybrid 8B Hugging Face Card: huggingface.co...
Mamba-2 Release: x.com/_albertg...
Faro-Yi-9B-DPO: x.com/01AI_Yi/...

КОМЕНТАРІ • 29

@NicolasEmbleton 2 місяці тому ⁺⁶
Yes, more Mamba demos please.
@Person-hb3dv 2 місяці тому ⁺⁴
Would be interesting to see benchmark numbers for the mamba models vs same-size transformer-based models
@aifluxchannel 2 місяці тому ⁺¹
We'll have to wait and see. Sometimes these models benchmark wildly different especially when you start to look at how long-context window scaling works out.
@novantha1 2 місяці тому ⁺⁷
Nvidia will literally train a state of the art AI model to use VRAM more effectively before letting us pay the $100 more to go from 24GB consumer cards to 32GB consumer cards lmfao.
@aifluxchannel 2 місяці тому
Quite literally haha. Granted, we all know we'll have to pay more than $100 to go from 24GB to 32GB on RTX cards :(
@ryzikx 2 місяці тому ⁺¹
i havent heard many people talk about nemotron either. nvidia really be low key dropping some insane fking stuff. thanks for the news!
nemotron-type models are going to be the future as we close in on the limit of natural high-quality data.
@Nadavot Місяць тому ⁺¹
Wasn’t ai21’s Jamba the first mamba transformer hybrid?
@pramilapatil806 2 місяці тому
Every new partnership announcement gets me more excited about Cyberopolis!
@fontenbleau 2 місяці тому ⁺⁸
A horrible scandal with Stable diffusion 3 new licensing terms, even ban of use on certain platforms. People see that with such license next buyer of stable diffusion will also get incredible rights on models and everything made by them. Open models was a lie, maybe lawless China will be the only oasis for such.
@xlr555usa 2 місяці тому
The open source models can be forked, this happens all the time. The core functionality is there and can be built upon. Don't sit around for China to screw everything up. The CCP controls China and has become a pariah on the world stage.
@southcoastinventors6583 2 місяці тому ⁺¹
Less is more when comes to LLMs hope they used more than the snake game implementations to train this model also Megaton is the leader of the Decepticons while Megaton is a city in the Capital Wasteland. Seems people love their favorite shows or deadly snakes. Please run the model through your normal tests.
@amirsaifi184 2 місяці тому
The reason I got Cyberopolis is because I believe decentralization is more important than anything else.
@jonmichaelgalindo 2 місяці тому ⁺¹
Everyone talks about Mamba and no one tests it, because we all know SSMs don't work.
A giant context window? Oh, let me try needle in the hayst--Oh it can't copy from the prompt to the output. :-|
Alright, let me just have it convert this data into JSON--No copying from prompt to output! Oh, right.
Well, then let's just do function calling. Here are the functions that--NO COPYING FROM PROMPT TO OUTPUT!
Oh... Right. Uhm...
@mira_nekosi 2 місяці тому
but mamba2-hybrid can, afaik it was shown in the paper
also, imo TOVA could help with limiting KV-cache without affecting such abilities much
@jonmichaelgalindo 2 місяці тому
@@mira_nekosi Well yeah, Mamba2-hybrid uses attention. That's where transformers get their power.
@mira_nekosi 2 місяці тому ⁺¹
@@jonmichaelgalindo i know, but it's much faster and uses less memory
also, imo performance loss with TOVA will be even less than in transformers, especially with finetuning for it
@Wobbothe3rd 2 місяці тому ⁺³
Recurrent Neural Networks return!!!!
@jmirodg7094 2 місяці тому ⁺¹
I'm curious to see how this new mamba 2 performs against a Llama 3
@aifluxchannel 2 місяці тому ⁺¹
Right now Mamba is more of an academic / research endeavor. Hopefully we'll see reasonable evals this week although I think for now although Mamba uses less compute LLama3 is likely still more practically capable.
@joech1065 2 місяці тому ⁺²
It's trained for 3.5 trillion tokens, Llama 3 for 15 trillion, so it can't possibly perform as good. I really hope they take a Mamba model and train it adequately to match the level of training current SOTA transformer models have.
@kunalyadav9996 2 місяці тому
Forget the rest, Cyberopolis is where it's at. Potential moonshot!
@user-dn2bb2re9h 2 місяці тому
FOMO kicking in as Cyberopolis partners with more and more merchants. Bullish!
@SiCSpiT1 2 місяці тому
I'm going to make the prediction your mid range GPU pick is a 4060ti 16GB.
@aifluxchannel 2 місяці тому ⁺¹
Keep an eye out for our next video ;)

Наступне

Автоматичне відтворення

Ditch the Tokens, Hello MambaByte LLM !!!

Ditch the Tokens, Hello MambaByte LLM !!!

Why & When You Should be Using Claude over ChatGPT

Why & When You Should be Using Claude over ChatGPT

Jamba MoE 16x12B 🐍: INSANE Single GPU Capability | Is Mamba the FUTURE of AI?

Jamba MoE 16x12B 🐍: INSANE Single GPU Capability | Is Mamba the FUTURE of AI?

Кінець РФ близько ❗️ Власна балістична ракета України

Кінець РФ близько ❗️ Власна балістична ракета України

Пришёл к другу на ночёвку 😂

Пришёл к другу на ночёвку 😂

Каха отправляет дочь в школу #непосредственнокаха

Каха отправляет дочь в школу #непосредственнокаха

when you have plan B 😂

when you have plan B 😂

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - 693

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - 693

Hidden Layers: Decoded (Mamba-2, KANs, OpenAI's Superalignment Team, and more) | EP.23

Hidden Layers: Decoded (Mamba-2, KANs, OpenAI's Superalignment Team, and more) | EP.23

Nemotron-4 340B - Need to Make a LLM Dataset?

Nemotron-4 340B - Need to Make a LLM Dataset?

Pixtral-12B 👀: Mistral AI's First Multi-Modal VLLM is HERE!

Pixtral-12B 👀: Mistral AI's First Multi-Modal VLLM is HERE!

MAMBA and State Space Models explained | SSM explained

MAMBA and State Space Models explained | SSM explained

Do we need Attention? A Mamba Primer

Do we need Attention? A Mamba Primer

Google DeepMind AI BRAIN Unlocks Secrets of Real Brains!

Google DeepMind AI BRAIN Unlocks Secrets of Real Brains!

Efficiently Modeling Long Sequences with Structured State Spaces - Albert Gu | Stanford MLSys #46

Efficiently Modeling Long Sequences with Structured State Spaces - Albert Gu | Stanford MLSys #46

Mamba - a replacement for Transformers?

Mamba - a replacement for Transformers?

Даша змусила гостей їсти з підлоги - Супермама 8 сезон - Випуск 5

Даша змусила гостей їсти з підлоги – Супермама 8 сезон – Випуск 5

Новый уровень твоей сосиски

Новый уровень твоей сосиски

ЖІНОЧИЙ ЛІКАР. НОВЕ ЖИТТЯ. Сезон 2. Серія 9. Драма. Мелодрама. Серіал про Лікарів.

ЖІНОЧИЙ ЛІКАР. НОВЕ ЖИТТЯ. Сезон 2. Серія 9. Драма. Мелодрама. Серіал про Лікарів.

Друг без машины #непосредственнокаха

Друг без машины #непосредственнокаха

СИНИЙ ИЛИ ЗЕЛЕНЫЙ, КТО ПОБЕДИТ?! #Shorts #Глент

СИНИЙ ИЛИ ЗЕЛЕНЫЙ, КТО ПОБЕДИТ?! #Shorts #Глент

The Joker wanted to stand at the front, but unexpectedly was beaten up by Officer Rabbit

The Joker wanted to stand at the front, but unexpectedly was beaten up by Officer Rabbit

Сбежать от Granny : Nuggets Gegagedigedagedago пытается удрать от страшной бабульки !

Сбежать от Granny : Nuggets Gegagedigedagedago пытается удрать от страшной бабульки !

Участник рассмешил Диму Журавлева 😂 | Смотри Удиви меня в VK Видео!

Участник рассмешил Диму Журавлева 😂 | Смотри Удиви меня в VK Видео!