Attention in transformers, step-by-step | DL6

Transformer Neural Networks Derived from Scratch

What does it mean for computers to understand language? | LM1

«Я жити не хочу»: винесли «з нуля» пораненого побратима #shorts

У ДЕТЕНЫША СТЕПЫ ИСЧЕЗ ГЛАЗИК

Заява ЗАЛУЖНОГО ШОКУВАЛА увесь СВІТ😱ТРЕТЯ СВІТОВА ВІЙНА ПОЧАЛАСЬ?

How did the Attention Mechanism start an AI frenzy? | LM3

vcubingx

Переглядів 20 349

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 20 січ 2025

КОМЕНТАРІ • 29

@vcubingx 9 місяців тому ⁺¹⁰
With that, these are the three videos I had planned out. Do check out the previous ones if you missed them!
What kind of videos would you guys like to see next?
@VisibilityO2 9 місяців тому
Hey , I consider vcubingx should explain the sparse attention it make the models handle large inputs more efficiently by only attending to a subset of elements . In large sequences it helps in a advantage of calculation (as it requires less calculation than softmax).
I will recommend you to read this 'research.google/blog/rethinking-attention-with-performers/?m=1'
@scottmcevoy9252 9 місяців тому ⁺⁸
This is one of the best explanations of attention I have seen so far. Understanding the bottleneck motivation really makes this clear right around 3:15.
@blackveganarchist 9 місяців тому ⁺⁴
you’re doing god’s work brother, thank you for the series
@j.domenig418 9 місяців тому ⁺²
Thanks!
@nikkatalnikov 9 місяців тому ⁺¹
great explanation
@antoineberkani9747 9 місяців тому ⁺²
I really like how easy you make it to understand the why of things. I think you've accomplished your goal of making it seem like I could come up with this!
Please cover multi headed self attention next! :)
I am worried that this simple approach skips important pieces of the puzzle though. Transformers do have a lot of moving parts it seems. But it seems like you're only getting started!
@lolatomroflsinnlos 9 місяців тому ⁺¹
Thanks for this series :)
@kevindave277 8 місяців тому
Thank you, Vivek. Absolutely love your content. Please also keep adding Math content, though. Maybe create a playlist about different functions, limits etc? Whatever suits you.
@YeabsiraTesfaye-ry6yc 9 днів тому
Just wow! Subscribed.
@FlyingHenroxx 5 місяців тому
Thank you for your work! Your videos were very helpful for understanding the evolution of transformers 👍
@shukurullomeliboyev2004 8 місяців тому
Best explanation when i have found so far, thank you
@calix-tang 9 місяців тому ⁺⁵
What a great video mfv I paid attention the whole time
@TheRoganExperienceJoe 4 місяці тому ⁺¹
Nice, time to boost this video in the algorithm by typing out a comment
@FabioDBB 7 місяців тому
Truly amazing explanation, thx!
@artmiss-x8o 4 місяці тому
it was really good. thank you
@hafizulislam364 2 місяці тому
What's the name of the piano tune that appears at the beginning of the video?
@Fussfackel 9 місяців тому
Great material and presentation, thanks a lot for your work! I'd like to see some deep dive into how embeddings work, as we can get embeddings from decoder-only models like GPTs, Llamas, etc. and they use some form of embeddings for their internal representations, right? But there are also encoder-only models like BERT and others (OpenAIs text-embedding models) which are actually used instead. What is their difference and why does one work better than the other? Is it just because of computer differences or are there some inherent differences?
@balasubramaniana9541 9 місяців тому ⁺¹
awesome
@maurogdilalla 9 місяців тому
What about Q, K a d V matrixes meaning?
@posttoska 9 місяців тому
nice vid
@Rouxles 9 місяців тому
tyfs
@lacmacmclean Місяць тому
epic
@varunmohanraj5031 9 місяців тому
❤❤❤
@aidanthompson5053 9 місяців тому
If you hate others, your really just hating yourself, because we are all one with god source
@OBGynKenobi 9 місяців тому ⁺⁸
Weird 3b1b has the same series going on now.
@korigamik 9 місяців тому ⁺³
He works for 3b1b
@Rami_Zaki-k2b 21 день тому
Every video on planet earth explains attention with "translation", when every individual on planet earth uses ChatGPT "NOT IN TRANSLATION". We use it to CHAT ... Why use translation to explain ? It is so wired ....
@rosschristopherross 9 місяців тому
Thanks!

Наступне

Автоматичне відтворення

Attention in transformers, step-by-step | DL6

Attention in transformers, step-by-step | DL6

Transformer Neural Networks Derived from Scratch

Transformer Neural Networks Derived from Scratch

What does it mean for computers to understand language? | LM1

What does it mean for computers to understand language? | LM1

«Я жити не хочу»: винесли «з нуля» пораненого побратима #shorts

«Я жити не хочу»: винесли «з нуля» пораненого побратима #shorts

У ДЕТЕНЫША СТЕПЫ ИСЧЕЗ ГЛАЗИК

У ДЕТЕНЫША СТЕПЫ ИСЧЕЗ ГЛАЗИК

Заява ЗАЛУЖНОГО ШОКУВАЛА увесь СВІТ😱ТРЕТЯ СВІТОВА ВІЙНА ПОЧАЛАСЬ?

Заява ЗАЛУЖНОГО ШОКУВАЛА увесь СВІТ😱ТРЕТЯ СВІТОВА ВІЙНА ПОЧАЛАСЬ?

СОЛДАТ КНДР: ВТЕЧА/ВІЙНА В УКРАЇНІ/10 РОКІВ ШПИГУВАВ У ПІВНІЧНІЙ КОРЕЇ/ТОРГУЮТЬ НАРКОТИКАМИ І ЗБРОЄЮ

СОЛДАТ КНДР: ВТЕЧА/ВІЙНА В УКРАЇНІ/10 РОКІВ ШПИГУВАВ У ПІВНІЧНІЙ КОРЕЇ/ТОРГУЮТЬ НАРКОТИКАМИ І ЗБРОЄЮ

The Dark Matter of AI [Mechanistic Interpretability]

The Dark Matter of AI [Mechanistic Interpretability]

The Pattern to Prime Numbers?

The Pattern to Prime Numbers?

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

Why Recurrent Neural Networks are cursed | LM2

Why Recurrent Neural Networks are cursed | LM2

The Attention Mechanism in Large Language Models

The Attention Mechanism in Large Language Models

How might LLMs store facts | DL7

How might LLMs store facts | DL7

Why do colliding blocks compute pi?

Why do colliding blocks compute pi?

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

Why the Future of AI & Computers Will Be Analog

Why the Future of AI & Computers Will Be Analog

Тайское мороженое в Калининграде

Тайское мороженое в Калининграде

"ВСЯ УЛИЦА полетела" - курянка про обстріли рф

"ВСЯ УЛИЦА полетела" — курянка про обстріли рф

🤔Можно ли спастись от Ядерки в Холодильнике ? #shorts

🤔Можно ли спастись от Ядерки в Холодильнике ? #shorts

Хто такий РОМАН СВІТАН? Звідки бере інформацію про фронт?

Хто такий РОМАН СВІТАН? Звідки бере інформацію про фронт?

Син ПОВАЛІЙ ПЛЮНУВ ЇЙ в ОБЛИЧЧЯ! Скандальне ПРИВІТАННЯ для ЗРАДНИЦІ! | OBOZ.LIFE

Син ПОВАЛІЙ ПЛЮНУВ ЇЙ в ОБЛИЧЧЯ! Скандальне ПРИВІТАННЯ для ЗРАДНИЦІ! | OBOZ.LIFE

"Бажано відбити посадку без втрат": військовий розповів, як загибель побратимів впливає на психіку

"Бажано відбити посадку без втрат": військовий розповів, як загибель побратимів впливає на психіку

Lp. Сердце Вселенной #60 РОЖДЕНИЕ ЛОЛОЛОШКИ [Финал] • Майнкрафт

Lp. Сердце Вселенной #60 РОЖДЕНИЕ ЛОЛОЛОШКИ [Финал] • Майнкрафт

Удержаться на воде?? 🌊 #симбочкапимпочка #симбочка #симба

Удержаться на воде?? 🌊 #симбочкапимпочка #симбочка #симба