How Lumiere Works

How AI 'Understands' Images (CLIP) - Computerphile

Road to Sora and How Diffusion Transformers Work

SIUUUUU 😳 At 39 years old, Cristiano Ronaldo 🇵🇹 still makes football look easy 🔥

💥ПОРТНИКОВ: кріт в Кремлі злив розмову путіна про ОХМАТДИТ! Цей удар - частина нової тактики рф

🔴 Орбан змінює плани / Деталі ракетного удару по Україні

How Medusa Works

Oxen

Переглядів 1 071

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 14 лип 2024
This week we cover the "Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads". A method that uses multiple decoding heads to predict multiple subsequent tokens in parallel using a tree-based attention mechanism.
--
Get Oxen 🐂 oxen.ai/
Oxen.ai makes versioning your datasets as easy as versioning your code! Even is millions of unstructured images, we quickly handle any type of data so you can build cutting-edge AI.
--
Medusa 📜 arc.net/l/quote/vvqinsvi
Medusa Notes 📜 www.oxen.ai/blog/arxiv-dives-...
Join Arxiv Dives 🤿 oxen.ai/community
Discord 🗿 / discord
--
Chapters
0:00 Introducing Daniel Varoli from Zapata.ai
2:00 The Problem with LLMs Today
3:45 How we Can Solve These Problems
8:30 Normal vs. Speculative Architecture
14:24 Speculative Decoding Example
15:35 Introducing Medusa
16:53 Medusa’s Decoding Heads
17:32 Generating Tokens With Medusa Heads
22:30 Verifying Candidates With Medusa
24:15 What if we Mess Up?
25:09 Rejecting Sampling For Accepting Candidates
29:11 Considering Many Completion Candidates at Once
31:56 Tree Attention Diagrams
40:00 How to Integrate Medusa Into a LLM
48:10 Results
Наука та технологія

КОМЕНТАРІ • 1

@420_gunna 3 дні тому
Great job daniel! Thanks for linking to that reddit comment.

Наступне

Автоматичне відтворення

How Lumiere Works

How Lumiere Works

How AI 'Understands' Images (CLIP) - Computerphile

How AI 'Understands' Images (CLIP) - Computerphile

Road to Sora and How Diffusion Transformers Work

Road to Sora and How Diffusion Transformers Work

SIUUUUU 😳 At 39 years old, Cristiano Ronaldo 🇵🇹 still makes football look easy 🔥

SIUUUUU 😳 At 39 years old, Cristiano Ronaldo 🇵🇹 still makes football look easy 🔥

💥ПОРТНИКОВ: кріт в Кремлі злив розмову путіна про ОХМАТДИТ! Цей удар - частина нової тактики рф

💥ПОРТНИКОВ: кріт в Кремлі злив розмову путіна про ОХМАТДИТ! Цей удар - частина нової тактики рф

🔴 Орбан змінює плани / Деталі ракетного удару по Україні

🔴 Орбан змінює плани / Деталі ракетного удару по Україні

меня не было еще год

меня не было еще год

Speculative Decoding: When Two LLMs are Faster than One

Speculative Decoding: When Two LLMs are Faster than One

Exploring the Latency/Throughput & Cost Space for LLM Inference // Timothée Lacroix // CTO Mistral

Exploring the Latency/Throughput & Cost Space for LLM Inference // Timothée Lacroix // CTO Mistral

Fixing RAG with GraphRAG

Fixing RAG with GraphRAG

How Diffusion Works for Text

How Diffusion Works for Text

But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning

But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Generative AI in a Nutshell - how to survive and thrive in the age of AI

I wish every AI Engineer could watch this.

I wish every AI Engineer could watch this.

LoRA & QLoRA Fine-tuning Explained In-Depth

LoRA & QLoRA Fine-tuning Explained In-Depth

LLM hallucinations discover new math solutions!? | FunSearch explained

LLM hallucinations discover new math solutions!? | FunSearch explained

Klavye İle Trafik Işığını Yönetmek #shorts

Klavye İle Trafik Işığını Yönetmek #shorts

⚡Контактная сварка медной ленты

⚡Контактная сварка медной ленты

После ввода кода - протирайте панель

После ввода кода - протирайте панель

Klavye İle Trafik Işığını Yönetmek #shorts

Klavye İle Trafik Işığını Yönetmek #shorts

I tested every new Samsung product!

I tested every new Samsung product!

Игровой Комп с Авито за 4500р

Игровой Комп с Авито за 4500р

Самый дорогой кабель Apple

Самый дорогой кабель Apple

ИГРОВАЯ СБОРКА ПК ЗА 30К ОТ А ДО Я

ИГРОВАЯ СБОРКА ПК ЗА 30К ОТ А ДО Я