8-bit Methods for Efficient Deep Learning with Tim Dettmers

The Attention Mechanism in Large Language Models

[1hr Talk] Intro to Large Language Models

ДНІПРО: КРИМІНАЛЬНА СТОЛИЦЯ СХІДНОЇ ЄВРОПИ | темна історія України

alyona alyona & Jerry Heil - Teresa & Maria (LIVE) | Ukraine 🇺🇦 | First Semi-Final | Eurovision 2024

This is taking glamping to a whole new level...😍 #RV #mercedes #living

8-bit Methods for Efficient Deep Learning -- Tim Dettmers (University of Washington)

Center for Language & Speech Processng (CLSP), JHU

Переглядів 1 705

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 21 тра 2024
Title: 8-bit Methods for Efficient Deep Learning
Abstract: Large language models are effective tools for many tasks but are difficult to train and inference due to their size. Moving from 32-bit models to 16-bit models resulted in considerable efficiency gains that made training and inference of large models easier. Can we train and inference in 8-bit to make further gains? In this talk, I will show that 8-bit inference and training can be used without degrading performance while improving efficiency. To make 8-bit methods work, it is essential to understand how quantization precision affects model performance and training stability as we scale the model size. I will talk about how these factors change with scale and how we need to adjust 8-bit methods to make them work. In particular, I will speak about 8-bit optimizers for training and Int8 inference for large language models with up to 175B parameters. These methods make training and inference more efficient and make large models more accessible to researchers.

КОМЕНТАРІ •

Наступне

Автоматичне відтворення

8-bit Methods for Efficient Deep Learning with Tim Dettmers

8-bit Methods for Efficient Deep Learning with Tim Dettmers

The Attention Mechanism in Large Language Models

The Attention Mechanism in Large Language Models

[1hr Talk] Intro to Large Language Models

[1hr Talk] Intro to Large Language Models

ДНІПРО: КРИМІНАЛЬНА СТОЛИЦЯ СХІДНОЇ ЄВРОПИ | темна історія України

ДНІПРО: КРИМІНАЛЬНА СТОЛИЦЯ СХІДНОЇ ЄВРОПИ | темна історія України

alyona alyona & Jerry Heil - Teresa & Maria (LIVE) | Ukraine 🇺🇦 | First Semi-Final | Eurovision 2024

alyona alyona & Jerry Heil - Teresa & Maria (LIVE) | Ukraine 🇺🇦 | First Semi-Final | Eurovision 2024

This is taking glamping to a whole new level...😍 #RV #mercedes #living

This is taking glamping to a whole new level...😍 #RV #mercedes #living

Вопрос с AMG GT закрыт РАЗ и НАВСЕГДА.

Вопрос с AMG GT закрыт РАЗ и НАВСЕГДА.

Speaker diarization -- Herve Bredin -- JSALT 2023

Speaker diarization -- Herve Bredin -- JSALT 2023

Watching Neural Networks Learn

Watching Neural Networks Learn

Residual Networks and Skip Connections (DL 15)

Residual Networks and Skip Connections (DL 15)

Tim Dettmers | QLoRA: Efficient Finetuning of Quantized Large Language Models

Tim Dettmers | QLoRA: Efficient Finetuning of Quantized Large Language Models

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

How does Netflix recommend movies? Matrix Factorization

How does Netflix recommend movies? Matrix Factorization

Illustrated Guide to Transformers Neural Network: A step by step explanation

Illustrated Guide to Transformers Neural Network: A step by step explanation

But what is a neural network? | Chapter 1, Deep learning

But what is a neural network? | Chapter 1, Deep learning

Two Effective Algorithms for Time Series Forecasting

Two Effective Algorithms for Time Series Forecasting

Пес Рем прощається зі своїм тренером і другом - військовим Сергієм Будзаном #війна #shortsyoutube

Пес Рем прощається зі своїм тренером і другом — військовим Сергієм Будзаном #війна #shortsyoutube

Усика увезли в больницу! Пресс-конференция Усик - Фьюри после боя / Видео нокдауна

Усика увезли в больницу! Пресс-конференция Усик - Фьюри после боя / Видео нокдауна

Amazing woodworking skills! Simple and Reliable way to attach a board to stone or concrete #shorts

Amazing woodworking skills! Simple and Reliable way to attach a board to stone or concrete #shorts

В гаражах силою утримують чоловіків#shortsvideo

В гаражах силою утримують чоловіків#shortsvideo

Voloshyn - ЗУСИЛЛЯ (прем'єра треку 2024)

Voloshyn - ЗУСИЛЛЯ (прем'єра треку 2024)

ДЄД-РАШИСТ НАДІВ ЗНАЧОК

ДЄД-РАШИСТ НАДІВ ЗНАЧОК

Арестович: "5 лет Зеленского. Отрицательное улучшение". Сбор для военных👇

Арестович: "5 лет Зеленского. Отрицательное улучшение". Сбор для военных👇

💔Тисячі вінничан прийшли попрощатись з Грінкою. Назавжди в строю!

💔Тисячі вінничан прийшли попрощатись з Грінкою. Назавжди в строю!