Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training

QLoRA paper explained (Efficient Finetuning of Quantized LLMs)

How might LLMs store facts | Chapter 7, Deep Learning

Done! Dad’S Private Money Is Gone! #comedy #small #funny #baby #cute

"Він залишив свій слід в Україні та світі": у Вінниці попрощалися з В'ячеславом Узелковим

DOMIY & SHUMEI - Не пройде

Quantization in Deep Learning (LLMs)

AI Bites

Переглядів 6 881

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 12 лис 2024

КОМЕНТАРІ • 19

@MojtabaJafaritadi 6 місяців тому ⁺³
Thanks for this clear and easy explanation of Quantization in NNs.
@AIBites 6 місяців тому
glad u liked 😊
@Ram-oj4gn Місяць тому
The quantisation of changing number format applies only to the result of activation function or also to the individual weights ? Where we apply this quantisation in the NN
@mashood3624 Рік тому ⁺²
A good comprehensive video......good work. I liked the related links you added in description. A tiny recommendation: Please increase the speed of speaking as there were many secs breaks between topics. Thank you and looking forward for more content.
@AIBites Рік тому
Thank you so much 😊
@mdnghbrs1283 10 місяців тому ⁺¹
I just have a question about the quantisation on tensorflow. For a project of mine i used QKeras library for QTA, the weights that i got in the end the were pretty large numbers (here speaking more about the volume like for example 0.235215266523415e-2). On the qunationzation config i used int8 and that number is not representable for int8 format.
Does the tranining still happen in fp32 but the quantisation is treated as noise?
Also what do i do to get the weights to be representable in int8 format?
How to test the accurracy of the weight quantised model?
@AIBites 9 місяців тому
Hey, I almost always use PyTorch. So will try my best to help you with TF.
Its normal for the weights to be large numbers like you have said if the weights are in fp32. If the number cannot be quantized into int8, the training could collapse as it could be rounded off to an extreme like 0. So if too many numbers get rounded off like this, gradients will collapse leading to training hitting a wall.
To test, you can run inference the same way as you do on your eval or test set. What is stopping you doing that?
@abuali5513 9 місяців тому ⁺²
Thank you for the informative content. Is it possible to combine pruning and quantization while maintaining accuracy?
@AIBites 9 місяців тому ⁺¹
Fantastic question. I feel it should be possible as they do two different things to the weights, though I have never tried both.
First prune and get rid of the unnecessory params. Then quantize what is left :)
@SreeramAjay Рік тому ⁺¹
Thankyou, it was really clean and clear explanation that too in a short time. 👏
@AIBites Рік тому ⁺¹
oh thanks for the encouraging words. Helps me keep going :)
@vishalchovatiya1361 3 місяці тому ⁺¹
very well explained.
@AIBites 2 місяці тому
thank you Vishal! :)
@Techiiot Рік тому
very good explanation,please make a video on how to calibrate the data and compute scaling factor and zero point by analysing the weight distribution of each layer for Int8 quantization in tensorflow tensorrt, also the role of fake quantizers during backpropagation
@AIBites Рік тому
Thanks for the nice suggestion. There are courses on Quantization these days. So wasn't able to cover everything or deep dive into specifics :)
@davidcmoffatt 3 місяці тому ⁺¹
Just start watching but... signed bye is -128..127 not 127..127. google 2's complement to see why.
@AIBites 2 місяці тому
sorry, thats an embarrassing errata! and good spot. Thanks a lot! will keep it up for next time.
@nazeem35 11 місяців тому
Thanks!
@AIBites 10 місяців тому
Thanks so much for the monetary reward! Very encouraging 🙂

Наступне

Автоматичне відтворення

Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training

Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training

QLoRA paper explained (Efficient Finetuning of Quantized LLMs)

QLoRA paper explained (Efficient Finetuning of Quantized LLMs)

How might LLMs store facts | Chapter 7, Deep Learning

How might LLMs store facts | Chapter 7, Deep Learning

Done! Dad’S Private Money Is Gone! #comedy #small #funny #baby #cute

Done! Dad’S Private Money Is Gone! #comedy #small #funny #baby #cute

"Він залишив свій слід в Україні та світі": у Вінниці попрощалися з В'ячеславом Узелковим

"Він залишив свій слід в Україні та світі": у Вінниці попрощалися з В'ячеславом Узелковим

DOMIY & SHUMEI - Не пройде

DOMIY & SHUMEI - Не пройде

💥ОРБАН наказав вимкнути Зеленського з ефіру після цих слів! У Будапешті скандал! / ДЖИГУН

💥ОРБАН наказав вимкнути Зеленського з ефіру після цих слів! У Будапешті скандал! / ДЖИГУН

tinyML Talks: A Practical Guide to Neural Network Quantization

tinyML Talks: A Practical Guide to Neural Network Quantization

LoRA explained (and a bit about precision and quantization)

LoRA explained (and a bit about precision and quantization)

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

Quantization in deep learning | Deep Learning Tutorial 49 (Tensorflow, Keras & Python)

Quantization in deep learning | Deep Learning Tutorial 49 (Tensorflow, Keras & Python)

How large language models work, a visual intro to transformers | Chapter 5, Deep Learning

How large language models work, a visual intro to transformers | Chapter 5, Deep Learning

AI, Machine Learning, Deep Learning and Generative AI Explained

AI, Machine Learning, Deep Learning and Generative AI Explained

AWQ for LLM Quantization

AWQ for LLM Quantization

Quantization of Neural Networks - High Accuracy at Low Precision

Quantization of Neural Networks – High Accuracy at Low Precision

GTC 2021: Systematic Neural Network Quantization

GTC 2021: Systematic Neural Network Quantization

НОВЫЙ AMONG US в РЕАЛЬНОЙ ЖИЗНИ - Масленников, Егорик, Милана Хаметова, Супер Стас

НОВЫЙ AMONG US в РЕАЛЬНОЙ ЖИЗНИ - Масленников, Егорик, Милана Хаметова, Супер Стас

Хліб возять раз на тиждень - як живуть у маленьких селах на Львівщині #shorts

Хліб возять раз на тиждень – як живуть у маленьких селах на Львівщині #shorts

Perfect Pitch Challenge? Easy! 🎤😎| Free Fire Official

Perfect Pitch Challenge? Easy! 🎤😎| Free Fire Official

Выполни Экстремальное Задание - Получи 300.000 Рублей! (Парадеич, Горилла, ФрамеТамер, Кокошка)

Выполни Экстремальное Задание - Получи 300.000 Рублей! (Парадеич, Горилла, ФрамеТамер, Кокошка)

СОБАКА ВЕРНУЛА ТАБАЛАПКИ😱#shorts

СОБАКА ВЕРНУЛА ТАБАЛАПКИ😱#shorts

How Much Tape To Stop A Lamborghini?

How Much Tape To Stop A Lamborghini?

Бомжи достали 6 тачек из мусора. Находка года!

Бомжи достали 6 тачек из мусора. Находка года!

Пробую гриб за 880 000 рублей за кг

Пробую гриб за 880 000 рублей за кг