VQ-VAE | Everything you need to know about it | Explanation and Implementation

ExplainingAI

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 30 тра 2024
In this video I go over Vector Quantised Variational Auto Encoder(VQVAE).
Specifically I talk over how its different from VAE, its theory and implementation.
I also train a VQVAE model and show how the generation process looks like after we have trained a VQVAE model.
Timestamps
--------------------
00:00 Intro
00:27 Need and difference from VAE
02:43 VQVAE Components
04:31 Argmin and Straight Through Gradient Estimation
07:11 Codebook and Commitment Loss
08:45 KL Divergence Loss
09:30 Implementation
12:35 Visualization
14:29 Generating Images
16:37 Outro
Paper Link - tinyurl.com/exai-vqvae-paper
Subscribe - tinyurl.com/exai-channel-link
Inspiration of visualization of embedding taken from - • [VQ-VAE] Neural Discre...
Github Repo Link - tinyurl.com/exai-vqvae-repo (Will be updated soon)
Background Track Fruits of Life by Jimena Contreras
Email - explainingai.official@gmail.com

КОМЕНТАРІ • 27

@Explaining-AI 7 місяців тому
Github Code - github.com/explainingai-code/VQVAE-Pytorch
@drannoc9812 23 дні тому
Thank you, the visuals really helped me understand, especially for the backpropagation part !
@Explaining-AI 22 дні тому
Happy that the video was some help to you :)
@foppili 23 дні тому
Great explanation, covering theory and implementation. Nice visualisations. Thanks!
@Explaining-AI 22 дні тому
Thank You!
@joegriffith1683 22 дні тому
Brilliant video, thanks so much!
@Explaining-AI 22 дні тому
You're very welcome!
@leleogere 7 місяців тому
Very clear explanation! Thanks for the implementation + the vizualisation of the codebook!
@Explaining-AI 7 місяців тому
Thank you
@inceptor1992 4 місяці тому
Dude your videos are absolutely amazing! Thank you!!!
@Explaining-AI 4 місяці тому
Thanks a lot 🙂
@user-vd2vc6yg4t 6 місяців тому
Thanks for the explanation, this is very helpful!!!
@Explaining-AI 6 місяців тому
Thank You! Am glad that it ended up helping in anyway
@PrajwalSingh15 5 місяців тому
Amazing explanantion with easy to follow animations.
@Explaining-AI 5 місяців тому
Thank you
@amirjodeiry7136 4 місяці тому
Thank you for providing insightful perspectives on this topic.
I appreciate your unique perspective and the effort you've put into providing valuable information, rather than simply copying from the paper. Keep up the great work!
@Explaining-AI 4 місяці тому
Thank you for saying that!
@badermuteb1012 5 місяців тому
How did code the visualizition?
Thank you for the tutorial. This is by far the best on UA-cam. Keep up please
@Explaining-AI 5 місяців тому
Thank you!
The visualization is not something I came up by myself, I saw it in a different video(link in description) and I thought it would be much better to explain with that kind of visualization.
This is roughly how I implemented it .
-> Reduce latent dimension as 2 and codebook dimension as 3
-> Bound VQVAE encoder outputs to a range using some activation at final layer of encoder. Say -1 to 1 or 0-1
-> Map both the dimension to some color's intensity value. So maybe x axis is green component(0-1 mapped to 0-255) and y axis is red component and blue is 255 always. Then color each points as (R, G, B) -> (encoded_dimension_1_value*255, encoded_dimension_2_value*255, 255)
-> Train VQVAE and get the codebook vectors for trained model and encoder outputs for an image.
-> Points on plot are encoder outputs for each cell of encoder output feature map and e1,e2,e3 are the codebook vectors.
-> Generate the quantized image using this mapping.
I hope this gives some clarity on the implementation part.
@scotth.hawley1560 3 місяці тому
Really nice. Thanks for posting. At 9:52, why are you using nn.Embedding, instead of nn.Parameter(torch,randn((3,2)))? I don't understand where the Embedding comes from.
@Explaining-AI 3 місяці тому ⁺¹
Thank you! Actually they are both the same. nn.Embedding any way just uses nn.Parameter and normal initialization.
github.com/pytorch/pytorch/blob/d947b9d50011ebd75db2e90d86644a19c4fe6234/torch/nn/modules/sparse.py#L143
So nn.Embedding just creates a wrapper in form of a lookup table to store embeddings of a fixed dictionary and size on top of nn.Parameter. Hope it helps.
@linhnhut2134 2 місяці тому
Thanks a lot for your video
Can you explain more detail about the :
quant_out = quant_input + (quant_out - quant_input).detach()
Why don't just
quant_out = quant_out.detach()
@Explaining-AI 2 місяці тому
Hello @linhnhut2134, what we want is the gradients from quant_out to used as if they are they are also the gradients for quant_input, kind of like copy pasting gradients. So ultimately in forward pass we desire to have quant_out = quant_out, but in backward pass what we want is quant_out = quant_input. And the operation "quant_out = quant_input + (quant_out - quant_input).detach()" allows us to achieve that distinction between forward and backward process.
@jakula8643 7 місяців тому
link to code please
@Explaining-AI 7 місяців тому
Hi @jakula8643, as a result of working on the implementation for the next video, I ended up modifying and making the VQVAE code a bit messy. I will clean it up and have it pushed here github.com/explainingai-code/VQVAE-Pytorch in couple of days time. Apologies for missing this and I will let you know as soon as I do that.
@Explaining-AI 7 місяців тому
Code is now pushed to the repo mentioned above
@PanicGiraffe 7 місяців тому
This video fuckin' rips.

Наступне

Автоматичне відтворення

VQ-GAN: Taming Transformers for High-Resolution Image Synthesis | Paper Explained