VQ-GAN: Taming Transformers for High-Resolution Image Synthesis | Paper Explained

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper Explained)

These Neural Networks Have Superpowers! 💪

Ветеран війни отримав гроші на житло

"ВСЯ УЛИЦА полетела" - курянка про обстріли рф

Морпіх із Каліфорнії доєднався до лав ЗСУ #shorts

High-Res Image Synthesis - Merging Transformer Power with CNN Efficiency

What's AI by Louis-François Bouchard

Переглядів 8 328

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 7 січ 2025

КОМЕНТАРІ • 13

@WhatsAI 3 роки тому
References:
Taming Transformers for High-Resolution Image Synthesis, Esser et al., 2020
►Project link with paper and results: compvis.github.io/taming-transformers/
►Code: github.com/CompVis/taming-transformers
►Colab demo to start sampling right away: colab.research.google.com/github/CompVis/taming-transformers/blob/master/scripts/taming-transformers.ipynb
@dietrichdietrich7763 Рік тому
Wow Really Neat 😁
@xiander 3 роки тому ⁺²
Thanks you for all this work
@WhatsAI 3 роки тому ⁺¹
Thank you so much ! It is a pleasure to make these videos! :)
@alaazniber467 3 роки тому ⁺¹
Perfect !
Thank you for these insights !
@WhatsAI 3 роки тому ⁺¹
It is my pleasure!
@hoaxuan7074 3 роки тому ⁺¹
I prefer to use Transforms to make neural nets. The fast Hadamard transform can be viewed as fixed collection of dot products. Then you can swap around what is adjusted in a net using the fixed dot products and adjustable (parametric) activation functions like fi(x)=ai.x x=0, i=0 to m.
The advantage is the number of operations per layer falls to log2(n) add subtracts and n multiplies. And the number of parameters to 2n where n is the width of the net. How can that work? Each dot product is a statistical summary measure and filter looking back at the entire prior layer of neurons and capable of modulating its response to whatever it sees. And that works fine. Fast Transform fixed filter bank neural networks.
@joshuacplacidi 3 роки тому ⁺¹
Great breakdown!
@WhatsAI 3 роки тому
Thank you!
@strongsyedaa7378 2 роки тому
Have YOU downloaded the dataset ?
@Jhon-tl5gg 3 роки тому ⁺¹
bro, I really don't understand what the function of the transformer is in this architecture. Is it only because of the fact that it can be used to predict information that is missing from the images by having learned the distribution of the latent space from others during training?
@WhatsAI 3 роки тому ⁺¹
Exactly! It helps to 'control' the latent space using a "codebook". In fact, since a high-resolution image is too big to be directly used in the transformer architecture, they need to attack this problem with another angle. In this case, they encode the information using CNNs and save these encoded parameters in a codebook instead of the pixels themselves. t
These encodings will be used in the transformer to change the style of the image. Mainly to help control the overall image realism while the CNN encodings ensure the local realism.
@strongsyedaa7378 2 роки тому
@@WhatsAI
Can YOU please make a video regarding the code also
I mean Transformers architecture on CNN by code

Наступне

Автоматичне відтворення

VQ-GAN: Taming Transformers for High-Resolution Image Synthesis | Paper Explained

VQ-GAN: Taming Transformers for High-Resolution Image Synthesis | Paper Explained

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper Explained)

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper Explained)

These Neural Networks Have Superpowers! 💪

These Neural Networks Have Superpowers! 💪

Ветеран війни отримав гроші на житло

Ветеран війни отримав гроші на житло

"ВСЯ УЛИЦА полетела" - курянка про обстріли рф

"ВСЯ УЛИЦА полетела" — курянка про обстріли рф

Морпіх із Каліфорнії доєднався до лав ЗСУ #shorts

Морпіх із Каліфорнії доєднався до лав ЗСУ #shorts

Як азовська піхота прийняла групу розвідки вс рф? Зізнання окупантів і кадри з GoPro

Як азовська піхота прийняла групу розвідки вс рф? Зізнання окупантів і кадри з GoPro

High-Quality Background Removal Without Green Screens | State of the Art Approach Explained

High-Quality Background Removal Without Green Screens | State of the Art Approach Explained

Much bigger simulation, AIs learn Phalanx

Much bigger simulation, AIs learn Phalanx

Transformer combining Vision and Language? ViLBERT - NLP meets Computer Vision

Transformer combining Vision and Language? ViLBERT - NLP meets Computer Vision

Restormer: Efficient Transformer for High Resolution Image Restoration | CVPR 2022

Restormer: Efficient Transformer for High Resolution Image Restoration | CVPR 2022

Learn Machine Learning Like a GENIUS and Not Waste Time

Learn Machine Learning Like a GENIUS and Not Waste Time

Will Transformers Replace CNNs in Computer Vision? + NVIDIA GTC Giveaway

Will Transformers Replace CNNs in Computer Vision? + NVIDIA GTC Giveaway

Vision Transformer (ViT) - An image is worth 16x16 words | Paper Explained

Vision Transformer (ViT) - An image is worth 16x16 words | Paper Explained

CVPR2023 - Activating More Pixels in Image Super-Resolution Transformer

CVPR2023 - Activating More Pixels in Image Super-Resolution Transformer

How AI Image Generators Work (Stable Diffusion / Dall-E) - Computerphile

How AI Image Generators Work (Stable Diffusion / Dall-E) - Computerphile

ПРОВЕРКА НА ВШИВОСТЬ (смешное видео, юмор, поржать, приколы)

ПРОВЕРКА НА ВШИВОСТЬ (смешное видео, юмор, поржать, приколы)

СОЛДАТ КНДР: ВТЕЧА/ВІЙНА В УКРАЇНІ/10 РОКІВ ШПИГУВАВ У ПІВНІЧНІЙ КОРЕЇ/ТОРГУЮТЬ НАРКОТИКАМИ І ЗБРОЄЮ

СОЛДАТ КНДР: ВТЕЧА/ВІЙНА В УКРАЇНІ/10 РОКІВ ШПИГУВАВ У ПІВНІЧНІЙ КОРЕЇ/ТОРГУЮТЬ НАРКОТИКАМИ І ЗБРОЄЮ

Пилот обманул смерть ракета пролетела рядом с ним #shorts

Пилот обманул смерть ракета пролетела рядом с ним #shorts

ПРАНК НАД БОЯРСКИМ | КОНФЛИКТ НА ДОРОГЕ

ПРАНК НАД БОЯРСКИМ | КОНФЛИКТ НА ДОРОГЕ

Разобрался голыми руками 😎 #start #кино #фильм #сериал #молотведьм #полиция #пацаны

Разобрался голыми руками 😎 #start #кино #фильм #сериал #молотведьм #полиция #пацаны

МАФИЯ в РЕАЛЬНОЙ ЖИЗНИ: Дубровский, Позов, Мамикс, Катя Клэп, Егорик, Кадрол, Столяров, Масленников

МАФИЯ в РЕАЛЬНОЙ ЖИЗНИ: Дубровский, Позов, Мамикс, Катя Клэп, Егорик, Кадрол, Столяров, Масленников

1% vs 100% #beatbox #tiktok

1% vs 100% #beatbox #tiktok

Ветеран війни отримав гроші на житло

Ветеран війни отримав гроші на житло