Multimodal AI from First Principles - Neural Nets that can see, hear, AND write.

I coded my own Text to Image Diffusion AI Model from scratch - Here is what I learned.

AI can't cross this line and we don't know why.

НИКИТА ПОДСТАВИЛ ДЖОНИ 😡

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

🦆 Waddle Over the Bridge Cake #Shorts

JEPA Architectures - How neural networks learn abstract concepts about images (IJEPA)

Neural Breakdown with AVB

Переглядів 2 880

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 26 жов 2024

КОМЕНТАРІ • 6

@ethansmith7608 Рік тому ⁺⁴
There's something quite interesting about predicting embeddings from embeddings, feels like you give the model an extra degree of freedom in designing its representation space rather than training it on reconstruction and hoping that indirectly you can achieve a nice representation space. Both CLIP and the strategy used for the DALLE prior both also sort of learn to base their predictions based off the position of other embeddings and their continued success makes me think this is a promising area of research
@gyahoo 2 місяці тому ⁺¹
Great explanation ❤
@josephsueke 7 місяців тому ⁺²
nicely explained!
@ControllerQuickSwaps Рік тому ⁺²
If "like human's do" just means 'using latent representation' that's definitely just attention grabbing imo. Neverthless, taking prediction to latent space is definitely the right direction.
@avb_fj Рік тому ⁺²
Yeah, I agree! I do think there is an aspect of marketing slogan involved here... but as a concept it does make a ton of sense as a research initiative.
@ControllerQuickSwaps Рік тому
@@avb_fj I'm still trying to figure out what part of the idea is actually novel. I brought it up in a lab meeting today and people said that self-supervised loss is already often done in latent space?

Наступне

Автоматичне відтворення

Multimodal AI from First Principles - Neural Nets that can see, hear, AND write.

Multimodal AI from First Principles - Neural Nets that can see, hear, AND write.

I coded my own Text to Image Diffusion AI Model from scratch - Here is what I learned.

I coded my own Text to Image Diffusion AI Model from scratch - Here is what I learned.

AI can't cross this line and we don't know why.

AI can't cross this line and we don't know why.

НИКИТА ПОДСТАВИЛ ДЖОНИ 😡

НИКИТА ПОДСТАВИЛ ДЖОНИ 😡

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

🦆 Waddle Over the Bridge Cake #Shorts

🦆 Waddle Over the Bridge Cake #Shorts

ДУБАЙСКАЯ ШОКОЛАДКА 🍫

ДУБАЙСКАЯ ШОКОЛАДКА 🍫

Neural Networks and the 2024 Nobel Prize in Physics - Sixty Symbols

Neural Networks and the 2024 Nobel Prize in Physics - Sixty Symbols

JEPA - A Path Towards Autonomous Machine Intelligence (Paper Explained)

JEPA - A Path Towards Autonomous Machine Intelligence (Paper Explained)

It's Not About Scale, It's About Abstraction

It's Not About Scale, It's About Abstraction

V-JEPA, AI Reasoning from a Non-Generative Architecture with Mido Assran - 677

V-JEPA, AI Reasoning from a Non-Generative Architecture with Mido Assran - 677

But what does a trained Convolution Neural Network actually learn? VISUALIZED!

But what does a trained Convolution Neural Network actually learn? VISUALIZED!

How large language models work, a visual intro to transformers | Chapter 5, Deep Learning

How large language models work, a visual intro to transformers | Chapter 5, Deep Learning

How Neural Nets estimate depth from 2D images? Monocular Depth Estimation Explained!

How Neural Nets estimate depth from 2D images? Monocular Depth Estimation Explained!

Overview of Joint Embedding Predictive Architectures

Overview of Joint Embedding Predictive Architectures

Harvard Presents NEW Knowledge-Graph AGENT (MedAI)

Harvard Presents NEW Knowledge-Graph AGENT (MedAI)

Тамар. Страшное пророчество Залужного, ядерная карта Зеленского, почему Гутерриш кланялся Путину

Тамар. Страшное пророчество Залужного, ядерная карта Зеленского, почему Гутерриш кланялся Путину

"Москва - это правнучка Киева, Крым - это Украина" - Борис Миронов размазал крымнашистов @omtvreal

"Москва - это правнучка Киева, Крым - это Украина" - Борис Миронов размазал крымнашистов @omtvreal

⚡️Орбан ЗУСТРІВСЯ із Зеленським в Брюсселі #shorts

⚡️Орбан ЗУСТРІВСЯ із Зеленським в Брюсселі #shorts

skibidi toilet 77 (part 4)

skibidi toilet 77 (part 4)

CAN YOU DO THIS ?

CAN YOU DO THIS ?

Лютый бугай против матёрого мужика!

Лютый бугай против матёрого мужика!

💥 УЛЬТИМАТУМ Путіну! Лише ПОСЛУХАЙТЕ, про що заговорили в НІМЕЧЧИНІ #shorts

💥 УЛЬТИМАТУМ Путіну! Лише ПОСЛУХАЙТЕ, про що заговорили в НІМЕЧЧИНІ #shorts

爆笑電梯整蠱！今天這個妹子的自我防護意識我給100分！

爆笑電梯整蠱！今天這個妹子的自我防護意識我給100分！