Dynamic Routing Between Capsules

Grokking: Generalization beyond Overfitting on small algorithmic datasets (Paper Explained)

Neural manifolds - The Geometry of Behaviour

Что-что Мурсдей говорит? 💭 #симбочка #симба #мурсдей

Сестра обхитрила!

TOY STORY IN BRAWL STARS!?

Manifold Mixup: Better Representations by Interpolating Hidden States

Yannic Kilcher

Переглядів 14 864

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 31 гру 2024

КОМЕНТАРІ • 32

@vincenzodelzoppo9125 4 роки тому ⁺⁸
Nice paper about regularization.
Such an elegant solution to districate manifolds in hidden states. Most of the networks I have seen they basically learn only in the last layers. While the backbone just extracts king of random features.
@herp_derpingson 5 років тому ⁺¹⁷
So many papers in rapid succession. This guy is on fire!
\m/
@YannicKilcher 5 років тому ⁺¹¹
or I'm just procrastinating on doing the dishes :p
@valthorhalldorsson9300 4 роки тому ⁺⁵
It's 9 months later and based on the rate of new videos I'm starting to worry you'll never get around to those dishes
@CosmiaNebula 4 роки тому
@@valthorhalldorsson9300 sooner than he gets to the dishes, a robot arm would be doing the dishes
@lucahugh7209 3 роки тому
you all prolly dont care at all but does someone know a method to log back into an instagram account..?
I stupidly forgot the account password. I love any help you can offer me!
@aidentroy4892 3 роки тому
@Luca Hugh Instablaster :)
@dermitdembrot3091 3 роки тому ⁺²
If the bottleneck layer makes the data linearly separable it may as well just be the last hidden layer. In that case this seems to be a technique for making the last hidden representation not just linearly separable but well-spaced. And I think it would induce the softmax inputs to seek an area where softmax is approximately linear.
@turbocaveman 2 роки тому
This is so coooool. It’s like saying here’s a cat, here’s a dog, here’s a mix of both.
@mucabi 6 місяців тому
It's exactly that. Basically it's the extension of MixUp data augmentation to the whole NN. Each layer has an input and an output and each layer learns individually the best representation. Now we are treating the latent representation of the previous layer (e.g cat, dog) as our input and smooth those accordingly.
@frenchmarty7446 2 роки тому
I agree with your point that not every layer (especially the lower layers) will or should be linearly separable.
However I think the objective of manifold mixup is to act as more of a regularization penalty, a given layer should be non-linearly separable only in so far as the benefits (to accuracy) overcome the penalty of mixup. The mixup adds a bias towards linearity but not a strict requirement.
Like all regularization methods there will probably have to be a lot more fine tuning and testing before we know if, when and how it gives the right bias variance trade-off.
@yanjieze 2 роки тому
Thanks! your paper explanation is really awesome!!!
@kevon217 Рік тому
great explanation. thanks!
@dude8309 5 років тому ⁺²
Wow! Super interesting paper and great insights.
@levikok1810 4 роки тому ⁺¹
Great video, thanks a million!
@selfhelp119 4 роки тому
amazing technique!
@ulissemini5492 3 роки тому ⁺¹¹
I like the video, but it's at 256 likes right now so I can't disturb the balance, sorry!
@DmitryRomanov 3 роки тому
Now you can push towards 512 😁
@ЗакировМарат-в5щ 4 роки тому
As I understand this technique is also good for NN prunning
@keyurranipa 3 роки тому
Can you elaborate?
@EngineerNick 3 роки тому
Thankyou! :)
@rahuldeora5815 5 років тому ⁺²
Nice
@AntonPanchishin 5 років тому ⁺¹
This video is another great Colab candidate. colab.research.google.com/drive/1qUDe3ENm3fnxND7iibyEF1Ixcw7nu4mK . Thanks again Yannic! Your video inspired me to create a colab ipython notebook that tested out this architecture. I love the concept! It was a pain to implement using Tensorflow Keras Layers. It does appear to help. I also decided that instead of just comparing it to a vanilla classifier that we could compare it to the "Worst" classifier from your other video about "Focusing on the Biggest Losers". Have a great weekend
@dimitriognibene8945 4 роки тому
So many new hyper parameters...
@meditationMakesMeCranky 5 років тому ⁺¹
I am not an expert, and I have not read the paper carefully, but this method seems more like a fancy data augmentation method rather than regularization.
Also, there is something to be said about the spiral example, I personally think that batch norm does a very good job. It is not good enough because we, humans, are biased and we "know" from experience and by guessing the intentions of whomever made the dataset the true representation :)
@levikok1810 4 роки тому ⁺²
Good point. I would say it's somewhere in between. You sort of create new 'averaged' samples to learn the model to be 'unsure' sometimes and this way the model converges to be more stable representation.
@flightrisk7566 3 роки тому
@@levikok1810 that analogy reminds me of DINO and CutMix

Наступне

Автоматичне відтворення

Dynamic Routing Between Capsules

Dynamic Routing Between Capsules

Grokking: Generalization beyond Overfitting on small algorithmic datasets (Paper Explained)

Grokking: Generalization beyond Overfitting on small algorithmic datasets (Paper Explained)

Neural manifolds - The Geometry of Behaviour

Neural manifolds - The Geometry of Behaviour

Что-что Мурсдей говорит? 💭 #симбочка #симба #мурсдей

Что-что Мурсдей говорит? 💭 #симбочка #симба #мурсдей

Сестра обхитрила!

Сестра обхитрила!

TOY STORY IN BRAWL STARS!?

TOY STORY IN BRAWL STARS!?

Правильный подход к детям

Правильный подход к детям

Supervised Contrastive Learning

Supervised Contrastive Learning

WE MUST ADD STRUCTURE TO DEEP LEARNING BECAUSE...

WE MUST ADD STRUCTURE TO DEEP LEARNING BECAUSE...

154 - Understanding the training and validation loss curves

154 - Understanding the training and validation loss curves

Reinforcement Learning: Machine Learning Meets Control Theory

Reinforcement Learning: Machine Learning Meets Control Theory

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Support Vector Machines: All you need to know!

Support Vector Machines: All you need to know!

My Understanding of the Manifold Hypothesis ft Geoffrey Hinton | Deep Learning

My Understanding of the Manifold Hypothesis ft Geoffrey Hinton | Deep Learning

Why neural networks aren't neural networks

Why neural networks aren't neural networks

All Machine Learning algorithms explained in 17 min

All Machine Learning algorithms explained in 17 min

Дал Свою Безлимитную Карту Друзьям, Потратили Миллионы... (Хазяева, Кокошка, Дилблин, Сатир)

Дал Свою Безлимитную Карту Друзьям, Потратили Миллионы... (Хазяева, Кокошка, Дилблин, Сатир)

Разобрался голыми руками 😎 #start #кино #фильм #сериал #молотведьм #полиция #пацаны

Разобрался голыми руками 😎 #start #кино #фильм #сериал #молотведьм #полиция #пацаны

When you lose control of your Waboba Moon Ball. @TheWabobaTeam #wabobapartner

When you lose control of your Waboba Moon Ball. @TheWabobaTeam #wabobapartner

Lp. Сердце Вселенной #60 РОЖДЕНИЕ ЛОЛОЛОШКИ [Финал] • Майнкрафт

Lp. Сердце Вселенной #60 РОЖДЕНИЕ ЛОЛОЛОШКИ [Финал] • Майнкрафт

ФИЛЬМ! НЕВИНОВНЫЙ ГОТОВИТ ДЕРЗКИЙ ПОБЕГ С НЕПРИСТУПНОГО ОСТРОВА-ТЮРЬМЫ! Мотылёк! Русский фильм

ФИЛЬМ! НЕВИНОВНЫЙ ГОТОВИТ ДЕРЗКИЙ ПОБЕГ С НЕПРИСТУПНОГО ОСТРОВА-ТЮРЬМЫ! Мотылёк! Русский фильм

🤔Можно ли спастись от Ядерки в Холодильнике ? #shorts

🤔Можно ли спастись от Ядерки в Холодильнике ? #shorts

КТО НЕ ДВИНЕТСЯ, ПОЛУЧИТ МАШИНУ!

КТО НЕ ДВИНЕТСЯ, ПОЛУЧИТ МАШИНУ!

ДИЗЕЛЬ ШОУ 2024 🇺🇦 ❄️ ЗИМОВА ПРЕМ'ЄРА ❄️ 🇺🇦 ВИПУСК 154 на підтримку ЗСУ ⭐ Гумор ICTV від 13.12.2024

ДИЗЕЛЬ ШОУ 2024 🇺🇦 ❄️ ЗИМОВА ПРЕМ'ЄРА ❄️ 🇺🇦 ВИПУСК 154 на підтримку ЗСУ ⭐ Гумор ICTV від 13.12.2024