Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask (Paper Explained)

The moment we stopped understanding AI [AlexNet]

Supervised Contrastive Learning

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Прочистка шлюзов

Уличный боец с ДУХОМ воина

Do ImageNet Classifiers Generalize to ImageNet? (Paper Explained)

Yannic Kilcher

Переглядів 29 632

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 16 гру 2024

КОМЕНТАРІ • 22

@herp_derpingson 4 роки тому ⁺⁸
This is awesome! I was not able to hold on to my papers.
Its interesting to see why nobody thought of accuracy as a function of both skill and difficulty before.
@jrkirby93 4 роки тому ⁺⁸
So, to sum it up: "Better models will struggle less on harder test sets."
I'd call this statement "the difficulty bias". I think this work does not prove that overfitting never occurs on imagenet. But it does show that the difficulty bias is a stronger effect than overfitting bias. So if overfitting to the imagenet test set does occur, it's probably not a particularly strong effect.
@Vaishaal 4 роки тому ⁺¹
I agree this work doesn't *prove* overfitting doesn't happen, this work + a few other related works imply adaptive overfitting isn't a *huge* issue in ML.
1. papers.nips.cc/paper/9117-a-meta-analysis-of-overfitting-in-machine-learning
2. papers.nips.cc/paper/9190-model-similarity-mitigates-test-set-overuse
@YannicKilcher 4 роки тому ⁺³
True, I just find it's generally not what anyone would have expected.
@bluel1ng 4 роки тому ⁺⁴
Strange plots in Fig. 1 @ 5:00 : Why did they not use the same axis-scaling for new and original accuracy? The XY-ranges are so similar that a non-skewed projection would have been no problem at all.
@Vaishaal 4 роки тому ⁺⁶
This was simply done for aesthetic reasons. Using the same axis scaling produces a lot of white space.
@I-did-not-ask-for-a-handle Рік тому
@@Vaishaal Surely, can't have that in a 72-page document!
@prachi07kgp 4 роки тому ⁺²
Wow, thanks for putting it so succintly, saved me so much time
@znotft 5 місяців тому
Shouldn't test set v1 and v2 indistinguishable?
@Guytron95 4 роки тому ⁺³
I wonder if a third set produced by 50/50 randomly selecting instances from each set would fall half-way between the 2 linear relations.
@rainerzufall1868 4 роки тому ⁺³
Yes, but that is obvious. Split the datapoints in the top-1 error into two sums (1 per dataset) and you see that you are just averaging the two error rates!
@Vaishaal 4 роки тому ⁺²
Yes this is exactly what would happen.
@arpitaggarwal7167 4 роки тому ⁺¹
So can one say that transfer learning is here to stay or overfitting to ImageNet dataset still a possibility?
@YannicKilcher 4 роки тому
Probably we're still not overfitting
@ash3844 2 роки тому
Hi, loved this content. But the base architecture at least half of it resembles tacotron2. could you pls make a detailed video on tacotron2 architecture. Thanks in advance.
@drdca8263 4 роки тому
The super-holdout idea seems like a good idea, if it isn’t too costly. I hope people start to do that.
@dribnet1 4 роки тому ⁺³
great summary. wouldn't an easy and revealing experiment here be training a binary classifier to discriminate between the old and new test set?
@YannicKilcher 4 роки тому ⁺³
They are doing this in the paper appendix, they reach about 53% accuracy or so.
@MrSystemStatic 4 роки тому ⁺¹
@@YannicKilcher That's just guessing, at that point.
@tristanwegner 2 роки тому ⁺¹
Is there anything stopping cheating researchers from training on the test set (original) itself, to get more klout for models that perform well? I mean even a new test set like this would not reveal such cheating, if the underlying model is at least descent, because the cheater basically had a bigger dataset to work with, which should lead to better generalization to the V2 testset.
@ego_sum_liberi 4 місяці тому
Awesome.. Thank you!!!
@not_a_human_being 4 роки тому ⁺¹
They should've "calibrated it"(by throwing away images) on some older models to make sure scores match FIRST, and only THEN do their comparison! Not an expert but can't see why this wasn't done.

Наступне

Автоматичне відтворення

Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask (Paper Explained)

Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask (Paper Explained)

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

Supervised Contrastive Learning

Supervised Contrastive Learning

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Прочистка шлюзов

Прочистка шлюзов

Уличный боец с ДУХОМ воина

Уличный боец с ДУХОМ воина

Что будет если украсть в магазине шоколадку 🍫

Что будет если украсть в магазине шоколадку 🍫

[Classic] Deep Residual Learning for Image Recognition (Paper Explained)

[Classic] Deep Residual Learning for Image Recognition (Paper Explained)

But what is a neural network? | Deep learning chapter 1

But what is a neural network? | Deep learning chapter 1

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

[Classic] ImageNet Classification with Deep Convolutional Neural Networks (Paper Explained)

[Classic] ImageNet Classification with Deep Convolutional Neural Networks (Paper Explained)

The Problem With Procedural Generation

The Problem With Procedural Generation

I never understood why you can't go faster than light - until now!

I never understood why you can't go faster than light - until now!

Group Normalization (Paper Explained)

Group Normalization (Paper Explained)

Israel Has The Right To Defend Itself | Stand-up Comedy by Daniel Fernandes

Israel Has The Right To Defend Itself | Stand-up Comedy by Daniel Fernandes

Recurrent Neural Networks (RNNs), Clearly Explained!!!

Recurrent Neural Networks (RNNs), Clearly Explained!!!

«Шнурки не зрізайте, акуратненько»: медик про реакцію військових на поранення #shorts

«Шнурки не зрізайте, акуратненько»: медик про реакцію військових на поранення #shorts

How to treat Acne💉

How to treat Acne💉

КТО НЕ ДВИНЕТСЯ, ПОЛУЧИТ МАШИНУ!

КТО НЕ ДВИНЕТСЯ, ПОЛУЧИТ МАШИНУ!

Комаровский. Когда конец войны, Трамп не поможет, потеря Украины, эмиграция, многоженство в Украине

Комаровский. Когда конец войны, Трамп не поможет, потеря Украины, эмиграция, многоженство в Украине

Wall Rebound Challenge 🙈😱

Wall Rebound Challenge 🙈😱

Lp. Сердце Вселенной #60 РОЖДЕНИЕ ЛОЛОЛОШКИ [Финал] • Майнкрафт

Lp. Сердце Вселенной #60 РОЖДЕНИЕ ЛОЛОЛОШКИ [Финал] • Майнкрафт

ФИЛЬМ! ЮЛИЙ ЦЕЗАРЬ ИЗ ПРОСТОГО ВОЕНАЧАЛЬНИКА СТАЛ ПОВЕЛИТЕЛЯ РИМА! ЮЛИЙ ЦЕЗАРЬ! Русский фильм

ФИЛЬМ! ЮЛИЙ ЦЕЗАРЬ ИЗ ПРОСТОГО ВОЕНАЧАЛЬНИКА СТАЛ ПОВЕЛИТЕЛЯ РИМА! ЮЛИЙ ЦЕЗАРЬ! Русский фильм