Deep Networks Are Kernel Machines (Paper Explained)

This is why Deep Learning is really weird.

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

ВАНЯ КУХАРЧУК В КЛУБІ ДИЛЕТАНТІВ #39

Прийшов додому на 5 хвилин, щоб попрощатися з батьками #shorts #війна #полонені #зниклібезвісти

Lecture 7 - Deep Learning Foundations: Neural Tangent Kernels

Soheil Feizi

Переглядів 23 913

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 21 вер 2020
Course Webpage: www.cs.umd.edu/class/fall2020/...

КОМЕНТАРІ • 29

@TheAIEpiphany 2 роки тому ⁺³
Cool video thanks!
00:00:00 Intro: linear regression
00:23:55 NTKs start here
01:01:33 link between NNs and ODEs (ordinary differential equations)
@debadeepta 3 роки тому ⁺¹⁷
Really nice lecture! I was looking to quickly learn NTKs before diving deep into the original papers and this really helped.
@zl7460 2 роки тому
+1. Most well-explained DL lecture I've seen for a long time
@StratosFair 2 роки тому ⁺²
Incredibly clear lecture, allowed me to fill the gaps in my understanding of NTK. Thank you professor !
@dv019 3 роки тому ⁺⁶
Great video, thank you! To the student asking about Kernels: the word is overloaded. It is used in linear algebra to mean the set of all vectors mapped to 0 by a linear transformation. Sometimes Green's functions in PDEs are called integral kernels. In general a kernel is "the central or most important part of something". I don't like how overloaded the word is either, but c'est la vie.
@sikun7894 3 роки тому ⁺²
Thank you so much for sharing these lectures! Really useful
@user-mm2xj2wj8w 3 роки тому ⁺¹
Awesome lesson! Straight and clear!
@itachi7243456 3 роки тому ⁺⁴
These are fantastic, thanks!
@joonho0 3 роки тому ⁺⁴
Thanks a lot for sharing this lecture!
@yuwu7547 2 роки тому
Very useful and easy-catching lecture. Thanks a lot!
@weisenjiang9179 3 роки тому ⁺²
great intro to NTK, benefit me a lot
@AyushSharma-ie7tj Рік тому
Really nice lecture with a very even pace. Thank you for sharing.
@DarkNinja-24 Рік тому ⁺¹
Beautiful explanation!
@nhl8586 2 роки тому
Super useful for understanding NTK in 15 mins!
@mstislavmaslennikov326 2 роки тому
The lecturer is imho doing a great job explaining difficult material!
@tanchienhao 2 роки тому
Thanks for the awesome lectures!!
@da_lime 2 роки тому
Awesome, thanks!
@chenamora1653 3 роки тому
So amazing
@sinaasadiyan Рік тому
great explanation, just Subscribed!
@yuzhema2506 2 роки тому
Thanks for the nice lecture! One question: the bias term in the Taylor approximation seems dependent on x, which means for different input x, the bias term varies. This is different from the traditional kernel view where the bias term is the same for different transformed input phi(x). In other words, for NTK, the inputs in the transformed space do not strictly follow the same linear model. How do we interpret such deviation? Thanks
@vi5hnupradeep 2 роки тому
Thankyou so much!
@MetaOptimizer 2 роки тому
41:07 Do we consider the large width of parameter (m) in empirical observation as an extremely large network such as GPT3? In other words, could I interpret the meaning of "the width of parameters" as "the number of trainable parameters"? Thank for your valuable lecture :)
@ihany9061 2 роки тому
lifesaver!
@sayeedchowdhury11 2 роки тому
thanks for the nice lecture, I have a query, we're evaluating the gradient at w0, does it mean the kernel is evaluated based on gradients obtained from an untrained NN which has just been initialized? i mean is the f(w,x) a trained NN or just an initialized one?
@meghbhalerao5208 2 роки тому
If I understand right, the NTK is derived when we only consider quadratic mse loss, right? can it be generalized to other loss functions?
@chongyizheng7758 3 роки тому ⁺¹
Question about the first-order Taylor approximation of neural network: Why the first term f(w_0, x) is not included in the kernel function since it is nonlinear w.r.t. x?
@ramanasubramanyam1110 3 роки тому
The first derivative is included (and called NTK) because it resembles the operation of a kernel on an input, i.e a transformation function mapping to a higher dimension
@chongyizheng7758 3 роки тому
@@ramanasubramanyam1110 Thanks for your reply, but I don't think I am asking for that. Let me clarify: My question is about the constant (the first) term f(w_0, x) at 41:16 instead of the derivative (the second) term in the equation. f(w_0, x) seems also nonlinearly depend on x, why it was excluded in the definition of NTK?
@hw1451 2 роки тому ⁺¹
I think since it's a constant, we can always subtract it from y.

Наступне

Автоматичне відтворення

Deep Networks Are Kernel Machines (Paper Explained)

Deep Networks Are Kernel Machines (Paper Explained)

This is why Deep Learning is really weird.

This is why Deep Learning is really weird.

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

ВАНЯ КУХАРЧУК В КЛУБІ ДИЛЕТАНТІВ #39

ВАНЯ КУХАРЧУК В КЛУБІ ДИЛЕТАНТІВ #39

Прийшов додому на 5 хвилин, щоб попрощатися з батьками #shorts #війна #полонені #зниклібезвісти

Прийшов додому на 5 хвилин, щоб попрощатися з батьками #shorts #війна #полонені #зниклібезвісти

У ТЦК втік "улов"

У ТЦК втік "улов"

Miles Cranmer - The Next Great Scientific Theory is Hiding Inside a Neural Network (April 3, 2024)

Miles Cranmer - The Next Great Scientific Theory is Hiding Inside a Neural Network (April 3, 2024)

Deep Learning Foundations: Andrew Wilson's Talk on How Do We Build Models That Learn and Generalize?

Deep Learning Foundations: Andrew Wilson's Talk on How Do We Build Models That Learn and Generalize?

Feature Learning in Infinite-Width Neural Networks

Feature Learning in Infinite-Width Neural Networks

Deep Learning Foundations: Jonathan Frankle talk on Faster Neural Network Training, Algorithmically

Deep Learning Foundations: Jonathan Frankle talk on Faster Neural Network Training, Algorithmically

Theoretical Foundations of Graph Neural Networks

Theoretical Foundations of Graph Neural Networks

Liquid Neural Networks

Liquid Neural Networks

Top Optimizers for Neural Networks

Top Optimizers for Neural Networks

Lecture 15 - Kernel Methods

Lecture 15 - Kernel Methods

From Gaussian Process to Neural Tangent Kernel - A Guide to Infinitely Wide Neural Networks

From Gaussian Process to Neural Tangent Kernel - A Guide to Infinitely Wide Neural Networks

【鬥羅大陸】小舞真的錯怪唐舞桐了! #斗羅大陸 #唐三 #小舞 #唐舞桐 #唐舞麟

【鬥羅大陸】小舞真的錯怪唐舞桐了! #斗羅大陸 #唐三 #小舞 #唐舞桐 #唐舞麟

Максим Галкін заспівав українською в Юрмалі

Максим Галкін заспівав українською в Юрмалі

МАРИЯ ГОЛУБКИНА О ЕСТЕСТВЕННОСТИ #shorts

МАРИЯ ГОЛУБКИНА О ЕСТЕСТВЕННОСТИ #shorts

Disparos en la colectora de la General Paz: ladrón atropelló a los policías que lo quisieron detener

Disparos en la colectora de la General Paz: ladrón atropelló a los policías que lo quisieron detener

Задержи дыхание дольше всех!

Задержи дыхание дольше всех!

Слепой наказал на дороге 🚘 @tv3_international #второезрение #детектив #расследование

Слепой наказал на дороге 🚘 @tv3_international #второезрение #детектив #расследование

«Сусіди за три пакети їжі показали, що ми - родина військового». Спогади маріупольки про окупацію

«Сусіди за три пакети їжі показали, що ми — родина військового». Спогади маріупольки про окупацію

НРАВИТСЯ ЭТОТ ФОРМАТ??

НРАВИТСЯ ЭТОТ ФОРМАТ??