The Lottery Ticket Hypothesis and pruning in PyTorch

mildlyoverfitted

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 2 чер 2024
In this video, we are going to explain how one can do pruning in PyTorch. We will then use this knowledge to implement a paper called "The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks". The paper states that feedforward neural networks have subnetworks (winning tickets) inside of them that perform as good as (or even better than) the original network. It also proposes a recipe how to find them.
Paper: arxiv.org/abs/1803.03635
Official code: github.com/facebookresearch/o...
Code from this video: github.com/jankrepl/mildlyove...
00:00 Intro
00:50 Paper overview: Hypothesis (diagram) [slides]
01:33 Paper overview: Hypothesis (formal) [slides]
02:15 Paper overview: Finding winning tickets [slides]
03:44 Paper overview: Our setup [slides]
05:08 Pruning 101 in PyTorch [code]
10:29 Data - MNIST [code]
12:18 Multilayer perceptron [code]
14:05 Pruning: Linear + MLP [code]
16:45 Randomly initializing: Linear + MLP [code]
18:24 Weight copying: Linear + MLP [code]
19:51 Computing statistics [code]
20:53 Training functions [code]
24:38 CLI and training preparation [code]
27:13 Train-prune loop [code]
30:04 Grid search script [code]
31:01 Results: Actual vs desired pruning [no code]
32:47 Results: Winning tickets (parallel coordinate plots) [no code]
36:02 Results: Winning tickets (standard plots) [no code]
37:12 Outro
If you have any video suggestions or you just wanna chat feel free to join the discord server: / discord
Twitter: / moverfitted
Credits logo animation
Title: Conjungation · Author: Uncle Milk · Source: / unclemilk · License: creativecommons.org/licenses/... · Download (9MB): auboutdufil.com/?id=600
Наука та технологія

КОМЕНТАРІ • 36

@ludwigstumpp 2 роки тому ⁺⁶
Thanks you so much for this high quality content! Please keep it up!
@mildlyoverfitted 2 роки тому ⁺¹
Thank you very much!
@marearts. 2 роки тому ⁺¹
Thank you for very very useful video. Great job 👏!
@hangchen 2 роки тому ⁺¹
Hearing that quick typing sound is like ASMR for me.
@mildlyoverfitted 2 роки тому
Hahah:) That was not necessarily the primary goal but great to hear that:))
@beizhou2488 2 роки тому ⁺²
Have been waiting for this video! Nice work, buddy :)
@mildlyoverfitted 2 роки тому
Thank you for suggesting the topic:)
@kevinlu8685 2 роки тому ⁺²
This kind of video is so helpful.
Thank you so much!
@mildlyoverfitted 2 роки тому
Glad it was helpful!
@deoabhijit5935 2 роки тому ⁺¹
this is gug :)
Thank you! was waiting for this, Will watch it today
@lucasfischer8593 2 роки тому ⁺¹
I've watched mostly all of your videos and just wanted to give the feedback that your content is the exact level of technical depth I was looking for.
From what I've seen it's the only youtube channel showing/explaining how to implement ML topics beyond the basic use cases of a straight forward neural net for some classification task. I'm very much a fan!
Cheers mate!
PS: The webcam addition is a nice touch 👍
@mildlyoverfitted 2 роки тому ⁺¹
Wow, I really appreciate your interest:) Thank you for the feedback!
Regarding the UA-cam ML/DL space, I do agree with you. I would love to see more channels that try to go into technical implementation details and explain what is happening in the background (rather than just showing how to use 3rd party packages). But who knows, maybe there are more channels like this one (and probably way better) but they don't have enough publicity yet. Feel free to let me know if you find any:)
Cheers
@lucasfischer8593 2 роки тому ⁺¹
@@mildlyoverfitted From the top of my head I can remember Mark Saroufim (ua-cam.com/users/marksaroufim) and Yannic Kilcher (ua-cam.com/users/YannicKilcher) but your channel shows more the code implementation of the topics whereas theirs is more about a "discussion" of the topics (which is also pretty cool of course)
@mildlyoverfitted 2 роки тому ⁺¹
@@lucasfischer8593 Both of them are great! I agree:)
@lihanou 2 роки тому
I feel the same - I'm a phd student and it's the exact level of depth I need.
@evab.7980 2 роки тому ⁺¹
Thank you!
@lukasbazalka7582 2 роки тому ⁺¹
Thank you!!!
@mikewood8175 2 роки тому ⁺¹
Hey Great Video! Actually I did tried this myself and I am saving both models with and without pruned. The size of pruned model is bigger than actual without pruning. Any suggestions? Also how did you those graphs which was explained at the last?
@mildlyoverfitted 2 роки тому
This is a great question and it is something I wondered about myself. I would refer you to this: discuss.pytorch.org/t/weight-pruning-on-bert/83429/2
TLDR: At this point, there are no inference time speedups or memory savings:( So more than anything the `prune` module is supposed to be used for research (e.g. LTH).
Hope that helps:)
@mikewood8175 2 роки тому ⁺¹
@@mildlyoverfitted Seems like I get it. Its kind of the way torch implements pruning by reserving metadata. Also other pruning methods like heads pruning in transformers and filter pruning in Convs do give inference speedups.
Also I wonder if you get time & can implement Longformers window attentions using tvm scripts. Great Work! Keep going!
@mildlyoverfitted 2 роки тому
@@mikewood8175 Thank you for the suggestion:)
@mikewood8175 2 роки тому ⁺¹
Actually the size of the pruned model should be large as there is an addition of buffers and some pre hooks. So only benefit of pruning is faster inference?
@mildlyoverfitted 2 роки тому
See the other comment:)
@jhilam3017 Рік тому
Does this method reduce the total training time?
@pritomroy2465 7 місяців тому
Hi, I am confused about one thing. When you are re-parameterization the original weight to the pruned model are you making zero the weight of the pruning mask? 19:16
@mildlyoverfitted 6 місяців тому ⁺¹
Hey! I am not sure if I follow what you mean, do you think you could create an issue on github.com/jankrepl/mildlyoverfitted/issues and I will address it in detail?
@RNGRB_com Рік тому
寫代碼的手速好快！
@mdbayazid6837 2 роки тому ⁺¹
Can you please explain your system configuration?
@mildlyoverfitted 2 роки тому
Hey there! Not sure what exactly you mean but see below a list of things I use:
* vim (no plugins other than the gruvbox theme)
* tmux
* Python (mostly 3.7 and newer)
* Python packages (depends on the video, I try to declare all of them in `requirements.txt`)
* MacOS
* A couple of CPUs
* If necessary (not for this video), I get a single GPU instance on Google Cloud Platform for a couple of days
@mdbayazid6837 2 роки тому
@mildlyoverfitted Thanks for the reply. By the way, I asked for GPU configuration in your Dino and Vision transformer implementation. Sorry for the confusion.
@mildlyoverfitted 2 роки тому
@@mdbayazid6837 No problem:) When it comes to GPUs I get either K80 or T4 since they are the cheapest. See cloud.google.com/compute/gpus-pricing for more details.
@mdbayazid6837 2 роки тому ⁺¹
@@mildlyoverfitted Thanks again, you are so kind.
@lihanou 2 роки тому ⁺¹
Great work - but why not train with a GPU?
@mildlyoverfitted 2 роки тому
Thank you! I always prototype the code for my videos locally and only then rent a GPU if necessary. It was not really necessary for this one:)
@swk9015 Місяць тому
what's the font you use?
@mildlyoverfitted Місяць тому
Note sure. I am using this vim theme: github.com/morhetz/gruvbox so maybe you can find it somewhere in their repo.

Наступне

Автоматичне відтворення