The Lottery Ticket Hypothesis and pruning in PyTorch
Вставка
- Опубліковано 2 чер 2024
- In this video, we are going to explain how one can do pruning in PyTorch. We will then use this knowledge to implement a paper called "The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks". The paper states that feedforward neural networks have subnetworks (winning tickets) inside of them that perform as good as (or even better than) the original network. It also proposes a recipe how to find them.
Paper: arxiv.org/abs/1803.03635
Official code: github.com/facebookresearch/o...
Code from this video: github.com/jankrepl/mildlyove...
00:00 Intro
00:50 Paper overview: Hypothesis (diagram) [slides]
01:33 Paper overview: Hypothesis (formal) [slides]
02:15 Paper overview: Finding winning tickets [slides]
03:44 Paper overview: Our setup [slides]
05:08 Pruning 101 in PyTorch [code]
10:29 Data - MNIST [code]
12:18 Multilayer perceptron [code]
14:05 Pruning: Linear + MLP [code]
16:45 Randomly initializing: Linear + MLP [code]
18:24 Weight copying: Linear + MLP [code]
19:51 Computing statistics [code]
20:53 Training functions [code]
24:38 CLI and training preparation [code]
27:13 Train-prune loop [code]
30:04 Grid search script [code]
31:01 Results: Actual vs desired pruning [no code]
32:47 Results: Winning tickets (parallel coordinate plots) [no code]
36:02 Results: Winning tickets (standard plots) [no code]
37:12 Outro
If you have any video suggestions or you just wanna chat feel free to join the discord server: / discord
Twitter: / moverfitted
Credits logo animation
Title: Conjungation · Author: Uncle Milk · Source: / unclemilk · License: creativecommons.org/licenses/... · Download (9MB): auboutdufil.com/?id=600 - Наука та технологія
Thanks you so much for this high quality content! Please keep it up!
Thank you very much!
Thank you for very very useful video. Great job 👏!
Hearing that quick typing sound is like ASMR for me.
Hahah:) That was not necessarily the primary goal but great to hear that:))
Have been waiting for this video! Nice work, buddy :)
Thank you for suggesting the topic:)
This kind of video is so helpful.
Thank you so much!
Glad it was helpful!
this is gug :)
Thank you! was waiting for this, Will watch it today
I've watched mostly all of your videos and just wanted to give the feedback that your content is the exact level of technical depth I was looking for.
From what I've seen it's the only youtube channel showing/explaining how to implement ML topics beyond the basic use cases of a straight forward neural net for some classification task. I'm very much a fan!
Cheers mate!
PS: The webcam addition is a nice touch 👍
Wow, I really appreciate your interest:) Thank you for the feedback!
Regarding the UA-cam ML/DL space, I do agree with you. I would love to see more channels that try to go into technical implementation details and explain what is happening in the background (rather than just showing how to use 3rd party packages). But who knows, maybe there are more channels like this one (and probably way better) but they don't have enough publicity yet. Feel free to let me know if you find any:)
Cheers
@@mildlyoverfitted From the top of my head I can remember Mark Saroufim (ua-cam.com/users/marksaroufim) and Yannic Kilcher (ua-cam.com/users/YannicKilcher) but your channel shows more the code implementation of the topics whereas theirs is more about a "discussion" of the topics (which is also pretty cool of course)
@@lucasfischer8593 Both of them are great! I agree:)
I feel the same - I'm a phd student and it's the exact level of depth I need.
Thank you!
Thank you!!!
Hey Great Video! Actually I did tried this myself and I am saving both models with and without pruned. The size of pruned model is bigger than actual without pruning. Any suggestions? Also how did you those graphs which was explained at the last?
This is a great question and it is something I wondered about myself. I would refer you to this: discuss.pytorch.org/t/weight-pruning-on-bert/83429/2
TLDR: At this point, there are no inference time speedups or memory savings:( So more than anything the `prune` module is supposed to be used for research (e.g. LTH).
Hope that helps:)
@@mildlyoverfitted Seems like I get it. Its kind of the way torch implements pruning by reserving metadata. Also other pruning methods like heads pruning in transformers and filter pruning in Convs do give inference speedups.
Also I wonder if you get time & can implement Longformers window attentions using tvm scripts. Great Work! Keep going!
@@mikewood8175 Thank you for the suggestion:)
Actually the size of the pruned model should be large as there is an addition of buffers and some pre hooks. So only benefit of pruning is faster inference?
See the other comment:)
Does this method reduce the total training time?
Hi, I am confused about one thing. When you are re-parameterization the original weight to the pruned model are you making zero the weight of the pruning mask? 19:16
Hey! I am not sure if I follow what you mean, do you think you could create an issue on github.com/jankrepl/mildlyoverfitted/issues and I will address it in detail?
寫代碼的手速好快!
Can you please explain your system configuration?
Hey there! Not sure what exactly you mean but see below a list of things I use:
* vim (no plugins other than the gruvbox theme)
* tmux
* Python (mostly 3.7 and newer)
* Python packages (depends on the video, I try to declare all of them in `requirements.txt`)
* MacOS
* A couple of CPUs
* If necessary (not for this video), I get a single GPU instance on Google Cloud Platform for a couple of days
@mildlyoverfitted Thanks for the reply. By the way, I asked for GPU configuration in your Dino and Vision transformer implementation. Sorry for the confusion.
@@mdbayazid6837 No problem:) When it comes to GPUs I get either K80 or T4 since they are the cheapest. See cloud.google.com/compute/gpus-pricing for more details.
@@mildlyoverfitted Thanks again, you are so kind.
Great work - but why not train with a GPU?
Thank you! I always prototype the code for my videos locally and only then rent a GPU if necessary. It was not really necessary for this one:)
what's the font you use?
Note sure. I am using this vim theme: github.com/morhetz/gruvbox so maybe you can find it somewhere in their repo.