3
58 698

What is Low-Rank Adaptation (LoRA) | explained by the inventor

μTransfer: Tuning GPT-3 hyperparameters on one GPU | Explained by the inventor

Are GFlowNets the future of AI?

Should you care about GFlowNets? What are they anyway? Learn about how GFlowNets are aiding drug discovery and reasoning in large language models!
**Like, subscribe, and share if you find this video valuable!**
Tutorial: milayb.notion.site/The-GFlowNet-Tutorial-95434ef0e2d94c24aab90e69b30be9b3
0:00 - Why care about GFlowNets?
0:54 - The problems GFlowNets solve
1:39 - A concrete example: drug discovery
3:53 - What GFlowNet really is
4:46 - Applications: GFlowNet-EM
5:58 - Applications: Better LLM reasoning
6:55 - Conclusion
Papers mentioned:
- GFlowNet for drug discovery (first GFlowNet paper)
arxiv.org/abs/2106.04399
- Jointly training a GFlowNet and an energy-based model
arxiv.org/abs/2202.01361
- GFlowNet-EM
arxiv.org/abs/2302.06576
- GFlowNet for better reasoning in LLMs
arxiv.org/pdf/2310.04363.pdf
Follow me on Twitter:
edwardjhu
🙏 This video would not be possible without my wonderful labmates at Mila and, of course, Yoshua.

Відео

What is Low-Rank Adaptation (LoRA) | explained by the inventor

7:29

What is Low-Rank Adaptation (LoRA) | explained by the inventor

Переглядів 28 тис.8 місяців тому

Low-rank adaptation, or LoRA, is one of the most popular methods for customizing large AI models. Why was it invented? What is it? Why should I consider using it? Find out the answers in this video from the inventor of LoRA. Like, subscribe, and share if you find this video valuable! Paper: arxiv.org/abs/2106.09685 Repo: github.com/microsoft/lora 0:00 - Intro 0:34 - How we came up with LoRA 1:3...

μTransfer: Tuning GPT-3 hyperparameters on one GPU | Explained by the inventor

7:09

μTransfer: Tuning GPT-3 hyperparameters on one GPU | Explained by the inventor

Переглядів 3,6 тис.11 місяців тому

How can one tune the hyperparameters of an enormous neural network like GPT-3 on a single GPU? Like, subscribe, and share if you find this video valuable! Paper: arxiv.org/abs/2203.03466 Repo: github.com/microsoft/mup Jupyter notebook to reproduce μTransfer: github.com/microsoft/mup/blob/main/examples/MLP/demo.ipynb 0:00 - Intro 0:45 - μTransfer in 3 steps 3:00 - Why μP and μTransfer work 5:42 ...

КОМЕНТАРІ

@DavidGreen-h2i 16 днів тому
Larson Parkway
@phdperson 21 день тому
This is amazing and very valuable. Thank you!!!
@Jhonnyzilla 24 дні тому
That is such a good explanation, thanks!
@EleaseNiebergall-e7g 26 днів тому
Jeanne Station
@shklbor Місяць тому
Awesome explanation and kudos for a great contribution to DL, please make a followup video on QLoRA
@tuhinmailme Місяць тому
These things existed for a lot time in vision research. Like only finetuning classifiers of large models on new tasks
@jimshtepa5423 Місяць тому
thank you for a great presentation. I am new to llm and would like to try to run the code on github. is my local machine (macbook m1) can handle it? or is it something for large enterprises with massive compute inventory?
@sorooshsohangir Місяць тому
Great Job!!!
@EsZfW5f 2 місяці тому
Thanks!
@Krishna1729-z8v 2 місяці тому
I have worked on Markov chain Monte Carlo algorithm, it took me 1 hour to map the posterior distribution and that’s not even close….looking forward to use this Gflownets
@shibohao8930 3 місяці тому
Great video! Looking forward to your video explaining the relation between GFN and Max-Entropy RL
@tonywang7933 3 місяці тому
3:26 That is the best explanation!!
@todamont 3 місяці тому
Very cool. I know some of these words.
@redthunder6183 3 місяці тому
Thank you so much for explaining this clearly, everything I watch on UA-cam is made by ppl who have no idea how the tech works, or don’t even know how to code outside of copy/paste/change inputs, but pretend like they do. Furthermore, there’s just so many useless libraries around LLMs that ppl claim are the next big thing, but in reality, they create code bloat, introduce more unknowns, make the code harder to work with since u now gotta learn the library, and don’t work as well as if u just wrote everything urself.
@ph10m 4 місяці тому
This was a great intuitive explanation of it. I wish more people took the adaptability of lora seriously, though: everyone (and their dog) upload full models after doing small fine-tunes *with* lora, instead of just the adapters. Not only would it help experimentation, but time too, as we have to download unnecessary base models over and over...
@jett_royce 4 місяці тому
LoRA is such an unlock for resource-constrained creators looking to leverage models for specific domains. Thank you for this amazing work!
@houbenbub 4 місяці тому
Awesome video, thanks for making it :)
@BruceChar007 5 місяців тому
能不能继续在微调后的LoRA模型上面微调，效果怎么样
@lophyre1380 5 місяців тому
Very informative video, but please get a better mic
@user-wp8yx 5 місяців тому
Trying to teach a mistral7b model sanskrit. It already has Sanskrit characters as tokens and is the best performing 7b llama based model I can find. You seem like a knowledgable person in this area. Do you have any advice for lora? Rank, alpha? How about targeting of q,k,v? Other strategies? I have about 3gb of datasets that range from translations, corpus, to data tables. I wonder if I should use different strategies for different data types?
@KemalCetinkaya-i3q 5 місяців тому
waiting for those future videos. lets generalize ood
@bobbyparikh5690 5 місяців тому
Fantastic video Edward! In case someone wants a quick refresher on low-rank decomposition of matrices, here's a great video: ua-cam.com/video/2ogdwpHD3V8/v-deo.html&ab_channel=ritvikmath
@arnoldpalmer-fv7pf 5 місяців тому
So much groundbreaking research broken down into an easy to follow 7 minute video, I love it 🙏
@nathangonzales-hess6569 5 місяців тому
That was great. Thanks for sharing. I really appreciate the simple style, no distracting animated plots or fancy editing. Look forward to more!
@justinpresent 6 місяців тому
thanks edward for the gentle intro!
@ellielikesmath 6 місяців тому
i was trying to come up with something like this, in that i wanted to train a generator which would be the inverse of a classifier, and the classifier gave a score to how good a solution was drawn from some range. this looks miles and miles more sophisticated than what i was doing with tf and pytorch, but i definitely understand, at least on that level of abstraction, why such a development is necessary. i look forward to trying this, cheers.
@Bbb78651 6 місяців тому
Thank you so much for the video Edward. Its an inspo seeing you make videos and take off. Im currently a masters of science student in data science and Im always excited about NN architectures and new ML algos. Whenever easy, could you please share 1-2 tips for writing good research papers in ML? I recently started in a lab that does neuroscience-ML, and rlly want to make an impact there
@faizanjaved1443 6 місяців тому
Hey there! Can we talk about Q*, the AGI developed by Sam Altman? I'm excited to discuss this with you, it's one of the most interesting topics for me after Sora.
@candrewlee14 6 місяців тому
This was fantastic! Thank you, it’s great to hear from a real expert in this AI mega-hype cycle.
@DB-Barrelmaker 6 місяців тому
The audio is terrible, it's noticeable when you're dealing with a complex subject containing alot of niche phrases
@DigitalAlligator 6 місяців тому
Shit, you invented LoRA 😮? How you come up with that idea works so good?!
@Seekerofknowledges 6 місяців тому
Thank you wholeheartedly.
@faizanjaved1443 6 місяців тому
Here's a "Below are some new topics for you to discuss: - Grok-1 - Claude Opus - Gemini Ultra - Poe - Figure humanoid powered by OpenAI - Hugging Face - Sam Altman 7+ trillion - Google AI in iPhones - NVIDIA Disney robots Let me know if you need any help to choose the best topics for your content."
@BaKimura03 6 місяців тому
Love the fact that you’re organic. Didn’t even lead with the big credential.. unlike some of these goofs who have done far less. 👏
@Seekerofknowledges 6 місяців тому
Thank you wholeheartedly.
@sumitsp01 6 місяців тому
Sir, please make more videos explaining fundamental concepts as well as latest breakthroughs in AI field. Your style of explanation is neat and pin point. Everyone will get benefited out of these videos.
@chitarito100 6 місяців тому
Hey, I’m glad you’re on UA-cam! It’s great to have an expert in the field break down complex topics and intuition and also explain real world applications not just theory. I think it would be nice if you could have some summary pointers overlaid in the video or on a summary slide with you overlaid on it. Different people have different learning styles and being able to read helps.
@Jandodev 6 місяців тому
I found a way to do this at my company with just LLM's! I used LLM's to not only generate the process steps but also make a liminal/expansion map of the ideas themselves that could lead to those individual processing steps! The trick to make it run wasn't to fit to a "reward function" but to use the "reward function" for back propagation as the starting point to then determine what was needed from the user to retrieve and select the correct path!
@automatalearninglab 6 місяців тому
Nice! Turn up your volume! Loved the topic! :)
@kibrutemesgen1759 6 місяців тому
Amazing explanation with clear explanation of the intuition. It would be great if you give us one more vedio on implementation or detail explanation of how Gflownet is used to enhance reasoning of LLMs (i.e the paper)
@james-cucumber 6 місяців тому
Thanks for including subtitles with this video. I’m pretty sure (but not certain given you may have access to better models than regular Joes) that it’s been manually edited or written from scratch. If so, extra thank you!
@yiannishadjiyianni7737 6 місяців тому
This might be dumb but I wonder how applicable can GFlowNets be in an object detection/instance segmentation context where the distribution created is based on the millions of images the model will train on and the "pathways" of the help the model be even more accurate
@vtrandal 6 місяців тому
Fantastic
@_rd_kocaman 6 місяців тому
You’re gonna need a graphic designer to reach broad range of audience
@alinmathuo4018 6 місяців тому
You don’t need microphone, just find ai tool that fix it🙃
@张朝阳-z2i 6 місяців тому
great man
@RonLWilson 6 місяців тому
This looks very promising!
@RonLWilson 6 місяців тому
BTW, I have been working on a graphical way of building Ontology models that I am calling UniML (Universal Modeling Language) that can model the GflowNets process quite readily much like OWL or RDF graphs, but even more graphically. Something like this might help develop these nets in that it is a way to better visualize them. I made a number of introductory videos that I uploaded on my UA-cam channel that describe them in more detail into their what's and why's plus a Padlet virtual corkboard that takes a bit of of a closer look at them. UniML can model NNs as well, or just treat them as functions (Chips).
@jayyoung7577 6 місяців тому
Wow❤ please post more deep dive videos. Thank you for sharing!
@jan3477 6 місяців тому
Great video, but it was a bit too fast for me to keep up with the complex topic.
@unclecode 6 місяців тому
Hi Edward, I've been following your channel since you had long hair, and I've noticed two things: 1) You post videos when you have a new paper meaning new discoveries, and 2) Your hair seems to get shorter in each video. If I had to predict, based on these three videos, (overfitting) your next video will have you bald! 😀😀 Nonetheless, you are incredibly talented, and your timing and placement are impeccable.

Edward Hu

КОМЕНТАРІ