I hope this walk through was helpful! Check out more resources and references in the video description :) Check out how I fine-tuned the teacher model here: ua-cam.com/video/4QHg8Ix8WWQ/v-deo.html
Really great description of various compression methods, I was literally trying to explain some of this to a colleague the other day - your video is far more eloquent though, so I think I will direct them here!
Hello dude, thank you for preparing this extremely helpful series. However are you planning to create a series on building applications based on Langchain and RAG? I am not able to find any helpful material on it. It would be very nice if you can prepare some stuff on it. Thanks.
Sure thing, I plan to do another fine-tuning video and this could be a good example. The example code is also freely available here: github.com/ShawhinT/UA-cam-Blog/blob/main/LLMs/model-compression/0_train-teacher.ipynb
@@ShawhinTalebi a top clas end to end project, that includes all these , like a charbot but I can do semantic analysis, summarization, text generator, code generator, All in one
I was hopin g to find something new here, but it's just quantization, pruning and distillation. Neither of these methods is truly free of sacrifice on the side of performance. The losses are often negligible but only in regards to the tested use cases. Why would anyone offer inference of full size models if the cheaper models performed as well just much cheaper and much quicker?
Great question. Here's my intuition. Tasks vary in sophistication. For example, French to English translation is simpler than French to English translation + sentiment analysis. Most LLMs can do a wide range of tasks. However, if you only want it to do something specific (e.g. French to English translation) then presumably there are many unnecessary components to a base LLM. These can be readily removed without loss of performance on the end task.
Seeking Guidance on LLM Compression for Multi-Class Classification Dear Shaw, I hope this email finds you well. I am reaching out to express my admiration for your work on LLM compression, particularly the video tutorial you shared. Your explanations and demonstrations were incredibly helpful in understanding the concepts. However, I am facing a challenge while trying to replicate your work for a multi-class classification task. In your tutorial, you used a binary classification task (phishing vs. non-phishing websites), but I need to adapt it for a multi-class problem. Despite my best efforts to modify the code, I keep encountering errors. As this is an academic project with an upcoming deadline, I was wondering if you could offer some guidance or support. Specifically, I would appreciate any advice on: - How to modify the code for multi-class classification - Troubleshooting the errors I'm encountering - Any additional resources or references that might help I would be grateful for any assistance you can provide. Please let me know if you're available for a discussion or if you can point me in the right direction. Thank you for considering my request, and I look forward to hearing from you soon. Best regards
I hope this walk through was helpful! Check out more resources and references in the video description :)
Check out how I fine-tuned the teacher model here: ua-cam.com/video/4QHg8Ix8WWQ/v-deo.html
Great video! I liked how you explained each step clearly and didn’t skip the important validation at the end.
Great video Shaw, I've shared it on my LinkedIn account
Thanks for sharing :)
Really great description of various compression methods, I was literally trying to explain some of this to a colleague the other day - your video is far more eloquent though, so I think I will direct them here!
Glad it was clear! I've been there... having the visual aids helps 😅
Thank you for the very helpful video, Shaw!
Glad it was helpful!
Really helpful video! I'm just wondering how you would compute the distillation loss if you use the teacher model outputs as ground truth?
Good question. This is what I do at 17:04. The distill_loss var shows one way of doing that.
Hello dude, thank you for preparing this extremely helpful series.
However are you planning to create a series on building applications based on Langchain and RAG? I am not able to find any helpful material on it. It would be very nice if you can prepare some stuff on it.
Thanks.
Glad it's been helpful :)
No plans specific plans to do RAG with Langchain, but this video might be helpful: ua-cam.com/video/LhnCsygAvzY/v-deo.html
Thanks for sharing.
Thanks a lot Shaw. QQ..Can u please show how the bert phishing classifier teacher was fine tuned ?.
Sure thing, I plan to do another fine-tuning video and this could be a good example.
The example code is also freely available here: github.com/ShawhinT/UA-cam-Blog/blob/main/LLMs/model-compression/0_train-teacher.ipynb
Here's a video walking through how I fine-tuned the teacher model: ua-cam.com/video/4QHg8Ix8WWQ/v-deo.html
hii, is this the end of course ? or more topics are there?
What else do you want to see?
@@ShawhinTalebi a top clas end to end project, that includes all these , like a charbot but I can do semantic analysis, summarization, text generator, code generator,
All in one
I was hopin g to find something new here, but it's just quantization, pruning and distillation. Neither of these methods is truly free of sacrifice on the side of performance. The losses are often negligible but only in regards to the tested use cases. Why would anyone offer inference of full size models if the cheaper models performed as well just much cheaper and much quicker?
Great question. Here's my intuition. Tasks vary in sophistication. For example, French to English translation is simpler than French to English translation + sentiment analysis.
Most LLMs can do a wide range of tasks. However, if you only want it to do something specific (e.g. French to English translation) then presumably there are many unnecessary components to a base LLM. These can be readily removed without loss of performance on the end task.
Seeking Guidance on LLM Compression for Multi-Class Classification
Dear Shaw,
I hope this email finds you well. I am reaching out to express my admiration for your work on LLM compression, particularly the video tutorial you shared. Your explanations and demonstrations were incredibly helpful in understanding the concepts.
However, I am facing a challenge while trying to replicate your work for a multi-class classification task. In your tutorial, you used a binary classification task (phishing vs. non-phishing websites), but I need to adapt it for a multi-class problem. Despite my best efforts to modify the code, I keep encountering errors.
As this is an academic project with an upcoming deadline, I was wondering if you could offer some guidance or support. Specifically, I would appreciate any advice on:
- How to modify the code for multi-class classification
- Troubleshooting the errors I'm encountering
- Any additional resources or references that might help
I would be grateful for any assistance you can provide. Please let me know if you're available for a discussion or if you can point me in the right direction.
Thank you for considering my request, and I look forward to hearing from you soon.
Best regards
Thanks for the note! I replied via email :)
less is more!
Yes!