Compressing Large Language Models (LLMs) | w/ Python Code

Поділитися
Вставка
  • Опубліковано 25 гру 2024

КОМЕНТАРІ • 26

  • @ShawhinTalebi
    @ShawhinTalebi  3 місяці тому +5

    I hope this walk through was helpful! Check out more resources and references in the video description :)
    Check out how I fine-tuned the teacher model here: ua-cam.com/video/4QHg8Ix8WWQ/v-deo.html

  • @manuelbradovent3562
    @manuelbradovent3562 3 місяці тому +1

    Great video! I liked how you explained each step clearly and didn’t skip the important validation at the end.

  • @AliEP
    @AliEP 3 місяці тому +2

    Great video Shaw, I've shared it on my LinkedIn account

  • @Tenebrisuk
    @Tenebrisuk 3 місяці тому

    Really great description of various compression methods, I was literally trying to explain some of this to a colleague the other day - your video is far more eloquent though, so I think I will direct them here!

    • @ShawhinTalebi
      @ShawhinTalebi  3 місяці тому +1

      Glad it was clear! I've been there... having the visual aids helps 😅

  • @GauravJain-ie2oc
    @GauravJain-ie2oc 3 місяці тому

    Thank you for the very helpful video, Shaw!

  • @mudassarz.6762
    @mudassarz.6762 11 днів тому

    Really helpful video! I'm just wondering how you would compute the distillation loss if you use the teacher model outputs as ground truth?

    • @ShawhinTalebi
      @ShawhinTalebi  10 днів тому +1

      Good question. This is what I do at 17:04. The distill_loss var shows one way of doing that.

  • @MuhammadObaidullah-c5q
    @MuhammadObaidullah-c5q 2 місяці тому

    Hello dude, thank you for preparing this extremely helpful series.
    However are you planning to create a series on building applications based on Langchain and RAG? I am not able to find any helpful material on it. It would be very nice if you can prepare some stuff on it.
    Thanks.

    • @ShawhinTalebi
      @ShawhinTalebi  2 місяці тому +1

      Glad it's been helpful :)
      No plans specific plans to do RAG with Langchain, but this video might be helpful: ua-cam.com/video/LhnCsygAvzY/v-deo.html

    • @MuhammadObaidullah-c5q
      @MuhammadObaidullah-c5q 2 місяці тому

      Thanks for sharing.

  • @KumR
    @KumR 3 місяці тому

    Thanks a lot Shaw. QQ..Can u please show how the bert phishing classifier teacher was fine tuned ?.

    • @ShawhinTalebi
      @ShawhinTalebi  3 місяці тому

      Sure thing, I plan to do another fine-tuning video and this could be a good example.
      The example code is also freely available here: github.com/ShawhinT/UA-cam-Blog/blob/main/LLMs/model-compression/0_train-teacher.ipynb

    • @ShawhinTalebi
      @ShawhinTalebi  2 місяці тому

      Here's a video walking through how I fine-tuned the teacher model: ua-cam.com/video/4QHg8Ix8WWQ/v-deo.html

  • @pewpewpew1177
    @pewpewpew1177 2 місяці тому

    hii, is this the end of course ? or more topics are there?

    • @ShawhinTalebi
      @ShawhinTalebi  2 місяці тому

      What else do you want to see?

    • @pewpewpew1177
      @pewpewpew1177 2 місяці тому

      @@ShawhinTalebi a top clas end to end project, that includes all these , like a charbot but I can do semantic analysis, summarization, text generator, code generator,
      All in one

  • @l.halawani
    @l.halawani 3 місяці тому +1

    I was hopin g to find something new here, but it's just quantization, pruning and distillation. Neither of these methods is truly free of sacrifice on the side of performance. The losses are often negligible but only in regards to the tested use cases. Why would anyone offer inference of full size models if the cheaper models performed as well just much cheaper and much quicker?

    • @ShawhinTalebi
      @ShawhinTalebi  3 місяці тому +4

      Great question. Here's my intuition. Tasks vary in sophistication. For example, French to English translation is simpler than French to English translation + sentiment analysis.
      Most LLMs can do a wide range of tasks. However, if you only want it to do something specific (e.g. French to English translation) then presumably there are many unnecessary components to a base LLM. These can be readily removed without loss of performance on the end task.

  • @kabirubashirbin3622
    @kabirubashirbin3622 Місяць тому

    Seeking Guidance on LLM Compression for Multi-Class Classification
    Dear Shaw,
    I hope this email finds you well. I am reaching out to express my admiration for your work on LLM compression, particularly the video tutorial you shared. Your explanations and demonstrations were incredibly helpful in understanding the concepts.
    However, I am facing a challenge while trying to replicate your work for a multi-class classification task. In your tutorial, you used a binary classification task (phishing vs. non-phishing websites), but I need to adapt it for a multi-class problem. Despite my best efforts to modify the code, I keep encountering errors.
    As this is an academic project with an upcoming deadline, I was wondering if you could offer some guidance or support. Specifically, I would appreciate any advice on:
    - How to modify the code for multi-class classification
    - Troubleshooting the errors I'm encountering
    - Any additional resources or references that might help
    I would be grateful for any assistance you can provide. Please let me know if you're available for a discussion or if you can point me in the right direction.
    Thank you for considering my request, and I look forward to hearing from you soon.
    Best regards

    • @ShawhinTalebi
      @ShawhinTalebi  Місяць тому

      Thanks for the note! I replied via email :)

  • @rezamohajerpoor8092
    @rezamohajerpoor8092 3 місяці тому

    less is more!