3 Easy Methods For Improving Your Large Language Model

Поділитися
Вставка
  • Опубліковано 26 гру 2024

КОМЕНТАРІ • 22

  • @ppasumarthi
    @ppasumarthi Рік тому +4

    Nice explanation!!! A separate session with an explanation of LORA would probably help

    • @MaartenGrootendorst
      @MaartenGrootendorst  Рік тому +1

      Thanks, that's a great subject and I'll definitely keep that in mind 😀

  • @windowviews150
    @windowviews150 11 місяців тому

    This video is pure gold! Please keep creating content! You are absolutely fantastic.

  • @naevan1
    @naevan1 Рік тому +1

    Thanks man, your libraries have been super valuable to me for the past 2 years of my masters. I'm now thinking of doing a thesis on fine tuning LLMs on specific phrase pattern detection on tweets, and to compare it with variants( RAG, Few shot examples, Fine tuned LLaMMA-2) vs a finetuned BERT model for that goal. Please provide more LLM vids, you are super good at explaining things!

  • @tyrealq
    @tyrealq 10 місяців тому

    Absolutely enjoyed the informative video! Would love to watch a dedicated video on LORA!
    A side question for clarification: Is OpenAI's fine-tuning service essentially LORA?

  • @IsmailKonak314
    @IsmailKonak314 Рік тому +1

    Awesome video sir! Easy and straight forward

  • @Pythonology
    @Pythonology Рік тому +1

    The Guy behind BerTOPIC and KeyBERT is here folks! Je hebt een UA-cam-kanaal! goed gedaan!

  • @oleksiipanasenko4987
    @oleksiipanasenko4987 Рік тому +1

    Thanks for this great video!

  • @nikolayn4022
    @nikolayn4022 Рік тому

    This is the video I've been looking for for a long time. Thanks! Found you through Jay Slammer. I think it makes sense for you to start with where you work, what kind of experience, etc. There are a lot of "GPT specialists" on UA-cam now, it's not clear at first glance who you are)

  • @streamocu2929
    @streamocu2929 Рік тому

    so good 👍

  • @fkollama
    @fkollama Рік тому +1

    Cool tutorial

  • @scitechtalktv9742
    @scitechtalktv9742 Рік тому +1

    Very interesting ! I would like to apply it to texts in the Dutch language using Llama 2. Can you explain how to do that?

    • @MaartenGrootendorst
      @MaartenGrootendorst  Рік тому +1

      Good question! Unfortunately, there aren't many open-source LLMs out there that work with the Dutch language. The data these models are trained on generally contains less than 1% of Dutch text. There are some cases where fine-tuning with the target language might help. Definitely worthwhile to try out!

    • @scitechtalktv9742
      @scitechtalktv9742 Рік тому +1

      @@MaartenGrootendorst Yesterday I tried doing translation using Llama 2, using a system prompt tailored for the kind of translation I want (e.g. “You are an expert in translating from Dutch to English”) and it performed really well e.g. for English to Dutch and also for Dutch to English! So I could also try
      1) translate a Dutch text to English
      2) perform the actions on that English text, with output in English
      3) translate the English output back into Dutch
      Perhaps that could work, but things could get wrong during the 2 translation operations.
      I will do some experiments to see what works and how well it works!

    • @scitechtalktv9742
      @scitechtalktv9742 Рік тому +1

      Translating (with Llama 2 13B LLM) Dutch to English works fine, but translating that back to Dutch again gives bad results as I discovered yesterday in an experiment of mine.
      I used the Dutch text of the “Troonrede 2023” in that experiment.
      I will try fine tuning with the target language as you suggested.

    • @MaartenGrootendorst
      @MaartenGrootendorst  Рік тому +1

      @@scitechtalktv9742 Thanks for sharing these results! I think that fine-tuning on domain-specific might already give nice results, especially since it can parse Dutch relatively well. Giving it more examples to work with might be the difference.

    • @scitechtalktv9742
      @scitechtalktv9742 Рік тому +1

      I get CUDA memory error when trying the PEFT part of the notebook. Can you recommend a method to still do the PEFT? Perhaps training on less data or something else?

  • @diogosantos207
    @diogosantos207 Рік тому

    Professor, excellent class. I had the following problem when running: ImportError: cannot import name 'randn_tensor' from 'diffusers.utils' (/usr/local/lib/python3.10/dist-packages/diffusers/utils/init.py)
    Do you know what it could be?

  • @malikrumi1206
    @malikrumi1206 Рік тому

    Maarten, I would like two things from you. First would be a detailed compare and contrast between your creations and SBERT. Are there any precautions or techniques to using them together? Second, I need to make a domain adaptation. I expect to use a diverse segment of the corpus for the fine tune, but I can't leave the training set out of the final model. Do you have any advice about how to handle this situation? I recently read about people using synthetic data to train on, but frankly that seems to me to be an invitation to hallucinate. How would anyone keep the synthetic data out of the end user's results?
    Thanks.