Dynamic Quantization with Unsloth: Shrinking a 20GB Model to 5GB Without Accuracy Loss!

Поділитися
Вставка
  • Опубліковано 13 січ 2025

КОМЕНТАРІ • 6

  • @suryadivi3905
    @suryadivi3905 Місяць тому +1

    Congratulations brother, hoping to see you in the first place.

  • @A_Me_Amy
    @A_Me_Amy Місяць тому +1

    great examination of this, I was wanting to see how this worked. So llama is not only the real open ai, but also are seeminly actively trying to make it easy for people to use and modify it. I should probably look in to llama more.

  • @testales
    @testales Місяць тому +1

    I hope they can and will implement this in Ollama ASAP. :-)

    • @PromptEngineer48
      @PromptEngineer48  Місяць тому

      Hmm

    • @chronicallychill9979
      @chronicallychill9979 Місяць тому

      It's easy to import any of these models after shrinking them though at least, definitely something you can script without much hassle.

    • @testales
      @testales Місяць тому

      @@chronicallychill9979 So at the end these are regular gguf models that ollama can load?