How to Fine-Tune LLama-3.2 Vision language Model on Custom Dataset.

Поділитися
Вставка
  • Опубліковано 28 січ 2025

КОМЕНТАРІ • 18

  • @sarithamiryala2819
    @sarithamiryala2819 3 місяці тому +7

    Nice video

  • @ChandanKumar-nr2vm
    @ChandanKumar-nr2vm 3 місяці тому +5

    Thanks you sir this video help me to understand the this model in very first video

  • @georffreyarevalo3067
    @georffreyarevalo3067 3 місяці тому +4

    Good video, how can I test the model that push to Hugging Face? Could you please share an example.

    • @nextGenAIGuy490
      @nextGenAIGuy490  3 місяці тому +5

      Thanks. You can use AutoModelForVision2Seq to load your model. You need to pass your model path and use huggingface access token.

  • @budsayalaohapensaeng6869
    @budsayalaohapensaeng6869 20 днів тому +1

    I have a question ? when we will fine tuning the model, we don't train whole the model right?. So, if it is like this, what should I do?

    • @nextGenAIGuy490
      @nextGenAIGuy490  17 днів тому +1

      We only train the last few layers (classification head, projection layers) or task-specific layers are trained or fine-tuned. In Our case as i have explained in video target modules are q_proj, v_proj (query and value projection). You asked what should you do, I am not able to understand. You have to explain your problem statement then i can assist you.

  • @tamilselvan3525
    @tamilselvan3525 Місяць тому +1

    how long will it takes for the whole process?

    • @nextGenAIGuy490
      @nextGenAIGuy490  Місяць тому +2

      @@tamilselvan3525 I haven't trained completely because of GPU limitation. So i won't be able to answer. I just wanted to share that its possible to train and how to train. But training time is dependent on dataset, hardware(GPU configuration) and no. of epochs you are training for.

    • @tamilselvan3525
      @tamilselvan3525 Місяць тому +1

      @@nextGenAIGuy490 Okay, thanks.

  • @soulaimanebahi741
    @soulaimanebahi741 2 місяці тому +1

    thank you for the démonstration. do you think can we fine tune this model on a videos data?

    • @nextGenAIGuy490
      @nextGenAIGuy490  2 місяці тому +1

      @@soulaimanebahi741 No, We can't.

    • @babusd
      @babusd Місяць тому

      Absolutely wrong ! If you dont know say "dont know" . Dont mislead him , fine tuning over video is possible

    • @nextGenAIGuy490
      @nextGenAIGuy490  Місяць тому +1

      @@babusd relax bro. Do one thing rather than saying show the proof. During training on llama 3.2 vision model they have used image and text pair. Read the model architecture. And if you know show me where they have written we can Fine-Tune vision model on videos.

  • @mohammadaqib4275
    @mohammadaqib4275 3 місяці тому +1

    w e are fine tuning llama .2 vision model but collate functionwas utilising Qwen2.
    IS it fine to use Qwen model in collate function while fine tuning llama-3.2?

    • @nextGenAIGuy490
      @nextGenAIGuy490  2 місяці тому +1

      By customizing the collate_fn, we are able to control how the data is prepared. we are using it for batch processing, padding bringing data into format to train the model. Its fine to use it.

    • @taido4883
      @taido4883 12 днів тому

      I highly doubt that this could work. Different models have different chat templates and processing.