Multimodal RAG with Qwen-2 and ColPali: Ask Questions from Images 🔥

Поділитися
Вставка
  • Опубліковано 27 гру 2024

КОМЕНТАРІ • 14

  • @mahajanvinod97
    @mahajanvinod97 3 місяці тому +1

    I’m encountering an issue where, when I ask a question, the system immediately searches the document for a solution. How can I prevent this? I want the LLM to first fully understand the problem before searching for an answer in the document. Could you please help me with this?

  • @samketola919
    @samketola919 3 місяці тому +6

    How can we extract images along with their figure captions from a PDF?

  • @Jogipraveen
    @Jogipraveen 2 місяці тому

    I am getting image with some other text , how can we get exact image only

  • @IsmailIfakir
    @IsmailIfakir 2 місяці тому

    is there any multimodal llm can fine-tuning for sentiment analysis

  • @mayukhbanerjee1147
    @mayukhbanerjee1147 3 місяці тому

    Wher from can I read about the architecture of RAGs ?

  • @gerhardheinzerling9880
    @gerhardheinzerling9880 2 місяці тому

    Thank you so much for the video. Just great! We have got PDFs with vector graphics in it. So we can just simple get the images from the PDF. Any idea?

  • @SnehaRoy-xf3zv
    @SnehaRoy-xf3zv 3 місяці тому

    Interesting project

  • @proudestberozgaar
    @proudestberozgaar 3 місяці тому

    Cant we send multiple images in a single prompt to qwen?

    • @Innovative_2001
      @Innovative_2001 3 місяці тому

      You try, let others also know

    • @proudestberozgaar
      @proudestberozgaar 3 місяці тому

      @@Innovative_2001 we can

    • @mohammadaqib4275
      @mohammadaqib4275 Місяць тому

      @@Innovative_2001 maybe you can merge multiple images (upto 3 would be fine) and thenpass that single merged image.

  • @RedCloudServices
    @RedCloudServices 2 місяці тому

    Can you make a video creating a chatbot with this method?