ColPali: Vision Language Models for Efficient Document Retrieval

Поділитися
Вставка
  • Опубліковано 13 жов 2024

КОМЕНТАРІ • 39

  • @fusilad
    @fusilad 2 місяці тому +18

    Yes...an example on this would be helpful

  • @percyscott6257
    @percyscott6257 2 місяці тому +7

    Great video, a full implementation example would be awesome!

  • @theresalwaysanotherway3996
    @theresalwaysanotherway3996 2 місяці тому +12

    This looks like it could be huge for enterprise applications where there is a large corpus of unstructured internal information that the model needs to be able to work with. My current method is alright, but multimodal embedding spaces just are not there yet for hundreds of similar looking graphs. Personally I would be extremely interested in a video going through the implementation of this!

  • @BadBite
    @BadBite 2 місяці тому +7

    Great! RAG with new methods and multi-agents with better reasoning, multi-models are very useful for academia

  • @loganhallucinates
    @loganhallucinates 17 днів тому +1

    Thanks for the video! So the demo retrieved "pages", if we want the actual paragraph or sentence-level sources we have to do an additional retrieval on the retrieved pages, right? I saw your Gemini PDF video and was wondering how ColPali performs compared to that.

  • @Krassfoor
    @Krassfoor 4 дні тому

    Thank you for the video. It was very interesting.
    I would really appreciate a video on implementing this locally, please🙏

  • @henkhbit5748
    @henkhbit5748 2 місяці тому +1

    Seems a very good approach. Yes, it would be nice to do and end to end test with local install and local documents. Thanks for the update👍

  • @IdPreferNot1
    @IdPreferNot1 2 місяці тому +2

    Great explanation. Love that you tackle the details and bring what i believe is a little simpler clarity to the picture versus another favorite channel i rely on with a more theoretical bend (Code your own AI). Please do the follow up video as this sounds like a promising standard as compute grows. Is there any mention of better retrieved feeding into the LLM, as i wonder if feeding dense pdf pages of tables etc into the interpretation LLM distracts from the original similarity patches cited.

  • @aditya_dev30
    @aditya_dev30 Місяць тому +1

    This was a great video to help understand the complex architecture of the colpali in a simple way. Thanks. In the end of the video I had a query is there a way we can see which parts of the retrieved page colpali is focussing on. Like they showed in the paper. If there is then a video on that would be very helpful as a next part.

    • @engineerprompt
      @engineerprompt  Місяць тому

      that's a good point. I haven't looked into it but I think there will be an implementation somewhere. Will explore it

    • @aditya_dev30
      @aditya_dev30 Місяць тому

      @@engineerprompt thanks so much.

  • @kunalsaurabh7968
    @kunalsaurabh7968 2 місяці тому +2

    Example usage on own data will be heavily appreciated.

  • @jial.5245
    @jial.5245 2 місяці тому +1

    Thank you so much for sharing! Would love to see an example!

  • @dcmumby
    @dcmumby 2 місяці тому +3

    looking forward to more on this

  • @stavroskyriakidis4839
    @stavroskyriakidis4839 2 місяці тому +1

    Would love to see more about this

  • @bradlegassick9327
    @bradlegassick9327 2 місяці тому +1

    Yes please, more examples👍

  • @tenvone
    @tenvone 2 місяці тому

    Thanks for your videos! Would love to see a guide to run this locally.

  • @mogliff3414
    @mogliff3414 2 місяці тому +3

    +1 implemention will be helpful

  • @drpchankh
    @drpchankh 2 місяці тому

    This work is impressive! Thanks for sharing.

  • @ahmadzaimhilmi
    @ahmadzaimhilmi 2 місяці тому +1

    I hope you can dive deeper into this

  • @jdallain
    @jdallain 2 місяці тому +1

    Very interested!

  • @CryptoMaN_Rahul
    @CryptoMaN_Rahul 2 місяці тому

    Hi bhaiya!!
    I'm working on my final year project, basically it has 2 ideas .
    1) AI POWERED previous year paper analysis system and sample paper generation from the current trends
    2) AI powered notes generation from the textbook content.
    There are 7 engineering departments in my college
    I'm little bit confused what to use where , agentic RAG , fine tuning or any other things ??
    Please help me to clear my confusion
    Thanks!!

  • @lionsinescanor405
    @lionsinescanor405 7 днів тому

    The speed of indexing depends on the GPU? Is there any way to speed up the process of indexing by parallelizing?

  • @venkateshratnaparkhe9328
    @venkateshratnaparkhe9328 Місяць тому +1

    Colpali is better with fewer documents. I tried it on 500 documents. Results are not good. May be Instead of their custom evaluator I am using vectordb. Can you suggest any vectordb that supports late interaction or multi-vector support.

  • @near_.
    @near_. 2 місяці тому

    Please make it. We are interested into it

  • @jimmyjustintime3030
    @jimmyjustintime3030 2 місяці тому

    llamaindex did a session on this and have a notebook you can improve on if you do make a followup !!

    • @engineerprompt
      @engineerprompt  2 місяці тому

      Nice, please share that. Would love to make a follow up

  • @jarki714
    @jarki714 2 місяці тому

  • @micbab-vg2mu
    @micbab-vg2mu 2 місяці тому

    interesting:)

  • @SundarRajendiran
    @SundarRajendiran 2 місяці тому

    In the demo, i could see that according to the query, it fetched images. But if we want to get the actual response using gen ai, How can we do that?

    • @engineerprompt
      @engineerprompt  2 місяці тому

      You can feed the images into a multimodal model like gpt4o or gemini to generate the final response

    • @SundarRajendiran
      @SundarRajendiran 2 місяці тому

      @@engineerprompt Thanks. Can you do the video of implementation using local dataset? It would be more helpful.
      And its taking too much time(nearly 4 minutes) for processing the pdf file having 10 pages as device i am using is torch.device("cpu"). Is there a way to make it fast for local purpose?

  • @criticalnodecapital
    @criticalnodecapital 2 місяці тому

    Can i do video with you? i made it work, and can show you how i followed your instructions and it helped out with my legal corpus which is 120GB large.

  • @stressrelaxationmusicchann4638
    @stressrelaxationmusicchann4638 2 місяці тому +1

    Please implement??