Multi-Modal RAG: Chat with Text and Images in Documents

Поділитися
Вставка

КОМЕНТАРІ • 25

  • @engineerprompt
    @engineerprompt  Місяць тому

    If you want to learn RAG Beyond Basics, checkout this course: prompt-s-site.thinkific.com/courses/rag

  • @wtcbretburstjk3726
    @wtcbretburstjk3726 Місяць тому +2

    thank you, keep it coming chief great work !

  • @aa-xn5hc
    @aa-xn5hc Місяць тому +2

    These rag videos are super interesting

  • @IdPreferNot1
    @IdPreferNot1 Місяць тому

    Such great code explanation and layout... so many Gist-able functions...thanks!!

  • @stressrelaxationmusicchann4638
    @stressrelaxationmusicchann4638 Місяць тому +2

    Hey this is amazing and i kindly request you to upload some videos how can we work with pdf document extraction for text ,tables, images graphs etc.. in the documents for rag application

  • @alpcan3777
    @alpcan3777 20 днів тому

    Thanks for great video. Is it possible to take both input image and text from user and query this? For example, user will upload its car image and ask about similar cars with lowest price based on the uploaded image. Then the system retrieve related car image and text from database.

  • @roip429
    @roip429 Місяць тому

    Excellent tutorial!
    Can you share the .ipynb please

  • @pratheekbabu272
    @pratheekbabu272 День тому

    hey will this code not run in windows only in colab?

  • @zoranProCode
    @zoranProCode Місяць тому

    Why it’s exactly 10x better?! Maybe it’s just better?

  • @AEismann-d6c
    @AEismann-d6c Місяць тому

    I wonder how much time before we will be able to run this locally, and then what would be a good model. So far from my testing nothing could compare to GPT-4... Thanks for the video

    • @free_thinker4958
      @free_thinker4958 Місяць тому

      CLaude 3.5 sonnet is far more performant than any model now

    • @engineerprompt
      @engineerprompt  Місяць тому +1

      local vision models have still a long way to go. But hopefully we will have something "good enough" soon.

  • @Know_Ur_World
    @Know_Ur_World 25 днів тому

    Can u use pdf containing images instead of this text data and image data

  • @VidishArvind
    @VidishArvind Місяць тому

    Can u make the same thing using free api models cause gpt api ain't free. Also a guide to host it on a cloud would also be great. End to end app deployed on cloud

  • @amanharis1845
    @amanharis1845 Місяць тому

    Hi, I had a small doubt. Doesn't the Langchain's document loaders extract image from the document?

    • @engineerprompt
      @engineerprompt  Місяць тому +1

      No, by default, its does not. You can use something like unstructedio that can extract images and tables. Will create a video on it soon.

    • @amanharis1845
      @amanharis1845 Місяць тому +1

      @@engineerprompt I have actually built a RAG chatbot using Langchain for my organisation. The pdf that we load usually contains lots of tables and few images. So far it is giving good responses from those PDFs. But ya if there is a method to extract these non text datas more efficiently, I'll definitely want to integrate with my chatbot.

    • @aadarshunniwilson8517
      @aadarshunniwilson8517 Місяць тому

      ​@@engineerprompt any updates on this.

  • @TheAstralftw
    @TheAstralftw Місяць тому +1

    This is nice demo but really useless in real world scenarios because you can maybe extract those images from wiki, but you can not from specific PDF file.. but it is still nice demo, but not very useful in real world projects where you need to build specific app .. still good thing for someone who wants to learn

  • @kishorethota9959
    @kishorethota9959 24 дні тому

    Can we get the code?