Multimodal RAG!? - Pushing the Boundaries of AI

Поділитися
Вставка
  • Опубліковано 26 гру 2024

КОМЕНТАРІ • 23

  • @wtcbd01
    @wtcbd01 6 місяців тому +3

    Thanks so much to budding AI students to share your knowledge and explain everything step by step, particularly the python functions, etc. You have a new subscriber.

  • @VastIllumination
    @VastIllumination 6 місяців тому +1

    Amazing research, Collab Code and overview of all the aspects. Really impactful new technology, you got a new subscription. Looking forward to more of these great videos with collab code in the future.

  • @IdPreferNot1
    @IdPreferNot1 6 місяців тому

    Great video to see expanded functionality for chroma, thx.

  • @DDubyah17
    @DDubyah17 6 місяців тому +1

    Fantastic demo - I have so many things to try

  • @asithakoralage628
    @asithakoralage628 6 місяців тому +2

    Great content and I learned a lot thanks and subscribed too. Keep up the good work

  • @nmstoker
    @nmstoker 6 місяців тому +1

    Would it be fairly easy to flip the RAG step round so that you prompt it with an image and the LLM is fed the associated text as part of the context?
    So extending the example here, a person could submit a picture of what they were wearing and it would give style tips for that?

    • @awakenwithoutcoffee
      @awakenwithoutcoffee 6 місяців тому +1

      yes this is possible. I would look into "ActiveLoop" course on RAG where they go over a similar use-case. Good luck and have fun :)

  • @iconicallyinfamous
    @iconicallyinfamous 6 місяців тому +1

    Seems that something is returned from image search regardless of input search term. I downloaded a Kaggle data set of vehicles - trucks, cars, motorbikes and buses. I put in a search term of "Unicorn" and it still came back with images of cars. Any way to prevent this happening

    • @AdamLucek
      @AdamLucek  6 місяців тому

      Yes that’s still one of the nuances with it, a result will always be returned. What you can do is potentially a further filtering step post retrieval, either a distance cutoff or attach metadata (possible through additional kwarg with chroma) that you can filter further from as well

    • @iconicallyinfamous
      @iconicallyinfamous 6 місяців тому +1

      @@AdamLucek Thanks for that Adam, I've also put in the same question to the HELP section on their discord server. Maybe someone there can suggest something too

  • @RolandoLopezNieto
    @RolandoLopezNieto 6 місяців тому +1

    Great video, subscribed.

  • @sw-ln1hh
    @sw-ln1hh 5 місяців тому

    very awesome thank you for your content

  • @julianpicon243
    @julianpicon243 5 місяців тому

    Amazing! Thanks

  • @Pure_Science_and_Technology
    @Pure_Science_and_Technology 6 місяців тому

    You could loop through a folder of photos and have a vision model provide the context for the each image as you’re vectorizing them. Then you can search your images using natural language.

    • @AdamLucek
      @AdamLucek  6 місяців тому +1

      We’re searching over the photos using natural language here without the need for a context generation step using zero shot image classification models, what you said is possible but an unneeded step with this method! It could be useful to do this for metadata still tho for further retrieval filtering

  • @TheUKROPpp
    @TheUKROPpp 6 місяців тому

    great video!

  • @jjen9595
    @jjen9595 6 місяців тому

    you are insane, because you could have just used Openclip to make the project, but you decided to train it with a clothing dataset to make it better, crazy hahaha

  • @thesimplicitylifestyle
    @thesimplicitylifestyle 6 місяців тому

    😎🤖

  • @alexandrtortik
    @alexandrtortik 6 місяців тому

    ♥🎯♦

  • @БаллРабот
    @БаллРабот 6 місяців тому

    Набор данных из 27 классификаций. Не проще бомжу дать ноутбук и он разметит данные за пару бутылок водки

    • @БаллРабот
      @БаллРабот 6 місяців тому

      И пропьет 500 рублёвый ноутпук