Intro to RAG for AI (Retrieval Augmented Generation)

Поділитися
Вставка
  • Опубліковано 2 лип 2024
  • This is an intro video to retrieval-augmented generation (RAG). RAG is great for giving AI long-term memory and external knowledge, reducing costs, and much more.
    Be sure to check out Pinecone for all your Vector DB needs: www.pinecone.io/
    Join My Newsletter for Regular AI Updates 👇🏼
    www.matthewberman.com
    Need AI Consulting? 📈
    forwardfuture.ai/
    My Links 🔗
    👉🏻 Subscribe: / @matthew_berman
    👉🏻 Twitter: / matthewberman
    👉🏻 Discord: / discord
    👉🏻 Patreon: / matthewberman
    👉🏻 Instagram: / matthewberman_ai
    👉🏻 Threads: www.threads.net/@matthewberma...
    👉🏻 LinkedIn: / forward-future-ai
    Media/Sponsorship Inquiries ✅
    bit.ly/44TC45V
  • Наука та технологія

КОМЕНТАРІ • 416

  • @matthew_berman
    @matthew_berman  12 днів тому +19

    What's your favorite use case for RAG?

    • @HanzDavid96
      @HanzDavid96 12 днів тому +8

      Giving the LLM/Agents a mind for long term planning and remembering stuff associatively. The memory is the half agi within the generative multiagentic system where the LLM is the context processor.

    • @FunwithBlender
      @FunwithBlender 12 днів тому +25

      I specialize in Retrieval-Augmented Generation (RAG). Your introduction is good, but it lacks technical depth. You glossed over chunking and how to use it correctly based on the data. Pinecone is good, but it's not necessarily better than vector databases built in Rust or Go, like Qdrant and Weaviate (which are free and open source). It's also important to explain in-memory vector database solutions using tools like FAISS or on-disk solutions like Qdrant and Pinecone, and to discuss the pros and cons of each.
      A significant omission is not addressing implicit behavior or implicit data versus explicit data, and their relationship with graph databases. Rerankers might be too advanced a concept; often, you can achieve better results by optimizing chunking, similar to how tokenization is used for semantic understanding. Often, agents are unnecessary, and having a chain-of-thought agent before sending to the LLM can be a waste. Additionally, discussing the similarities between the internals of a transformer and a vector database is intriguing. Overall, the video feels like a Pinecone sponsorship.
      Regarding fine-tuning, it's about improving the understanding or behavior of an LLM in a specific domain at the cost of losing understanding in other areas. You should only fine-tune if the model does not seem to understand. Use RAG when the model lacks knowledge or when you want to reduce hallucinations, but relying solely on vector databases is a missed opportunity. One micro aspect you did not touch on is tokenization. The two biggest things people often overlook are chunking and tokenization, and there are massive gains to be made when these are properly understood.

    • @Spudster3
      @Spudster3 12 днів тому +3

      Using my local scanned (searchable) PDF documents in RAG.

    • @FunwithBlender
      @FunwithBlender 12 днів тому +2

      one good use is ecommerce products for conversational shopping...creating new experiences...built a few prototypes of this as mvps for pitches...its a night and day experience

    • @dakotaep1
      @dakotaep1 11 днів тому +2

      @@FunwithBlender Great comment!
      What is your go to open source RAG pipeline? I am beginning to learn and discover all these tools. It is pretty amazing.

  • @ICProfessional
    @ICProfessional 12 днів тому +171

    Would be great a full tutorial on RAG

    • @paelnever
      @paelnever 11 днів тому +16

      Yeah, and would be great one with open source tools, not an advertorial for a closed source company.

    • @flying-higher
      @flying-higher 11 днів тому +2

      @@paelnever GPT4All has a new vector tech I'm playing with.

    • @ripstar2
      @ripstar2 11 днів тому

      I would love to see this. I do process automatisation with a combination of KIs and zapier for companies. RAG opens up a ton of new opportunities for my clients.

    • @gligoran
      @gligoran 10 днів тому

      I would love a full RAG tutorial as well, but maybe first without Pinecone. The missing piece for me is how to embed large documents. Do you have to split them into sections or how does that work?

    • @expchrist
      @expchrist 10 днів тому +1

      Please do a tutorial on rag using pine cone!

  • @dombayo
    @dombayo 12 днів тому +101

    A vector database tutorial would be great! Excellent content.

    • @gabrielsandstedt
      @gabrielsandstedt 11 днів тому +6

      You can ask Claude 3.5 create a locally run vector database. It will manage it in a day and you will avoid having to pay for another clouded service. I did it and it worked.

    • @fabrizio-6172
      @fabrizio-6172 7 днів тому

      Great ​@@gabrielsandstedt

  • @Dant110
    @Dant110 12 днів тому +45

    I would like a deeper dive into RAG and an end to end pinecone tutorial! Thanks for the great video!

    • @gabrielsandstedt
      @gabrielsandstedt 11 днів тому

      You could use pinecone but Claude 3.5 can build you a custom vector search algorithm that will work and you can store locally using sqlite

  • @positivevibe142
    @positivevibe142 12 днів тому +105

    That's great! PLeaaaaaaaaaaaaaaaaaase, build a LOCAL PRIVATE version that uses open source models, not API or any cloud thing!

    • @JustinsOffGridAdventures
      @JustinsOffGridAdventures 12 днів тому +3

      Look a Matt's older videos. He shows you how to use local model like LLama 3 as well as using RAG tools without the use of an API key. Before I got here in the wilderness I had set myself set up with a pretty good AI testing laboratory. I had to switch gears from building race cars and AI testing platforms to chopping down trees.

    • @lucidzfl
      @lucidzfl 12 днів тому +3

      we run weaviate - its phenomenal local.

    • @positivevibe142
      @positivevibe142 12 днів тому +1

      @@JustinsOffGridAdventures
      Wilderness, chopping out trees, nature, greens, fresh air, away from technology.... 🤔!!!!! Sounds like you did the right thing to me and truly living this life!
      Normally people spend their entire life on jobs waiting to retire then move out to enjoy their lives, while took the shortcut.
      Good for you Justin.

    • @positivevibe142
      @positivevibe142 12 днів тому

      @@lucidzfl
      I've tried many available options, but not this one! I'll give it a try. Thanks. If you don't mind me asking, I had some problems with the other options I used like: inaccurate information retrieval, frequent "no info found" messages, significantly smaller answer sizes compared to my input text, and difficulty handling large files (around 40K words each). Should I expect better results from Weaviate compared to the other options I've tried?

    • @lucidzfl
      @lucidzfl 11 днів тому

      @@positivevibe142 so i do a boatload of rag and there are many ways to do it.
      When it comes to weaviate i leave the blobs fairly short (

  • @JustinsOffGridAdventures
    @JustinsOffGridAdventures 12 днів тому +12

    Great video! I've bee following you for awhile and have set up some edge LLM's using your tutorials. RAG is the future for any business wanting to truly utilize their data. to the fullest. I think that a lot of companies aren't even sure how they can implement their data for the greater good of the business while saving money at the same time. Videos like this help clarify the subject. Please do a video on Pinecone. I'm sure there is a lot of us that would like to see it's capabilities. Keep up the great work.

  • @forifand
    @forifand 11 днів тому +8

    A full tutorial would be great - thanks so much 👍

  • @ErickJohnson-qx8tb
    @ErickJohnson-qx8tb 12 днів тому +11

    YESSS DO ITT PLEASE 🙏

  • @JulioCesarjcfalcone
    @JulioCesarjcfalcone 12 днів тому +11

    I would love to see a tutorial on how to use RAG! I was just thinking on how to solve some of this knowledge problem on a small project I'm working on

  • @ytrew9717
    @ytrew9717 12 днів тому +11

    Very well explained : short and clear with good examples, thanks!

  • @User-actSpacing
    @User-actSpacing 11 днів тому +3

    What a great commercial

  • @nareshtaneja7038
    @nareshtaneja7038 11 днів тому +4

    Thanks you for making this Video. I am a Non Techie trying to get easy to understand method of querying my documents using RAG with open source LLMs. Would eagerly await your full tutorial on this topic .

  • @mcarrusa
    @mcarrusa 11 днів тому +4

    PLEASE do the how-to on setting this up. It is a key piece to the puzzle, for sure. Thank you for all the great content!

  • @shuntera
    @shuntera 12 днів тому +8

    Be interested to see best practices for keeping the RAG database up to date. For example if a new PDF is dropped into a watched folder the PDF gets submitted to the embedding model automatically. Likewise for PDFs that are out of date and removed which should them be dropped from the vector database.

    • @antaishizuku
      @antaishizuku 11 днів тому

      You could add a useage count, entered date, last accessed date, etc and have a background thread check for old info. Like say 2-3 years unless its something your llm wouldn't know

  • @AbdulMajeed-lf5sq
    @AbdulMajeed-lf5sq 11 днів тому +2

    This is one of the best videos I watched from you as a junior AI engineer 👌🏼 BEAUTIFUL

  • @dennis383838
    @dennis383838 12 днів тому +8

    Rag tutorial please, especially use case of local open source llm. Thanks!

    • @dennis383838
      @dennis383838 11 днів тому

      With long term memory implementation, as well. All open source, please.

  • @dcmumby
    @dcmumby 11 днів тому +2

    RAG requires a knowledge graph DB as well in order to find information not directly mentioned which is a limitation of RAG, a tutorial incorporating both would be amazing

  • @bitcloud2304
    @bitcloud2304 6 днів тому

    Just discovered this channel and it quickly leapfrogged others as one of my favorite AI channels. I'm a Data Scientist starting to work in the LLM arena and these videos are super helpful. I'd love a full tutorial on RAG!

  • @Idea-LabAi
    @Idea-LabAi 10 днів тому +1

    I would also like more tutorials on RAG and techniques to improve chatbots. Thanks Matthew for this content. I like your posts on news but tutorials are also useful and appreciated given your ability to communicate such concepts.

  • @afonsolfm
    @afonsolfm 8 днів тому +1

    Great videos man! Listening them every day now.

  • @jack.splash2334
    @jack.splash2334 11 днів тому +2

    A tutorial would be amazing! It’s exactly what I need for something I wanted to experiment with

  • @BrankoPetrovic-f2z
    @BrankoPetrovic-f2z 11 днів тому +1

    I've heard about RAG before, but this video helped me understand it much better. Thank you for sharing your knowledge! I would greatly appreciate it if you could make another video demonstrating how to use it with a real-life example

  • @samtabby3373
    @samtabby3373 12 днів тому +1

    I like your style of explaining things. Thank you for your videos as I've learned a lot from you.

  • @paultoensing3126
    @paultoensing3126 10 днів тому

    Yes! Please set up a full tutorial for us. This is powerful. I have a Custom GPT business and I’ve always known I need to incorporate RAG in the most pragmatic way possible to advance my capabilities. So it sounds like Pinecone is the way to go. Thanks so much for your help.

  • @youdaloser1
    @youdaloser1 3 дні тому

    100% on board with seeing a full tutorial. Also highly interested in seeing a fully open-sourced setup.

  • @dieyoung
    @dieyoung 10 днів тому

    This is exactly what I've been looking for! Thanks so much for this

  • @middleman-theory
    @middleman-theory 9 днів тому

    Yes, we need a full tutorial please. This is great knowledge and a very simple to understand video! I actually have a pinecone account, and started using it when I first started playing around with Auto-GPT, but I haven't used it since. I'm interested in developing some new projects soon, and RAG sounds like something I need to be thinking about.

  • @bobwarfieldoz
    @bobwarfieldoz 9 днів тому

    Yes please, more information about Pinecone and RAG! Great content, thanks!

  • @davidlavin4774
    @davidlavin4774 11 днів тому +1

    Slight pet peeve of mine - I think presenting it this way makes it sound like you must use an embedding model/vector db to do RAG. The basic version of RAG is just that idea of passing additional, retrieved info with the prompt to the LLM. Yes, the embedding model w/ vector db is a very efficient way of doing that - especially with large amounts of data. But it is not the only way to accomplish it, and may not even be the best way to do it, depending on the use case.

  • @thecobrasnakes
    @thecobrasnakes 11 днів тому

    Yess we want a tutorial! Amazing content thank you !

  • @tchadcarby8439
    @tchadcarby8439 9 днів тому

    Thank you for your hard work Mathew! Please do videos on all suggestions that you made in this video.

  • @jprak123asd
    @jprak123asd 11 днів тому

    Brilliant!! Yes, a deeper dive will help

  • @sahilverma9330
    @sahilverma9330 11 днів тому

    Finally an explanation without using complex terminologies. Thank you Matthew. Lets do one with RAG + Agents

  • @williamross4062
    @williamross4062 8 днів тому

    A full tutorial is NEEDED

  • @studiophantomanimation
    @studiophantomanimation 11 днів тому

    Claude's new Projects feature is like a simple RAG. I've given it all the knowledge about a novel I'm working on and it has been surprisingly good at understanding all the nuances. Way better than a normal conversation.

  • @lydiayuna9155
    @lydiayuna9155 10 днів тому

    This is by far the best AI educational video!!
    Please share more RAG solution , this will be very very useful for your audience !!

  • @michaeldolmos
    @michaeldolmos 10 днів тому

    Love to see a full tutorial.!

  • @KonradTamas
    @KonradTamas 12 днів тому +3

    YeYe, do the Tutorial

  • @youcandosomethingaboutit
    @youcandosomethingaboutit 12 днів тому +2

    00:02 An intro to RAG and its misunderstood nature
    01:51 RAG is efficient for continually providing new knowledge to large language models
    03:42 RAG enables adding external knowledge to AI models
    05:29 RAG allows AI to access and incorporate new information into its responses.
    07:25 Utilizing embedding models to enhance AI understanding
    09:12 RAG enhances AI by providing external knowledge sources
    11:10 Utilizing external knowledge for AI searches
    12:57 RAG simplifies retrieval augmented generation process

  • @stuffaboutthings8679
    @stuffaboutthings8679 11 днів тому

    Yes ! To all of the walk through on setting up local rag llms and mixed agents

  • @piparsforever
    @piparsforever 12 днів тому +2

    Yes, please, show advanced RAG solution including ranking and SQL usage.

    • @Sven_Dongle
      @Sven_Dongle 11 днів тому

      Come up with an index, store data as a BLOB, then use SQL to retrieve it and add it to prompt.

  • @andredinizwolf7076
    @andredinizwolf7076 12 днів тому +4

    Great knowledge!! Please create a new video about pinecone..

  • @TheAstralftw
    @TheAstralftw 11 днів тому

    Great stuff. Thanks

  • @JeffParkerTexas
    @JeffParkerTexas 9 днів тому

    Yes, please do a step-by-step guide!!!
    Thank you!

  • @garic4
    @garic4 11 днів тому

    In UA-cam, there are hundreds of channels baffling buzzwords and lame tutorials about these concepts without putting real effort on creating meaningful videos. And this channel is not one of those.
    I appreciate your videos Matt, thank you for the great content

    • @garic4
      @garic4 11 днів тому

      Oh and please publish both tutorials , Picone and more RAG applications - those are the future and using agents with that is golden for the near future for all of us

  • @TheLegomom2
    @TheLegomom2 6 днів тому

    Yes definitely need to expand on RAG, vector database and pinecone. Full end to end process for incorporating specific business data sets to generate highly customized content. Creative/marketing use case if possible.

  • @rahuljauhari3240
    @rahuljauhari3240 11 днів тому

    amazing explanation of RAG thank you!!

  • @brianWreaves
    @brianWreaves 11 днів тому

    🏆 Very helpful, with just the main points... love it! As with other, looking forward to more details.

  • @Larimuss
    @Larimuss День тому

    Would love a full RAG tutorial. Thanks for the great video.

  • @FullEvent5678
    @FullEvent5678 10 днів тому

    I'd be very happy to see the whole process presented in a video ♥

  • @Rw223x
    @Rw223x 11 днів тому

    Thanks!

  • @Maltesse1015
    @Maltesse1015 6 днів тому

    Looking forward for the Tutorial 🎉!!

  • @luizcamillo9933
    @luizcamillo9933 11 днів тому

    This is a great and very easy to understand explanation. Please make a full tutorial!

  • @levicarr8345
    @levicarr8345 11 днів тому

    I would really appreciate more videos following this rabbit hole (RAG, pinecone, knowledge Graphs, LangChain)

  • @patrickbowen8408
    @patrickbowen8408 10 днів тому +1

    Yes, full tutorial on rag and pinecone. Provide details on keeping private data private.

  • @plantbasedman
    @plantbasedman 11 днів тому

    definitely want a deeper dive

  • @fasteddiegarcia1
    @fasteddiegarcia1 4 дні тому

    Yes please create a tutorial video showcasing step by step instructions around practical techniques for RAG, local open source vector databases, and automations

  • @fourlokouva
    @fourlokouva 11 днів тому

    Great explanation of RAG and how it differs from fine-tuning and prompt engineering

  • @bitsie_studio
    @bitsie_studio 11 днів тому

    Would absolutely love to see a tutorial on this. Thanks for doing something more technical like this, Love it!

  • @BenoitStPierre
    @BenoitStPierre 11 днів тому

    The OpenAI Dev Days from last year had a great session on optimizing LLMs. Their progression was to try few-shot, then RAG, then fine-tuning - and their description of fine-tuning was that it was a good way to provide "intuition" to the model, but not knowledge.

  • @BigBadBurrow
    @BigBadBurrow 11 днів тому

    Thanks, Matt, interesting concept. A video tutorial would be great!

  • @antaishizuku
    @antaishizuku 11 днів тому

    I have been working on a chromadb vector database sothis is awesome! Thanks!

  • @Copa20777
    @Copa20777 12 днів тому

    This topic is the kind of knowledge everyone thinks they have and brush over.. thanks Matthew

  • @basedbuz
    @basedbuz 11 днів тому

    I have said that it's less about compute power and now about organization of data and mimicking the brain.
    This is one way to do it

  • @BeTheFeatureNotTheBug
    @BeTheFeatureNotTheBug 9 днів тому

    Yeah deeper dive!

  • @jk-2033
    @jk-2033 11 днів тому

    This was very interesting and a full step by step video would be very helpful!

  • @shonnspencer1162
    @shonnspencer1162 11 днів тому

    please continue to educate and show us the RAG vectoring tutuorial. Great video!

  • @DrFukuro
    @DrFukuro 11 днів тому +2

    Do it, but without pinecone, with opensource, locally working tools only.

  • @gustavdreadcam80
    @gustavdreadcam80 11 днів тому

    I'm defintely interested in doing RAG but more so in doing it locally. Especially with all the important information I can't trust a service for storing it, if there is a local way of doing it I'd be very interested in building a RAG pipeline. Great video for explaining the basics of it.

  • @ianvecmanis5642
    @ianvecmanis5642 12 днів тому

    I'd like to you to expand on this Matt! Thanks!

  • @stonibeauchamp4588
    @stonibeauchamp4588 9 днів тому

    Full tutorial would be fantastic!

  • @ignaciopincheira23
    @ignaciopincheira23 3 дні тому

    It is essential to conduct a thorough preprocessing of the documents before entering them into the RAG. This involves extracting the text, tables, and images, and processing the latter through a vision module. Additionally, it is crucial to maintain content coherence by ensuring that references to tables and images are correctly preserved in the text. Only after this processing should the documents be entered into a LLM.

  • @PersianMate
    @PersianMate 10 днів тому

    yes please! I’d like to see a full tutorial on how to do the whole process

  • @attilazimler1614
    @attilazimler1614 10 днів тому

    Hi, thanks for the video, a deeper dive would be interesting :) thanks :)

  • @alanmorgan2536
    @alanmorgan2536 12 днів тому +1

    I've been dreaming about using RAG to compile the summary of key references I use in my profession (Geophysical interpretation). Obviously, professionals may not utilize every key learning from published materials and some information may be conflicting with other published materials in the same field. What would be immensely useful is a method of adding weights to information you utilize on a daily basis and to identify where an AI finds conflicts in logic. If a conflict is found, a model can be taught which path to follow.

  • @laurenceturpin1409
    @laurenceturpin1409 11 днів тому

    An excellent tutorial I would really like you to do a deeper dive into RAG and show how you would set it up.

  • @IamiAGorynT
    @IamiAGorynT 11 днів тому

    Great video. A step-by-step video on RAG and Pinecone would be great! 👍

  • @Pwelican
    @Pwelican 11 днів тому

    Yes please setup a full tutorial

  • @lasithchandrasekara5200
    @lasithchandrasekara5200 11 днів тому

    Great video, please do a deeper dive into RAG and later DSPy video as well.

  • @ProxyBalls
    @ProxyBalls 11 днів тому

    YES!!! Tutorial please

  • @dizzident
    @dizzident 11 днів тому

    I would kill for a full RAG tutorial...

  • @jr21294
    @jr21294 11 днів тому

    For search, there are two ways to do it: lexical or semantic search. RAG can also be used with lexical search

  • @gsmorgan
    @gsmorgan 11 днів тому

    A deeper dive on how to set-up RAG with Pinecone and an embedding model would be great!

  • @KiLVaiDeN
    @KiLVaiDeN 9 днів тому

    A clever way to make an ad, here for Pinecone, by delivering knowledge. It's much more acceptable this way. Well done, and thanks for the intro to RAG :) The people @Pinecone must be proud of this video.
    I've just to say that, it's more about giving AI an optimized context than truly giving them a "memory". The title feels a bit misleading. A real memory would be a workable space where the AI stores itself the required data for later retrieval, and which becomes part of its infrastructure. This is not it.

  • @svetoslavlyubenov8521
    @svetoslavlyubenov8521 9 днів тому

    It will be great to do a full tutorial. If you add multimodal RAG and agents functionalities it will be even better.

  • @businessresearch520
    @businessresearch520 12 днів тому

    Wowza, I think I've been sorta doing this without realizing it lol.

  • @TrevorMatthews
    @TrevorMatthews 11 днів тому

    Ok that was awesome. Of course I’d like to know more! I’ve had a hard time understanding rag til now for some odd reason. Would also love a tutorial on pinecone and embedding.

  • @id10tothe9
    @id10tothe9 8 днів тому

    yes pleez gives us the tutorial!

  • @bradstudio
    @bradstudio 10 днів тому

    PLEASE DO A FULL RAG SETUP TUTORIAL!! 🔥

  • @PureMoss
    @PureMoss 10 днів тому

    Would love to see both the tutorial and deeper dive using RAG

  • @dimadavidoff
    @dimadavidoff 11 днів тому

    I would love to see Pinecone setup!

  • @user-gh3di2rc3o
    @user-gh3di2rc3o 11 днів тому

    Berman seems happy today, but watch out when he is on the RAG.

  • @ricktapf.4474
    @ricktapf.4474 10 днів тому

    Tutorial - yes please!!

  • @naetuir
    @naetuir 9 днів тому

    I would love to see a full tutorial using pinecone.

  • @jsirius3783
    @jsirius3783 12 днів тому

    this is incredible, youve inspired me to learn python. I want to work with these frameworks.

  • @RikHeijmen
    @RikHeijmen 11 днів тому

    Yes! Do the tutorial pls

  • @HIIIBEAR
    @HIIIBEAR 12 днів тому

    Thanks for all you do! Agi is coming!

    • @Sven_Dongle
      @Sven_Dongle 11 днів тому

      Lol, it's not even breathing hard.

    • @HIIIBEAR
      @HIIIBEAR 11 днів тому

      @@Sven_Dongle i didnt say when so no one asked what you think

    • @Sven_Dongle
      @Sven_Dongle 11 днів тому

      @@HIIIBEAR Tough nutz, doosher.

    • @Sven_Dongle
      @Sven_Dongle 11 днів тому

      @@HIIIBEAR lol, bonewad

  • @strazzi2
    @strazzi2 10 днів тому

    A deeper dive into RAG and embeddings would be a great help for developers like me. I work in C# with GPT4o and I use REST rather than Python, but then OK, you can't always get what you want 🙂

  • @dawiesnyman3939
    @dawiesnyman3939 9 днів тому

    Would love a tutorial please. Love your content

  • @corytimm142
    @corytimm142 7 днів тому

    I would love to see a video on how to do all of this with open source software that I can run locally. A project combining RAG with Ollama models would be awesome