What is RAG? (Retrieval Augmented Generation)

Поділитися
Вставка
  • Опубліковано 17 січ 2024
  • How do you create an LLM that uses your own internal content?
    You can imagine a patient visiting your website and asking a chatbot: “How do I prepare for my knee surgery?”
    And instead of getting a generic answer from just ChatGPT, the patient receives an answer that retrieves information from your own internal documents.
    The way you can do this is with a Retrieval Augmented Generation (RAG) architecture.
    It’s not as complex as it sounds and I’m breaking down how this very popular solution works in today’s edition of #CodetoCare, my video series on AI & ML.
    My next video will be on a use case of AI in healthcare - what do you want to hear about from me?
    #AI #artificialintelligence #LLM #genai
    Check out my LinkedIn: / donwoodlock
    ---
    ABOUT INTERSYSTEMS
    Established in 1978, InterSystems Corporation is the leading provider of data technology for extremely critical data in healthcare, finance, and logistics. It’s cloud-first data platforms solve interoperability, speed, and scalability problems for large organizations around the globe.
    InterSystems Corporation is ranked by Gartner, KLAs, Forrester and other industry analysts as the global leader in Data Access and Interoperability. InterSystems is the global market leader in Healthcare and Financial Services.
    Website: www.intersystems.com/
    UA-cam: / @intersystemscorp
    LinkedIn: / intersystems
    Twitter: / intersystems
  • Наука та технологія

КОМЕНТАРІ • 134

  • @dwoodlock
    @dwoodlock  27 днів тому +4

    Since this video turned out to be so successful and several people requested for me to do a deep dive / demo, here it is! Looking forward to reading your comments and hope you enjoy this one too. ua-cam.com/video/P8tOjiYEFqU/v-deo.html

  • @vaidyanathtdakshinamurthy8732
    @vaidyanathtdakshinamurthy8732 4 дні тому

    Hello Don Sir, thanks for this explanation. You're a blessed master craftsman. Simple and precise description and to the point.

  • @gt6808a
    @gt6808a 2 дні тому

    This has been the most helpful video I've found to help me understand how RAG works. Thank you so much for your wonderful explanation!

  • @hussamcheema
    @hussamcheema 2 місяці тому +35

    One of the best explanation of RAG on UA-cam. Thanks Don.

    • @NicolaiDufva
      @NicolaiDufva Місяць тому +2

      I agree. Most other explanations are either way too detailed with live coding that muddles the information or way too high-level talking about how the LLM retrieves the additional data (which it doesn't! it is given to it via the prompt!)

  • @CodeVeda
    @CodeVeda 2 місяці тому +25

    Finally someone is explaining with an real time example. Otherwise everyone else takes an example of fruits (apple, oranges etc) or movie names etc.

  • @califfa2419
    @califfa2419 3 дні тому

    what a great explanation of RAG! Thank you

  • @christopherhunt-walker6294
    @christopherhunt-walker6294 19 днів тому +5

    Wow he has explained this really clearly. This is the missing link for me between LLMs and making them actually useful for my projects. Thank you!

  • @eahmedshendy
    @eahmedshendy 3 місяці тому +22

    Not confusing at all, just simple and get to the point explanation, thank you.

  • @MateoGarcia-rt7xt
    @MateoGarcia-rt7xt 9 днів тому

    Thanks for this great explanation, Don!

  • @govindarajram8553
    @govindarajram8553 14 днів тому

    Just I watched from youtube suggestions and you me good explanations on Retrieval augmented generation closure to my use case. Thank you

  • @dharmakelleherauthor
    @dharmakelleherauthor День тому

    Thank you so much! This is a great, easy-to-follow explanation. Coincidentally, I'm having knee surgery tomorrow. LOL.

  • @longship44
    @longship44 Місяць тому +1

    This is one of the best explanations of large language Models and the value of utilizing RAG I have seen. Don, you are an outstanding communicator. Thank you for taking the time to put this together.

  • @MuthukumaranPanchalingapuramKo
    @MuthukumaranPanchalingapuramKo Місяць тому +2

    Best content on RAG!! Thank you!

  • @bhaskarmazumdar9478
    @bhaskarmazumdar9478 26 днів тому +2

    This is an excellent explanation of the concept. Thank you Don

  • @mzimmerman1988
    @mzimmerman1988 7 днів тому +1

    well done! thanks.

  • @BAZ82
    @BAZ82 2 місяці тому +5

    I found your video to be the most accessible and informative introduction to RAG, especially for those new to this topic.

  • @user-ts2sj2dg8t
    @user-ts2sj2dg8t 3 місяці тому +6

    Thank you. You are the first to explain RAG well. I have hear about a lot without understanding what does it mean.

  • @aryankushwaha9306
    @aryankushwaha9306 Місяць тому +1

    one of the best explanation i ever found. Now I finally understand what RAG is and thank you so much Mr. Don

  • @mattius459
    @mattius459 26 днів тому +1

    Great, thank you.

  • @MrNewAmerican
    @MrNewAmerican 3 місяці тому +6

    This is probably the best tutorial I have watched. Period. What an amazing teacher!

  • @latentspaced
    @latentspaced Місяць тому

    Appreciate you and your content. I'm glad I found you again

  • @easybachha
    @easybachha 2 місяці тому +1

    Excellent explanation. Exactly what I was looking for! Thank you, Don!

  • @m.abdullahfiaz9635
    @m.abdullahfiaz9635 Місяць тому +1

    Thanks Prof. Don Woodlock you have explained exactly the same as I need to understand about my current project every concept maps to the practical part of project. Please deliver your knowledge more about advance and complex topics.👍

  • @johnny1966m
    @johnny1966m 3 місяці тому +3

    Thank you very much for this video. Now is understand what my colleagues do in work with system documentation handling with use of LLM.:)

  • @arjbaid2024
    @arjbaid2024 3 місяці тому +3

    Wonderful explanation of this topic. Thank you!

  • @joeytribbiani735
    @joeytribbiani735 Місяць тому +1

    the best explanation of rag that've found thank you a lot

  • @vinayakminde8317
    @vinayakminde8317 3 місяці тому +2

    By far this is the most simple explaination for RAG I have came across. Amazing.
    Looking forward to next videos in series.

  • @herculesgixxer
    @herculesgixxer 2 місяці тому +1

    loved your explanation, thank you

  • @nadellaella6416
    @nadellaella6416 25 днів тому +2

    Bestt explanation! Thank youu Mr.Don!

  • @Ak_Seeker
    @Ak_Seeker 2 місяці тому +3

    Awesome, thanks for the wonderful explanation in simple language

  • @JamesNguyen-lt5bc
    @JamesNguyen-lt5bc 10 днів тому

    awesome explanation. Thank you

  • @DavidBennell
    @DavidBennell 22 дні тому +1

    Great explanation, I have seen a lot of these and people normally go into far too much detail and muddy the water, or are far too abstract, fast and loose, or just get it wrong. I think this is a great level to cover this topic at.

  • @achen94
    @achen94 2 місяці тому +1

    Amazing video. Thanks for the great explanation!

  • @rahulkunal
    @rahulkunal Місяць тому

    Thanks for such a simple explanation of the RAG Architecture Concepts.

  • @travelchimps6637
    @travelchimps6637 Місяць тому +1

    9:20 not at all confusing, makes perfecf sense the way u exolained it thank you!!!

  • @MichaelRuddock
    @MichaelRuddock 2 місяці тому +2

    Thank you for sharing your knowledge with us, great explanation.

  • @bryanbimantaka
    @bryanbimantaka 2 місяці тому +1

    WOW! The simplest yet the best explanation! It's easy to understand for a beginner like me.
    THANK YOU!

  • @dannysuarez6265
    @dannysuarez6265 2 місяці тому +1

    Thank you for your great explanation sir!

  • @MrFrubez
    @MrFrubez 2 місяці тому +1

    Such a great explanation of RAG. It really helped me grasp the power of it.

  • @rafa_lopes
    @rafa_lopes 2 місяці тому +1

    It was one of the most didactic explanations about RAG. Thank you, @Don Woodlock.

  • @coopernelson6947
    @coopernelson6947 28 днів тому +2

    Great video. I feel like this is the first time I'm learning stuff that is at the cutting edge. This video was posted 2 months ago, very exciting times

    • @gtarptv_
      @gtarptv_ 14 днів тому

      Same here I had no idea that RAG WAS BIG DEAL. I'VE BEEN READING STUFF ON REDDIT WORK PEOPLE TALKING ABOUT THE RAG THIS AND THAT

  • @ciropaiva1519
    @ciropaiva1519 2 місяці тому +1

    Incredible Video! Thank you very much!

  • @chesaku
    @chesaku Місяць тому +1

    Wow.. Job well done. Great and simplistic explanation for such complex topic.

  • @Themojii
    @Themojii 16 днів тому

    Great explanation of RAG. I subscribed to your channel after watching this. Thank you Don for the great content.

  • @AshisRaj
    @AshisRaj 2 місяці тому +1

    Excellent explanation Mr. Author

  • @stephenlii1744
    @stephenlii1744 2 місяці тому

    it’s a pretty good explanation,thanks Don

  • @abhilpnYT
    @abhilpnYT 2 місяці тому +1

    Well explained ThankYou ❤

  • @jasonkey7063
    @jasonkey7063 2 місяці тому +1

    Great explanation. I believe this has a big market for developers in small towns. Such an easy product to create and sell.

  • @EstevaoFloripa
    @EstevaoFloripa Місяць тому

    Thanks a lot! Great and simples explanation!

  • @789juggernaut
    @789juggernaut 2 місяці тому +1

    Excellent video, really appreciate it.

  • @itsAlabi
    @itsAlabi 2 місяці тому

    This is really clear, this will customize the output based on the environment of the user not just on open source data.

  • @bigplumppenguin
    @bigplumppenguin 2 місяці тому +1

    Very good introduction!!!

  • @Arunkumar-234.
    @Arunkumar-234. Місяць тому

    Great explanation! Thank you very much :)

  • @EGlobalKnowledge
    @EGlobalKnowledge 3 місяці тому +2

    Nice explanation, Thank you

  • @narendraparmar1631
    @narendraparmar1631 Місяць тому

    Thanks Don , that's informative

  • @zandanshah
    @zandanshah Місяць тому +1

    Good content, please share more.

  • @PR03
    @PR03 2 місяці тому +1

    great session dear Don. It was very complete, to the point and simply more advanced than other popular videos but of course in simple words. Thank you so much sir. ❤❤

  • @kingofartsofficial4431
    @kingofartsofficial4431 3 місяці тому +3

    Very Good Explanation Sir

  • @steffenmuller2888
    @steffenmuller2888 2 місяці тому

    I was looking for a general explanation to the RAG topic and you provide it very well! Now, I understand that the quality of RAG systems strongly depend on the information retrieval from the vector database. I will try to implement a RAG system on my own to learn something about it. Thank you very much!

  • @shamimibneshahid706
    @shamimibneshahid706 Місяць тому

    Clearly explained!

  • @anoopaji1469
    @anoopaji1469 3 місяці тому +2

    Very informative and simple

  • @fire17102
    @fire17102 2 місяці тому +4

    Would love it if you could showcase a working rag example with live changing data. For example item price change, or policy update. Does it require to manually manage chunks and embedding references or are there better existing solutions? I think this really differentiates between fun-todo and actual production systems and applications.
    Thanks and all the best! Awesome video ❤

    • @_alphahowler
      @_alphahowler Місяць тому

      I would second that request with a real world example where information changes, i.e. some information is outdated Nd some new information is added without compromising the quality of the system.

  • @IhorVasutyn
    @IhorVasutyn 2 місяці тому

    Very intuitive

  • @CollaborationSimplified
    @CollaborationSimplified 2 місяці тому +1

    This was great, thank you! I believe this process is what Copilot for Microsoft 365 uses and it is referred to as ‘grounding’. Very helpful 👍

  • @itayregev4691
    @itayregev4691 2 місяці тому +1

    Thank you Don

  • @TournamentPoker
    @TournamentPoker 3 місяці тому +2

    Great tutorial!

  • @FirstNameLastName-fv4eu
    @FirstNameLastName-fv4eu 23 дні тому +1

    God save that patient on his/ her Knee Surgery !!

  • @screenwatcher6224
    @screenwatcher6224 2 місяці тому

    This is SOOOO GOOD

  • @Deep185
    @Deep185 3 місяці тому +1

    Thank you!

  • @ClayBellBrews
    @ClayBellBrews 2 місяці тому +1

    Great work; would really love to see you dig in on tokens and how they work as well.

  • @tatuldanielyan9943
    @tatuldanielyan9943 5 днів тому

    Thank you

  • @MagusArtStudios
    @MagusArtStudios 2 місяці тому

    I've been doing RAG and not even knowing the definition. Was glad to see I wasn't doing it wrong by injecting it into the end of the prompt.

  • @peterbedford2610
    @peterbedford2610 2 місяці тому +1

    Sounds like it is optimizing or creating a more efficient prompt session? I guess "augmentation" is a fairly good description.
    Thank you. I enjoy your teaching style.

  • @inaccessiblecardinal9352
    @inaccessiblecardinal9352 3 місяці тому +2

    Doing RAG stuff right now for work. Just scratching the surface, but very interesting stuff so far. We have a few clients on the horizon who really just need text classification, and the vanilla results from the vector DB might actually be good enough for them. Interesting territory coming fast.

    • @dwoodlock
      @dwoodlock  3 місяці тому +1

      yes - I have found that pretty small LLMs (like BERT) do just fine for text classification.

  • @speedycareer
    @speedycareer 2 місяці тому

    Great knowledge obviously sir.
    Would u please tell, can we integrate this data or these things in an app

  • @joannaw3842
    @joannaw3842 3 місяці тому

    Thank you very much, finally someone has explained it in an accessible way. My question, as a tester, are there any weaknesses in such a solution that need to be taken into account when working with such systems?

    • @dwoodlock
      @dwoodlock  2 місяці тому +1

      Good question. There are two key points of failure that you want to think about from a testing point of view. Part 1 is whether the system is pulling the right documents to use as context. And Part 2 is whether the LLM, given the right documents, is giving a good answer to the question. Maybe teasing those two apart and testing them separately would be a good strategy.

  • @cerberus1321
    @cerberus1321 Місяць тому

    Great video, thanks. I'm tasked with prototyping a product utilizing these methodologies for a client this quarter. I've not done it before so this is very helpful. Is langchain a tool that can handle this entire process? How much context can you provide an LLM without restricting it? Also, how do we actually bottle the raw database query results for summarization, assuming not all questions will relate to qualitative data?

    • @dwoodlock
      @dwoodlock  18 днів тому

      Yes - langchain can be a great help. I mostly didn't use it for this video because I wanted to explain the underlying concepts.

  • @oryxchannel
    @oryxchannel 3 місяці тому

    Pinecone vector DB has done some revolutionizing of its website- driving costs down with a new tech. It may affect how info is retrieved.

    • @dwoodlock
      @dwoodlock  3 місяці тому

      Yes - pinecone is a leader in this area.

  • @MadHolms
    @MadHolms 2 місяці тому

    great explanation, thx, but is only theory, can you please show a sample system where all of the above happens? thx!

  • @geoiowa
    @geoiowa Місяць тому

    Great explanation of RAG! What tool do you use for the drawing? Thanks

    • @dwoodlock
      @dwoodlock  18 днів тому

      It's a Revolution Lightboard.

  • @mtb_carolina
    @mtb_carolina 11 днів тому

    Let me ask...do you have any methodologies to keep the chunks from the internal database private for the rag recall with the LLM if those internal databases that the RAG system is pulling from are confidential? I've been grappling with this...any insights will be much appreciated. Thank you!

  • @imranideas
    @imranideas Місяць тому

    Hi, I have passed on the content of a pdf file to the llm and it does come up with a relevant response however the response is still generic in nature from the content i provided, what i need is a crisp to the point response like steps required to activate a sim card.. can you help me achieve the same

  • @hebol
    @hebol 2 місяці тому

    Just found you and your great content. Have to ask do you mirror-write…or thinking about it do you just mirror the video .-)

    • @dwoodlock
      @dwoodlock  2 місяці тому +1

      No I don't mirror right - that would be way too complicated! I'm speaking behind a glass a writing on it naturally and it reverses everything. That's why I don't wear my Nirvana T-shirt when filming.

    • @hebol
      @hebol 2 місяці тому

      @@dwoodlockit wouldn’t be impossible. I had a lecturer she wrote with both hands interchangeable. You absolutely have the uniqeness (if that is a word…I’m Swedish)
      But you mirror the video so that we can se the text or? You are right-handed aren’t you…it looks like you are left handed….but regardless..I love your content. The best of lecturers theroretical and practical!

  • @morespinach9832
    @morespinach9832 2 місяці тому +1

    The revelation for me is that “our own data” is in fact added as a prompt before the prompt. And not after the LLM has responded.
    Is this correct?
    Secondly any vector database recommendations for storing our own very unstructured PDF documents? Do we need specialized stuff like Pinecone (which sadly is only hosted saas) or would Neo4J type stuff work too… or elastic search?

  • @theindubitable
    @theindubitable 2 місяці тому

    I have a problem with the model not changing context. It fills the the token cap and then when I ask another question it wont update the chunks everytime. How can I solve this? Maybe prompting.

  • @letseat3553
    @letseat3553 Місяць тому

    The top 5 documents sounds very much like a TF-IDF / cosine similarity based query with a 'limit 5' added to get the top 5 matches on the query - the kind of result you can get from a simple MariaDB search on a FTS index these days.
    No need to over complicate it and involve an LLM at that stage.
    I do like how you describe it as 'the prompt before the prompt' - which is just the top 5 results......

    • @dwoodlock
      @dwoodlock  18 днів тому

      Yes - though a language model will have a better sense of vocabulary, meaning it'll know that 'tired' and 'fatigued' are similar - a TF/IDF will unless you feed it Word2Vec vectors. And it can morph the word vectors based on the rest of the sentence its order of words. But I agree with your point, you shouldn't overcomplicate things if simpler approaches work for your use case.

  • @seva723
    @seva723 3 місяці тому +1

    soooooo lit

  • @DhavalPatel12
    @DhavalPatel12 28 днів тому

    Thanks for explaining in detail and relevant example. Why not just train LLM with your data in the first place ? That would simplify the architecture.

    • @dwoodlock
      @dwoodlock  18 днів тому

      Yes - it would. But you may not have enough text to teach it all the intricacies of the English language and the original training is very very expensive and computational expensive. So it's better to start with a model that somebody has trained first and then fine-tune it or feed in context like I described.

  • @lxn7404
    @lxn7404 16 днів тому

    I wonder what is the role of LLM in creating a vector, it looks like simple indexation

  • @didyouknowtriviayt
    @didyouknowtriviayt 2 місяці тому

    I built a system like this last year with openai, pinecone and python

  • @JohnTurner313
    @JohnTurner313 3 місяці тому

    Very clear and helpful. The question I have: if I create a RAG using my own content (eg: contents of my cloud drive), how do I prevent that data content, which may include PII, HIPAA and other protected information, from being used by the AI provider like OpenAI? Anything I send to a 3P AI LLM will be used by them for training their own model, which in turn leads to high risk leakages to other people who aren't me. It seems like the only way to do this is to have a restricted, private LLM running locally on my laptop or home network.

    • @dwoodlock
      @dwoodlock  2 місяці тому +1

      You need an agreement with these cloud services providers that enabled you to send PHI to them. Some offer this as one of their services. If you don't have that agreement, you cannot do it (in most countries). It is also an option to run a model on-prem and there are decent open source LLMs that you can use for certain use cases.

    • @morespinach9832
      @morespinach9832 2 місяці тому

      @@dwoodlockwhich ones can be self hosted - BERT, RoBERTa? Which ones are good I mean.
      Also - do we have to keep these models up to date on Prem by downloading them again in the future as new versions of them emerge?

  • @pritampatil6056
    @pritampatil6056 3 місяці тому +1

    Father of AI for a reason!!

    • @djl3009
      @djl3009 2 місяці тому

      I guess it doesn't hurt to have a likeness to Geoffrey Hinton if you are an AI practitioner :)

    • @dwoodlock
      @dwoodlock  18 днів тому

      Ha.

  • @michaelcharlesthearchangel
    @michaelcharlesthearchangel Місяць тому +1

    RAG // RSI from the Matrix

  • @labsanta
    @labsanta 3 місяці тому

    - [00:00 - 02:20](ua-cam.com/video/u47GtXwePms/v-deo.html) 🧠 Introducción a la Recuperación Aumentada en Generación (RAG)
    - Explicación de lo que es Retrieval Augmented Generation (RAG).
    - Uso común de RAG para mejorar la experiencia de los usuarios con modelos de lenguaje.
    - Aplicaciones de RAG en la respuesta a preguntas y generación de contenido.
    - [02:20 - 06:11](ua-cam.com/video/u47GtXwePms/v-deo.html) 🤖 Componentes de RAG
    - Detalle sobre la estructura de una solicitud a un modelo de lenguaje (prompt).
    - La importancia de las instrucciones en la solicitud antes del prompt.
    - Proceso de selección de contenido relevante de la base de datos y su inclusión en la solicitud.
    - [06:11 - 10:58](ua-cam.com/video/u47GtXwePms/v-deo.html) 🔄 Vectorización y Retrieval en RAG
    - Explicación de cómo se vectoriza el contenido para hacerlo numérico y comparable.
    - Proceso de búsqueda y selección de documentos relevantes en la base de datos.
    - Cómo RAG mejora la generación de respuestas basadas en el contenido recuperado.
    Espero que esta división en secciones sea útil para comprender mejor el concepto de Retrieval Augmented Generation (RAG).

  • @joemiller9856
    @joemiller9856 Місяць тому

    Don , Can RAG be used to protect private company sensitive (trade secrets, etc. ) data by essentially translating this private data to numerical information (vectors) while using publicly available LLM? I suppose the responses from these prompts may also potentially expose sensitive information as well??

    • @dwoodlock
      @dwoodlock  18 днів тому +1

      Even though the RAG approaches turns words, sentences, and paragraphs into numbers, that doesn't mean that they are private, in other words you can call the tokenizer in reverse. So you need to treat these in the same way when considering whether to send them to a cloud LLM service.

  • @worldof_AG
    @worldof_AG 2 місяці тому

    Please create a video of an entire project to create a chatbot using RAG

    • @dwoodlock
      @dwoodlock  18 днів тому

      I have one now: ua-cam.com/video/P8tOjiYEFqU/v-deo.html

    • @worldof_AG
      @worldof_AG 18 днів тому

      @@dwoodlock thanks

  • @serdalaslantas
    @serdalaslantas Місяць тому

    Hi, I have a problem with my chatbot. It doesn’t have a memory and doesn’t remember my previous conversations! How do I solve this issue? Does RAG system solve this problem? Thanks for your answer.

    • @KrungThaiBank
      @KrungThaiBank 23 дні тому

      Can save question and answer for next question

    • @dwoodlock
      @dwoodlock  18 днів тому

      yes - I didn't demonstrate this but RAG systems also summarize the prior conversation (or the last few questions) and send that as context. So if you ask "Do you have parking?" and then "How much does it cost?", it knows that you are referring to parking vs. your knee surgery for example.

  • @JamesSpellos
    @JamesSpellos 2 місяці тому

    So is a GPT that people can easily build essentially an architecture that uses a RAG approach, and if so, does it create a vector database from the documents the user uploads?

    • @dwoodlock
      @dwoodlock  18 днів тому

      Yes basically. And it also allows the user to put in custom instructions for their GPT.

    • @JamesSpellos
      @JamesSpellos 18 днів тому

      @@dwoodlock Thank you for the confirmation & clarification. Appreciate it.

  • @ramsescoraspe
    @ramsescoraspe 3 місяці тому

    What about if the "content" is a relational database?

    • @dwoodlock
      @dwoodlock  2 місяці тому

      Relational data is a fine source for this as well. Generally speaking you would 'textify' that data from the relational database to make it part of the prompt. For example when we use this approach with patient medical records, we create a text version of the medical record (which contains structured fields in a relational database) and use that text as part of the RAG model. You don't have to turn everything into complete sentences, but some textual form gives the LLMs the best chance of understanding structured data.

  • @NISHANTSUTAR-fi9xe
    @NISHANTSUTAR-fi9xe 21 день тому

    If I don’t have any relative data in my document then how to avoid giving respond to that question instead of giving random answer unrelated answer

    • @dwoodlock
      @dwoodlock  18 днів тому

      Just add that to the prompt. "If you are not finding any relevant information in the context that I provided, please don't answer the question". That sort of thing.