Retrieval Augmented Generation (RAG) Explained: Embedding, Sentence BERT, Vector Database (HNSW)

Поділитися
Вставка
  • Опубліковано 21 лис 2024

КОМЕНТАРІ • 122

  • @nawarajbhujel8266
    @nawarajbhujel8266 8 місяців тому +13

    This is what a teacher with a deep knowledge on what is teaching can do. Thank you very much.

  • @tryit-wv8ui
    @tryit-wv8ui 11 місяців тому +16

    Wow! I finally understood everything. I am a student in ML. I have watched already half of your videos. Thank you so much for sharing. Greetings from Jerusalem

  • @DeepakTopwal-sl6bw
    @DeepakTopwal-sl6bw 7 місяців тому +2

    and learning becomes more interesting and fun when you have an Teacher like Umar who explains each and everything related to the topic so good that everyone feels like they know complete algorithms.
    A big fan of your teaching methods Umar.. Thanks for making all the informative videos..

  • @wilsvenleong96
    @wilsvenleong96 11 місяців тому +23

    Man, your content is awesome. Please do not stop making these videos as well as code walkthroughs.

  • @AqgvP07r-hq3vu
    @AqgvP07r-hq3vu Місяць тому +1

    Among all videos about hnsw, the best. Others dont understand. They just pretend. This one, on the other hand, is honest and thorough.

  • @abisheksatnur1770
    @abisheksatnur1770 3 місяці тому +2

    The 🐐. I wish I had you as my teacher in real life, not just through a screen

  • @ramsivarmakrishnan1399
    @ramsivarmakrishnan1399 6 місяців тому +3

    You are the best teacher of ML that I have experienced. Thanks for sharing the knowledge.

  • @faiyazahmad2869
    @faiyazahmad2869 4 місяці тому +3

    This is one of the best explanation i ever seen in youtube.... Thank you.

  • @Rockermiriam
    @Rockermiriam 7 місяців тому +4

    Amazing teacher! 50 minutes flew by :)

  • @sarimhashmi9753
    @sarimhashmi9753 6 місяців тому +4

    Wow, thanks a lot. This Is the best explanation on RAG I found on UA-cam

  • @mohittewari6796
    @mohittewari6796 4 місяці тому +1

    The way you've explained all these concepts has blown my mind. I won't be surprised to see your number of subscribers skyrocket. Channel Subscribed !!

  • @bevandenizclgn9282
    @bevandenizclgn9282 5 місяців тому +3

    Best explanation I found on UA-cam, thank you!

  • @nithinma8697
    @nithinma8697 Місяць тому +1

    This is the type of content we really want

  • @yuliantonaserudin7630
    @yuliantonaserudin7630 9 місяців тому +6

    The best explanation of RAG

  • @redfield126
    @redfield126 11 місяців тому +5

    Waited for such content for a while. You made my day. I think I got almost everything. So educational. Thank you Umar

  • @suman14san
    @suman14san 8 місяців тому +3

    What an exceptional explanation of HNSW algo ❤

  • @venkateshdesai3150
    @venkateshdesai3150 5 місяців тому +2

    Amazing !! I finally understood everything. Good Job, all your videos have in-depth understanding

  • @alexsguha
    @alexsguha 9 місяців тому +1

    Impressively intuitive, something most explanations are not. Great video!

  • @1tahirrauf
    @1tahirrauf Рік тому +3

    Thanks Umar. I look forward for your videos as you explain the topic in an easy to understand way. I would request you to make "BERT implementation from scratch" video.

  • @NikhilSharma-o2g
    @NikhilSharma-o2g Рік тому +1

    One of the best channels to learn and grow

  • @kiranshenvi2626
    @kiranshenvi2626 9 місяців тому +1

    Awesome context sir, it was the best explanation I found till now!

  • @JRB463
    @JRB463 9 місяців тому +2

    This was fantastic (as usual). Thanks for putting it together. It has helped my understanding no end.

  • @vasoyarutvik2897
    @vasoyarutvik2897 10 місяців тому +1

    Hello sir i just want to say thanks for creating very good content for us. love from India :)

  • @IndianGamingMaharaja
    @IndianGamingMaharaja 4 місяці тому +1

    total 48 minutes worthy vedio

  • @alexandredamiao1365
    @alexandredamiao1365 9 місяців тому +3

    This was fantastic and I have learned a lot from this! Thanks a lot for putting this lesson together!

  • @mturja
    @mturja 8 місяців тому +1

    The explanation of HNSW is excellent!

  • @goelnikhils
    @goelnikhils 11 місяців тому +1

    Amazing content and what clear explanation. Please make more videos. Keep making this channel will grow like anything.

  • @trungquang1581
    @trungquang1581 7 місяців тому

    Thank you so much for sharing. Looking for more content about NLP and LLMs

  • @Jc-jv3wj
    @Jc-jv3wj 6 місяців тому

    Thank you very much for a detailed explanation on RAG with Vector Database. I have one question: Can you please explain how do we design the skip list with embeddings? Basically how to design which embedding is going to which level?

  • @myfolder4561
    @myfolder4561 8 місяців тому +1

    Thank you so much - this is a great video. Great balance of details and explanation. I have learned a ton and have saved it down for future reference

  • @melikanobakhtian6018
    @melikanobakhtian6018 11 місяців тому +2

    Wow! You explained everything great! Please make more videos like this

  • @amazing-graceolutomilayo5041
    @amazing-graceolutomilayo5041 9 місяців тому

    This was a wonderful explanation! I understood everything and I didn't have to watch the Transformers or BERT video (I actually know nothing about them but I have dabbled with Vector DBs).
    I have subbed and I will definitely watch the transformer and BERT video. Thank you!❤❤
    Made a little donation too. This is my first ever saying $Thanks$ on UA-cam haha

  • @tysun2739
    @tysun2739 Місяць тому +1

    Very nice explanation. Many thanks!

  • @jeremyregamey495
    @jeremyregamey495 Рік тому +1

    Just love ur videos. Soo much Details but extremly well put together

  • @akramsalim9706
    @akramsalim9706 Рік тому +1

    Awesome paper. Please keep posting more videos like this.

  • @NeoMekhar
    @NeoMekhar 11 місяців тому +1

    This video is really good, subscribed! You explained the topic super well. Thanks!

  • @alirahmanian5127
    @alirahmanian5127 2 дні тому +1

    Cannot thank enough! Awesome content!!

  • @meetvardoriya2550
    @meetvardoriya2550 11 місяців тому +1

    Really amazing content!!, looking forward for more such content Umar :)

  • @DatabaseAdministration
    @DatabaseAdministration Рік тому +1

    Glad I've subscribed to your channel. Please do these more.

  • @sethcoast
    @sethcoast 7 місяців тому +1

    This was such a great explanation. Thank you!

  • @satviknaren9681
    @satviknaren9681 6 місяців тому

    Please bring some more content !

  • @李洋-i4j
    @李洋-i4j Рік тому +1

    Wow, I saw the Chinese knotting on your wall ~

  • @ShreyasSreedhar2
    @ShreyasSreedhar2 9 місяців тому +1

    This was super insightful, thank you very much!

  • @andybhat5988
    @andybhat5988 4 місяці тому +1

    Super explanation. Thank you

  • @12.851
    @12.851 10 місяців тому +1

    Great video!! Shouldn't 5 come after 3 in skip list?

  • @mdmusaddique_cse7458
    @mdmusaddique_cse7458 9 місяців тому +1

    Amazing explanation!

  • @amblessedcoding
    @amblessedcoding Рік тому +2

    Wooo you are the best I have ever seen

  • @manyams5207
    @manyams5207 9 місяців тому +1

    wow wonderful explanation thanks

  • @mustafacaml8833
    @mustafacaml8833 9 місяців тому +1

    Great explanation! Thank you so much

  • @emptygirl296
    @emptygirl296 Рік тому +1

    Hola, coming back with a great content as usual

  • @LorenzoMontù
    @LorenzoMontù 9 місяців тому +1

    amazing work very clear explanation ty!

  • @tomargentin5198
    @tomargentin5198 7 місяців тому

    Hey, big thanks for this awesome and super informative video!
    I'm really intrigued by the Siamese architecture and its connection to RAG. Could someone explain that a bit more?
    Am I right in saying it's used for top-K retrievals ? Meaning, we create the database with the output embeddings, and then use a trained Siamese architecture to find the top-K most relevant chunks computing similarities ?
    Is it necessary to use this approach in every framework, or can sometimes just computing similarity through the embeddings work effectively?

    • @jeromeeusebius
      @jeromeeusebius 5 місяців тому

      The siamese network he talked about just provides details of the sentence-bert that is used for encoding. The connection to RAG is that the sentence-bert model is used to do the encoding for both the query and the rest of the document chunks fed into the DB. In the case, Umar is providing some additional information regarding how the sentence-bert model was developed and why it is better than the natural BERT. I think it's important to understand the distinction.
      The top-K retrievals is done by the vector search. Using the HNSW, example, the query is compared with a random entry and then you proceed to the neighbors of each of the vectors until you get to a local minimum. You save this point. You do this a few times (> k) and retrieve the top-K ones sorted by their similarities. So the embeddings from S-BERT are used but not directly. The retrieval of the top-K embeddings is done at the vector DB search level. And doing this multiple times (via a different entry into the HNSW graph) you will get different results. And then you retrieve the top-K from there. I hope this is clear.

  • @qicao7769
    @qicao7769 9 місяців тому +1

    Cool video about RAG! You could also upload into Bilibili, as you live in China, you should know that. :D

  • @bhanujinaidu
    @bhanujinaidu 7 місяців тому +1

    Good explanation, thanks

  • @LiuCarl
    @LiuCarl 9 місяців тому +1

    simply impressive

  • @parapadirapa
    @parapadirapa 10 місяців тому

    Amazing presentation! I have a couple of questions though... What size of chunks should be used when using Ada-002? Is that dependent on the Embedding model? Or is it to optimize the granularity of 'queriable' embedded vectors? And another thing: am I correct to assume that, in order to capture the most contexts possible, I should embed a 'tree structure' object (like a complex object in C#, with multiple nested object properties of other types) sectioned from more granular all the way up to the full object (as in, first the children, then the parents, then the grand-parents)?

  • @ashishgoyal4958
    @ashishgoyal4958 Рік тому +1

    Thanks for making these videos🎉

  • @nancyyou7548
    @nancyyou7548 10 місяців тому +1

    Thank you for the excellent content!

  • @ahmedoumar3741
    @ahmedoumar3741 9 місяців тому +1

    Nice lecture, Thank you!

  • @SanthoshKumar-dk8vs
    @SanthoshKumar-dk8vs 10 місяців тому +1

    Thanks for sharing, really a great content 👏

  • @soyedafaria4672
    @soyedafaria4672 9 місяців тому

    Thank you so much. Such a nice explanation. 😀

  • @hassanjaved4730
    @hassanjaved4730 8 місяців тому

    Awesome I completely understand the RAG just because of you, Now I am here with some questions let's I am using the Llama2 model to where my main concern is I am giving him the pdf for context then user can ask question question on this,
    but this approach took time, during inferencing. so after watching your video what i undersatnd using the RAG pipeline is it possible to store the uploaded pdf into vector db then we will used it like that.
    I am thinking right or not or is it possible or not?
    Thanks,

  • @maximbobrin7074
    @maximbobrin7074 Рік тому +2

    Man, keep it up! Love your content

  • @ravindarmadishetty736
    @ravindarmadishetty736 3 місяці тому

    This is really awesome session. Off course it is a lengthy but nice. Seems problem with git. Unable to access python and pdf files.

  • @fernandofariajunior
    @fernandofariajunior 9 місяців тому +1

    Thanks for making this video!

  • @Zayed.R
    @Zayed.R 8 місяців тому +1

    Very informative, thanks

  • @Tiger-Tippu
    @Tiger-Tippu Рік тому +1

    Hi Umar,does RAG also has context window limitation as prompt engineering technique

  • @RomuloBrito-b2z
    @RomuloBrito-b2z 4 місяці тому

    When the algorithm runs to store the k best scores, it uses a pop operation on the list to remove the nodes that have already been visited?

  • @prashantharipirala7652
    @prashantharipirala7652 29 днів тому

    Can you tell how that hierarchial structure supporting HNSW is created?

  • @dantedt3931
    @dantedt3931 10 місяців тому +1

    One of the best videos

  • @_seeker423
    @_seeker423 9 місяців тому +1

    Excellent content!

  • @rvons2
    @rvons2 Рік тому +1

    Are we storing the sentence embeddings together with the original sentence they were created? If not how do we map them back (from the top-k most similar stored vectors) into the text they were originated for, given that the sentence embedding lost some information when pooling was done.

    • @umarjamilai
      @umarjamilai  Рік тому +2

      Yes, the vector database stores the embedding and the original text. Sometimes, they do not store the original text but a reference to it (for example instead of storing the text of a tweet, you may store the ID of the tweet) and then retrieve the original content using the reference.

  • @faiqkhan7545
    @faiqkhan7545 11 місяців тому

    Lets say I want to create a Online semantic search tool , that uses vector DB, and RAG performance. just like bing tool .
    will it follow the same procedure and what new things I will be adding it to integrate to Internet?
    Plus nicely put video Umar .
    can you do a coding session for this one like you do for all others , like make something with real time output with rag ? or anything . will be a pleasure to watch.

  • @sounishnath513
    @sounishnath513 11 місяців тому +1

    I am so glad I am subscribed to you!

  • @hientq3824
    @hientq3824 Рік тому +2

    awesome as usual! ty

  • @oliz1148
    @oliz1148 9 місяців тому +1

    so helpful! thx for sharing

  • @RomanLi-y9c
    @RomanLi-y9c Рік тому +1

    Thank you, awesome video!

  • @DanielJimenez-yy8xk
    @DanielJimenez-yy8xk 7 місяців тому +1

    awesome content

  • @oliva8282
    @oliva8282 6 місяців тому +1

    Best video ever!

  • @RudraPratapDhara
    @RudraPratapDhara Рік тому +2

    You are legend

  • @koiRitwikHai
    @koiRitwikHai 10 місяців тому +1

    at 44:00 , the order of linked list is incorrect... isn't it? because it should be 1 3 5 9

    • @TheGreatMind55
      @TheGreatMind55 10 місяців тому

      Even I have the same doubt. It should have been sorted as per the definition

  • @rajyadav2330
    @rajyadav2330 11 місяців тому +1

    Great content , keep doing it .

  • @rkbshiva
    @rkbshiva 11 місяців тому

    Umar, great content! Around 25:00, when you say that we have a target cosine similarity. How is that target's cosine similarity calculated? Because there is no mathematical way to calculate the cosine similarity between two sentences. All we can do is only take a subjective guess. Can you please exlain in detail to me how this works?

    • @umarjamilai
      @umarjamilai  11 місяців тому

      When you train the model, you have a dataset that maps two sentences to a score (chosen by a human being based on a scale from 1 to 10 for example). This score can be used as a score for the cosine similarity. If you look papers in this field, you'll see there are many sofisticated methods, but the training data is always labeled by a human being.

    • @rkbshiva
      @rkbshiva 11 місяців тому

      @@umarjamilai Understood! Thanks very much for the prompt response.
      It would be great if we can identify a bias free way to do this as the numbering between 1 - 10, especially when done by multiple people and at scale, could get biased.

    • @jeromeeusebius
      @jeromeeusebius 5 місяців тому

      @@rkbshiva The numbering is not done by random people. Usually, some specialists, e.g., language specialists are employed to get this dataset, and this reduces the noise in the label (but you'd still get some bias but should be small). Google does this for the search quality. They have a standard search quality evaluation document that is provided to the evaluators and they use the document as a guide and how to score the different documents returned for a give query.

  • @christopherhornle4513
    @christopherhornle4513 Рік тому

    Great video, keep up the good work! :) Around 19:25 you're saying that the embedding for "capital" is updated during backprop. Isn't that wrong for the shown example / training run where "capital" is masked? I always thought only the embedding associated with non-masked tokens can be updated.

    • @umarjamilai
      @umarjamilai  Рік тому +2

      You're right!
      First of all, ALL embedding vectors of the 14 tokens are updated (including the embedding associated with the MASK token).
      What happens actually is that the model updates the embedding of all the surrounding words in such a way that it can rebuild the missing word next time. Plus, the model is forced to use (mostly) the embedding of the context words to predict the masked token, since any word may be masked, so there's not so much useful information in the embedding of the MASK token itself.
      It's easy to get confused when you make long videos like mine 😬😬
      Thanks for pointing out!

    • @christopherhornle4513
      @christopherhornle4513 Рік тому

      I see, didn't know that the mask token is also updated! Thank you for the quick response. You really are a remarkable person. Keep going!

  • @mohamed_akram1
    @mohamed_akram1 9 місяців тому +1

    Thanks

  • @ChashiMahiulIslam-qh6ks
    @ChashiMahiulIslam-qh6ks 9 місяців тому +1

    You are the BEST!

  • @adatalearner8683
    @adatalearner8683 7 місяців тому

    why is the context window size limited? Is it because these models are based on transformers and for a given transformer architecture, long distance semantic relationship detection will be bounded by the number of words/context length ?

  • @SureshKumarMaddala
    @SureshKumarMaddala 11 місяців тому

    Excellent video! 👏👏👏

  • @HichamElKaissi-g4s
    @HichamElKaissi-g4s Рік тому +1

    Thank you so much man..

  • @Vignesh-ho2dn
    @Vignesh-ho2dn 7 місяців тому

    How would you find number 3 at 44:01 ? The algorithm you said will go to 5 and then since 5 is greater than 3, it won't go further. Am I right?

    • @jeromeeusebius
      @jeromeeusebius 5 місяців тому

      I think he is mostly explaining how the skip-list data structure works. In general, with HNSW, you are not looking for a particular value (those values are cosine similarity scores) but rather you are traversing the graph to find neighbors with smaller similar scores until you get to a local minima, then that is the node that is returned. You then repeat it again from another entry point.

  • @chhabiacharya307
    @chhabiacharya307 10 місяців тому +1

    Thank YOU :)

  • @jrgenolsen3290
    @jrgenolsen3290 9 місяців тому +1

    💪👍 good introduktion

  • @UncleDavid
    @UncleDavid 11 місяців тому

    Salam Mr Jamil, i was wondering if it was possible to use the BERT model provided by apple in coreml for sentimental analysis when talking to siri then having a small gpt2 model fine tuned in conversational intelligence give a response that siri then reads out

  • @ltbd78
    @ltbd78 10 місяців тому +1

    Legend

  • @YouHaveToLoveMe
    @YouHaveToLoveMe 2 місяці тому +1

    Master peice :)

  • @adatalearner8683
    @adatalearner8683 6 місяців тому

    how to do get target cosine similarity at first place?

    • @jeromeeusebius
      @jeromeeusebius 5 місяців тому +1

      There is an annotated sentence-sentence scored by experts. This what is used to compute the loss.

  • @MihailLivitski
    @MihailLivitski 11 місяців тому

    Thanks!

  • @tempdeltavalue
    @tempdeltavalue 10 місяців тому

    So how llm converts vector to text ?

  • @tempdeltavalue
    @tempdeltavalue 10 місяців тому

    So how LLM converts vector to text ?

  • @amblessedcoding
    @amblessedcoding Рік тому

    Thanks bro