Anthropic's new improved RAG: Explained (for all LLM)

Поділитися
Вставка
  • Опубліковано 27 гру 2024

КОМЕНТАРІ • 25

  • @DaveRetchless
    @DaveRetchless 2 місяці тому +2

    So much to learn....everyday! Thanks for providing great content! You are one of my daily learning resources. Keep kicking AI!

    • @code4AI
      @code4AI  2 місяці тому

      Thanks. Smile.

  • @DanielBowne
    @DanielBowne 2 місяці тому +1

    I am super stoked about this. Soooo many AI content channels just regurgitate the Anthropic's article, but no original content. Would love someone to take this example and actually build an example and show comparison of how RAG worked for them vs this new method.

  • @1DusDB
    @1DusDB 2 місяці тому +2

    12:33 Give the whole document?! but at the beginning 0:40 they say that if kb is lower than 200K tokens its better to send it within the prompt (so not RAG used). 🤔
    So what if the document is big >200K in that "situate_context" code?

    • @code4AI
      @code4AI  2 місяці тому

      It is so easy to find the answer. Just upload a 500 page document to Claude and look what is happening. You can experience Ai yourself! Trust yourself.

  • @ramitube21
    @ramitube21 2 місяці тому +3

    What about building a rag system with contextual retrieval and open source models like llama?

  • @PedroPereira-i6b
    @PedroPereira-i6b 2 місяці тому

    I have a question: if you need to load the entire document into the prompt, does that mean that Contextual RAG doesn't work for situations where the document has more than 200k tokens?
    In a way, it seems that this solution undermines the main principle of a RAG, which is to fragment the content to 'fit' into the prompt.

  • @pedrogondim2740
    @pedrogondim2740 2 місяці тому

    How does this compare to Jina Late Chuncking approach for contextual understanding

  • @johnnybloem1
    @johnnybloem1 2 місяці тому

    I like your sense of humour. I *giggled” when you spoke about chickens and eagles categorising both as birds. Chickens have not yet formally been added to the bird category…may be due to their limited flying capacity 😂

  • @mulderbm
    @mulderbm 2 місяці тому

    Love it keep your humor in these videos its beautiful 😂

  • @xsrothebeginner8658
    @xsrothebeginner8658 2 місяці тому

    How about storing a hierarchical order of the chunks depending on e.g. paragraphs, which you add to retrieved embedded vectors. In addition, you can ask the LLM for the most important words like name etc entities in the prompt and search for the given word in the chunk texts and again use the hierarchical order of the chunks to obtain the contextual chunks

  • @remusomega
    @remusomega 2 місяці тому

    Late Chunking has solved the Chunking Context problem.

    • @code4AI
      @code4AI  2 місяці тому

      Please indicate each of your jokes with a clear label, like joke::

  • @JonCollins-eq8jm
    @JonCollins-eq8jm 2 місяці тому

    I would also like to see and/or work on an open source implementation. If anyone has a resource, or @Discover AI would like to work on it, it would be appreciated. For the process described, why is the whole document and individual chunk fed to the LLM each time - couldn't you just feed the document once and for example a batch of 100 chunks (assuming it fits in the context window)? Then the LLM could produce a batch of contextualised chunks, rather calling it so many times?

  • @ChristophBackhaus
    @ChristophBackhaus 2 місяці тому

    Is this not also very usefull if you use a very long system prompt?
    I have a system prompt that is a couple pages long. Telling the AI what our coding conventions are. The idea is that instead of giving the Model a bunch of code and having it try to guess what the rules behind the code is we tell it. At least in my small amout of testing this has worked quite well.

  • @__mbCrypto
    @__mbCrypto 2 місяці тому

    So...we've been doing this since MONTHS. Store the chunk without implicit references to improve false in context learning generation and rewrite it using a recursive summarisation until the full document fits in the content window.
    Didn't know that could justify a published paper...😅 to us, it's just common sens and some implementation details.
    Anyone agree or are we secretly geniuses? 😂
    Great video BTW

  • @ChristophBackhaus
    @ChristophBackhaus 2 місяці тому

    So. Did you see that meta claims to have solved prompt injection?

  • @SirajFlorida
    @SirajFlorida 2 місяці тому +1

    That's why I can't turn away.

  • @skinclub-cosmeticdoctors6247
    @skinclub-cosmeticdoctors6247 2 місяці тому +4

    So in essence it's a load of BS. Why would we want to triple our embeddings requirement , it's not sustainable.

    • @DanielBowne
      @DanielBowne 2 місяці тому +1

      Embedding storage is the cheap part. Anthropic is stating 47% accuracy improvement. If you use context caching, this will be fairly cheap to build out. Even cheaper if you use a local embeddings model.

    • @dragoon347
      @dragoon347 2 місяці тому

      They are giving a master class on how to frontload and save on cost. If you have a large knowledge base its best to do this 1 time, then you don't have to do all the context and everything again.
      For example all previous years sales numbers for a company. Nice static database.
      Now something like a chatbot memory would need the extra compute as new information will constantly be ingested.

  • @DoktorUde
    @DoktorUde 2 місяці тому

    This could have been explained in 10 minutes instead of 34.

    • @code4AI
      @code4AI  2 місяці тому +4

      Glad to hear the concept of contextual retrieval clicked for you so quickly! If you're ready to explain it in 10 minutes now, I'd say the 34 minutes were well spent. Thanks for the feedback!