Contextual Retrieval with Any LLM: A Step-by-Step Guide

Поділитися
Вставка
  • Опубліковано 25 гру 2024

КОМЕНТАРІ • 32

  • @DanielBowne
    @DanielBowne 2 місяці тому +1

    Would have loved to see this done with Antropic, mostly due to if you wanted to do this on larger documents, context caching from anthropic would be ideal.

  • @lolwhatsmeta
    @lolwhatsmeta 2 місяці тому

    i love this type of explain the code from the llm provider 🥰thank you so much.

  • @out_and_about08
    @out_and_about08 2 місяці тому +7

    Thanks! Can you please create a video on hybrid rag - vector + graph based retrieval

  • @pprvitaly
    @pprvitaly 2 місяці тому

    14:44 - would be nice to have some practical video about Late chunking

  • @buanadaruokta8766
    @buanadaruokta8766 2 місяці тому +1

    Thank you for creating this content. It's very useful for completing the bachelor's thesis I'm currently working on. I'd like to ask a question: When the chat history reaches thousands of entries, and this chatbot is potentially used for a mobile app, is a vector database needed (for storing the data)? If so, should each data query (session ID, query, and answers) be stored? Or is there something else to consider? In this case, I want the vector database to address the limitation of LLMs, which is the context window constraint.

  • @gramnegrod
    @gramnegrod 2 місяці тому

    Thx for explaining how to generalize the anthropic trick! It’s very germane. And thanks for all the different RAG approaches to consider. Could u do a video on how to evaluate these different methods with metric driven analysis. Now, I just eyeball the results and besides being time consuming;I’m not that good at distinguishing small improvements between RAG models.

  • @Sabeer-k-s
    @Sabeer-k-s 2 місяці тому

    Thank You❤

  • @NLPprompter
    @NLPprompter 2 місяці тому

    Thank You.

  • @alx8439
    @alx8439 2 місяці тому +1

    How costly it will be to send the same big document to some paid API many-many times asking it to locate the next small snippet and add some context? It will be rediculously expensive

    • @tomaszzielinski4521
      @tomaszzielinski4521 2 місяці тому +1

      Yeah, sounds very inefficient with quadratic complexity as function of document size.
      That gave me an idea, though: What if we first generate PDF summary and then use it as a context, instead of full document?

    • @engineerprompt
      @engineerprompt  2 місяці тому +1

      Anthropic, Gemini, and OpenAI all support prompt/context caching. This can substantially reduce the cost by caching your document

  • @msondkar
    @msondkar 2 місяці тому

    Will this work if I have JSON data instead of text documents? How to work out contextual embedding for JSON chunks?

    • @engineerprompt
      @engineerprompt  2 місяці тому

      What is in the json? Can you create flat descriptions from your json and add a reference to the actual json in the metadata?

  • @astronosmage3722
    @astronosmage3722 2 місяці тому +12

    the whole points of rag, atleast for me, was not having to feed the LLM the whole document. Now this needs to be done for every chunk? Doesn't seem very efficient to me

    • @q0x
      @q0x 2 місяці тому

      Prompt caching helps in this case, but I am also not a great fan of putting whole documents into the LLM. Especially since they may still blow up the context size and processing may take a long time.

    • @moin_uddin
      @moin_uddin 2 місяці тому +1

      I'm facing a similar situation, if I could enter the whole document why would I need RAG.

    • @loganyang
      @loganyang 2 місяці тому +2

      Not mentioning the additional time it needs for indexing, that is the dealbreaker for me

    • @rikhendrix261
      @rikhendrix261 2 місяці тому

      @@moin_uddin You can fit one document or some part of the entire documents, but RAG was made for scanning through thousands of documents right? I do agree that sending the extra text to add context is a tough problem. I am wondering if the contexting wouldn't allready work by adding two chuncks before and two chuncks after or 1 chunck before and one chunck after?

    • @moin_uddin
      @moin_uddin 2 місяці тому

      @@rikhendrix261 like we can just add the first few pages of a document even that can help in adding context.

  • @konstantinlozev2272
    @konstantinlozev2272 2 місяці тому

    Should you not join a summary/summaries to the prompt, instead of the whole document/section? Would that not be even better in providing the essence of the context?

  • @micbab-vg2mu
    @micbab-vg2mu 2 місяці тому

    thanks:)

  • @JNET_Reloaded
    @JNET_Reloaded 2 місяці тому

    av you got github link to this code used?

  • @dgoodall6468
    @dgoodall6468 2 місяці тому

    Place in to instructions and thank me later
    ---
    You are a world-class AI system, capable of complex reasoning and reflection. Reason through the query inside `` tags, and then provide your final response inside `` tags. If you detect that you made a mistake in your reasoning at any point, correct yourself inside `` tags. Self reflection is mandatory in every reply unless specifically stated by the user.