2 Methods For Improving Retrieval in RAG

Поділитися
Вставка
  • Опубліковано 20 гру 2024

КОМЕНТАРІ • 22

  • @sapdalf
    @sapdalf 17 годин тому +1

    A very good video showing that following the main trends isn't always profitable. Thanks.

  • @AA-rd6nm
    @AA-rd6nm День тому +1

    Excellent findings. Keep continue good work!

  • @limjuroy7078
    @limjuroy7078 19 годин тому

    A tutorial for this real-world use case is absolutely necessary. It’s highly relevant and applicable to many real-world problems.

  • @PierreRibardière-u7x
    @PierreRibardière-u7x День тому +1

    Awesome video! Thank you for sharing your techniques. Extracting use case-specific info and store them in metadata before indexing is a very interesting approach. This might actually be better than regular contextualization of chunks, where you add the info in the content of your chunk instead of the metadata. Will definitely try that out. Thanks!
    Would love to see you talk about agents frameworks in the future! Especially how you could try to make something as good as the Composer Agent from Cursor.

    • @johannesjolkkonen
      @johannesjolkkonen  День тому

      Thanks! Yeah, this kind of metadata-enriching with LLMs can definitely be applied in all kinds of ways. Very versatile.
      Might make a video about LangGraph at some point, it's honestly the only agent framework I've found that could be useful - I tend to look for deterministic workflows as much as possible. Don't know about making something that could touch Composer though, the engineering team at Cursor is pretty nuts😄

  •  Годину тому

    Interesting topic of cause. I would start using LLM for the UX part, NLP etc and then generate SQL content against a database. The content is structured anyways. The result could then polished with a fine tuned model. Complete or partial results could be cached too since we are inside of a specific domain. Outliars could be caught and managed. This would be the benchmark to beat in that case. Running cost included...

  • @ZabavTubus
    @ZabavTubus День тому +1

    That's really interesting. So, the LLM got rid of the conjugation problem by mapping all the forms to specific services? Did you also run any tests to find out how large the LLM needs to be for that functionality?

    • @johannesjolkkonen
      @johannesjolkkonen  День тому

      Yep, that's correct. Both in extracting the services, and in structuring the filters, the LLM is instructed to return the un-conjugated / nominative form of the services and cities. So we got 2 birds with 1 stone, both getting the filtering as well as eliminating the conjugation-issues (:
      We tested a couple of OpenAI models of varying sizes. They all did pretty good, but the smaller ones occasionally missed some services in the extraction. So we ended up going with a larger model (4o), which performed very well.
      But for the query rewriting/structuring, I'm pretty sure we concluded that 4o-mini was good enough for that. So as far as conjugation goes, I think smaller models should be able to do it. It was more the service extraction where we saw issues. This was of course specific to Finnish, so your mileage may vary (:

  • @micbab-vg2mu
    @micbab-vg2mu 5 годин тому +1

    thanks:)

  • @theseedship6147
    @theseedship6147 День тому +1

    I second your approach, a bit strong on the agentics orchestration, that might not fit your use case, but still have plenty other good happy ending ;)

    • @johannesjolkkonen
      @johannesjolkkonen  День тому

      Yeah, definitely. There's just quite a lot of over-enthusiasm about taking an agentic approach wherever possible, so I want to push back on that (:

  • @mtprovasti
    @mtprovasti 12 годин тому

    Is the LLM bert?

    • @johannesjolkkonen
      @johannesjolkkonen  2 години тому

      Nah, GPT4o and -mini

    • @mtprovasti
      @mtprovasti 2 години тому

      @johannesjolkkonen trying to figure out now that modern bert is out at what stage of rag is it applied.

    • @johannesjolkkonen
      @johannesjolkkonen  56 хвилин тому

      @@mtprovasti Ah, right. BERT is an encoder, meaning it would be used in creating the embeddings for vector search. Here we used OpenAI's ada-02 -encoder for the same purpose. Until we gave up on vector search, that is.
      BERT is a popular choice when you want a fine-tuned embedding-model though, to better capture the semantic similarities/dissimilarities in your specific content (and thus get better retrieval results)

  • @timfitzgerald8283
    @timfitzgerald8283 21 годину тому +1

    After a day, why only 22 likes???

  • @themax2go
    @themax2go 19 годин тому

    why not just kg w triples?

    • @johannesjolkkonen
      @johannesjolkkonen  2 години тому

      Not sure what the benefit would be. What do you think?

  • @KevinKreger
    @KevinKreger 6 годин тому

    Disagree about agentic RAG. It's becoming a common feature. It's not just some grad paper. I don't know why you would say this after presenting a use case.

    • @johannesjolkkonen
      @johannesjolkkonen  Годину тому

      Sure, not saying it doesn't have its place. Just that in my experience, people are too quick to jump on flashy solutions instead of simpler ones that get the job done fine, and in a more robust way.
      Could you achieve similar results with an agentic approach? Maybe, but they typically come with serious trade-offs in latency, cost and unpredictability.
      Appreciate the comment though. One reason why I often speak against agents is also just how poorly the term is defined and over-used. Fine for marketing, but imo not useful when it comes to actually understanding how all this stuff works.