Awesome video! Thank you for sharing your techniques. Extracting use case-specific info and store them in metadata before indexing is a very interesting approach. This might actually be better than regular contextualization of chunks, where you add the info in the content of your chunk instead of the metadata. Will definitely try that out. Thanks! Would love to see you talk about agents frameworks in the future! Especially how you could try to make something as good as the Composer Agent from Cursor.
Thanks! Yeah, this kind of metadata-enriching with LLMs can definitely be applied in all kinds of ways. Very versatile. Might make a video about LangGraph at some point, it's honestly the only agent framework I've found that could be useful - I tend to look for deterministic workflows as much as possible. Don't know about making something that could touch Composer though, the engineering team at Cursor is pretty nuts😄
Годину тому
Interesting topic of cause. I would start using LLM for the UX part, NLP etc and then generate SQL content against a database. The content is structured anyways. The result could then polished with a fine tuned model. Complete or partial results could be cached too since we are inside of a specific domain. Outliars could be caught and managed. This would be the benchmark to beat in that case. Running cost included...
That's really interesting. So, the LLM got rid of the conjugation problem by mapping all the forms to specific services? Did you also run any tests to find out how large the LLM needs to be for that functionality?
Yep, that's correct. Both in extracting the services, and in structuring the filters, the LLM is instructed to return the un-conjugated / nominative form of the services and cities. So we got 2 birds with 1 stone, both getting the filtering as well as eliminating the conjugation-issues (: We tested a couple of OpenAI models of varying sizes. They all did pretty good, but the smaller ones occasionally missed some services in the extraction. So we ended up going with a larger model (4o), which performed very well. But for the query rewriting/structuring, I'm pretty sure we concluded that 4o-mini was good enough for that. So as far as conjugation goes, I think smaller models should be able to do it. It was more the service extraction where we saw issues. This was of course specific to Finnish, so your mileage may vary (:
@@mtprovasti Ah, right. BERT is an encoder, meaning it would be used in creating the embeddings for vector search. Here we used OpenAI's ada-02 -encoder for the same purpose. Until we gave up on vector search, that is. BERT is a popular choice when you want a fine-tuned embedding-model though, to better capture the semantic similarities/dissimilarities in your specific content (and thus get better retrieval results)
Disagree about agentic RAG. It's becoming a common feature. It's not just some grad paper. I don't know why you would say this after presenting a use case.
Sure, not saying it doesn't have its place. Just that in my experience, people are too quick to jump on flashy solutions instead of simpler ones that get the job done fine, and in a more robust way. Could you achieve similar results with an agentic approach? Maybe, but they typically come with serious trade-offs in latency, cost and unpredictability. Appreciate the comment though. One reason why I often speak against agents is also just how poorly the term is defined and over-used. Fine for marketing, but imo not useful when it comes to actually understanding how all this stuff works.
A very good video showing that following the main trends isn't always profitable. Thanks.
Excellent findings. Keep continue good work!
Thank you (:
A tutorial for this real-world use case is absolutely necessary. It’s highly relevant and applicable to many real-world problems.
Awesome video! Thank you for sharing your techniques. Extracting use case-specific info and store them in metadata before indexing is a very interesting approach. This might actually be better than regular contextualization of chunks, where you add the info in the content of your chunk instead of the metadata. Will definitely try that out. Thanks!
Would love to see you talk about agents frameworks in the future! Especially how you could try to make something as good as the Composer Agent from Cursor.
Thanks! Yeah, this kind of metadata-enriching with LLMs can definitely be applied in all kinds of ways. Very versatile.
Might make a video about LangGraph at some point, it's honestly the only agent framework I've found that could be useful - I tend to look for deterministic workflows as much as possible. Don't know about making something that could touch Composer though, the engineering team at Cursor is pretty nuts😄
Interesting topic of cause. I would start using LLM for the UX part, NLP etc and then generate SQL content against a database. The content is structured anyways. The result could then polished with a fine tuned model. Complete or partial results could be cached too since we are inside of a specific domain. Outliars could be caught and managed. This would be the benchmark to beat in that case. Running cost included...
That's really interesting. So, the LLM got rid of the conjugation problem by mapping all the forms to specific services? Did you also run any tests to find out how large the LLM needs to be for that functionality?
Yep, that's correct. Both in extracting the services, and in structuring the filters, the LLM is instructed to return the un-conjugated / nominative form of the services and cities. So we got 2 birds with 1 stone, both getting the filtering as well as eliminating the conjugation-issues (:
We tested a couple of OpenAI models of varying sizes. They all did pretty good, but the smaller ones occasionally missed some services in the extraction. So we ended up going with a larger model (4o), which performed very well.
But for the query rewriting/structuring, I'm pretty sure we concluded that 4o-mini was good enough for that. So as far as conjugation goes, I think smaller models should be able to do it. It was more the service extraction where we saw issues. This was of course specific to Finnish, so your mileage may vary (:
thanks:)
I second your approach, a bit strong on the agentics orchestration, that might not fit your use case, but still have plenty other good happy ending ;)
Yeah, definitely. There's just quite a lot of over-enthusiasm about taking an agentic approach wherever possible, so I want to push back on that (:
Is the LLM bert?
Nah, GPT4o and -mini
@johannesjolkkonen trying to figure out now that modern bert is out at what stage of rag is it applied.
@@mtprovasti Ah, right. BERT is an encoder, meaning it would be used in creating the embeddings for vector search. Here we used OpenAI's ada-02 -encoder for the same purpose. Until we gave up on vector search, that is.
BERT is a popular choice when you want a fine-tuned embedding-model though, to better capture the semantic similarities/dissimilarities in your specific content (and thus get better retrieval results)
After a day, why only 22 likes???
why not just kg w triples?
Not sure what the benefit would be. What do you think?
Disagree about agentic RAG. It's becoming a common feature. It's not just some grad paper. I don't know why you would say this after presenting a use case.
Sure, not saying it doesn't have its place. Just that in my experience, people are too quick to jump on flashy solutions instead of simpler ones that get the job done fine, and in a more robust way.
Could you achieve similar results with an agentic approach? Maybe, but they typically come with serious trade-offs in latency, cost and unpredictability.
Appreciate the comment though. One reason why I often speak against agents is also just how poorly the term is defined and over-used. Fine for marketing, but imo not useful when it comes to actually understanding how all this stuff works.