RAG is just 'full text indexing' on the local data with the ranked results fed into the context window and sent to the LLM along with the question. Every time I see it described as something of a database guy for the last 30 years all I see are new words describing long solved problems.
Well new cars have wheels which is a technology that has thousands of years of existence. It does not mean that new cars are 'obsolete' but using an old tech to improve a new one is a great way of doing engineering !
It depends on the code implementation of the system! Most will put in place a system to detect it and summarize or extract key points to make it shorter.
Great video. Would you make a video the different types of RAGs? Or how to prepare data for a RAG, for example when your document has tables, math formulas, references to images, I haven't seen much content about how to handle diverse data inside a document with RAGs. Cheers
AI algorithms facilitate better decision-making in business by providing actionable insights from data analysis.This enhances strategic planning and operational efficiency.
A database can be much larger than this context window and much more efficient I believe. It’s unsure how good the models are vs gpt4 yet. Plus, sending millions of tokens for every prompt will be extremely expensive for each request, haha! It’s good for some use cases like sending a full repo once and asking questions but not for working with customers and handling many requests I believe.
Get your copy of "Building LLMs for Production": amzn.to/4bqYU9b
RAG is just 'full text indexing' on the local data with the ranked results fed into the context window and sent to the LLM along with the question.
Every time I see it described as something of a database guy for the last 30 years all I see are new words describing long solved problems.
You mean like how elastic search does indexing ?
Well new cars have wheels which is a technology that has thousands of years of existence. It does not mean that new cars are 'obsolete' but using an old tech to improve a new one is a great way of doing engineering !
What happens to the information received from the RAG if the original request already occupies the entire context window?
It depends on the code implementation of the system! Most will put in place a system to detect it and summarize or extract key points to make it shorter.
Great video. Would you make a video the different types of RAGs? Or how to prepare data for a RAG, for example when your document has tables, math formulas, references to images, I haven't seen much content about how to handle diverse data inside a document with RAGs.
Cheers
Great idea, thank you! Will definitely look into multi modal RAG! :)
I think this is the best video I have seen on this topic. Wanted to ask if we can use RAG offline maybe with Mistral model ?
Of course you can host everything locally if you have the capacity! :)
by any chance do you know which RAG system/framework is giving out the best performance?
From our work we like to use llamaindex for many parts and adapt on our own code for more personalized settings!
Very Informative and useful!! Thanks
Thanks but what's the point of sound effects?
Truly excellent video!
Wow. Thanks a lot for this amazing explanation
Thanks , very clear excellent explanation
Thank you! :)
Now I understood, What is RAG - Retrieval Augmented Generation ,Very Informative Video, Liked your Video 👍
Great video, straight to the point. Thanks again
Thank you Sabri! :)
How to protect a company's information with this technology?
You'd only provide placeholders for the company's information within these prompts and make sure that they are in a specific format.
Vraiment clair et précis merci
Subbed
Et cetera bien sur mon poto
thx. i enjoyed this video
Glad to hear so my friend! 😊
AI algorithms facilitate better decision-making in business by providing actionable insights from data analysis.This enhances strategic planning and operational efficiency.
After integrating with RAG. latency increased....
That is for sure! There is some downsides but the latency if very little.
The accent of the speaker is pretty heavy.
Hope it’s still easy to understand!
google launched gemini advanced 1.5, a RAG killer 💀
A database can be much larger than this context window and much more efficient I believe. It’s unsure how good the models are vs gpt4 yet. Plus, sending millions of tokens for every prompt will be extremely expensive for each request, haha! It’s good for some use cases like sending a full repo once and asking questions but not for working with customers and handling many requests I believe.