Great information here! Thanks for making it public. I think you're going to get a sizeable community around you because of these live streams. Q: where in the code is prompt caching evoked?
Great video and demo as always! I learn much from your content. The contextual retrieval paper said if your corpus is less than 200k tokens, just skip rag and dump the entire corpus into the prompt for every question, and they will cache it (but only for a short time) and just use long context Q&A. I didn’t see them publish any metrics comparing long context to rag, so I take it with a grain of salt. They do want customers to spend as many tokens as possible... But I’m very intrigued at the same time. Maybe you could do a video comparing the two methods? That would be amazing research.
Great insights and instincts @Sean! We'll keep the content recommendation in mind for sure! This is farthest we've gotten on Long-Context and Evaluation for the big window LLMs: ua-cam.com/users/liveBrwhbjh3boU?si=V24z6pagQ0EQ8Ms1
Great video! I’m excited to dive into contextual retrieval next week. When it comes to productionizing hybrid retrieval with BM25, I’m considering using Elasticsearch, any other recommendations? My main concern with hybrid retrieval is the added complexity it brings to the production.
Hi Greg and Wiz. Great tutorial. I am actually applying it to my own application. I was wondering what would you sugges to do if the whole document size is large more than 700 pages. It want be passed in the contextual chunking function. If I take the 500 pages around the chhunk, the chaching wont work. Please can you advice? Thanks Saurabh
Contextual Retrieval: colab.research.google.com/drive/1KGVxiwc2zoY9v6f3IQfs8qJIZeGeMKAq?usp=sharing
Event Slides: www.canva.com/design/DAGTv5ofV8g/-wFTZpoCu8yYzseTb_kx2g/view?DAGTv5ofV8g&
The Ragas part of code in the notebook is not working. Could you fix it?
Great information here! Thanks for making it public. I think you're going to get a sizeable community around you because of these live streams.
Q: where in the code is prompt caching evoked?
Caching is offered by Anthropic's endpoint by default - and is being taken advantage of under the hood here.
Great video and demo as always! I learn much from your content. The contextual retrieval paper said if your corpus is less than 200k tokens, just skip rag and dump the entire corpus into the prompt for every question, and they will cache it (but only for a short time) and just use long context Q&A. I didn’t see them publish any metrics comparing long context to rag, so I take it with a grain of salt. They do want customers to spend as many tokens as possible... But I’m very intrigued at the same time. Maybe you could do a video comparing the two methods? That would be amazing research.
Great insights and instincts @Sean! We'll keep the content recommendation in mind for sure! This is farthest we've gotten on Long-Context and Evaluation for the big window LLMs: ua-cam.com/users/liveBrwhbjh3boU?si=V24z6pagQ0EQ8Ms1
Great video! I’m excited to dive into contextual retrieval next week.
When it comes to productionizing hybrid retrieval with BM25, I’m considering using Elasticsearch, any other recommendations?
My main concern with hybrid retrieval is the added complexity it brings to the production.
Elasticsearch is a great tool for this!
RAG-ception 0:55 - Context of the contextually generated chunks. Got it...got..it.......got it....ok wait what? Need to watch the whole thing.
Hi Greg and Wiz. Great tutorial. I am actually applying it to my own application. I was wondering what would you sugges to do if the whole document size is large more than 700 pages. It want be passed in the contextual chunking function. If I take the 500 pages around the chhunk, the chaching wont work. Please can you advice? Thanks Saurabh
I would build some metadata, in this case, like a summary/outline and use that to generate contextual augments.
thanks :)
Great video, but using text that the model was already trained on is a bad test case
Agreed! We typically stick with easy to consume toy-examples, however!