This was an excellent talk. Thanks so much for sharing your experience and this RAG framework. If there could be a follow up sometime with a sample notebook that uses these techniques, and a code walkthrough video , I’m sure many people would greatly benefit from it.
Sisil's presentation was exemplary, addressing key pain points with innovation. Congratulations on your work!. We as a company also trying to solve all pain points you have mentioned in the pdf docs area. Thanks to Jerry for spotlighting talent like Sisil. Excited for more content!
amazing!!! Sisil really explained the difference between benchmarks and real world openended questions! btw, could you include the datasets name Sisil talked about regarding "retrieval" and "reranking" on evaluation?
This was an excellent talk. Thanks so much for sharing your experience and this RAG framework. If there could be a follow up sometime with a sample notebook that uses these techniques, and a code walkthrough video , I’m sure many people would greatly benefit from it.
Very helpful guys! Thanks !
Super comprehensive. Thanks for this.
Sisil's presentation was exemplary, addressing key pain points with innovation.
Congratulations on your work!. We as a company also trying to solve all pain points you have mentioned in the pdf docs area.
Thanks to Jerry for spotlighting talent like Sisil. Excited for more content!
a showcase sample notebook would be deeply appreciated!
this was fantastic, can you provide a tutorial notebook?
Would have loved if you shared the slides in the description. :)
Is there a sample code you can post a link to, specifically for indexing the subdocs, then the chunks and retrieving them?
amazing!!! Sisil really explained the difference between benchmarks and real world openended questions!
btw, could you include the datasets name Sisil talked about regarding "retrieval" and "reranking" on evaluation?
how does lexical indexing work for subdocs
I wonder what is the PDF parse Sisil 's team is using?
Adobe Extract PDF API or something. I haven't tested it but it works pretty well from what I've heard.
There is an opensource version as well (Works in Linux Distro) called unstructured. You just need to do pip install unstructured[all].
I didn't catch what was the pdf parser used? Can you name it please?
I think it was adobe api