World's Most Accurate RAG? LangChain/Pinecone, LlamaIndex or EyeLevel

Поділитися
Вставка
  • Опубліковано 29 вер 2024
  • Not all retrieval augmented generation platforms are the same. We put three popular RAG approaches -- LangChain/Pinecone, LlamaIndex and EyeLevel.ai --in a head-to-head test for accuracy on 1,000 pages of complex documents.
    The winner emerged with 98% accuracy including on difficult PDFs with tables and diagrams.
    A must see for engineers trying to build hallucination free RAG on their documents.
    The full report is here.
    www.eyelevel.a...
    The source files are here if you want to replicate the test.
    drive.google.c...

КОМЕНТАРІ • 17

  • @cmcocktails
    @cmcocktails 5 місяців тому +6

    While this sounds like fantastic news, LlamaIndex "out of the box" would provide poor results to a lot of solutions if you don't build smarter retriever / pipelines to create a more robust and usable RAG solution.
    I'd love to see the code testbed for all three of these tests, otherwise I've got no actual evidence that you used a highly performing solution in both Langchain and LlamaIndex _and_ beat it. I mean, if I said "my Ford Focus is faster than your Audi RS6" and show you them both off the line...but fail to show you the engine rebuild in the Ford Focus then I'm sorta misrepresenting the results.

    • @EyeLevelAI
      @EyeLevelAI  5 місяців тому

      Hey! @cmcocktails we've released all the data used for this test and our source code for GroundX, LangChain/Pinecone and LlamaIndex
      drive.google.com/drive/u/0/folders/1l45ljrGfOKsiNFh8QPji2eBAd2hOB51c
      Take EyeLevel's GroundX APIs for a spin, create an account at www.groundx.ai and let us know what you find!

    • @cmcocktails
      @cmcocktails 5 місяців тому

      @@EyeLevelAI Ok, thanks. Scanned it, but what I can tell you're just using LlamaIndex "as it" with all common defaults, out of the box, basically a naive rag solution without any additional pipelining for reranking and setting it up in a way that most people would use it in production.
      For instance, I'm working on LlamaIndex for a more complicated scenario than just "load document, ask question" and that's not something you do with 5 lines of code, it's a bit more configuration with a hybrid search approach, etc.
      So are you asserting that GroundX solution is better than the others if you just pick the basic "hello world" out of the box RAG solution? V.s. what the frameworks are actually capable of doing?

    • @benjaminfletcher4217
      @benjaminfletcher4217 5 місяців тому

      Thanks @cmcoktails, as we mentioned, the point was to compare the 3 solutions "out of the box". We acknowledged that there are more advanced implementations for both Langchain and Llama Index. They are likely to improve performance. We are unsure how much and will likely test it in a future comparison. But implementing advanced RAG with the other frameworks would not make GroundX worse. Basically you get "advanced RAG" "out of the box" with GroundX.

    • @cmcocktails
      @cmcocktails 5 місяців тому

      @@benjaminfletcher4217 ok, I'd be interested in a real "bake off" between the 3 products as "out of the box" tech doesn't really interest me. Primarily because even I could take LlamaIndex, hide it in the cloud somewhere as "proprietary", and use their youtube advanced RAG solutions to make it better and beat the LlamaIndex "out of the box" solution.
      Things that concern me after reading through docs and signing up on your site (that I couldn't find/understand):
      1. I don't seem to find any pricing anywhere.
      2. I'm curious if it's possible to run this without being tied to your SaaS, more like a Framework as the other solutions currently offer
      3. SOC2 compliance or other certifications that help us understand the ramifications of uploading our personal documents to your infrastructure and what risks we're incurring on the safety of our documents.
      4. Any throttles/limits on utilizing the service (e.g. if someone puts this into production what type of risks are they incurring from a high trafficked site)

    • @bramjanssen8865
      @bramjanssen8865 Місяць тому

      What would be en example of a Llamaindex / langchain pipeline that would you compare against Groundx?

  • @kiiikoooPT
    @kiiikoooPT Місяць тому +1

    I already stoped the video at 00:37, so you work at EyeLevel, and you are going to tell us wich RAG is the more accurate from LangChain/Pinecone, LlamaIndex or EyeLevel, let me guess....... They are all the same? Or you going to say EyeLevel is the best like everyone says about the project they work with? 🤣🤣🤣

    • @kiiikoooPT
      @kiiikoooPT Місяць тому

      I'm going to see the video anyway, don't take me wrong, I just find these introductions a bit non sense, because of course you are not going to say bad things about your project.
      But I like to learn new stuff so lets go, and thanks for the content, don't see me as an hater please, that is not my point. Like I said, is just that it trow people off straight away when you say things like that. Or describe your video as who is the best when you work with one of the systems you are testing.
      Just title it something like, EyeLevel RAG system showcase or something like that.

    • @EyeLevelAI
      @EyeLevelAI  Місяць тому +1

      All good. Thanks for the feedback. Did you try the test for yourself?

    • @kiiikoooPT
      @kiiikoooPT Місяць тому

      @@EyeLevelAI having problems with my laptop, didn't teu it yet, but Will specially now that i got a free host, can mess around with more stuff

  • @attilavass6935
    @attilavass6935 5 місяців тому +2

    You compare apples with pears.
    GroundX looks to be a sophisticated RAG application, while you use the default / very basic components and settings for both Langchain and LlamaIndex RAG.
    The comparison is misleading this way...

    • @benjaminfletcher4217
      @benjaminfletcher4217 5 місяців тому

      Thanks for the comment. As we mentioned, the point was to compare the 3 solutions with the same level of effort to show the performance a developer gets "out of the box". We acknowledged that there are more advanced implementations for both Langchain and Llama Index. They are likely to improve performance. We are unsure how much and will likely test it in a future comparison. But implementing advanced RAG with the other frameworks would not make GroundX worse. Basically you get "advanced RAG" without any effort with GroundX was our point.

  • @adammobile7149
    @adammobile7149 3 місяці тому +1

    Is it a GroundX advertisement? As in reality, everything depends on configuration. 😅
    Any way, good video. 👍

  • @andrew.derevo
    @andrew.derevo 3 місяці тому

    no source code. make this video no sense