Pls Do a video for unstructured io. Much needed. Please do it with local ollama Also if we use this method , then will llama cloud have access to our private documents? As we are parsing those documents with llama parse
Sure, will take that into account. Well, you are sending data to API so yes, it is stored somewhere in the cloud. If you have sensitive informations better to talk with them how to deal with it.
Your Chanel is a gold mine and your videos are gems! Thanks you for the great work! BTW, what do you use to highlight webpages like in the LlamaParse page? Keep up the great work!
can i replace llama2 embedding with nomic embed text and the ollama model with mistral? will it work ? actually i tried and it didnt, am i missing something?
i would really like to see a great local RAG right now i think privategpt+ollama {mixtral} +reranker + unstructured io + OCR + qdrant is a good combination , as you said garbage in, garbage out !! so preprocessing the pdf files, specially pdfs that are complex and have tables,pictures,diagrams and all sorts of stuff , is the key to get the correct answer in RAG system can you please build the most accurate Local RAG platform for complex pdf files as of 2024 ? i think we all need a video like this, please make this video
that’s a good combination of tools out there. But again, running locally ( as models need to be quantized ), I still see some hiccups. But, lets see it might change.
@saeednsp1486 look at AutoRAG. It's the only LlamaIndex based "all-in-one" app that I've seen that can do all inference locally (via Ollama backend). pip install and it's up and running. Its author actually meant to help evaluate RAG pipelines, so it's a bit programmer heavy still. Its takes some Python to get working. But a by-product of that goal is that AutoRag has everything together based on local LlamaIndex. The authors stuff is around on YT, Medium, and Reddit. I don't think it has LlamaParser, so maybe a feature request/PR?
Dear Sir, Could you please, in integrating multiple cutting-edge technologies into a single system. Specifically, I am interested in combining the following components: - RAG (Retrieval-Augmented Generation) - Nomic Embedding Model - OllaMa language model - Groq hardware accelerator -Chainlit Additionally, please specify which language models should be used as the base for the system. Two potential options could be: #model_name='llama2-70b-4096' #model_name='mixtral-8x7b-32768' and etc... Thank you for your time and consideration. I greatly enjoyed your recent video and anticipate future content.👍
Hello, you should have checked the video before this 😎 Crazy FAST RAG | Ollama | Nomic Embedding Model | Groq API ua-cam.com/video/TMaQt8rN5bE/v-deo.html
I had a naive doubt, beneath the query engine there's an associated llm that is working right? Otherwise how are we getting responses without using a llm? If yes, then where is the model specified, as to which llm we are using. If no, how such a well framed answer is coming without using llm, because as far as i know, it is the llm which actually takes the relevant pieces of context and stitches them to provide an answer in natural language.
Yes, Llamaparseuses offcourse something behind the scene which is not revealed as its their service 🙂 It kow supports GPT-4o model for the same which is more expensive but better.
@@datasciencebasics Yes, I tooo studied the documentation and found out that until a specific model is not given llamaindex uses the OpenAI GPT 3.5 turbo model. Just another quick question, any downsides of llamaparse? Because for me it works well on parsing as well as extracting data from text as well as tables in a pretty satisfactory manner. Why are people then using pypdf, or apache pdf extraction tools or even paddleocr kind of ocr engines for text extraction and not simply this library. Additionally llamaparse can be integrated with the chains of langchain as well, which means it has no restriction that it can be used by llamaindex only, then why other frameworks?? please clarify this doubt I am new to this field.
Thanks and waiting for ur valuable videos. I like to process .docx files and get text & page number details. I think, no proper library available to get page number details of .docx.....
Very Nice Video ,thanks, One thing I have found additional steps post Markdown , could you help understanding what does it do post the markdown "node_parser = MarkdownElementNodeParser( llm=OpenAI(model="gpt-3.5-turbo-0125"), num_workers=8 nodes = node_parser.get_nodes_from_documents(documents)"
Hi.Thanks for the video & great explanation. I tried your code. I am getting Error : "Retrying llama_index.embeddings.openai.base.get_embeddings in 0.97 seconds as it raised APIConnectionError: Connection error.." . Do you know how to resolve this?. This error happens in line "index = VectorStoreIndex.from_documents(documents)"
You have latest and quality content, I always wait for ur video. Thanks ❤❤
You are welcome. Glad that the videos are helpful.
Pls Do a video for unstructured io. Much needed. Please do it with local ollama
Also if we use this method , then will llama cloud have access to our private documents? As we are parsing those documents with llama parse
Sure, will take that into account. Well, you are sending data to API so yes, it is stored somewhere in the cloud. If you have sensitive informations better to talk with them how to deal with it.
Your Chanel is a gold mine and your videos are gems!
Thanks you for the great work!
BTW, what do you use to highlight webpages like in the LlamaParse page?
Keep up the great work!
You are welcome, Glad the videos are helpful. The highlighter I am using is Weava Highlighter.
can i replace llama2 embedding with nomic embed text and the ollama model with mistral? will it work ? actually i tried and it didnt, am i missing something?
i would really like to see a great local RAG
right now i think privategpt+ollama {mixtral} +reranker + unstructured io + OCR + qdrant is a good combination , as you said garbage in, garbage out !! so preprocessing the pdf files, specially pdfs that are complex and have tables,pictures,diagrams and all sorts of stuff , is the key to get the correct answer in RAG system
can you please build the most accurate Local RAG platform for complex pdf files as of 2024 ? i think we all need a video like this, please make this video
that’s a good combination of tools out there. But again, running locally ( as models need to be quantized ), I still see some hiccups. But, lets see it might change.
@saeednsp1486 look at AutoRAG. It's the only LlamaIndex based "all-in-one" app that I've seen that can do all inference locally (via Ollama backend). pip install and it's up and running. Its author actually meant to help evaluate RAG pipelines, so it's a bit programmer heavy still. Its takes some Python to get working. But a by-product of that goal is that AutoRag has everything together based on local LlamaIndex. The authors stuff is around on YT, Medium, and Reddit. I don't think it has LlamaParser, so maybe a feature request/PR?
Dear Sir,
Could you please, in integrating multiple cutting-edge technologies into a single system. Specifically, I am interested in combining the following components:
- RAG (Retrieval-Augmented Generation)
- Nomic Embedding Model
- OllaMa language model
- Groq hardware accelerator
-Chainlit
Additionally, please specify which language models should be used as the base for the system. Two potential options could be:
#model_name='llama2-70b-4096'
#model_name='mixtral-8x7b-32768'
and etc...
Thank you for your time and consideration. I greatly enjoyed your recent video and anticipate future content.👍
Hello, you should have checked the video before this 😎
Crazy FAST RAG | Ollama | Nomic Embedding Model | Groq API
ua-cam.com/video/TMaQt8rN5bE/v-deo.html
I had a naive doubt, beneath the query engine there's an associated llm that is working right? Otherwise how are we getting responses without using a llm?
If yes, then where is the model specified, as to which llm we are using.
If no, how such a well framed answer is coming without using llm, because as far as i know, it is the llm which actually takes the relevant pieces of context and stitches them to provide an answer in natural language.
Yes, Llamaparseuses offcourse something behind the scene which is not revealed as its their service 🙂 It kow supports GPT-4o model for the same which is more expensive but better.
@@datasciencebasics Yes, I tooo studied the documentation and found out that until a specific model is not given llamaindex uses the OpenAI GPT 3.5 turbo model.
Just another quick question, any downsides of llamaparse? Because for me it works well on parsing as well as extracting data from text as well as tables in a pretty satisfactory manner.
Why are people then using pypdf, or apache pdf extraction tools or even paddleocr kind of ocr engines for text extraction and not simply this library.
Additionally llamaparse can be integrated with the chains of langchain as well, which means it has no restriction that it can be used by llamaindex only, then why other frameworks??
please clarify this doubt I am new to this field.
Outstanding! But how to get the metadata? Thanks!
great job!!
Thanks !!
Using llama parsing means our data is exposed to external API , right ?
yes it is, before using any sensitive information, I suggest you to contact them how it is handled.
Thanks and waiting for ur valuable videos. I like to process .docx files and get text & page number details. I think, no proper library available to get page number details of .docx.....
You are welcome. I hope soon Llamaparse will handle that. If I find something useful, will make a video on that.
Very Nice Video ,thanks, One thing I have found additional steps post Markdown , could you help understanding what does it do post the markdown "node_parser = MarkdownElementNodeParser(
llm=OpenAI(model="gpt-3.5-turbo-0125"), num_workers=8
nodes = node_parser.get_nodes_from_documents(documents)"
Hi.Thanks for the video & great explanation. I tried your code. I am getting Error : "Retrying llama_index.embeddings.openai.base.get_embeddings in 0.97 seconds as it raised APIConnectionError: Connection error.." . Do you know how to resolve this?. This error happens in line "index = VectorStoreIndex.from_documents(documents)"