RAG-GPT: Chat with any documents and summarize long PDF files with Langchain | Gradio App

Farzad Roozitalab (AI RoundTable)

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 13 січ 2025

КОМЕНТАРІ • 169

@mikew2883 Рік тому ⁺²
Hi there. Quick question. I know the documentation states to open and modify the "cfg.py" but noticed it was replaced with app_config.yml and the other configuration are located in "load_config.py". I am receiving an error with the values I supplied so I was wondering what is the format and values the following configurations are looking for value-wise? I am using OpenAI for instance and not Azure. Thanks!
openai.api_type = os.getenv("OPENAI_API_TYPE")
openai.api_base = os.getenv("OPENAI_API_BASE")
openai.api_version = os.getenv("OPENAI_API_VERSION")
@airoundtable Рік тому ⁺⁴
Hi Mike. You are right, the correct configuration file is app_config.yml. Whenever you make changes to this YAML file, the updates should seamlessly propagate throughout the project through load_config.py script that handles the distribution of configuration values.
To add a new configuration parameter, you'll need to follow these steps:
1. Introduce the new parameter in the app_config.yml file.
2. Update load_config.py to ensure that the new parameter is loaded correctly.
3. Access the new parameter in your project's modules by creating an instance of the configuration loader, like so:
```
APPCFG = LoadConfig()
your_new_parameter = APPCFG.new_argument
```
Regarding the OpenAI credentials, it's crucial to handle them securely. I use an environment variable to store these credentials, which is not included in the GitHub repository to maintain security. You should create a .env file within your project directory and populate it with your OpenAI credentials. Here's an example of what that might look like:
OPENAI_API_KEY=
Since you're not utilizing Azure, you may not require all four of the credential arguments I use. Just include the necessary ones provided by OpenAI. I suggest checking these links and make a simple API call to OpenAI to ensure you understand the process.
* www.datacamp.com/tutorial/using-gpt-models-via-the-openai-api-in-python
* platform.openai.com/docs/api-reference/streaming?lang=python
Alternatively, if you prefer not to use a .env file, you can directly insert your OpenAI credentials into the load_config.py module, although this is less secure and not recommended for sensitive information.
I hope this clarifies the configuration process for you.
@mikew2883 Рік тому ⁺²
@airoundtable Thank you for your reply. I ended up creating a .env but still getting the following error. I even left the the API endpoint out as well since you mentioned this was only needed for Azure.
APIConnectionError: Error communicating with OpenAI: Invalid URL 'None/engines/gpt-35-turbo-16k/chat/completions': No scheme supplied. Perhaps you meant None/engines/gpt-35-turbo-16k/chat/completions?
@airoundtable Рік тому ⁺⁴
@@mikew2883 No problem! Well, that means that the API call is now ok but the code to generate the response from the GPT mode is not compatible with OpenAI. The issue is because I am using something like:
```
openai.ChatCompletion.create(
engine=gpt_model,
messages=[
{"role": "system", "content": llm_system_role},
{"role": "user", "content": prompt}
],
temperature=temperature,
)
```
This is the code that works with Azure OpenAi. While for those who use OepnAI directly, the code is something like this:
```
client.chat.completions.create(
model="gpt-4",
messages= messages=[
{"role": "system", "content": llm_system_role},
{"role": "user", "content": prompt}
],
)
```
So, in order to fix the problem:
1. Check the project schema and find where GPT models are generating a response
2. Find the code in the project and change them with the OpenAI format.
I suggest you first generate a response from a GPT model using your API key and make sure that the code that you are using works as expected.
Check this link for more info on how to generate the response with OpenAI:
platform.openai.com/docs/api-reference/streaming?lang=python
,
@mikew2883 Рік тому ⁺¹
Will do. Thanks for the reply! @@airoundtable
@airoundtable Рік тому
You're welcome!@@mikew2883
@navanshukhare 11 місяців тому ⁺⁹
This is one video if you want to learn and make an advanced RAG Project. Other videos are also equally great and I love your approach to how organized you are in your videos; your code quality is just WOW.
@alikhalili9961 11 місяців тому ⁺⁴
This is an incredibly informative and well-structured video! The detailed breakdown of the RAG-GPT chatbot, along with the time-stamped sections, makes it easy to navigate and understand. The inclusion of real-time document uploads and summarization requests showcases the versatility of this chatbot. The GitHub links and references to the main libraries used are very helpful for those who want to delve deeper. Keep up the great work! Looking forward to more content like this. 👏👏👏
@airoundtable 11 місяців тому ⁺²
Thanks ! Glad you liked the video!
@usamaahmed8075 8 місяців тому
@@airoundtable can you help me am facing an issue with db when i run server it said that uploaded db doesnot exist
@usamaahmed8075 8 місяців тому
can you help me am facing an issue with db when i run server it said that uploaded db doesnot exist
@airoundtable 8 місяців тому
@@usamaahmed8075 @usamaahmed8075 Hi. what do you mean by the uploaded db? if it said vectordb does not exist. it is because you have to run this module first: upload_data_manually.py
I explained it completely in the video. If you are planning to upload a document and then start chatting with it, you need to first upload it using the upload button and make sure that you see the confirmation message on the screen.
@RZOLTANM 10 місяців тому ⁺¹
Really good and perfectly articulated presentation on RAG. Thank you!
@airoundtable 10 місяців тому ⁺¹
Great to see that you liked the content!
@SaddamBinSyed 5 місяців тому
A very well done @Farzad. Great explanation. This is exactly the concept I was looking to understand and implement. You are simply 100x amazing. I am highly excited to listen to your other videos as well. Thanks for keeping this channel so informative. One suggestion from my side: next time, please use local LLMs like Ollama Llama 3.1 so those who cannot afford it will benefit.
@airoundtable 5 місяців тому
Thanks! I appreciate the kind words and I am glad that the content was helpful. Thanks for the suggestion. I have almost the identical project using open source LLMs. Please check out:
ua-cam.com/video/6dyz2M_UWLw/v-deo.htmlsi=u-QWc-Mz5oOA17LS
@SaddamBinSyed 5 місяців тому
@@airoundtable Thanks for sharing, Watching right away....
@341yes 10 місяців тому
Thanks for this project! Very useful! Will watch every1 of ur videos from now on! ☺
@airoundtable 10 місяців тому
Thanks @341yes! I am glad to see that you liked the video!
@JJaitley 11 місяців тому ⁺³
What are your suggestions on cleaning the company docs before chunking? Some of the challenges faced are how to handle the index pages in multiple pdfs also the headers and footers. You should definitely make some video related to cleaning a pdf before chunking much needed.
@airoundtable 11 місяців тому ⁺²
Well, handling company documents for integration into a RAG system is indeed a complex task. It's often so detailed and requires such a hands-on approach that I would strongly suggest treating the document preparation as a separate project from the RAG chatbot development. Even that project by itself can be divided into two main flows:
1. Cleaning and preparing existing documents
2. Establishing a standard format for all the new documents for easier future integration
Since the RAG system is going to perform a vector search across the entire document set, I suggest removing the unnecessary or duplicate content (for instance I cannot think of any possible way that a separate index would add value to the conventional RAG strategies and vector search techniques, unless you design a complex RAG system that incorporates hierarchical graph methodologies).
Finally, if your documents contain domain-specific abbreviations that general language models may not recognize, you can think of implementing an advanced RAG system with a fine-tuned LLM on your specific domain data (There is another video in the channel that explains how to fine-tune an LLM on company documents which might give you some good ideas).
And thanks for the suggestion! I'll consider creating a tutorial video and address this issue.
@thaukhoorz 9 місяців тому
Thank you so much for this! Excellent instructions, excellent documentation.
@airoundtable 9 місяців тому
You're very welcome! I am glad it was helpful
@zlatomirradev3030 Рік тому ⁺¹
Nice project, thank you very much for the great content sir👏👏👏
@airoundtable Рік тому ⁺¹
Glad to see that you liked the project 😉
@anandkhule375 11 місяців тому ⁺²
Great Project .. Good Work !
@airoundtable 11 місяців тому ⁺¹
Glad to see you liked it! More projects are on the way!
@serjmitaki 11 місяців тому ⁺²
you did a great job! Thank you!
@airoundtable 11 місяців тому ⁺¹
Glad to hear you liked the video! Thanks for the feedback!
@LorenzoPozzi-g6h 5 місяців тому
Very good content, thanks for the video!!
@airoundtable 5 місяців тому
Thanks!
@deborahjamesmathew 5 місяців тому
Thank you so much for the informative video
@airoundtable 5 місяців тому
Thanks! I am glad it was helpful
@AlonAvramson 11 місяців тому ⁺¹
Thank you! great video and explanations!
@airoundtable 11 місяців тому
Thanks for the feedback! I am glad to hear that you enjoyed the video!
@doctorbill37 10 місяців тому ⁺¹
I just discovered your video today in my feed. This is an excellent project with great attention to detail. Very well done.
I cloned it and saw a project in your bullet list called "Open Source LLMs" along with the note that it is coming soon. Do you have any idea when that might be? This is important for those of us wanting to run LLMs with RAG locally on our machines. Very much looking forward to seeing this. Thanks for your work,
@airoundtable 10 місяців тому ⁺¹
Thank you very much for the positive feedback @doctorbill37. I am glad to hear you liked the video. For the open-source-RAG project, I have a good news. I have already started to take the video. It will be uploaded in the next couple of days
@doctorbill37 10 місяців тому
@@airoundtable Wonderful -I am subcribed!
@musumo1908 9 місяців тому ⁺¹
Great work! Would love to see this with LiteLLM as an option and some sort of basic user login system…along the lines of open webui
@airoundtable 9 місяців тому
Thanks! That is indeed a great combo. I haven't looked at it closely yet but I will definetly check it down the road
@RetiredVet 10 місяців тому ⁺¹
You did a great job, but the videos are so small, I have to constantly expand them to read it. It would be nice if you could read the text without going full screen all the time.
@airoundtable 10 місяців тому ⁺¹
Thanks @RetiredVet! You are right. I have to find a way to increase the size if the contents for an easier read. That is actually why in langchain vs llama-index video I omitted the powerpoint and showed everything on screen and also showed each command that I was executing on the screen as well. However, I am constantly looking for ways to improve the quality as I just recently started to upload videos on UA-cam.
@RetiredVet 10 місяців тому
@@airoundtableI enjoyed your video and think your code is great. The code and explanations are the important part. You can learn the video stuff much easier. I've looked at a lot of langchain videos and your explanations are very clear.
Unfortunately, I am an intermediate python programmer and I had no idea that requirements' files were so different between windows and Linux. I cannot use your requirement files and when I try installing langchain with pip these days, it never works. If the UA-cam video is a week old, the requirements have changed. I try to downgrade, to use the recommended versions, but then langchain installs packages that don't work. I am learning a lot more about package management than I ever wanted to.
Langchain is a very interesting project, but it is moving so fast, it is difficult for me to keep up.
Keep up the good work.
@tobiasbuchmann6972 10 місяців тому ⁺¹
Thanks for the great tutorial! Just out of interest, would it also be possible to use streamlit as a user interface or are there any technical issues? Thanks again.
@airoundtable 10 місяців тому ⁺¹
Thanks for the feedback! I am glad that you liked the video. Sure, you can use streamlit as well. In my opinion using streamlit is a bit easier than Gradio. That was also one of the goals of this series to show how to use streamlit, gradio, and chainlit. I used esch one of them in a separate video. If you check the channel you will see a chatbot that I designed with streamlit.
"Connect a GPT agent to duckduckgo search engine".
Feel free to reach out if you have any other questions.
@tobiasbuchmann6972 10 місяців тому ⁺¹
@@airoundtableThanks! You mentioned also the issue of data flow management. Let’s assume that I upload ten documents in advance to the database then I have another one that I upload while using the chatbot. Will the chatbot use all eleven documents to answer my question? Thanks again for your help!
@airoundtable 10 місяців тому ⁺¹
That is a good question@@tobiasbuchmann6972. No this chatbot treats the documents that were prepared in advance differently from the ones that you upload while using the chatbot. So, to your example, it creates an index for those 10 documents that you preprocessed earlier, and it creates a separate index for that one single document that you passed to it while using it. And also let's say that during using the UI you upload documents in multiple steps. Everytime that you upload a new set of documents, it creates a new index for them and points the chatbot to the most recent index. Finally, whenever you run the UI, it make sure that all the indexes that were created for the uploaded documents during the previous user's session are removed and clean up the disk and it gives you a fresh start.
But this is just one way of doing it. At the end of the day, all these functionalities can be adjusted based on your needs.
@Thomas-nx5vo 10 місяців тому ⁺¹
Great video. I'm testing out the project but it seems that the chatbot also takes information from the web, as it accesses websites when there is no uploaded docs. I would like to have it only interact with the PDFs/uploaded docs... Any fixes?
@airoundtable 10 місяців тому
Thanks Thomas! The chabot does not have access to the internet but it has access to the pre-trained knowledge of the GPT model. Overall, it works in 3 different ways:
1. If you have already preprcessed some documents and start using the chatbot, it will give you answer based on those documents (This is the chat with pre-processed doc feature).
2. If you select Chat with upload docs feature and upload documents, it will start giving you answer based on the uploaded documents (until you change the setting to with pre-processed doc again)
3. In case the user's question is not related to any documents the chatbot will use its own knowledge but in a limited way to just act as a friendly chatbot. If you would like to restrict it even more, you can change the LLM_system_role in the config folder (configs/app_config.yml: llm_system_role argument). It is where I am instructing it and explain to it how it should behave. I explain it in the video at:
00:40:45 LLM system role
I hope this helps you solve the problem.
@chintujha7404 9 місяців тому ⁺¹
Great video on RAG. One quick question. Can we add documents from UI for preprocessing and chat with that rather than adding documents to data folder from backend? I mean..add a functionality in UI that will allow me to add documents in data folder and preprocess it so that i can chat? Thank you
@airoundtable 9 місяців тому
Thanks! yes the chatbot has that capability. Follow these steps:
1. In the "Rag with" drop down choose chat with Upload docs
2. Use the "Upload doc" button and select your documents.
3. Stay a few seconds until the chatbot tells you that the documents are procssed and you can start asking questions.
@PrinceBrosnan 5 місяців тому ⁺²
is there a similar ready-made solution on the site "poe"? I am a beginner and want such a model, but not to make it, but to work with it
@airoundtable 5 місяців тому
Check out my video below:
ua-cam.com/video/8iMIGVWMPPQ/v-deo.htmlsi=ryvfD6m65Jyro205
This is a free RAG app that you can use
@hammadyounas2688 3 місяці тому ⁺¹
Can we have the option to upload files to the vectore store to update the assistant? like upload files to the vector store of openai?
@airoundtable 3 місяці тому
If you mean whether you can add/remove/modify the files of the vectorstore that you created, the answer is yes you can. you can easily find the info on it. I just did a quick search and saw this tutorial:
www.datacamp.com/tutorial/chromadb-tutorial-step-by-step-guide
But I'd search the ChromaDB documentation to find all the details.
@revanthreddy6136 9 місяців тому ⁺¹
hello sir is the open ai api key paid ? do we have to pay for it in order to access it and use it ?
@airoundtable 9 місяців тому
Hi, yes. You have to pay for it to be able to make the API calls. If you go to openAI website you will see how to get the API key from there. Just also keep it in mind that the project is currently using Azure OpenAI. In case you want to use OpenAI directly, a couple of modifications are required. I have pinned a comment here (you can see it on top), where I explained all the steps in detail.
@tonysingh9426 9 місяців тому ⁺¹
Hello there, I am running into the following error: The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again.
@airoundtable 9 місяців тому
Hello. The project is using Azure OpenAI. Therefore, it needs access the API key, endpoint, and deployment name of the GPT model and embedding model in azure. And for that you need to deploy them on Azure OpenAI studio first. This error rises in two scenarios:
1. The model has not been deployed yet
2. The model is deployed but the deployment name that was passed to the chatbot is not the same as the deployment name that was used in the OpenAI studio.
And in case you want to use OpenAI directly, instead of Azure OpenAI, please read the pinned comment for a step by step description of the changes that are required for that.
@tonysingh9426 9 місяців тому ⁺¹
@@airoundtable thank you, how do i pass the deployment name to the chat bot?
@airoundtable 9 місяців тому
@@tonysingh9426 create a file in the project directory and name it: .env
in there create these arguments:
OPENAI_API_TYPE=azure
OPENAI_API_VERSION=
OPENAI_API_KEY=
OPENAI_API_BASE=
gpt_deployment_name=
embed_deployment_name=
Also in the config folder, in llm_config there is the name of the gpt in engine argument, change that as well.
Then run the chatbot. The project automatically loads the .env file and extract all these information from it
@tonysingh9426 9 місяців тому ⁺¹
Thank you, sir
@tonysingh9426 9 місяців тому ⁺¹
@@airoundtable do i need a second deployment for the embedding model? and if so should it be the same engine as the gpt_deployment?
@taylorfans1000 10 місяців тому ⁺¹
Hi, i am using Openai entirely and not azure, i change the chat completion function as per your solution in the comments, but i am getting the error : TypeError: 'ChatCompletion' object is not subscriptable in response["choices"][0]["message"]["content"]. Please suggest some resolution for the same. Also, the application stops fetching answers after the first question. Please help. @AI RoundTable
@airoundtable 10 місяців тому ⁺²
Hi @taylorfans1000, this problem is not hard to debug.
1. Based on OpenAI's website, even by using their model directly you should be able to extract the response from response["choices"][0]["message"]["content"]. Here is the reference: platform.openai.com/docs/guides/text-generation/chat-completions-api
1. But to make sure that you GPT model is working as expected, test it separately on a notebook. Use your API key and the Client.chat.completion class from openai and make a successful API call with the GPT model that your are using.
2. Don't use response["choices"][0]["message"]["content"] and directly print(response) itself and make sure that you are getting the whole json response from OpenAI.
3. Once you've got the API call working, then you can try to get the specific message content by using response["choices"][0]["message"]["content"] and make sure that you can extract the response content from OpenAI's json response.
4. After you went through these steps successfully, apply it to the following files and lines in your project:
src/util/chatbot.py in lines 68 to 78
src\utils/summarizer.py in lines 110 to 118
My guess is the fetching problem will be solved as well after you fix this. I hope this helps, feel free let me know in caseyou have more questions.
@341yes 10 місяців тому
Hey! Were you able to debug this issue? ....i'm facing the same! Thanks in advance!
@KinesitherapieImanesghuri 9 місяців тому ⁺¹
How can we evaluate the responses generated by the RAG system?
@airoundtable 9 місяців тому
That is the million dollar question.There have been a lot of effort around evaluating RAG systems but the challenging part is that there is no single metric that can tell you how accurate your system's response is. Instead, to evaluate RAG systems, we usually use the help of LLMs themselves. But before that we need to understand what the challenges in a RAG system are. here is a brief summary of some of the key steps:
1. In data preparation pipeline: data quality + chunking strategy + embedding quality
2. In retrieval side: user's query quality + search quality + relevance of the contexts of the retrieved documents to the query
3. In sysnthesis side: context overflow + LLM hallucination + answer relevance
These are the components that need to be adjusted and evaluated in a RAG system. For the evaluation pipeline itself, you can either use the frameworks that are being developed for this purpose: e.g: TruLens, Langsmith, Galileo (My recommendation: Langsmith)
Or you can design a custom pipeline depending on your goal and usecase. I have a video on the channel called, "Langchain vs Llama-index" where I design an end to end pipeline and evaluate the performance of 5 different RAG techniques. There I go into much more detail about this topic.
Overall, this task requires a good amount of testing and iteration specially if the requirements are too specific and complex.
@VNNA-forfun 29 днів тому
Great tutorial. I have a question: Can I learn and run all your projects without openai api key?
@airoundtable 27 днів тому
Thanks. Yes, You need to convert the openai call sections and use another framework like ollama for example. I have received multiple messages from the community that they successfully used OLLAMA with the projects in the channel
@MH-xx6df 5 місяців тому ⁺¹
module error pwd. Are you running this on a unix-like system? What modifications must I make for Windows pls?
@airoundtable 5 місяців тому ⁺¹
I ran this project on Windows but it does not matter. The project is designed in a way that it can be executed on any OS without needing any modification. If you are getting it while installing the requirements.txt just remove the library that is causing this issue and try again
@shahnaz9026 9 місяців тому ⁺¹
I have a doubt... This project is only for text pdf? Or it can be used for the pdf which contains images taable as well?
@airoundtable 9 місяців тому ⁺¹
This project is only for .txt and PDF files and easily extendable for docx files (you just need to add it to the list of acceptable files in the code). If your documents contain images and tables, that would not hurt the performance on text. But for implementing RAG with images and tables this project needs to be upgraded.
RAG on images requires image embedding and vector search on those embeddings. Still there is no solid approach that can handle various type of images (e.g: technical drawings). But there are a few priliminary solutions out there that work on generic images. So, the industrial application is still very limited.
Tables on the other hand requires specific approaches that are able to extract the contents of a table properly. "Unstructured" library has been working on this aspect and Langchain has adapted it within its framework. So, since I used Langchain in this project, you can easily modify it and add that approach to it. But the problem with that is handling tables with that approach takes a very long time (in my experience) which makes it impractical for industrial purposes. That leaves the door open to custom solutions that can suit a specific business need which can vary widely.
@shahnaz9026 9 місяців тому ⁺¹
@@airoundtable thank you so much for replying
One more doubt can i write all those code in Jupytter notebooks and run?
@airoundtable 9 місяців тому ⁺¹
@@shahnaz9026 technically you can. You need to make alot of refactoring to the code but it is doable if you want to run it on jupyter. Keep it in mind that this code is currently using Azure OpenAI, in case you want to use OpenAI directly, the chat completion functions need to be modified as well.
@martiancoders1518 11 місяців тому ⁺²
Can I use any other model or point me to section where I can use gguf file
@airoundtable 11 місяців тому ⁺²
Sure, you can use other LLMs but you have to modify the code for that. The code is now using OpenAI GPT 3.5 (with API calls) model for inferencing. If you want to change the LLM and run the code:
1. Use a powerful LLM for a good performance (consider the context length, chunk sizes, and the instructions that you want to give to the model along with the available computational power that you have at hand)
2. You need to change the code wherever it is getting the response from the GPT model, which happens in two location:
a. src.utils.chatbot.py - response function - line 68 to 72
b. src.utils.summarizer - get_llm_response function
3. Depending on the model that you are using, you may need to process the response in a differen way as well. (e.g llama2's output will contain both the query and the response along with some special characters that need to be processed for a neat user experience.)
@martiancoders1518 11 місяців тому ⁺¹
@@airoundtable excellent, will give it a try. Thank you 💯💯💯
@kunalsatpute8379 10 місяців тому
Very detailed explanation and thank you for making it open source.
Is there any plan to advancement to this application? like
1 advance rag pipeline which can extract text and table data or image based on user question
2 create vector db based on text image and table data?
3 providing login and admin panel to track the information like no of token used by different users etc
4 using react node for better app experience?
5 Complete deployment process ?
@airoundtable 10 місяців тому ⁺¹
Thanks! I am glad that you liked the video.
These are all great points. For some of them yes, there will be a video soon and for some I still have no plan. I am looking into solutions for taking into account tables and unstructured documents along with images. There are already some solutions out there (unstructured library and image vector databases) but non of them are still practical in my opinion. For instance the available approach that "langchain" and "unstructured" proposed for processing the tables in documents is super slow and technically non practical. So, I will make a video as soon as I see an approach that can be applied in real-world scenarios.
The next two videos would be interesting for you I guess. The next one is a multimodal chatbot that uses 5 different models in the background and is able to answer questions asked from the context of an image as well. And the one after is an advanced RAG chatbot that uses knowledge graph and take into account more detailed relationships between the content of a document and related chunks.
3,4, and 5 have crossed my mind but I still have not planned a video for them. I will keep it in mind and think about it after the next two videos.
Thanks for the suggestions @kunalsatpute8379!
@kunalsatpute8379 10 місяців тому ⁺¹
@@airoundtable Thank you for replying, and excited for your videos. One question Will these videos be extension or enhancing this application ? or would be be entire separate video?
@airoundtable 10 місяців тому ⁺¹
@@kunalsatpute8379 That would be expansion. Beside LLM applications, one of the main ideas behind this series was to walk through all the necessary steps required for an advanced multimodal chatbot. I started by explaining function calling and vector search and using them I designed multiple projects. The next video would be:
1. A combination of all the chatbots that I have designed and uploaded on the channel so far (RAG-GPT + connecting the GPT to the search engine + chatting and summarizing websites)
2. We will use the concept that I showed on open-source RAG for creating a web server for serving models
3. We will add more abilities: The user can interact with the chatbot by sending voice, text, and image and the chatbot will respond in voice, text + we can ask questions about a specific image that we uploaded and the chatbot would be able to answer the questions about the image context + we can ask the chatbot to generate image for us as well.
So it would be an any-to-any chatbot
input: voice, text, image
output: voice, text, image
functionalities: Normal AI chatbot + RAG with documents + RAG with websites + search the web using a search engine + summarize documents + summarize websites + understand image both for answering questions and for generating them
So, RAG-GPT project would be one arm of that chatbot and I am thinking to give the user the ability to work with around 9 or 10 different Gen AI models (all open-source except GPT). So, in that video I will just briefly touch RAG-GPT and the other parts that I have already covered in the previous videos and the focus would be to explain the multimodal side of it and how the whole chatbot was designed. That is a huge project
@usamaahmed8075 8 місяців тому ⁺¹
facing an issue when i run ma chatbot i doesnot give answer its show error when i processed the file also show error
@airoundtable 8 місяців тому
As I mentioned, please check the repository issues and open a new one if needed
@Abdulrahmanmuhammed-cy9zq 8 місяців тому ⁺¹
Hi, I have to ask beginner questations
is that support arabic document as well ?
and is that key for free to use ?
@airoundtable 8 місяців тому
Hi,
I am really not sure how the models would perform on Arabic. You can give it a try or search in arab forums and see what models do they suggest for arabic.
Which key are you refering to?
@ragavanajith4538 11 місяців тому ⁺¹
How to Stop the stream response in llm Langchain we are using X axel buffering
@airoundtable 11 місяців тому
It is hard to tell without seeing the code and it depends on how you are calling the model and generating the response. In the code that I put in my github repository, the model does not stream the response. But in case you are using a different code with langchain, check these links:
- python.langchain.com/docs/modules/model_io/chat/streaming
- python.langchain.com/docs/modules/model_io/llms/streaming_llm
@hillmanlai3270 10 місяців тому
This is a great video that clearly explains the whole RAG development process.
One quick question, I created the .env file to store the OPENAI_API_KEY. But it still could not find it. where should I put the OPENAI_API_KEY?
@airoundtable 10 місяців тому
Thanks, I am glad that you liked the video!
To make that work:
create a raw file and name it ".env" and put it in the parent folder of the project (RAG-GPT) folder and add your arguments like this:
OPENAI_API_TYPE=azure
OPENAI_API_VERSION=
OPENAI_API_KEY=
OPENAI_API_BASE=
Then to test if it is working properly, open a notebook or a raw .py module and run this command:
import os
from dotenv import load_dotenv, find_dotenv
# This line automatically finds the .env file in your environment
_ = load_dotenv(find_dotenv())
openai_api_type = os.getenv("OPENAI_API_TYPE")
openai_api_base = os.getenv("OPENAI_API_BASE")
openai_api_version = os.getenv("OPENAI_API_VERSION")
openai_api_key = os.getenv("OPENAI_API_KEY")
Then you can print and make sure that it got it right:
print(openai_api_type )
If you see the values by printing them, then you are good to go.
@abhiruwijesinghe2677 9 місяців тому ⁺¹
Hi,
I got an error like there is no files inside the data/docs
FileNotFoundError: [WinError 3] The system cannot find the path specified: 'data/docs'
but I didnt change anything of your repo, I just clone it and run. Can you give some guidance for this issue
@airoundtable 9 місяців тому
Hi,
It shouldn't be the case. this directory is part of the repository and I am using 'pyprojroot' library to manage the directoris in the project automatically. Without the full traceback of the error, I cannot understand why it is happening. In case you still get the error, feel free to share the full traceback and I will let you know the source of the problem.
@RealLexable 5 місяців тому ⁺¹
Would .csv files or images also be able to add?
@airoundtable 5 місяців тому
Not to this version. For performing Q&A and RAG with CSV files please have a look at my LLM agent videos. For images you would need more complex approaches and the one presented by Langchain and Unstructured are not ready for production and they took a very long time to process the images. Although in my next video, I am aiming to show how to find-tune multimodal LLMs on custom image datasets and those models can be used to perform RAG on images. Here is the link to my video describing how to chat with SQL and tabular databases:
ua-cam.com/video/ZtltjSjFPDg/v-deo.htmlsi=bh9xdkJqufFBMrBI
@NinVibe 9 місяців тому
Hi, I love the whole project, but I would be happy if you go more in-depth on the following statement in the repo: "It is strongly recommended to design a more robust and secure document handling process for any production deployment."
Do you mean like improvement on security of the documents and restricted access for the app and implement such steps, or?
@airoundtable 9 місяців тому ⁺¹
Hi, thanks.
No, I would suggest the accessing level on the chatbot side. In general, for RAG rjects, I suggest to separate the doument processing pipeline from the chatbot itself. And for the data pipeline there are so many factors that need to be taken into account. If I assume the company is mid size or bigger:
1. Document cleaning, and transformation (for instance, if you are dealing with, .txt, .docx, and .pdf, after preprocessing, you can convert all to PDF documents for a managable workflow).
2. Content validity check. This step can be managed on a division level or on multiple levels depending on the size of the company and the complexity of the documents. (The verification teams should verify the contents and be responsible for what is in there.).
3. Avoid duplicates. (Sometimes differenti dvisions have very similar documents but for different usecases and purposes. This can cause confusion in a RAG project.)
4. Address specific cases (e.g: if a PDF file was created using scanner, you would not be able to perform RAG on it.)
5. Manage security (who can add/remove a document from the pipeline itself and also the access level)
6. Implement a CI/CD pipeline for automating the workflow and managing scale (e.g: in case a document is added to a database, your pipeline should be able to only add that document to the vectorDB rather than recreating the vectordb from scratch)
7. Secure the pipeline
8. Test and improve
These are some of the key aspects of the data pipeline for RAG projects.
@OBRosewell 9 місяців тому
dude!!!! INSANE!! such a good tutorial. you rock.
my one question is: credits. Will the vector function save you credits? e..g i want to build a legal document reader & Q&A. Some docs are 100 pages long. wont each doc cost hundreds in API credits? OR is that what vectorisation & DBs are for?
@airoundtable 9 місяців тому
Thanks! I am glad you liked the video. Storing the vectorized documents in a vectorDB definitely can save costs and it won't be efficient to build such a system without the use of vector databases (especially since you can use many of them for free). About the pricing itself, vectorization is not a very costly task unless you are dealing with thousands of documents. According to OpenAI documentation, text-embedding-3-large will be priced at $0.00013 / 1k tokens. Keep it in mind text-embedding-3-large is their new model and most expensive one (also more expensive than the one that I used in the video).
Source: openai.com/blog/new-embedding-models-and-api-updates
So, imagine for a document with 100 pages, in case you turn it into let's say 500 chunks, you will make 500 API calls to this model and it will roughly cost you something around: 0.0325 US$ (approximate).
I considered each chunk to be around 1500 characters which makes it around 400 tokens.
Also, keep it in mind that you can manage the vectordbs by editing them (add/remove documents) rather than re-creating them on every small change.
@TomerShaiman 3 місяці тому
Great Project, but eventually we are using OpenAI API's which does the "real" AI magic here. question is : if we already plan to build a customize bot and train it on our data, wouldn't we also want to train our own open source LLM model such as LLma-3 and remove the dependencies with API online calls ( which makes the project strongly bound to public access of our private documents, right ?)
@airoundtable 3 місяці тому
It very much depends on the requirements of your project. If using closed-source models lie GPT is not an option, then using a powerful open source model like Llama is the way to go. You can definetly do that. I have another video that I have implemented this same project but using open-source models. You can check it out here:
ua-cam.com/video/6dyz2M_UWLw/v-deo.htmlsi=Lfn9h2Y9pl5zRUd9
@kunalsatpute8379 10 місяців тому ⁺¹
Hi,
I am using windows machine and getting error while running upload_data_manually.py file
its giving me runtimeerror: your system has an unsupported version of sqlite3. Chroma requires sqlite>=3.35.0
then i checked sqlite version using
sqlite3 -version its showing me 3.41.2 which is greater than 3.35 🤔
@airoundtable 9 місяців тому
Hi, please try to replicate the project using the exact library versions that I included in requirements.txt and let me know if the problem appears again.
@kunalsatpute8379 9 місяців тому ⁺¹
@@airoundtable I downloaded sqlite3 DLL files and copied those files to python installation directory. this resolve this issue
@damianoiucci309 9 місяців тому
is commercial or not openai need api key? there is possible to use llm otherwise of IA of third parts?
@airoundtable 9 місяців тому
OpenAI GPT and embedding models need an API key. If I got your point correctly, you want to use open source LLMs. You absolutely can. Please check the Open Source RAG video in my channel. That is alsomt the identical project with open source models.
ua-cam.com/video/6dyz2M_UWLw/v-deo.htmlsi=W9dW4JNbC2KH_tHs
@RZOLTANM 10 місяців тому ⁺¹
QQ What version of Python is best here as there's a lot of packages and hence implicit dependancies?
@airoundtable 10 місяців тому
I have been using python 3.11 and never faced any issue with RAG projects. I also included a requirements.txt file in the project root directory that you can check the versions of all the libraries that I used for this chatbot.
@RZOLTANM 10 місяців тому
Thanks.Apologies if you have already clarified that. I will give it a go with 3.11.@@airoundtable
@vectorautomationsystems 9 місяців тому
Thank you very much for this. Very concise in your explanations. Any chance this can be done without OpenAI, but instead use a local LLM like ollama?
@airoundtable 9 місяців тому ⁺¹
Absolutely, please check out my latest video called:
Open Source RAG Chatbot with Gemma and Langchain | (Deploy LLM on-prem)
i took RAG-GPT and replaced GPT models with Google Gemini7B. I also replaced OpenAI's embedding model with an open source model. I would never suggest using a 7B LLM for deployment but my main goal was to show how you can have the same pipeline (RAG-GPT) but with an open source model on prem.
@neelamyadav533 8 місяців тому
My url is generated for UI , but nothing is getting displayed. I checked vector db is also created under chroma folder for the documents already stored under docs folder, I am using azure open ai credentials , what could be the reason?
@airoundtable 8 місяців тому
It is hard to tell without seing the Traceback of the problem. Whatever is happening, you can see it from the terminal. In case the problem is not solved yet, feel free top open up an issue on the github repository and post the traceback there.
@0ZeroTheHero 10 місяців тому ⁺¹
How can this be modified to include .epub files?
@airoundtable 10 місяців тому ⁺¹
To add .epub files, modify "prepare_vectordb.py" module and add the condition for adding .epub and then use the following links to prepare the langchain loader and pass the loaded files for chunking.
js.langchain.com/docs/integrations/document_loaders/file_loaders/epub
python.langchain.com/docs/integrations/document_loaders/epub
api.python.langchain.com/en/latest/document_loaders/langchain_community.document_loaders.epub.UnstructuredEPubLoader.html
@Krishna-p6r2h 10 місяців тому
Thank u Sir🙏
@rajasengupta1125 10 місяців тому ⁺²
This is the most comprehensive RAG tutorial video I have seen on UA-cam. What a great effort and command over the subject sir!
I am from a low-code business analyst background so I heavily depend upon co-pilot to guide me on python script functionality
Still I was able to set up the system as explained by you on my local PC, however I am getting the error on executing python src
aggpt_app.py
"
import pwd
ModuleNotFoundError: No module named 'pwd'
"
Can you guide me on what I am missing
Many thanks
@airoundtable 10 місяців тому ⁺¹
Thanks! I am happy to hear that you liked the video! It is a bit difficult for me to debug that code without the whole traceback. What operating system are you using? and can you send me the full error?
@duynguyen-kn6tg 10 місяців тому
@@airoundtable
I have same error, I use Windows 10
...
File "D:\Install\PyThon\lib\site-packages\langchain\document_loaders\__init__.py", line 18, in
from langchain_community.document_loaders.acreom import AcreomLoader
File "D:\Install\PyThon\lib\site-packages\langchain_community\document_loaders\__init__.py", line 163, in
from langchain_community.document_loaders.pebblo import PebbloSafeLoader
File "D:\Install\PyThon\lib\site-packages\langchain_community\document_loaders\pebblo.py", line 5, in
import pwd
ModuleNotFoundError: No module named 'pwd'
@rajasengupta1125 10 місяців тому
Thanks for your kind reply. I am using windows 10.
On VS Code I have created the virtual env, installed all libraries and followed all instructions as given in readme (for RAG-GPT application)
When i initialize python serve.py its ok
PS F:\Users\XYZ\Desktop\projects\LLM-Zero-to-Hundred\RAG-GPT> cd src
PS F:\Users\XYZ\Desktop\projects\LLM-Zero-to-Hundred\RAG-GPT\src> python serve.py
Serving at port 8000
But when I initialize python raggpt_app.py ( To launch Gradio ). I get the following error ( truncated )
PS F:\Users\XYZ\Desktop\projects\LLM-Zero-to-Hundred\RAG-GPT> cd src
PS F:\Users\XYZ\Desktop\projects\LLM-Zero-to-Hundred\RAG-GPT\src> python raggpt_app.py
Traceback (most recent call last):
File "F:\Users\XYZ\Desktop\projects\LLM-Zero-to-Hundred\RAG-GPT\src
aggpt_app.py", line 25, in
from utils.upload_file import UploadFile
File "F:\Users\XYZ\miniconda3\lib\site-packages\langchain_community\document_loaders\pebblo.py", line 5, in
import pwd
ModuleNotFoundError: No module named 'pwd'
PS F:\Users\XYZ\Desktop\projects\LLM-Zero-to-Hundred\RAG-GPT\src>
@TechPuzzle_Haven 10 місяців тому ⁺¹
Excellent Video. Thanks a lot. Could you pls help me to get the terminal_q_and_a.py file
@airoundtable 10 місяців тому ⁺¹
Hi, Thanks! Sure, I will update the repository in about an hour and push that file in it. I saw you opened an issue on Github as well, so I will inform you there when the file is added.
@TechPuzzle_Haven 10 місяців тому ⁺¹
@@airoundtable I got the file .Thanks a lot for your quick response and excellent video
@airoundtable 10 місяців тому
@@TechPuzzle_Haven You're welcome! Thanks!
@Preds23 9 місяців тому
This is a fantastic tutorial. Nice Work!
**Question, sporadically the chat history and retrieved content are appended to the chatbot along with the question/response. Not sure if it's related but, it seems to only happen when I adjust(increase text details) the Prompt instructions. Any idea why? Thanks.
@airoundtable 9 місяців тому
Thanks! I am glad that you liked the video.
I am not sure if I understood your point correctly. The chat history, is appended to the model's input in every query. (Along with the retrieved content, and the user's query itself). Therefore, the GPT model should always have access to the chat history. But in case the performance is not consistant, the source can be
1. Due to the context length that the GPT model recived. GPT 3.5 has 4096 token limit and based on my experience if it recives somethong over 3000, there is a great chance that you see degradation in performance. In case you are interested, have a look at minute 8 of Open Source RAG Chatbot video for a more technical explanation of this aspect.
2. Format of the input: In case the chunk sizes are very small in a sense that the input does not create an understandable context, GPT model can get confused as well.
3. Model system role: model system role should clearly guide the GPT model to understand each part of the input context. A vague system role can make the GPT model confused and therefore it can affect the usage of the chat history.
Please let me know if this could answer your question.
@Preds23 9 місяців тому ⁺¹
@@airoundtable Sorry for the confusion. After more testing, I don't think adjusting the prompt is the issue. Sporadically, the chat history, retrieved content(similar to references sidebar), source text and the input query are displayed on the output of the Gradio chatbot. Is this normal functionality? It doesn't happen on all questions though, some just return the response. Thanks again for the interaction.
@airoundtable 9 місяців тому
@@Preds23 No worries. Now I think I got your point which is a good point indeed. In some occasions the GPT model also provides the source with the response and in some cases you don't see it unless you click on references. Here is why:
That behavior is due to the context length that I mentioned in the previous message. If the context that the GPT model receives is not too long (in a way that it overwhelms the model) the model is able to pick up that piece of instruction and show the source in the output. But if the context length that the model receives is long (chat history + user prompt + retrieved content) GPT 3.5 is not powerful enough to be able to follow all the details. So, it will only focus on the main body of the instruction and misses that part where it should also provide the source. However, as you noticed you can always see the reference and the chunks in the reference bar. But in case you want a more consistent behavior from GPT on that matter:
1. Use GPT 4 instead of GPT 3.5. with the current config, GPT 4 would be almost always return that source as the end.
or
2. Reduce the size of chunks and the number of chat history to be injected to the model. To do that you have to make sure that this change is not so drastic to downgrade the model's behavior but to some extent, you can make the model's input a bit shorter so the model can pick up all the instruction's details.
To put it in a more simple explanation think of it this way: GPT 3.5 has 4096 token limit. If you pass an input to it with 3500 tokens, the model will focus more on the beginning and end of the input and starts to forget (or ignore) what was said in the middle section of the input. And in case you pass an input with 2000 tokens, the model can understand and follow all the instructions nicely without any issue. This is an intrinsic characteristic of all LLMs.
Hope this help you understand the problem.
@xiaochen-l7c 24 дні тому ⁺¹
Could you please open up the test file, please? So that I can follow the video tutorial step by step to know the use of each method，thx!
@airoundtable 22 дні тому
Which test file?
@xiaochen-l7c 18 днів тому
@@airoundtable your note files,like others project has explore file，I need explore file
@malleshtelagarapu9219 11 місяців тому ⁺¹
Very Informative video, understood all the components involved, could you please let me know where should I define "OPEN_API_KEY", is it in moderation.py, or this along with other parameters should be defined in environment variables, could you please help me with this.
@airoundtable 11 місяців тому ⁺¹
Thanks! To keep important information like passwords or API keys safe, we put them in a special file called .env in our project. This is a common way to handle private settings. Here's a really easy guide to help you set it up:
Make a new file in the main folder of your project and call it .env.
1. Inside this .env file, you can write your private information like this:
```
OPEN_AI_API_KEY=yourapikeyhere
ANOTHER_SECRET_KEY=yoursecretkeyhere
```
2. To use these settings in your code, you'll need to add a couple of lines to tell your program to read the .env file. Here's how you do it:
First, you need to add these lines at the beginning of your Python script:
```
import os
from dotenv import load_dotenv
load_dotenv() # This tells Python to read your .env file
```
3. Then, whenever you need to use the information from the .env file, you can get it like this:
```
openai_api_key = os.getenv("OPEN_AI_API_KEY")
another_secret_key = os.getenv("ANOTHER_SECRET_KEY")
```
Replace OPEN_AI_API_KEYand ANOTHER_SECRET_KEYwith the actual names you used in your .env file. Now your code can use your private settings safely!
@malleshtelagarapu9219 11 місяців тому ⁺¹
Thanks a lot, I will try the above method.
@coffeepod1 5 місяців тому
how to register Azure Deployment?
we already have some params in load_openai_cfg(self), like API_KEY, API_BASE, API_VERSION, API_TYPE. I tried to add parameters API_DEPLOYMENT, but it still error: No deployment found
@airoundtable 4 місяці тому
I hope you have already found the answer to your question. But in case you haven't, first check and see if you can call your models from a separate notbook from the project. After the successful call, then try to add the models to the project. This error was raised because the project could not properly communicate with the model. That is either due to the model name or the credentials that you took from Azure OpenAI
@coffeepod1 4 місяці тому ⁺¹
@@airoundtable Yes already solved. But I ended up using OpenAI instead of Azure. Hi, I want to ask again:
How can we make the chatbot understand the format of documents? Example, I have a document with format: title, dates, content. Then I upload new document and check if my new document has the same format with my preprocessed document. And I altered system prompt in about_config.yml, to add capability of the chatbot to detect typos, but it doesn't work. How can I edit the project? Thanks before
@airoundtable 4 місяці тому
@@coffeepod1 Glad to hear it.
Regarding your new question, I've never done it but if I want to solve it, I would use a hybrid approach by combining python libraries that are able to extract document info based on the document structure (headings, subheadings, etc.) and from there I could either hard code it to make sure the structure is correct or I would use an LLM to make the judgment for me. But just sharing a document to an LLM would not be an effective way for achieving your goal.
About the typos, since LLMs work with tokens and not the words, it is sometimes hard for them to detect typos (especially when the context length is long). The benefit is when we interact with them they don't care if we have typos in our queries. But on the downside, when it comes to fixing these typos, there is a chance that they miss the errors. They can perform better with smaller context lengths which is not usually the case for RAG systems.
@coffeepod1 4 місяці тому ⁺¹
@@airoundtable I see, we are not OpenAI (or big AI companies), limited on resource indeed. But, what if "prompt engineering" with few shots about the doc structure? or how about NER approach since it is similar with information extraction. and about typos, I don't really think about the obstacles like you explained man, I just think the GPT model (I use GPT-4o) has capabilities to detect typos. Poor me, working on hard project
@airoundtable 4 місяці тому
@@coffeepod1 I am not sure about the suggestions. I haven't work with them. But at this point, I would say the best strategy would be to test different approaches quickly and see which one is more effective than the other ones. Then spend time and improve that approach
@hungryforasmr1157 4 місяці тому ⁺¹
bro it was giving error for modifying pip
@airoundtable 3 місяці тому
Open a ticket on the repo if you still couldn't solve the problem. I need to see the error
@agep13 11 місяців тому ⁺¹
Hi, Could you give me your .env setting you use, openai_api_type, openai_api_base, and openai_api_version? I am struggling to make it works.. thank.. great video btw..
@airoundtable 11 місяців тому
Thank you for your positive feedback @agep13.
Regarding the .env settings, it's important to note that these often contain sensitive information, such as API keys, which should be kept private and not shared openly to ensure security. However, I can certainly guide you on setting up your own .env file.
For the openai_api_type, openai_api_base, and openai_api_version, you'll need to consult the official documentation provided by OpenAI or Azure to determine the correct values. These details are typically available in the API or developer section of your account dashboard.
Here's a basic template for what your .env file might include:
OPENAI_API_KEY=your_unique_api_key
OPENAI_API_TYPE=the_type_of_api_you_are_using
OPENAI_API_BASE=azure_open_ai_endpoint_url
OPENAI_API_VERSION=version_number
Please ensure you replace the placeholders with your actual API key and the appropriate values for your use case. The OPENAI_API_TYPE will depend on the API service you've subscribed to (e.g., GPT in the video.), while the OPENAI_API_BASE and OPENAI_API_VERSION are generally standard URLs used for accessing OpenAI's API.
Remember to keep your .env file secure and avoid uploading it to public repositories to prevent any unauthorized use of your API keys.
@ShadowScales 10 місяців тому ⁺¹
@dtable hello great video, i know the OPENAI_API_key is in your open ai account and openai_api_type would be like gpt3.5 or 4 but for the rest im still confused on where to get the base and version how do i check which urls to use
@airoundtable 10 місяців тому
Hi@@ShadowScales, Thanks! Your confusion is on point because to use GPT models from OpenAI directly, you won't need to insert the endpoint and the api_version. You probably missed that I am using GPT model from Microsoft Azure. That is why I have 4 credentials for it. In order to understand how to adjust the code to OpenAi API directly, please read the comments under @mikew2883 down below (it is the comment with the 7 replies). There we had a full discussion on how to properly modify the project. I hope this helps. In case you have any question along the way, please let me know.
@sujit5013 6 місяців тому
Is the code no longer available?
@airoundtable 6 місяців тому
It is available:
github.com/Farzad-R/LLM-Zero-to-Hundred/tree/master/RAG-GPT
@ojasvisingh786 10 місяців тому
🎉❤
@arian2168 10 місяців тому ⁺¹
Hello thanks for the video but when i try to run the app i get this error : "
openai.error.InvalidRequestError: Invalid URL (POST /v1/engines/gpt-35-turbo/chat/completions)" because i dont have access to azure open ai api yet im using openai api, Would you be able to help me with this ? Thank you
@airoundtable 10 місяців тому
Hi Arian. Thanks. I just pinned a message on top in the comment section where I discussed this issue with @mikew2883. Please read that discussion. I provided all the necessary guidance for changing the openAI API call from Azure to OpenAI itself. let me know if you have any other question
@arian2168 10 місяців тому
@@airoundtable Thank you for your reply. I made a few additional adjustments, and now it works, really appreciate your awesome work and the effort you put in. Damet Garm 👊
@airoundtable 10 місяців тому
@@arian2168 happy to hear it. Thanks Arian!

Наступне

Автоматичне відтворення

Connect GPT Agent to Duckduckgo Search Engine | Streamlit Chatbot