💬 Join the Discord Help Server: link.alejandro-ao.com/981ypA ❤ Buy me a coffee (thanks): link.alejandro-ao.com/YR8Fkw ✉ Join the mail list: link.alejandro-ao.com/o6TJUl
Agree about the agent error. Something must have changed underneath the hood as neither the video or the Github code now works. No fault as since this stuff changes frequently right now. Hopefully there will be a fix / update. Thanks and keep up the great work.
Hi, thanks for the tutorial, but I have a questions, I got error on create_csv_agent, it expects a string or string[], but I have the result of st.file_upload(), which does not seem a string, did I miss anything? Thanks.
Thanks for this Amazing tutorial! I have a question, I've created a pandas agent and I'm trying to add memory, but to no avail. how can I create memory in the pandas agent?
hey there, absolutely. you can always set the 'verbose' argument to True. this will print all the reasoning in the console. just set `verbose=True` when importing your LLM
tldr: add "always limit results to less than 300" to the first prompt so you never get too many rows from your sql query. so there are 2 main steps in this process. i see only a problem with the second step: step 1. send the schema of the db to the llm alongside the natural language query in order to get a sql query. no problem here since we are sending only the info about the columns. it might be a problem if you had thousands of columns, but rows are irrelevant here. step 2. execute the query and send results to LLM for interpretation. here it might be a problem since you might get thousands of rows from your query. so i would just add an additional instruction to the previous prompt, something like "always limit results to less than 300" or something like that.
For some reason, when I tried running the code in your video I got an error about the create_csv_agent function expecting its path parameter to be a string or list but instead receiving an UploadedFile object from Streamlit. I solved it by creating a temporary file and writing the contents of the uploaded file into it with the tempfile library. Just in case anyone else runs into the same error.
@@chandrakalagowda3129 I fixed it by creating a tempfile , writing the contents of the csv file to that tempfile and getting the absolute path of that tempfile which I then pass to the creatcsvagent function. So just make sure you are passing an absolute path to the csv agent or it won’t know where to look. Hope this helps
I use langchain (SQLChain / SQLSequential chain) to generate sql queries with the help of ChatGPT API. Though the issue I am facing is 4K token limit. I have a very large database with around 150+ tables. Each table has more than 100 columns. Whenever I request "give me the top 10 customers having the highest balance", it throws a token limit error. All the online videos/articles etc show it's working on a small size of the database but maybe it's not designed for a large database? or Am I missing something?
I had a plan to SQLchain but I saw there is no guarantee of DML so i didn't use it. Any issue you face while working on that? I mean with delete or update etc question?
Yes Kevin, you are missing this: Imagine you have a 1billion records but you just need 10. You ask GPT or any model to write the query. Then you pass the Query to your engine, and then you receive the answer. CHAT GPT doesn't need to see the DB. Just to know the column names in order to write the query from your natural language question
Your pitch and delivery is so cool calm and binds the listeners vibes . It removes all the stress in the listeners and helps to learn the way you have learnt things - the concepts you explain on the terms used are the actual questions which are coming to listernes mind and hit them right , short and crisp just Great delivery !!!! ...
Sir, I am begineer and having an issue please help When i call agent = create_csv_agent(OpenAI(temperature=0), "/content/drive/MyDrive/data.csv", verbose=True) It gives me the error detailed below ValueError: Prompt missing required variables: {'tool_names', 'tools'} Please Help ....
I am new to the LLM and python both. I am getting an exception of API key is not set. What do I do. I think I am missing to set it somewhere. I created secret key and pasted but I think we should declare the variable which isn't talked here
@@alejandro_ao bro you are the best ever But please do not forget to do the same system that you take the input from the user like the streamlit app you have done before
Great video, I am following along. I get the following error when streamlit app is launched: ModuleNotFoundError: No module named 'langchain' Traceback: File "c:\Users\aidab\llmpy\.venv\Lib\site-packages\streamlit untime\scriptrunner\script_runner.py", line 535, in _run_script exec(code, module.__dict__) File "C:\Users\aidab\llmpy\main.py", line 1, in from langchain.agents import create_csv_agent
Can the same be done by uploading "multiple csv" files at once, such that when a question is asked, all the input documents are analyzed before returning an answer?
Can you share the link to the docs that you mention, please? "Creativity" is probably not the perfect word, but it makes it easier for beginners to understand. According to OpenAI: "For temperature, higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic."
"One question: If you have a CSV file with 1000 rows and 10 columns, and it contains more words, does that mean OpenAI will consume more tokens? What is the cost of one request? I see that your CSV file contains more than 500 rows and 20 columns. How much would OpenAI charge for processing this?"
not necessarily, the language model is only used to “think” like “i have to do this”, etc. the whole data manipulation is done in your computer (or your server if your app is deployed). but beware because manipulating bigger csv files can be really resource intensive for your server or your pc.
Thanks for liking my comment but it was also a question :-) Do we need to upload all the documents everytime ? How can I store documents once and keep asking questions from them ?
How to scale up a solution like this , containing hundreds of csvs? User might want to query, whose answer will be located across multiple CSVs , then the call to LLM will get an exception regarding too much token use, beyond permissible limit. how to solve this scaling up problem ? Using a RAG maybe ?
I don't know how you managed to run this, because I'm stuck in the dependancy loop. When i tried installing streamlit, there was an error stating that "Failed to build wheels for pyarrow", then i checked that pyarrow is compatible with python 3.8 at most, so i changed the python version and tried it again. and now there's a new error : ValidationError: 1 validation error for PythonAstREPLTool __root__ This tool relies on Python 3.9 or higher (as it uses new functionality in the `ast` module, you have Python version: 3.8.2. So I'm kinda lost here as to what to do now? And I'm not running 32 bit python. I already checked that. I'm using VSCODE 1.83.1 in windows 11
I wanna know if this approach will work if my requirements is as follows: 1. give the model idea about my CSV files. 2. Input a description to the model, and ask to extract a particular information from the description. 3. use that extracted information and find its answer from the CSV files. can it do that?
This is totally possible. You will need to create your own agent to make this happen. Maybe I will make a video about that if people are interested in that!
@@alejandro_ao thank you so much for your reply. I think in the real scenario, people for sure will have interest in it. Please do make a video in it. I am working on it but I am having difficulties. Thank you so much
I am not sure why my openAI_API_KEY is not set when I have already declared the key in the environment variable though. And because of this when i run, nothing appears on the localhost. Help?
Hey Alejandro, I'm not able generate the graphs when passed the question as "Plot the bar graph between column A and column B", there is no graph reflecting. What to do?
Hi Alejandro! Thank you so much for your video, and making the tutorial very easy to follow! I just had one question, if our excel sheet has merged cells, how do you think we can workaround or deal with it ? Thanks again!
I really hate the example of "who is Leonardo Dicaprio's girlfriend", can't you think of something more useful that this baseline approach that is seen everywhere?
I'm selling all your videos, but I haven't seen one that teaches step by step fine tuning of a model like GPT2 or T5 that doesn't need an official API, could you please teach? I'm wanting to develop a project, but it seems very difficult, I'm having poor results, if you can I would be grateful.
Hey Alejandro i know this is an old video but i have a question regarding this. I asked the agent to create visualization based on the data for eg, a histogram. And the histogram showed up as part of the though process of the agent. But the final output only told me the relation between the columns in natural language. My question is: Is there a way to extract the graphs it creates during its thought process. Or maybe extract some specific info from the though process of the agent ? if Anyone can help me it would be great.
@alejandro_ao hey, I am using groq llama 70b model for it. I was hoping to create a natural language to csv graph visualization agent. Something similar to the sql chatbot you created
This was very helpful to me. However, when I try to load csv files with "weird characters", I got a "UnicodeDecodeError". I think It's because It's not using utf-8 encoding, but I tried to manually change It to utf-8 while saving the file and It continues to give me an error. I will appreciate any help, thank you guys :)
Hello Alejandro, thanks so much for your effort. I am in a project by deploying GPT API , accessing file (like .csv or relational database) / files, chat with the data. Even though not having finished your whole video yet, but would like to ask you a question. I see with streamlit there is the drag and drop file, are we uploading the file to streamlit server ? I am wondering about the data privacy since our project would be done in the company. Thanks so much. Have a nice day !
Hi Alejandro. Thanks for this video. I have a quick question related to this CSV agent. Using current implementation that you showed we can get pretty good analysis from CSV. What if I want to manipulate my CSV based on given prompts. What if I say, "Drop First 1000 rows" or "Make new column named as TEST_COLUMN having concatenated data from Col1 and Col2" And then I say save this dataframe or csv etc etc etc (Any sort of changing and manipulation of data) How exactly can we do it? I have explored langchain docs and issues on github but couldn't really found solution to this. NEED YOU HELP
If I upload a bigger file (say 10,000 rows and 20 columns) it charges me more because of all the data taking it as an implicit prompt, do you know if there's a way around so we could work with big files at an acceptable cost? If I upload a file with 100,000 rows x 30 columns the cost per query will be huge. Thanks again, great video
I have same question? If my understanding is correct LLM responds with instructions to agent on how to proceed and what tools and commands to use. In this case we could just send only headers and few rows to llm. why we need to send whole csv file. big csv file may contain 500k rows and will fail.
My customer asked me. How can evaluate LLMs answer? I think it probably like classic machine learning system. LLMs must have validation dataset to evaluate accuracy. Do you have any suggestion?
Great question. LLMs are usually trained in an unsupervised way, which means that there is no validation set (like K-neighbors, for example). We can evaluate and improve its text completions during the fine-tuning stage (where humans rate the answers that the model gives) in order to make the model better respond to our prompts. But here we are neither training the model nor fine-tuning it. We are just using an already trained one (OpenAI's models in this case) to make it complete the "thoughts" and "actions" of our agent. I am not sure that we have a clear way of systematically evaluating LangChain agents, though! I hope this helps!
Thanks for the video! I wanted to build a medical chatbot for a project so is it possible top use langchain to train a model on the existing medical data to do so?
hey there! sure thing! When you are using LangChain, all you need to do is initialize your `llm` variable with a different language model provider. You can use any of the ones listed here: python.langchain.com/docs/integrations/chat/
Anyone who had the "Expected str or list, got " error, try adding "with tempfile.NamedTemporaryFile(delete=False, suffix=".csv") as temp_file: temp_file.write(user_csv.getvalue()) temp_file_path = temp_file.name" after the first if statement
@@alejandro_ao could you maybe do an updated tutorial on it? Also is the chat memory added in this code? this one is showing error I'm stuck. Thank you for your great content though ❣️ Loved it. Keep it up 💪😃
💬 Join the Discord Help Server: link.alejandro-ao.com/981ypA
❤ Buy me a coffee (thanks): link.alejandro-ao.com/YR8Fkw
✉ Join the mail list: link.alejandro-ao.com/o6TJUl
Mind blowing, immediately tested on real world data from work . Thanks!
awesome!
@alejandro_ao great video. Thanks. Probably you should extend this to Part 2 to show how we can perform analytics (Predictive etc) on csvs.
Agree about the agent error. Something must have changed underneath the hood as neither the video or the Github code now works. No fault as since this stuff changes frequently right now. Hopefully there will be a fix / update. Thanks and keep up the great work.
Hi, thanks for the tutorial, but I have a questions, I got error on create_csv_agent, it expects a string or string[], but I have the result of st.file_upload(), which does not seem a string, did I miss anything? Thanks.
yeah same error i got, if u managed to figure it out pls share the solution
raise ValueError(f"Expected str or list, got {type(path)}")
ValueError: Expected str or list, got
Help, how to fix this error?
got any solution ?
Thanks for this Amazing tutorial!
I have a question, I've created a pandas agent and I'm trying to add memory, but to no avail.
how can I create memory in the pandas agent?
Are there absolutely no videos that uses huggingface instead of OpenAI?
Would you please educate us on the update of this project? The langchain tools has been removed
I'm not very familiar with the models of langchain. Could you create a video introducing these models?
It says you exceeded your current quota, please check your plan and billing details
❤❤❤
Hi bro How can I train it further through code if I need to
it was good, in some cases we need intermediate results like Observation and thoughts to be printed, can we get that
hey there, absolutely. you can always set the 'verbose' argument to True. this will print all the reasoning in the console. just set `verbose=True` when importing your LLM
Yes thanks, sorted out, for that we need to use return_intermediate_steps = True, then we can print
Hello, I am wonderng About something, when WE se a csv agent, WE don't need to use embeding, Vector data base or a memory ? I am currenly confuse
does anyone know how to add memory to the csv agent?
If we upload a massive file with 20000+ rows will this work correctly or will we get size error?
tldr: add "always limit results to less than 300" to the first prompt so you never get too many rows from your sql query.
so there are 2 main steps in this process. i see only a problem with the second step:
step 1. send the schema of the db to the llm alongside the natural language query in order to get a sql query. no problem here since we are sending only the info about the columns. it might be a problem if you had thousands of columns, but rows are irrelevant here.
step 2. execute the query and send results to LLM for interpretation. here it might be a problem since you might get thousands of rows from your query. so i would just add an additional instruction to the previous prompt, something like "always limit results to less than 300" or something like that.
is it possible that if i want to translate my csv file into another language can you guide ?
For some reason, when I tried running the code in your video I got an error about the create_csv_agent function expecting its path parameter to be a string or list but instead receiving an UploadedFile object from Streamlit. I solved it by creating a temporary file and writing the contents of the uploaded file into it with the tempfile library. Just in case anyone else runs into the same error.
I've got the same error. Thanks for the help... It worked! 🙂
im having this same problem but when i try pass the tempfile i created to my createcsvagent function i get an error saying it expecting a str or list.
@@caassimbah2485 I have this issue too. please update if you happen to solve it.
@@chandrakalagowda3129 I fixed it by creating a tempfile , writing the contents of the csv file to that tempfile and getting the absolute path of that tempfile which I then pass to the creatcsvagent function. So just make sure you are passing an absolute path to the csv agent or it won’t know where to look. Hope this helps
@@caassimbah2485 Thanks for this info. I will check.
can you do it using huggingface?
or any other open source ?
yes, just initialise another language model in the variable llm
@@alejandro_ao eagerly waiting for pdf QA with huggingface
@@alejandro_ao I did, I used GPT4ALL llm but getting following errorCould not parse LLM output: `Answer to Question 1 in Python Repl AST format.
Great tutorial. 🫡
Just a question, if I use huggingface will the efficiency of the answers differ a lot or will it be the same?
I use langchain (SQLChain / SQLSequential chain) to generate sql queries with the help of ChatGPT API. Though the issue I am facing is 4K token limit. I have a very large database with around 150+ tables. Each table has more than 100 columns. Whenever I request "give me the top 10 customers having the highest balance", it throws a token limit error. All the online videos/articles etc show it's working on a small size of the database but maybe it's not designed for a large database? or Am I missing something?
I had a plan to SQLchain but I saw there is no guarantee of DML so i didn't use it. Any issue you face while working on that? I mean with delete or update etc question?
Yes Kevin, you are missing this: Imagine you have a 1billion records but you just need 10. You ask GPT or any model to write the query. Then you pass the Query to your engine, and then you receive the answer. CHAT GPT doesn't need to see the DB. Just to know the column names in order to write the query from your natural language question
Use the gpt3.5 turbo 16k model
btw, you can join my email list to keep in touch and make sure you don't miss any of these tutorials 👇
bit.ly/42QofVK
Hey great video, can you do this for a hugging face model to read a csv file?
hey, thank you for this video. Did you let your API_KEY visible on purpose in the .env file ?
Worth receiving, thanks for all you are doing.
Your pitch and delivery is so cool calm and binds the listeners vibes . It removes all the stress in the listeners and helps to learn the way you have learnt things - the concepts you explain on the terms used are the actual questions which are coming to listernes mind and hit them right , short and crisp just Great delivery !!!! ...
Sir, I am begineer and having an issue please help
When i call agent = create_csv_agent(OpenAI(temperature=0), "/content/drive/MyDrive/data.csv", verbose=True)
It gives me the error detailed below
ValueError: Prompt missing required variables: {'tool_names', 'tools'}
Please Help ....
The LangChain CSV agent code isn't working. I searched extensively for a solution but couldn't find one. What should I do to fix it?
Keeps giving me errors:
ValueError: Expected str or list, got
got any solution?
@@gaganmehta7810 OpenAI(temperature=0), csv_file.name, verbose=True)
I am new to the LLM and python both. I am getting an exception of API key is not set. What do I do. I think I am missing to set it somewhere. I created secret key and pasted but I think we should declare the variable which isn't talked here
It is showing the error
ValueError: Expected str or list, got
Hey can you make a similar app using crewAI?
facing problems with the requirement.txt file unable to install some library from it
errors with the requirement.txt file unable to install some library from it
Hola Alejandro, your video style is spot on, so calm and concise, thank you very much!
thank you man! i really appreciate it
really helpful !!! could you please tell me how can I load and chat with pdf that has tabular data (text-tables)
Hi
Can you do a tutorial about the pdf QA with huggingface open source models
or can you recommend any tutorial or article please
That is what I want.
@@donghyeoklee3615 hope he answers
if the world doesn’t end before, it should be coming out next week!
@@alejandro_ao bro you are the best ever
But please do not forget to do the same system that you take the input from the user like the streamlit app you have done before
awesome video Alejandro!
I like how you simplify things. 🤓🤓
thank you! you’re so cool
Great video, I am following along. I get the following error when streamlit app is launched:
ModuleNotFoundError: No module named 'langchain'
Traceback:
File "c:\Users\aidab\llmpy\.venv\Lib\site-packages\streamlit
untime\scriptrunner\script_runner.py", line 535, in _run_script
exec(code, module.__dict__)
File "C:\Users\aidab\llmpy\main.py", line 1, in
from langchain.agents import create_csv_agent
hey there. you probably forgot to install langchain before importing it. just go the terminal and do ‘pip install langchain”
Normally the document which needs to be processed is composed with text and table, then how this document can be used?
@Alejandro - You are awesome and made my day with this. Is it possible to do a video for Ask CSV with memory.
hey there, i am making an update on this video very soon and it will include memory! stay tuned!!
Can the same be done by uploading "multiple csv" files at once, such that when a question is asked, all the input documents are analyzed before returning an answer?
sure thing, we would have to update the agent a little bit though. i will be making a video on custom agents very soon!
temperature is not creativity. it says in the docs that this is not the correct way to think about it. great video still
Can you share the link to the docs that you mention, please?
"Creativity" is probably not the perfect word, but it makes it easier for beginners to understand. According to OpenAI:
"For temperature, higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic."
"One question: If you have a CSV file with 1000 rows and 10 columns, and it contains more words, does that mean OpenAI will consume more tokens?
What is the cost of one request? I see that your CSV file contains more than 500 rows and 20 columns. How much would OpenAI charge for processing this?"
not necessarily, the language model is only used to “think” like “i have to do this”, etc. the whole data manipulation is done in your computer (or your server if your app is deployed).
but beware because manipulating bigger csv files can be really resource intensive for your server or your pc.
Do we need to upload documents every time we run program?
Thanks for liking my comment but it was also a question :-) Do we need to upload all the documents everytime ? How can I store documents once and keep asking questions from them ?
I need help building a machine model my brothers.
you’re in the right place 😎
How to scale up a solution like this , containing hundreds of csvs? User might want to query, whose answer will be located across multiple CSVs , then the call to LLM will get an exception regarding too much token use, beyond permissible limit.
how to solve this scaling up problem ? Using a RAG maybe ?
I don't know how you managed to run this, because I'm stuck in the dependancy loop. When i tried installing streamlit, there was an error stating that "Failed to build wheels for pyarrow", then i checked that pyarrow is compatible with python 3.8 at most, so i changed the python version and tried it again. and now there's a new error :
ValidationError: 1 validation error for PythonAstREPLTool __root__ This tool relies on Python 3.9 or higher (as it uses new functionality in the `ast` module, you have Python version: 3.8.2.
So I'm kinda lost here as to what to do now?
And I'm not running 32 bit python. I already checked that. I'm using VSCODE 1.83.1 in windows 11
I wanna know if this approach will work if my requirements is as follows:
1. give the model idea about my CSV files.
2. Input a description to the model, and ask to extract a particular information from the description.
3. use that extracted information and find its answer from the CSV files.
can it do that?
This is totally possible. You will need to create your own agent to make this happen. Maybe I will make a video about that if people are interested in that!
@@alejandro_ao thank you so much for your reply. I think in the real scenario, people for sure will have interest in it. Please do make a video in it. I am working on it but I am having difficulties. Thank you so much
I am not sure why my openAI_API_KEY is not set when I have already declared the key in the environment variable though. And because of this when i run, nothing appears on the localhost. Help?
hey there. have you installed `python-dotenv` and ran `load_dotenv()` before using openai models?
@@alejandro_aoYes I did - but it is still showing up as empty on the local host. Not sure where I'm going wrong..
Hey Alejandro, I'm not able generate the graphs when passed the question as "Plot the bar graph between column A and column B", there is no graph reflecting. What to do?
Hi Alejandro! Thank you so much for your video, and making the tutorial very easy to follow! I just had one question, if our excel sheet has merged cells, how do you think we can workaround or deal with it ? Thanks again!
Iam getting an error :
ValueError: Expected str or list, got
what should I do😥
I really hate the example of "who is Leonardo Dicaprio's girlfriend", can't you think of something more useful that this baseline approach that is seen everywhere?
I'm selling all your videos, but I haven't seen one that teaches step by step fine tuning of a model like GPT2 or T5 that doesn't need an official API, could you please teach? I'm wanting to develop a project, but it seems very difficult, I'm having poor results, if you can I would be grateful.
Hey Alejandro i know this is an old video but i have a question regarding this.
I asked the agent to create visualization based on the data for eg, a histogram. And the histogram showed up as part of the though process of the agent.
But the final output only told me the relation between the columns in natural language.
My question is: Is there a way to extract the graphs it creates during its thought process. Or maybe extract some specific info from the though process of the agent ?
if Anyone can help me it would be great.
hey there mate. sure there is. but i guess i would need more info. what model are you using to do this?
@alejandro_ao hey, I am using groq llama 70b model for it.
I was hoping to create a natural language to csv graph visualization agent. Something similar to the sql chatbot you created
@@delgrave4786 that sounds great. is your code on gh?
This was very helpful to me. However, when I try to load csv files with "weird characters", I got a "UnicodeDecodeError". I think It's because It's not using utf-8 encoding, but I tried to manually change It to utf-8 while saving the file and It continues to give me an error. I will appreciate any help, thank you guys :)
Hello Alejandro, thanks so much for your effort. I am in a project by deploying GPT API , accessing file (like .csv or relational database) / files, chat with the data. Even though not having finished your whole video yet, but would like to ask you a question. I see with streamlit there is the drag and drop file, are we uploading the file to streamlit server ? I am wondering about the data privacy since our project would be done in the company. Thanks so much. Have a nice day !
Hi Alejandro. Thanks for this video. I have a quick question related to this CSV agent.
Using current implementation that you showed we can get pretty good analysis from CSV.
What if I want to manipulate my CSV based on given prompts.
What if I say, "Drop First 1000 rows" or "Make new column named as TEST_COLUMN having concatenated data from Col1 and Col2"
And then I say save this dataframe or csv
etc etc etc (Any sort of changing and manipulation of data)
How exactly can we do it?
I have explored langchain docs and issues on github but couldn't really found solution to this.
NEED YOU HELP
would be cool to extend this app to multiple csv files
How do we keep asking questions? Continuing the chat down the page with followup questions.
Is it possible to have the same application read both PDFs and CSV’s?
Or do they need to be separate?
Thank you so much for this video. Can you also say how to show all the previous conversational history as well on top of it?
Thanks for the video, but can you please make similar video with open source model, Thanks
would love to see this but using local LLM.
coming up!
can’t wait dude
Thanks your for Insight. I searched long until I seen ur hands on clip.
what is the limit of this application, maximum how many rows of data can it analyze ?
Amazing tutorial, very helpful for me as a beginner in this field! Very clear, well-organized, and easy to follow!
this is the first time i ever commented on a video and i just want to say that this has helped me so much. thank you!
how can work with multiple csv file as we work on text or pdf data
How can i ran file in local host
i'll make an updated version of this one very soon. keep the notifications on!
Thanks Brother! I love the way you explain in a way anyone can understand easily. cheers. Long way to go.
thank you brother! it means a lot :)
Can you show the structure of your CSV file?
If I upload a bigger file (say 10,000 rows and 20 columns) it charges me more because of all the data taking it as an implicit prompt, do you know if there's a way around so we could work with big files at an acceptable cost? If I upload a file with 100,000 rows x 30 columns the cost per query will be huge.
Thanks again, great video
I have same question? If my understanding is correct LLM responds with instructions to agent on how to proceed and what tools and commands to use. In this case we could just send only headers and few rows to llm. why we need to send whole csv file. big csv file may contain 500k rows and will fail.
Can you share how to add memory to csv agent?
My customer asked me. How can evaluate LLMs answer?
I think it probably like classic machine learning system. LLMs must have validation dataset to evaluate accuracy.
Do you have any suggestion?
Great question. LLMs are usually trained in an unsupervised way, which means that there is no validation set (like K-neighbors, for example). We can evaluate and improve its text completions during the fine-tuning stage (where humans rate the answers that the model gives) in order to make the model better respond to our prompts.
But here we are neither training the model nor fine-tuning it. We are just using an already trained one (OpenAI's models in this case) to make it complete the "thoughts" and "actions" of our agent. I am not sure that we have a clear way of systematically evaluating LangChain agents, though!
I hope this helps!
Thanks for the great video! Many people may be expecting your next video "Chat with a database" :)
i know!! i will use an open source model on that one. it is scheduled for next week 🔥
Can we load multiple files into the model at once ?
Thanks for the video! I wanted to build a medical chatbot for a project so is it possible top use langchain to train a model on the existing medical data to do so?
absolutely! i’ll do some content about that soon :)
Awesome tutorial 👍 Thank you very much 🙏
thank you man
thank _you_!
it would be cool to add plots and memory
Hey @alejandro : This is really cool video man. I liked it very much. Is there anyway to build chatbot without OpenAI ?
hey there! sure thing! When you are using LangChain, all you need to do is initialize your `llm` variable with a different language model provider. You can use any of the ones listed here: python.langchain.com/docs/integrations/chat/
Thank you so much @@alejandro_ao Do you have any video demoing chatbot with any of this chat models ?
Thanks a lot bro. Shouldn't you be hiding your secret key though?
no worries, i deleted it before uploading the video. thanks for the heads up tho!
I think there is a problem with the llama streaming blocking the entire server. I think it can be further improved through multiprocessing.
bro, ty for the way you break this down. very easy to understand and follow. great job
your channel is pure gold !!
Thank you so much to share this content with all of us, cheers.
You made my day. Thank you! I'm glad you find it useful :)
Hey Alejandro, whats upp man! Can you please tell me what software do you have there that you demonstrate you diagrams?
sup mate :) i used canva this time but i didn’t really like it. the arrows are weird, i’ll try something different next time, i think
@@alejandro_ao MIRO
had to install a ton of dependencies (classic python) but it eventually worked, thanks!
How did you do? Cant manage to install dotenv..
hey,does it work for complex queries on local computer with 32gb ram?
Thank you
Thank you! ❤
Can't wait to see a txt version !!
Another great vid. Any chance for a video but with google docs (or word documents)? Thanks and keep it up! You da best teacher.
thanks! that's a good idea, might so something related soon :)
@@alejandro_ao Thanks. Loving these vids.
👍👍👍
Grazie. Saludos
prego
to compile for me, i had to use the file name: if csv_file is not None:
file_name = csv_file.name
#llm = ChatOpenAI(temperature=0, model="gpt-4")
llm = OpenAI(temperature=0)
agent = create_csv_agent(
llm,
file_name,
verbose=True,
agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
)
Anyone who had the "Expected str or list, got " error, try adding "with tempfile.NamedTemporaryFile(delete=False, suffix=".csv") as temp_file:
temp_file.write(user_csv.getvalue())
temp_file_path = temp_file.name" after the first if statement
or just:
OpenAI(temperature=0), csv_file.name, verbose=True)
Nicely explained. It would be nice to also show how to prompt multiple questions instead of just one, using only one input
sure thing, streamlit just added a new chat component that makes this super easy
@@alejandro_ao could you maybe do an updated tutorial on it? Also is the chat memory added in this code? this one is showing error I'm stuck. Thank you for your great content though ❣️ Loved it. Keep it up 💪😃
Amazing tutorial mate. Thank you!