You are one of the best explainers ever. Out of 50 years listening to thousands of people trying to explain thousands of things. Also, it's raining and thundering outside and I'm creating this monster, I feel like Dr. Frankenstein
If you keep getting timeout errors and happen to be using a somewhat lackluster computer like me, changing `request_timeout` in these lines llm = Ollama(model="mistral", request_timeout=3600.0) ... code_llm = Ollama(model="codellama", request_timeout=3600.0) to a larger number (3600.0 is 1 hour, but it usually takes only 10 minutes) helped me out. Thanks for the tutorial!
I wanted to express my gratitude for the Python Advanced AI Agent Tutorial - LlamaIndex, Ollama and Multi-LLM! This tutorial has been incredibly helpful in my journey to learn and apply advanced AI techniques in my projects. The clear explanations and step-by-step examples have made it easy for me to understand and implement these powerful tools. Thank you for sharing your knowledge and expertise!
You are by far my favorite tech educator on this platform. Feels like you fill in every gap left by my curriculum and inspire me to go further with my own projects. Thanks for everything!
Thanks to you, now I can create an agent with Ollama and LlamaIndex. I have been working on this topic for a month. Very headache. Now, it is solved. Thank you very much.
I like Tim. Tim explains the important things concisely without diving into rabbit holes. Tim gets straight to the point without loud obnoxious music using code. Tim is an expert. Be like Tim.
For me, once installed xcode, rerun installing the package and was able to get llama_cp_python wheel to install. Thanks for this note, helped make sense of the error message.
Yup. I encountered this on windows. On my Visual studio I used the ctrl + shift + p opens a search bar. I searched for interpreter and then I was able to access previous versions of python in different environments, I selected Conda environment and opened a new Terminal. I checked python --version and then the selected python version was up.
Excellent demo! I liked seeing it built in vs code with loops, unlike many demos that are in Jupyter notebooks and can’t run this way. Regarding more demos like this…Yes!! Most definitely could learn a lot from more and more advanced LlamaIndex agent demos. Would be great to see a demo that uses their chat agent and maintain chat state for follow-up questions. Even more advanced and awesome would be an example where the agent will ask a follow up question if it needs more information to complete a task.
Amazing as always, Tim. Thanks for spending the time to walk through this great set of tools. I'm looking forward to trying this out with data tables and PDF articles on parsing these particular data sets to see what comes out the other side. If you want to take this in a different direction, I'd love to see how you would take PDFs on how different parts of a system work and their troubleshooting methodology and then throw functional data at the LLM with errors you might see. I suspect (like other paid LLMs) it could draw some solid conclusions. Cheers!
Thank You for this very informative video. I really like the capabilities of 'LlamaIndex' with PDF's. I used it to process several of my own medium-size PDF's and it was very quick and correct. It would be great to have another vid on how to save and reuse the VectorStore for queries against PDF's already processed. To me this is more important even than the code generation.
Error 404 not found - local host - api - chat [FIX] If anyone else gets an error like that when trying to run the llamacode agent, just run the llamacode llm in terminal to download it, as it did not download it automatically for me at least as he said around 29:11 So similar to what he showed at the start with Mistral: ollama run mistral. You can run this in a new terminal to download codellama: ollama run codellama
This was fascinating, I'm definitely going to be giving it a whirl! I'd love to learn how something like this could be adapted to write articles using information from our own files.
The way you explain is really good and I understood it , you code line by line others just copy paste and donot explain what the code is doing but you explained everything really good content ALso can you bring more tutorial using mutlti agent in crew ai using this multi local llm model thing coz the open ai key is very expensive and all the other channel use that none does it in the local llm
Thanks for this tutorial and your way of explaining, I've been looking for this , Can you also make a vid on how to build enterprise grade generative ai with Nemo Nvidia that would be so interesting, thanks again
The guys at llmware have some fone-tuned models for RAG and some for function calling (outputing structured data). Could be interesting to try out with this.
Dang seems im stuck with a 404 message @ 31:57. Anyone else have that issue? Or have a fix for it possibly? Maybe the dependencies need an update already?
Hi Tim, Great tutorial. I wanted to ask l have been using having trouble when l use .query, I have been using models like mistral-instruct, mixtral, even llama.3.2-instruct. But l am having error that they are not Conversational models
You are truly amazing in explaining concepts. it is like you have fully understood it yourself that is why you can explain them really well. I am trying to get the autocomplete of VS code to work on Mac but nothing works. which extension are you using?
Great video tutorial! Thanks 🙌 (liked and subscribed, lol) A bit of a "noob" developer here, so vids like this really help. I know it's a lot to ask, but.... I was wondering if you might consider showing us how to build a more modular app, where we have separate `.py` files to ingest and embed our docs, then another to create and/or add embeddings to a vector DB (like Chroma), then another for querying the DB. Would this be possible? It would be nice to know how to have data from one Python file feed data to another, while also minimizing redundancy (e.g., IF `chroma_db` already exists, the `query.py` file will know to load the db and query with LlamaIndex accordingly) Even better if you can show us how make our `query_engine` remember users' prior prompts (during a single session). Super BONUS POINTS if you can show us how to then feed the `query.py` data into a front-end interface for an interactive chat with a nice UI. Phew! That was a lot 😂
Great video tutorial or walk-thru. It would be nice to determine minimum configuration required to run. I tried the example on a Xeon 4 core Ubuntu laptop , 16GB with a NVIDIA Corporation GM107GLM [Quadro M2000M] / Mesa Intel® HD . Sometimes it gave a bunch of errors and I had to do cold restart. Also, the only difference in an Ollama versus non-Ollama version should be the instantiation of the LLM and embedding model. Am I right?
it seems multiple elements in the requirements.txt doc require different versions of python and other libraries. Could you clarify what versions what what is needed in order for this to work?
Why did I need to downgrade python 3.12 to 11 to be able to run requirements.txt which some dependencies were calling to use a version less than 3.12 but I see you are using python 3 with no errors?
Guessing you are using Windows? Sometimes you need a different library / adapted library for Windows. It's easier to follow someone who's a Windows developer, but once you get used to the nuances, it's pretty simple. Or you can just use a WSL2 Ubuntu project, or Docker, and it all works fine.
everytime i try to install the requirements.txt files, it only downloads some of the content but then i get this error message: Requires-Python >=3.8.1, im runnning this on a mac with python version 3.12.3 and I can't seem to download the older version of python.
Good video but do you have a complete ai agent with your own data without the coding formatting? This is the closest tutorial I’ve found to do on premise ai agent implementation that I can understand. Thanks!
I am using Python 3.11.6 and in my Windows OS I installed C++ developer tools options. but getting this error "Building wheels for collected packages: guidance, llama-cpp-python Building wheel for guidance (pyproject.toml) ... error error: subprocess-exited-with-error" Shall I proceed with this?
I can't move past this error "No module named 'llama_index.llms.ollama'". I have tried to uninstall and install llama_index and I have also downgraded python version. Did anyone else run into this?
Buy a workstation with very good Nvidia gpu, so u can use cuda. If u still want to go for a MacBook Pro, get the M2 with 32gb or 64gb ram. I’m using a MacBook m1 16” 16gb ram and I can only run llms with 7 - 13b without crashing it
Have you ever thought about using colab as a remote webserver with local llm such as llama3 and calling it from your pc to get predictions? I have your same problem and was thinking about solving like this
Guys if there is some kind of problem downloading the modules with different versions you can remove the specified versions then try installing again.... this is for slakers like me create a python program ti delete the version text which is after == (ask chatgpt if you can't)
I'm getting the following when I run the prompt: Error occured, retry #1: timed out Error occured, retry #2: timed out Error occured, retry #3: timed out Unable to process request, try again... What is this timing out on?
Your Agent is unable to reach your Ollama server. It's repeatedly trying to query your Ollama server's API on localhost, then those requests are timing out. Check if your Ollama LLM is initializing correctly. Also make sure your Agent constructor contains the correct LLM argument.
Hi Tim! GREAT JOB on pretty much everything! BUT, i have a problem im running on windows with pycharm and it shows me an error when installing the requirements, because its pycharm, i have 2 options for installing the requirements, one from within pycharm and one from the terminal FIRST ERROR (when i install through pycharm) in both options im seeing an error (similar one, but not exactly the same) can you please help me with it?
These APIs are used within the same environment or system, enabling different software components or applications to communicate with each other locally without the need to go through a network. This is common in software libraries, operating systems, or applications where different modules or plugins need to interact. Local APIs are accessed directly by the program without the latency or the overhead associated with network communications.
You are one of the best explainers ever. Out of 50 years listening to thousands of people trying to explain thousands of things. Also, it's raining and thundering outside and I'm creating this monster, I feel like Dr. Frankenstein
50 years of listening, and learning, iam sure you have great knowlege
Best comment ever 👌 😅
Uve been studying ai for 50 years ?!?
Agree been watching a lot of Tim videos 😂
If you keep getting timeout errors and happen to be using a somewhat lackluster computer like me, changing `request_timeout` in these lines
llm = Ollama(model="mistral", request_timeout=3600.0)
...
code_llm = Ollama(model="codellama", request_timeout=3600.0)
to a larger number (3600.0 is 1 hour, but it usually takes only 10 minutes) helped me out. Thanks for the tutorial!
thanks mate!
thanks @alexkraken
thank you for your comment! this really helps me! I've been stuck for few hours! thanks!!!
thank you so much.
What if you adjusted the request_timeout to 3600 and it still gets a timeout error? I have also adjusted it to 6000, still get the ReadTimeout error
I wanted to express my gratitude for the Python Advanced AI Agent Tutorial - LlamaIndex, Ollama and Multi-LLM! This tutorial has been incredibly helpful in my journey to learn and apply advanced AI techniques in my projects. The clear explanations and step-by-step examples have made it easy for me to understand and implement these powerful tools. Thank you for sharing your knowledge and expertise!
This is clearly a bot-written comment, but why? What's their endgame? So many bots with puzzling intentions
You are by far my favorite tech educator on this platform. Feels like you fill in every gap left by my curriculum and inspire me to go further with my own projects. Thanks for everything!
Thanks to you, now I can create an agent with Ollama and LlamaIndex. I have been working on this topic for a month. Very headache. Now, it is solved. Thank you very much.
I was really looking forward to learn this. Thanks for the video
I like Tim. Tim explains the important things concisely without diving into rabbit holes. Tim gets straight to the point without loud obnoxious music using code. Tim is an expert. Be like Tim.
Just used your code with llama 3, and made the code generator a function tool, and it was fvcking awesome. Thanks for sharing👍🏻
Some helpful things when going through this:
- Your Python version needs to be
For me, once installed xcode, rerun installing the package and was able to get llama_cp_python wheel to install. Thanks for this note, helped make sense of the error message.
Yup. I encountered this on windows. On my Visual studio I used the ctrl + shift + p opens a search bar. I searched for interpreter and then I was able to access previous versions of python in different environments, I selected Conda environment and opened a new Terminal. I checked python --version and then the selected python version was up.
I have never found anyone that explains code and concepts as well as you. Thank you for everything you do, it really means a lot♥♥
Excellent demo! I liked seeing it built in vs code with loops, unlike many demos that are in Jupyter notebooks and can’t run this way.
Regarding more demos like this…Yes!! Most definitely could learn a lot from more and more advanced LlamaIndex agent demos. Would be great to see a demo that uses their chat agent and maintain chat state for follow-up questions. Even more advanced and awesome would be an example where the agent will ask a follow up question if it needs more information to complete a task.
Great video. Would really like to see methods that didn't involve reaching out to the cloud but keeping everything local.
wow this is absolutely mind blowing ,thanks Tim.
Amazing as always, Tim. Thanks for spending the time to walk through this great set of tools. I'm looking forward to trying this out with data tables and PDF articles on parsing these particular data sets to see what comes out the other side. If you want to take this in a different direction, I'd love to see how you would take PDFs on how different parts of a system work and their troubleshooting methodology and then throw functional data at the LLM with errors you might see. I suspect (like other paid LLMs) it could draw some solid conclusions. Cheers!
Great work TIM you hit it on the head ,what put people of is downloading. Putting into a requirements file is a great idea
Thank You for this very informative video. I really like the capabilities of 'LlamaIndex' with PDF's.
I used it to process several of my own medium-size PDF's and it was very quick and correct.
It would be great to have another vid on how to save and reuse the VectorStore for queries
against PDF's already processed. To me this is more important even than the code generation.
This is very clear and very instructive, so much valuable information! Thanks for your work
Error 404 not found - local host - api - chat [FIX]
If anyone else gets an error like that when trying to run the llamacode agent, just run the llamacode llm in terminal to download it, as it did not download it automatically for me at least as he said around 29:11
So similar to what he showed at the start with Mistral:
ollama run mistral.
You can run this in a new terminal to download codellama:
ollama run codellama
thanks alot !!!!
i love this community ... thanks alot
@TechWithTim This should be pinned :D
you are my hero bro . Thiss problem so f*cking disgusting.Than you my honey
some heroes don't wear cape
This was fascinating, I'm definitely going to be giving it a whirl! I'd love to learn how something like this could be adapted to write articles using information from our own files.
im 16yo and this is the best video tutorial on llm agent!
you still want people to say...oouuu he is only 16 wow he has potentia.....shit
No idea what’s going on but I love falling asleep to these videos 😊
I really loved the video please keep making videos like this
Bro your videos are gold.
Thank you for this video... it was really informative.
This was brilliant, thank you.
Awesome video, man thx a big bunch!
The way you explain is really good and I understood it , you code line by line others just copy paste and donot explain what the code is doing but you explained everything really good content
ALso can you bring more tutorial using mutlti agent in crew ai using this multi local llm model thing coz the open ai key is very expensive and all the other channel use that none does it in the local llm
11:20
If your ollama command doesn't work like mine you can try reinstalling then restarting. If not then try manually add it to path.
Great vid.. only issue is the fact that the parsing is done externally. For RAG's ingesting sensitive data this would be a major issue.
Yeah that's probably why its a free service. They take your clients sensitive info and train their own A.I. Not good.
"If I fix these up." My god, Tim. You know that won't scale.
Awesome 💯
This is awesome.
Awesome 👍
Thanks for this tutorial and your way of explaining, I've been looking for this ,
Can you also make a vid on how to build enterprise grade generative ai with Nemo Nvidia that would be so interesting, thanks again
just awesome !
Great❣
What if I don't my data to be manipulated in the cloud? Is there an alternative for LlamaParser that can be ran locally?
Your explanation is quite effective. Could you let me know when the next video is scheduled for release on similar topic?
I keep getting errors when trying to install the dependencies from requirements.txt
Make sure you have the correct version of python
or better since I prefer he pip install them manually
It is probably because of your python version. I had the same errors. It requires python version below 3.12. I prefer using 3.11.9
yes man...this what i want to do and more...
Could you also do a video on infinite world generation using chunks for RPG type pygame games?
The guys at llmware have some fone-tuned models for RAG and some for function calling (outputing structured data). Could be interesting to try out with this.
Dang seems im stuck with a 404 message @ 31:57.
Anyone else have that issue? Or have a fix for it possibly? Maybe the dependencies need an update already?
Hi Tim, Great tutorial. I wanted to ask l have been using having trouble when l use .query, I have been using models like mistral-instruct, mixtral, even llama.3.2-instruct. But l am having error that they are not Conversational models
You are truly amazing in explaining concepts. it is like you have fully understood it yourself that is why you can explain them really well. I am trying to get the autocomplete of VS code to work on Mac but nothing works. which extension are you using?
Thanks for this!! Unfortunately I can't run it on my laptop, it takes forever and the AI seems confused. I guess it needs powerful machine...
I can't install packages of llama-index in my Windows system. Also, the 'guidance' package is showing an error
did u find the error ?
Please create a video about production-ready AI agents!
Tim - thanks for the wonderful video. Very well done sir!! Is there an alternative to LlamaParse to keep the parsing local?
pymupdf
Great video tutorial! Thanks 🙌
(liked and subscribed, lol)
A bit of a "noob" developer here, so vids like this really help.
I know it's a lot to ask, but....
I was wondering if you might consider showing us how to build a more modular app, where we have separate `.py` files to ingest and embed our docs, then another to create and/or add embeddings to a vector DB (like Chroma), then another for querying the DB. Would this be possible?
It would be nice to know how to have data from one Python file feed data to another, while also minimizing redundancy (e.g., IF `chroma_db` already exists, the `query.py` file will know to load the db and query with LlamaIndex accordingly)
Even better if you can show us how make our `query_engine` remember users' prior prompts (during a single session).
Super BONUS POINTS if you can show us how to then feed the `query.py` data into a front-end interface for an interactive chat with a nice UI.
Phew! That was a lot 😂
neat! but why not multi-agent dev team that evaluates (qa) and reiterates on code that fails qa?
Great video tutorial or walk-thru. It would be nice to determine minimum configuration required to run. I tried the example on a Xeon 4 core Ubuntu laptop , 16GB with a NVIDIA Corporation GM107GLM [Quadro M2000M] / Mesa Intel® HD . Sometimes it gave a bunch of errors and I had to do cold restart. Also, the only difference in an Ollama versus non-Ollama version should be the instantiation of the LLM and embedding model. Am I right?
Nice ❤
You obviously went to the Matthew Berman School of I'll revoke this API Key before publishing this video!
it seems multiple elements in the requirements.txt doc require different versions of python and other libraries. Could you clarify what versions what what is needed in order for this to work?
Pls give the video of final product so i can decide I have to watch the video or not!
Instead of using a local ollama model, can we use gemini with an API key 😅
Why did I need to downgrade python 3.12 to 11 to be able to run requirements.txt which some dependencies were calling to use a version less than 3.12 but I see you are using python 3 with no errors?
Guessing you are using Windows? Sometimes you need a different library / adapted library for Windows. It's easier to follow someone who's a Windows developer, but once you get used to the nuances, it's pretty simple.
Or you can just use a WSL2 Ubuntu project, or Docker, and it all works fine.
New subscriber here!!!
everytime i try to install the requirements.txt files, it only downloads some of the content but then i get this error message: Requires-Python >=3.8.1, im runnning this on a mac with python version 3.12.3 and I can't seem to download the older version of python.
how to know the requirements.txt dependencies we required (it is a large list)
Does the mistral large will be available ? I'm wondering if the LLM availability will be up to date or there's other step to do.
What's the latency of models running locally?
Thanks for the tutorial. Is there any alternate to LlamaParse that allows me to run the application completely local?
Good video but do you have a complete ai agent with your own data without the coding formatting? This is the closest tutorial I’ve found to do on premise ai agent implementation that I can understand. Thanks!
I am using Python 3.11.6 and in my Windows OS I installed C++ developer tools options. but getting this error "Building wheels for collected packages: guidance, llama-cpp-python
Building wheel for guidance (pyproject.toml) ... error
error: subprocess-exited-with-error"
Shall I proceed with this?
so ollama is run locally in your machine? can i make it cloud based by applying it into my backend?
Well, I can't get it to work. It gives 404 on /api/chat
I am getting the same error
you are probably getting this error because you are missing the codellama model, run ollama pull codellama and it should fix it
Can i use that to make ai agent that can call customers and interact with them and take notes of what's happens ? Thank's
If you use llama Cloud and api Key .. are you Need internet? Isn't full local?
I liked this. Out of curiosity, why venv rather than Conda?
I can't move past this error "No module named 'llama_index.llms.ollama'". I have tried to uninstall and install llama_index and I have also downgraded python version. Did anyone else run into this?
Anyone here??? to help? I am also stucked at the same point
@@anishkoirala8532 You may need to "pip install langchain_ollama", if you don't have it.
when im using llama.31 my llm response gets stuck in a loop of action observation action observation. what to do?
This is the limitation of llama3.1 and all llamas.
Once we pass on tools it always ready to call tools even the query don't need it.
can anyone reply to me wt pre requisites to learn for this??
Llamaparser do that unlocal, it cause that it can not be used for enterprise. Is there any way to do this fully local?
Hey can you shared the system configuration need to run this application ?
what am i doing wrong cause when I run it does not work no matter what I try
What are your MacBook Pro specs? I'm looking for a new computer to run llm locally.
Buy a workstation with very good Nvidia gpu, so u can use cuda. If u still want to go for a MacBook Pro, get the M2 with 32gb or 64gb ram. I’m using a MacBook m1 16” 16gb ram and I can only run llms with 7 - 13b without crashing it
I have an M2 Max
Have you ever thought about using colab as a remote webserver with local llm such as llama3 and calling it from your pc to get predictions? I have your same problem and was thinking about solving like this
My mbp pro m1 8gb is hanging while running the llm locally. Any alternatives that we can learn to build without killing my mbp?
Is there much difference between result_type = "Markdown" and result_type = "text"?
Nice one
Guys if there is some kind of problem downloading the modules with different versions you can remove the specified versions then try installing again....
this is for slakers like me
create a python program ti delete the version text which is after == (ask chatgpt if you can't)
❤❤❤❤❤❤
I'm getting the following when I run the prompt:
Error occured, retry #1: timed out
Error occured, retry #2: timed out
Error occured, retry #3: timed out
Unable to process request, try again...
What is this timing out on?
Your Agent is unable to reach your Ollama server. It's repeatedly trying to query your Ollama server's API on localhost, then those requests are timing out. Check if your Ollama LLM is initializing correctly. Also make sure your Agent constructor contains the correct LLM argument.
Do you have a VPN like NordVPN running? Sometimes that can't mess up local servers
change the request_timeout to a bigger value, like 3600.0
Another great tutorial... Thank You! How do I get in touch with you Tim for consultant?
Send an email to the email listed on my about page on youtube
круто
what keyboard are you using? 😊
nice
Can anyone tell me what basic things should one know before going into this video ??
Is it possible to create an agent using other languages?
I did one using Llama2.
are you sharing it somewhere?
Hi Tim!
GREAT JOB on pretty much everything!
BUT, i have a problem
im running on windows with pycharm and it shows me an error when installing the requirements,
because its pycharm, i have 2 options for installing the requirements, one from within pycharm and one from the terminal
FIRST ERROR (when i install through pycharm)
in both options im seeing an error (similar one, but not exactly the same)
can you please help me with it?
you can check which python version you have installed.
@@diegoromo4819hey, thank you for your response, which version should i have? i can't find it in the video.
@@ofeksh 3.11
@@neilpayne8244 shit, that's my version...
Problems ?
# make sure the LLM is listening
`pip install llama-index qdrant_client torch transformers` `pip install llama-index-llms-ollama`
# didn`t download codellama
`ollama pull codellama`
# timeout error
set request_timeout to 500.
How to do same thing for .csv data, someone plz help
can i do the same using langchain
What the minimum laptop to run this model ? Thank's
you need a good gpu to run like litrarly any llm
multiple pdf at a time and pdf contains drawing, how to make
But this is not completely "local" since you need an api key, no?
These APIs are used within the same environment or system, enabling different software components or applications to communicate with each other locally without the need to go through a network.
This is common in software libraries, operating systems, or applications where different modules or plugins need to interact.
Local APIs are accessed directly by the program without the latency or the overhead associated with network communications.
Can you please do a video about making a gui in python
It's Great but how can we manage chat history.
=
Can you make a series