Awesome video! I actually just came to the realization today i need to Name my tools and descriptions pretty much exactly like you described. Lots of great info, thank you!
Thank you for your wonderful configurations to learn tools. Local tools for automation with laptop and cellphone interface for llm local memory for my conversations and code building. Is what I'm building
Would be nice if you could demo a custom tool using all the important aspects that you have mentioned. And what would the results of applying on not applying them.
what is your favorite framework? I'm seeing "agency swarm" is getting some good traction. Before that MS Autogen was very promising, but with limited toolset it was quite dumb. They have released a new version recently - worth checking. And Langflow / N8N also look nice.
LLMs tend to generate answers even when they don't have the right information. When retrieving data from RAG, if the question is unrelated, they sometimes hallucinate responses. I've seen people build tools to filter and fact-check, but with multiple agents running, they quickly eat up my credits like there's no tomorrow.
How does the agent fair when you give it access to lots of tools? I would assume that it increases the number of errors/hallucinations that occur but how quickly is this dropoff? Essentially asking if you've given it access to all 48 of your tools at the same time?
OpenAI Assistants can hold up to 128 function call tools, but I hear it can get confused when it has many tools, but I assume that depends on a lot of things. I was just thinking of just storing tools a python files locally in a database instead of defined with the actual agent. I really don't see much of a difference, and you can easily add a lot more meta data for the choosing of the correct tool. Or maybe a combination of a main tool repo where agents can pull the tools they need from, lot's to figure out. Oh and my other option, every tool is an agent.
I am not using LangChain for anything major in production, The framework approach is useful to show people things rather than write them from scratch and I don't want to go giving out prod code at this stage. The tools concepts though are pretty much the same no matter what you choose to use.
Are you writing evals (evaluation code) for each of those tools separately? Also you mentioned that you are using langchain in some of the tools, do you find it still useful? I find myself struggling more and more with these abstractions and most of the time I'm just using simple things like instructor and build "the framework" around it myself (things like retries on bad outputs, have a graph based routing, etc...).
I can hear you! I feel the same! How much terminology do I need to study, that I‘m able to do this or that. Believe it or not, I can do lots with python, and I don‘t know how to name it. 😂
Yes this is totally valid point and I largely agree. I tend to use LangGraph for prototyping and then streamline for anything I want to put into production etc. Instructor is a cool lib nad I have used it for a few things.
I have a sense that LangChain and "agents" are simply an anthropomorphized solution in search of a problem. You argue that a custom tool isn't just an API call, but "agents" are merely sequenced, conditional, or looping LLM prompts, possibly with some function calling - essentially just API calls. With Cursor IDE, you can write functions to call new API endpoints in 1-2 requests, and then you have OpenAI's o1 model that has built-in chain-of-thought and planning capabilities. This begs the question of why you need an agentic approach or LangChain at all.
The thing I would say is you generally don't want an agent to write its own tools on the fly. You can certainly use things like cursor and various code generation tools to create the tools and then use them in your agent, but I'm really reluctant to let the agent do that realtime. They tend to be too unrestricted and just end up wasting lots of tokens and going into loops of repeating themselves.
@samwitteveenai thank you for your comment. I wasn't considering on the fly. I was thinking like a CrewAI crew that is completely focused on building CrewAI tools, following the process of [research, design, build, test, improve]. With a huge emphasis on utilizing existing tools, like read/write files, websearch, etc. and very strict small tasks/agent role definitions. And yes, I'm considering writing the whole thing with cursor. Do you think this is doable, any suggestions to assist the creation of good tests?
@@samwitteveenai Very true, but n some instances they can work ok. For example I have a psql agent that can manage databases using psql commands and sqlachemy with an InMemoryExecution tool and it actually works pretty well after some initial training, but it is true, they can get confused on occasion and get stuck in loops if they try something they've never tried before. But over all for general tasks it's not bad. All depends on the use case. I can just say for example, create a new relational database for this or that with all the these tables and fields and fill it will sample data, and it can pull it off no problem. In this case, i would have to make a tool for every possible action on a database. In production, i would have no choice though but to make a tool for each task.
I cannot hear it anymore… Framework here, framework there… What about tool usage without any framework? What about alternative approaches? What about an orchestrator that simply writes a python script on given functions, to return the first layer? direct_reply() or whatever tool/workflow I want to use the LLM uses to generate whatever? Sam, I‘m so frustrated because of all these so called frameworks. Mistral-Nemo is able to write instructed code and to avoid out of scope code.
Been away for a while. It’s nice to be back on your channel and to watch this informative tutorial. Many thanks.
Awesome video! I actually just came to the realization today i need to Name my tools and descriptions pretty much exactly like you described. Lots of great info, thank you!
Thank you for your wonderful configurations to learn tools. Local tools for automation with laptop and cellphone interface for llm local memory for my conversations and code building. Is what I'm building
Would be nice if you could demo a custom tool using all the important aspects that you have mentioned. And what would the results of applying on not applying them.
what is your favorite framework? I'm seeing "agency swarm" is getting some good traction. Before that MS Autogen was very promising, but with limited toolset it was quite dumb. They have released a new version recently - worth checking. And Langflow / N8N also look nice.
Excellent, All the agent verification ive been thinking of, but i haven't had the time to write
LLMs tend to generate answers even when they don't have the right information. When retrieving data from RAG, if the question is unrelated, they sometimes hallucinate responses. I've seen people build tools to filter and fact-check, but with multiple agents running, they quickly eat up my credits like there's no tomorrow.
So writing some functions is good when programming. Thanks!
How does the agent fair when you give it access to lots of tools? I would assume that it increases the number of errors/hallucinations that occur but how quickly is this dropoff? Essentially asking if you've given it access to all 48 of your tools at the same time?
OpenAI Assistants can hold up to 128 function call tools, but I hear it can get confused when it has many tools, but I assume that depends on a lot of things. I was just thinking of just storing tools a python files locally in a database instead of defined with the actual agent. I really don't see much of a difference, and you can easily add a lot more meta data for the choosing of the correct tool. Or maybe a combination of a main tool repo where agents can pull the tools they need from, lot's to figure out. Oh and my other option, every tool is an agent.
Thanks for the video. I see you using langraph a lot, do you recommend it for building production ready agents?
16:27 Yes, in deed! Tools are very important. I“m just not convinced that the framework approach is the best (and only) one.
I am not using LangChain for anything major in production, The framework approach is useful to show people things rather than write them from scratch and I don't want to go giving out prod code at this stage. The tools concepts though are pretty much the same no matter what you choose to use.
@@samwitteveenai , good to know. Good that you mention it. Over time, I got the impression that these frameworks are the be-all and end-all.
Even a non-framework approach is a framework :)
@@AI_Escaped I wrote „these frameworks“. And now, do you feel better?
Are you writing evals (evaluation code) for each of those tools separately? Also you mentioned that you are using langchain in some of the tools, do you find it still useful?
I find myself struggling more and more with these abstractions and most of the time I'm just using simple things like instructor and build "the framework" around it myself (things like retries on bad outputs, have a graph based routing, etc...).
I can hear you! I feel the same! How much terminology do I need to study, that I‘m able to do this or that. Believe it or not, I can do lots with python, and I don‘t know how to name it. 😂
Yes this is totally valid point and I largely agree. I tend to use LangGraph for prototyping and then streamline for anything I want to put into production etc. Instructor is a cool lib nad I have used it for a few things.
Hey sam, just curious in regards to **kwargs, why didnt you
1) include *args
2) just use *args instead of *kwargs
I have a sense that LangChain and "agents" are simply an anthropomorphized solution in search of a problem. You argue that a custom tool isn't just an API call, but "agents" are merely sequenced, conditional, or looping LLM prompts, possibly with some function calling - essentially just API calls. With Cursor IDE, you can write functions to call new API endpoints in 1-2 requests, and then you have OpenAI's o1 model that has built-in chain-of-thought and planning capabilities. This begs the question of why you need an agentic approach or LangChain at all.
Agents are basically just smart programs
thanks :)
can you recommend some planning tools?
Thank Sam. Any pointers on having agents generate tools?
The thing I would say is you generally don't want an agent to write its own tools on the fly. You can certainly use things like cursor and various code generation tools to create the tools and then use them in your agent, but I'm really reluctant to let the agent do that realtime. They tend to be too unrestricted and just end up wasting lots of tokens and going into loops of repeating themselves.
@samwitteveenai thank you for your comment. I wasn't considering on the fly. I was thinking like a CrewAI crew that is completely focused on building CrewAI tools, following the process of [research, design, build, test, improve]. With a huge emphasis on utilizing existing tools, like read/write files, websearch, etc. and very strict small tasks/agent role definitions.
And yes, I'm considering writing the whole thing with cursor. Do you think this is doable, any suggestions to assist the creation of good tests?
@@samwitteveenai Very true, but n some instances they can work ok. For example I have a psql agent that can manage databases using psql commands and sqlachemy with an InMemoryExecution tool and it actually works pretty well after some initial training, but it is true, they can get confused on occasion and get stuck in loops if they try something they've never tried before. But over all for general tasks it's not bad. All depends on the use case. I can just say for example, create a new relational database for this or that with all the these tables and fields and fill it will sample data, and it can pull it off no problem. In this case, i would have to make a tool for every possible action on a database. In production, i would have no choice though but to make a tool for each task.
I cannot hear it anymore… Framework here, framework there… What about tool usage without any framework? What about alternative approaches? What about an orchestrator that simply writes a python script on given functions, to return the first layer? direct_reply() or whatever tool/workflow I want to use the LLM uses to generate whatever? Sam, I‘m so frustrated because of all these so called frameworks. Mistral-Nemo is able to write instructed code and to avoid out of scope code.
I am not aware of any agent system that is reliable enough to trust.