Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)

Sam Witteveen

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 29 вер 2024
Colab: colab.research...
In this video I look at how you can make your own custom tools to do a variety of tasks and how you can use the ChatGPT (gpt-3.5-turbo) API to do it.
My Links:
Twitter - / sam_witteveen
Linkedin - / samwitteveen
Github:
github.com/sam...
github.com/sam...

КОМЕНТАРІ • 90

@jjklin Рік тому ⁺¹
Thank you, Sam. This video addressed the issues I encountered with your precise in-depth knowledge. I practiced with your guidances (turbo_llm and prompt), and it works perfectly. System prompt message to enforce the rule in gpt-3.5-turbo seems to be challenging to many people (including me). May be it worth to have a dedicated video on this topic, if other audiences have the similar issue. Thanks again, great mentor 🙏
@samwitteveenai Рік тому
I do want to make a longer vid about prompting in general at some point, I feel there is a lot of false hype out there about the 'one' prompt or the 'best'. I know from people at Google and OpenAI when they often want to get the model to do something, they have mini competitions internally to work out the best prompts etc. A lot of it is trial and error.
@bingolio Рік тому ⁺²²
Exellent, but would love to see FOSS end-end using OpenAssitant, GPT4all, etc models
@ruffinator 11 місяців тому
Hi. I've followed this guide, but im instead using a local llama2 model. I keep getting a error from the observation:
"> Entering new AgentExecutor chain...
Search for the winner of the 2023 Monaco GP.
Action: search, 2023 Monaco GP winner
Observation: Invalid Format: Missing 'Action Input:' after 'Action:'
Thought: I'm sorry, I didn't understand your question. Please provide an input for the action.
Final Answer: I don't know who won the 2023 Monaco GP."
Any ideas?
Edit: This is with langchain 0.0.313, the structure differs slightly
@chandrashekhargoka5321 7 місяців тому ⁺¹
Hi Sam, your videos are great, I have a question here, we have tools = [search, random_tool, life_tool] like this right, do you know will all 3 tools get executed in the same order that we mentioned? If not; how can we make the execution in order.?
@MarkTarsis Рік тому ⁺¹⁰
This is very useful. I threw the agent into a Gradio Chatbot and it works pretty well interactively. I also played with the Jira toolkit on a test project, and while it works it's very buggy. I'd be fairly curious if perhaps base LLaMA or Alpaca might work well for agents since they're not very chatty. I think WizardLM may be interesting to play with as well in an agent role.
Also, I feel like we sorely need a "prompt database" that allows you to search prompts by model and use case. Figuring out the prompting quirks of every LLM has been my biggest pain.
@samwitteveenai Рік тому ⁺⁹
Currently working on a WizardLM video. Totally agree about the prompt DB idea.
@micbab-vg2mu Рік тому ⁺⁷
Great video! Please continue discussing the topic of custom tools; it looks very promising.
@vrc5674 Рік тому ⁺²
What's never clear to me in all the explanations of LangChain and tools I've seen (including this video) is exactly how LLM/LangChain decides what tool to use. It's obfuscated/abstracted away by LangChain (and the ReAct framework I guess). What I suspect is that somehow the LLM is told on initialization that there are these things called tools and its instructed to go through its list of "tools" each time the user sends a new prompt to the LLM. When the user asks something it determines if a tool matches the prompt (or some portion of the prompt) and constructs a response that is formatted in a way that's consumable by the tool. I'm curious what prompt is given to the LLM during tool setup/initialization and what hidden conversation (if any) is going on between LangChain python code and the LLM to determine the appropriate tool.
@samwitteveenai Рік тому ⁺¹
I show the prompt the LLM that shows that in there. LangChain just filters for those responses back from the LLM. The initialization is all in the prompt. The LLM has no state from call to call,
@spoonikle Рік тому ⁺²
Ok so here is an idea - an agent that identifies what is being talked about and selects the appropriate agent to query. So it takes in a prompt and determines its scope of knowledge - then outputs information like engineering:mechanical:machining
then a tool redirects the prompt to each agent in the scope.
these second agents being identifying “factual statements” and format domain specific database vector database queries to identify correct context and output the queries.
then the script tools query the database and include the context with the users prompt to the next agent.
finally a new context rich and fact checked prompt is sent to agents to draft the final response.
then a second agent pass identifies factual statements in the response and drafts queries to the database.
a script tool queries the database to check the facts and sends the data to the fact checking agent that corrects any errors in the response.
you may do a second round of checks or not, depending on your budget and requirements for accuracy.
Then a final check where an agent is prompted if the response meets the requirements of the original prompt and the final response is delivered.
For example - we could have a game facts agent, that formats search queries to the wikipedia for any given game and formats a response to the query and uses the “facts” in the wiki to rewrite the response with “correct” info.
This could then be adapted to a corporate compliance AI - where the database it queries is the SOP of the business and the safety and compliance information relevant to a job. Like for example the electrical code book. You could then talk with this now expert agent who is specifically designed to output factual statements about the latest electrical code or your corporations new policy without ambiguity with the context of your current task.
As you make better specialized work flows the first agent pass that identifies the scope can start utilizing more and more specialized work flows. The hard part is tuning the individual chains for each task - but once thats done you will have in scope experts for anything that can be hooked up to a computer.
@samwitteveenai Рік тому ⁺¹
Yes things like this are done for commercial applications and when there is budget etc. there are lots of tricks when you can call the LLM many times and have many tools.
@shishanliu5851 2 місяці тому
This video really helps me a lot, thanks first!
I have a question. If we bind an llm with tools, is it a kind of agent?
I am confused because I am not sure an authentic agent is activated through initial_agent() or create_tool_calling_agent() or through simply binding an llm with tools.
@kenfink9997 Рік тому ⁺²
Consistently great videos. Thank you!!! Can you do a video on how to create OpenAI-compatible endpoints so these can work with a user interface like Chatbot-UI (or alternatives you'd recommend?). Thanks!!
@vishnuajoshy1588 6 місяців тому
Thanks it really helps....But for some questions I could see in the log that "{tool_name} is not a valid tool..I used Google Serper API wrapped tool...When I tried the same question after after some times it gave me good answer..What could be reason?
@gustavopublio3451 Рік тому ⁺¹
Thanks for this great introductory video! I have a question: my Jupyter notebook is not always correctly interpreting the OpenAI API response, I frequently get a "OutputParserException: Could not parse LLM output: ", although the output looks good to me. Do you know what I can do to fiix that? Is this a LangChain config or a Jupyter limititation? Thanks in advance!
@Ryan-yj4sd Рік тому ⁺¹
At 13:34, why did you need to do that twice? It’s already done in initialize_agent, right?
@samwitteveenai Рік тому
I did that as we had made a new tool and I wanted to reset the memory and init with everything fresh etc.
@stanTrX 5 місяців тому
You have to define many tools, such as for calculation, getting the current date etc etc. Thesee should be all built in functionalities imo.
@rajivmehtapy Рік тому ⁺²
This video is clear the concept. What is your feedback on using Dolly/Open Assistants/H2OGPT Instead Big Corporate?
@samwitteveenai Рік тому ⁺¹
The challenge with these is they usually need to be fine tuned for doing the ReACT etc.
@zknox71 Рік тому
Could you let us know what the best option available now with open source models?
@vicentejavieraguilerayeven14 Рік тому
Interested too. Being trying for a while with OpenSource models but it seems that they are not working for MRKL tasks yet
@lucianopacheco2008 Рік тому ⁺¹
Your videos are awesome. I was able to build a flutter app to work with a python backend running in replit, using fastapi to serve API endpoints. In my app, I can upload a PDF file and chat with it using an agent with memory. It works fine. However, I need to allow multiple users, each one to have its own agent with its own memory. I have no idea how to acomplish this. What path would you recommend?
@samwitteveenai Рік тому ⁺¹
When we do things like this we usually use Firebase it can handle the authentication, storage and db.
@MuiOmniKing Рік тому ⁺¹
Excellent walkthrough! To the point and easy to understand, when messing with the example tools and the gpt turbo model i immidietely understood what you mean when you said it can be very "chatty" perfectly, so using that information when you said that i decided to try an parameter within the tools description, and another simple way to better the ouput of tools for example the meaning of life example, By defining that the model should only respond with the functions return value and no further information or context. so depending on the specific tool being used (as some tools would need context in how the system should act upon it), you can make it act upon its return value more accuretly without needing to provide more context to what the output should look like which can eat up more token usage.
then we end up with an action chain as such:
> Entering new AgentExecutor chain...
{
"action": "Meaning of Life",
"action_input": "42"
}
Observation: The meaning of life is 42 if rounded but is actually 42.17658
Thought:{
"action": "Final Answer",
"action_input": "The meaning of life is 42 if rounded but is actually 42.17658"
}
@MuiOmniKing Рік тому
This also ensures we dont have to modify the actual system prompt as well thus allowing us to cause it to be more accurate without causing it to change its "Core Prompt Parameters" which can also change how the system works completely with task and etc.
@ravi4519 Рік тому
Make custom tool to launch ec2 instance 😊
@hiawoood Рік тому ⁺¹
You are awesome
@homejf520 Рік тому ⁺¹
Very nice video and explanation :)
But I wonder if there are some hidden prompts going on behind the scene? E.g. your system prompt doesn't specify anything about an output format but the LLM responded with json format regardless. Is there a way to see the whole prompt-reply chain to and from the LLM?
Like how many tokens does the final prompt has and how many iterations (Thought-Action-Observation) prompts get done and how much tokens did the whole agent action cost in the end?
@samwitteveenai Рік тому ⁺¹
you can go through all the prompts in the chain and look at them. Can also use callbacks to count tokens if you are concerned about that.
@israeabdelbar8994 7 місяців тому
Very helpful video, thank you very much for sharing it with us.
However, I'm wondering if can I apply your approach using internal APIs? For POST/GET requests?
@shaunpx1 Рік тому
Awesome video ! Say I had a template where I am passing input_variables that I want to add to the fixed_prompt is there a way to use define a prompt = PromptTemplate(
input_variables=["utterance", "topic", "question",],
template=DEFAULT_TEMPLATE_TO_USE_WITH_AGENT,
) so I can use it with the agent
@faustoalbers6314 Рік тому ⁺¹
Thanks for another great tutorial! Question: How to use a vector database as a tool? So that it only accesses that tool when certain requirements are fulfilled (e.g., a predefined set of information that needs to be provided by the user first through chat interaction).
@matthewtschetter1953 11 місяців тому
@faustoalbers6314 did you ever figure this out?
@aliwaleed7139 11 місяців тому
hello Mr. Sam thank you for your video it was very useful however when i use an opensource model like GPT2.NEO 2.7B somehow my model miss lead the tasks and fails the process of being an agent so i wanted to know which of the opensource models would be useful of being an agent similar to GPT
@MisterZike Рік тому ⁺¹
you're a legend
@CintraAI 10 місяців тому
For those wondering about custom agent prompts, I think there have been some updates since this video.
This seems to do the trick
agent_kwargs={
'prefix': PREFIX,
'format_instructions': FORMAT_INSTRUCTIONS,
'suffix': SUFFIX
}
Look at the source files for what's in each of the fields as default. I only changed the prefix field.
Also, if anyone knows how to tackle either of these problems, please let me know:
- Using a different LLM for tool selection vs. tool execution (e.g., summarizing final answer)
- Creating custom tools with multiple parameters (e.g., not just giving a tool the string as input, but also having other things in the constructor like a variable doc store based on the user using the tool)
@JNET_Reloaded 6 місяців тому
how do i use a local ollama model ?
@picklenickil Рік тому
How do you get a custom agent to ask questions? Say that I want to make an interviewer agent who wants to learn everything about someone. It comes with a pre decided set of questions as a reference but decides the actual questions depending on how the user responds.
@klammer75 Рік тому ⁺¹
Fucking love this! Tku, you da man!!!🥳🦾
@bolivianprince7326 Рік тому
How to set Scrapy as a tool instead of the default SerpApi?
@samwitteveenai Рік тому
you will need to write a tool that is scrapy and takes in a URL etc
@stanTrX 5 місяців тому
Thanks sam. For instance in autogen studio, there is no tool description, you just add your tool to agent and you have to define your tool in the agent prompt
@Cobryis Рік тому
How are you not hitting an exception on LLM output when you get a thought that's not purely in json? The output parser by default was failing on that before
@li-pingho1441 8 місяців тому
this tutorial is amazing, much better than langchain official tutorial
@EngineerFormidable 8 місяців тому
Excellent video! Thanks for the code explanation puts a lot of things in perspective !
@shacharlahav Рік тому
Hi, thanks for these videos. Is there a way to intercept the agent/tool's flow in order to let the User make a decision about the sources of the information that make up the context? For example, I'd like the Search tool to search the internet, but I want to let the user select from the search results the webpages that should be used for answering the question. I am currently writing my own 'agents' to do this, but I am wondering if I am just reinventing the wheel.
@pmshadow Рік тому
Very, very useful! Thanks a lot for the content!!
@cmthimmaiah Рік тому
Nicely presented, thank you
@rajivraghu9857 Рік тому
Excellent.. Very nicely explained the purpose of tools
@Ryan-yj4sd Рік тому ⁺¹
What does MOL mean?
@samwitteveenai Рік тому
I just made it up to represent Meaning Of Life. For that tool we actually didn't need a text input
@augustus6660 Рік тому
Quick question, if anyone knows: around the 12:57 mark when the conversational_agent is being redefined, there's a parameter on cell 74, line 11 for the System message that receives "fixed_prompt". But then in the cell under that one the prompt.template is manually overriden with "fixed_prompt" as well. Are those two different prompts, or was that maybe just 2 different ways of doing the same thing and, in this case, cell 75 is redundant?
@themfu Рік тому
good catch, looks like line 75 is redundant.
@goforit5 Рік тому
Great video - very helpful. I would love to see a video explaining how to make a custom agent that grabs data from an API. I’m looking to connect to our Accounting software API and our HRIS API and have it return business data. New developer here 😊 Thanks for your videos
@davisonyeoguzoro9232 5 місяців тому
Have you had any success with this?
@pareshraut7600 Рік тому
Great video Sam !, Wanted some help on creating a tool for risk assessments using bayesian methods anyway i can reach out ?
@samwitteveenai Рік тому
reach out on Linkedin
@MichaelBomett Рік тому
Thanks for the great video, how would you go about using another agent like ZeroShotReact as a tool for conversational agent?
@samwitteveenai Рік тому
You should be able to set it up as a separate agent and just call it through a python tool etc
@mohitsarpal9380 Рік тому
Thanks for this video.. any idea about which open-sourced model is best and how to fine-tune on custom data?
@samwitteveenai Рік тому
I have a number of vids on doing fine tuning as for choosing the base model they are changing so quickly. Hopefully soon we will have an open source LLaMa which works well.
@samuelkantor8242 Рік тому
This is fantastic, thank you for sharing!
@sysadmin9396 Рік тому
Can a question to the gpt3.5 llm model be a tool itself? I’d like for my app to use the basic llm and if it can’t answer then use the tools..
@samwitteveenai Рік тому
should be able to do this, the challenge will be in how do you decide when the first model can't answer. it will often just give a sub par answer.
@sysadmin9396 Рік тому
@@samwitteveenai in my mind I see it this way: you feed it a prompt , if the regular (chatgpt) can’t answer correctly or take an action, then it feeds the prompt to a tool. The initial prompt can have chatgpt label its answers on wether it can or not, maybe with a number.. then you use that number to determine wether or not you have to use a tool.
@TomanswerAi Рік тому
Excellent video as always thank you! Would be great see an example of creating an agent to call a specific API where nothing has been prebuilt already. Or perhaps someone could point me to this being done?
@samwitteveenai Рік тому
Any thoughts on what API? I am looking for some examples to show.
@TomanswerAi Рік тому
@@samwitteveenai oh nice cool that you are 🙂 wasn’t sure if a dumb request. Shopify is of particular interest for me personally. Ecom platforms in general. Keep up the great work 👍
@jacobjoseph4106 Рік тому
Suppose you want to return just the best tool from the list of tools created? How do you stop the agent to execute after getting "action":
@samwitteveenai Рік тому
It can always choose to not use any of the tools and just use LLM output.
@jacobjoseph4106 Рік тому
@@samwitteveenai Let's take this scenario. There is source which contains the Documentation and another source which are Blog articles. Now you wouldn't want to mix answers from both. So based on the question, you would select the best tool.
@gavinaren8534 10 місяців тому
great video!
@guilhemvalentin6687 Рік тому
Great video, thanks for your work !
Do you know how to define a tool that needs arguments, and that will ask the user for those arguments if they are not provided ?
Ex: the user asks "what is the weather like ?", then the assistant should ask "where ?", the user would specify his location, then the assistant would finally gives the weather forecast for that location
Do you know how to achieve this ?
@samwitteveenai Рік тому
You could do this in the tool. Eg if no location came it its response could be "please tell me which location". Another trick in the past I used for this in a mobile app was to pass in the GPS data as meta data and default to that location.
@guilhemvalentin6687 Рік тому
@@samwitteveenaiwhen I return "please tell me which location", and I give the answer, the agent does not select the action "weather" so I won't get back in the tool with all the arguments it needs. I could not figure out how to do it.
I also tried to play with the Human tool for this, but I was not able to make it work like I want.
@samwitteveenai Рік тому
So you could try to trigger another Chain inside the tool. Are you using memory? the location thing would need memory to work not a zero shot agent.
@sebastiansanchez4866 Рік тому
How would we apply the agents in a friendly UI on websites?
@samwitteveenai Рік тому ⁺¹
you just have your front end call an API like FastAPI etc. currently I am using Google Cloud Functions for this and it is good for most but not all things
@redfoothedude Рік тому
@@samwitteveenai is there a way to run python code in a google function or just javascript? I make endpoints on function too, how are you doing it with LangChain? are you using the js LangChain?
@koningsbruggen Рік тому
Very convenient
@sysadmin9396 Рік тому
Is there any benefit and using tools versus plug-ins
@samwitteveenai Рік тому ⁺¹
tools are for LangChain. Do you mean OpenAI Plugins or there plugin format?
@sysadmin9396 Рік тому
@@samwitteveenai ah yes my bad. Reading through the langchain documentation, I can seee that you can use open ai plugins but you can also use tools for similar results. Is there a benefit of using one over the other or can I use both at the same time?
@sysadmin9396 Рік тому
@@samwitteveenai because I know you can use plugins with langchain agents . So for example, there is a plug-in for a google search.. but there’s also a tool for it
@toddnedd2138 Рік тому
Thank you for the very helpful video and colab. I´d like to suggest some improvement cleaning the search result and remove any clutter.
url = ''
page = requests.get(url=url)
parsed = BeautifulSoup(page.content, 'html.parser')
paragraphs = parsed.find_all('p')
titles = parsed.find_all('title')
parsedText = ''
for i in range(len(paragraphs)):
parsedText += paragraphs[i].get_text(strip=True) + '
'
if i < len(titles):
parsedText += titles[i].get_text(strip=True) + '
'
with open('parsedText.txt', 'w', encoding='utf-8') as file:
file.write(parsedText)
The next step could be to either split the text with some overlap or pass it into a LLM directly to create a summary or extract relevant information.
Do you have any recommendation which model I can use from the huggingface hub to do that?
@moussatouhami7567 Рік тому ⁺¹
First viewer

Наступне

Автоматичне відтворення

LangChain Retrieval QA Over Multiple Files with ChromaDB