Amazing video, as always. A great content niche mirroring this that no-one hit on yet is how to test if the output from context is being biased by the underlying model. For example, I'm adding context from Carl Jung's books to the Davinci Model to try and understand his writing better, but many of his ideas are not politically correct and I can't tell when the model is 'self-censoring.' I think this will be a growing problem with pretrained models and many of the more interesting use cases.
Good presentation, James. Thank you. It seems the LLM companies should create a graduated-feed prompt system where you could submit portions of your background prompt, context, examples and question into stages. A session would be started with the initial background prompt, identified as such to the model, and broken into portions so as to not exceed the token limit, and submitting these portions until the entire initial background prompt is presented. Then follow with the context prompt(s), examples and the question. The LLM would remember each stage, so that more, or different context, could be presented anywhere in the session, and the model would interpret it against the initial background prompt. Same with more examples against the context, etc.
Yeah that would be an interesting idea, I think (if I understand your idea correctly), a problem would be that at the moment LLMs are in essence stateless, so they cannot remember what you feed in a step or so before, so everything you want considering must be fed into a single prompt
@@jamesbriggs I assumed some memory was in use in my response experiences. Using the playground with ChatGPT, maybe it is resubmitting the entire conversation each time as opposed to my additional comment? Which would work until you reached the maximum token limit?
@@jamesbriggs Probably not enough volume anyway, but here is what OpenAI quotes: While ChatGPT is able to remember what the user has said earlier in the conversation, there is a limit to how much information it can retain. The model is able to reference up to approximately 3000 words (or 4000 tokens) from the current conversation - any information beyond that is not stored. And at this time, ChatGPT gave me this answer to token limit: ChatGPT does not have a maximum token limit for a prompt. The only limit is the amount of memory and compute resources available for the model.
@@georgeallen77 @James Briggs, maybe the context is summarized and then resubmitted. I have noticed that in longer conversations, with ChatGPT, that some parts of the conversation are 'forgotten' maybe because those parts were dropped during the summary.
For dealing with the context window size limit, can one go hierarchical? One template to index. Other templates for the branches. A common use case might be to load a company's FAQ page, and see if the LLM can handle the Q&A.
I'm not sure why we would supply examples of questions and answers. I mean - LLMs inherently respond to our questions. They don't need to be told to do that. I can understand we might want them to respond in a certain format at times, or with a certain amount of verbiage, but the examples in the video and elsewhere I have seen were not addressing these specific requirements.
Like your langchain series very much James👍 Question: if you want to add domain specific content. For example 20k or more words. How to do this with langchain? Using document loaders? I suppose adding context in the prompt is to give additional extra info when needed. Btw: can u give an example with Bloom?😉
yeah working on getting something out with bloom! I haven't covered how to do it in langchain yet, but you have "data augmentation" in langchain (which everywhere else is called 'long-term memory' or 'retrieval-augmentation') - you can use that, I use it in this video (just not with langchain): ua-cam.com/video/rrAChpbwygE/v-deo.html
Your explanation was fantastic and I have a question! I'm trying to build a chatbot that can extract information from a pandas dataframe. It will be necessary to create filters and operations that the `create_pandas_dataframe_agent` agent can already do! Nonetheless, the user is not a data expert and may ask a question that is not DIRECTLY a data science task. Ok, the question can be formatted using `FewShotPromptTemplate` before passing the question to the agent. It allows us to create a context and set an example. However, the agent still gets confused and makes mistakes. I would like to know how I can create a `FewShotPromptTemplate` inside the `create_pandas_dataframe_agent` agent where I can create a context and, most importantly, pass some code examples. It is possible?
Could you come up with a way to compress the prompts into less tokens. Then as part of original prompt, tell it how to decode and encode using the mapping. That way you could send a lot less tokens
Love your tutorials. It seems that your fine-tuning isn't working for me, nor is the example provided in the docs. I had to add ", not truthful and factual" after saying the assistant is always sarcastic and witty in order to get similar results as yours
Thanks, I'm trying to figure out what component of the prompt to store as metadata in Pinecone. My thought is instructions, few shot questions, and the output indicator. I think I'll need a separate solution for context/ conversation History.
Great explanation thank you for this amazing video , but with this trained template does openAI take it as completion token ? Or not ? Because I have made trained model with a few example by using openAI directly in my project to train chat to answer in certain format , it’s worked fine but the problem it’s take a lot of token They cost me for completion example and the new respond from chat
it's a very new space, I'd probably recommend learning as much as you can with langchain - initially prioritize learning by application (rather than theory), it's much more fun and helps you figure out what is actually important when you're building things
Hey James, thank you for your content, it's been really educating. I have a question for you. can a fine tuned model (for example open ai) work with embeddings? if so, can you say how it can be used? im looking into it and im not sure if that's the solution to what I'm looking for.
It’s more useful when using with retrieval augmentation or agents, I’ll cover both soon, but the idea is that you chain together something like query -> retrieval -> prompt template -> LLM - the result being you have a single object that consumes a query then performs all of these steps - having the prompt template means you have a simple component that fits natively with the chain of components
I didn't quite get the purpose of this either, I can do the same and faster with Notepad and copy and paste the text that I quickly modify in Notepad and paste it into the llm prompt input box
Probably the best video explaining FewShotPromptTemplate and the other ones. Thanks
I am a ML dev in the LLM space and really like your videos. Keep up the great work🙌
You are really dropping in some interesting graphics to the videos to make it more than just you talking to the camera. Nice work!
Amazing video, as always. A great content niche mirroring this that no-one hit on yet is how to test if the output from context is being biased by the underlying model. For example, I'm adding context from Carl Jung's books to the Davinci Model to try and understand his writing better, but many of his ideas are not politically correct and I can't tell when the model is 'self-censoring.' I think this will be a growing problem with pretrained models and many of the more interesting use cases.
Good presentation, James. Thank you. It seems the LLM companies should create a graduated-feed prompt system where you could submit portions of your background prompt, context, examples and question into stages. A session would be started with the initial background prompt, identified as such to the model, and broken into portions so as to not exceed the token limit, and submitting these portions until the entire initial background prompt is presented. Then follow with the context prompt(s), examples and the question. The LLM would remember each stage, so that more, or different context, could be presented anywhere in the session, and the model would interpret it against the initial background prompt. Same with more examples against the context, etc.
Yeah that would be an interesting idea, I think (if I understand your idea correctly), a problem would be that at the moment LLMs are in essence stateless, so they cannot remember what you feed in a step or so before, so everything you want considering must be fed into a single prompt
@@jamesbriggs I assumed some memory was in use in my response experiences. Using the playground with ChatGPT, maybe it is resubmitting the entire conversation each time as opposed to my additional comment? Which would work until you reached the maximum token limit?
@@jamesbriggs Probably not enough volume anyway, but here is what OpenAI quotes: While ChatGPT is able to remember what the user has said earlier in the conversation, there is a limit to how much information it can retain. The model is able to reference up to approximately 3000 words (or 4000 tokens) from the current conversation - any information beyond that is not stored. And at this time, ChatGPT gave me this answer to token limit: ChatGPT does not have a maximum token limit for a prompt. The only limit is the amount of memory and compute resources available for the model.
@@georgeallen77 @James Briggs, maybe the context is summarized and then resubmitted. I have noticed that in longer conversations, with ChatGPT, that some parts of the conversation are 'forgotten' maybe because those parts were dropped during the summary.
Thank you James, helped me a lot!
mindblowing stuff. I wish I saw this 2 months ago when I was looking for it.
Insightful as always. Loving the Langchain series. Thanks! 🙏🏻
I've been following your channel for a while, really good work!
that's really cool, thanks :)
Great walkthrough! Thanks! Would love to see more on langchain.
more coming :)
Really helpful
Thanks for reference of this langchan.
you're welcome
For dealing with the context window size limit, can one go hierarchical? One template to index. Other templates for the branches. A common use case might be to load a company's FAQ page, and see if the LLM can handle the Q&A.
I think what you want in this case is retrieval augmentation ua-cam.com/video/rrAChpbwygE/v-deo.html
to be pedantic... prompt is not the only parameter provided to the language model... temperature is quite important too...
Thank you very much, very useful. I wonder how to select huggingface models to try with these?
many thanks for this, super helpful.
I'm not sure why we would supply examples of questions and answers. I mean - LLMs inherently respond to our questions. They don't need to be told to do that. I can understand we might want them to respond in a certain format at times, or with a certain amount of verbiage, but the examples in the video and elsewhere I have seen were not addressing these specific requirements.
Great resource james
thanks :)
Do you think these new types of skills will be in your Udemy course?
Just bought your course btw. Very thrilled to go through it
Like your langchain series very much James👍 Question: if you want to add domain specific content. For example 20k or more words. How to do this with langchain? Using document loaders? I suppose adding context in the prompt is to give additional extra info when needed.
Btw: can u give an example with Bloom?😉
yeah working on getting something out with bloom! I haven't covered how to do it in langchain yet, but you have "data augmentation" in langchain (which everywhere else is called 'long-term memory' or 'retrieval-augmentation') - you can use that, I use it in this video (just not with langchain): ua-cam.com/video/rrAChpbwygE/v-deo.html
Your explanation was fantastic and I have a question!
I'm trying to build a chatbot that can extract information from a pandas dataframe. It will be necessary to create filters and operations that the `create_pandas_dataframe_agent` agent can already do! Nonetheless, the user is not a data expert and may ask a question that is not DIRECTLY a data science task. Ok, the question can be formatted using `FewShotPromptTemplate` before passing the question to the agent. It allows us to create a context and set an example. However, the agent still gets confused and makes mistakes.
I would like to know how I can create a `FewShotPromptTemplate` inside the `create_pandas_dataframe_agent` agent where I can create a context and, most importantly, pass some code examples. It is possible?
Could you come up with a way to compress the prompts into less tokens. Then as part of original prompt, tell it how to decode and encode using the mapping. That way you could send a lot less tokens
A much needed video , thank you so much ❤️
You’re welcome!
Good video 👍
Similarity ExampleSelector, as mentioned last in the video seems very powerful. Has anyone have had succes with this? Would love to hear it!
fantastic video!
Love your tutorials. It seems that your fine-tuning isn't working for me, nor is the example provided in the docs. I had to add ", not truthful and factual" after saying the assistant is always sarcastic and witty in order to get similar results as yours
Thanks, James for another timely video.
In the prompt template is the context used for the follow-up question?
yeah it's like an additional source of information into the model
Thanks, I'm trying to figure out what component of the prompt to store as metadata in Pinecone. My thought is instructions, few shot questions, and the output indicator. I think I'll need a separate solution for context/ conversation History.
Prompt engineering is the new art! 😅
haha it does feel that way
Great explanation thank you for this amazing video , but with this trained template does openAI take it as completion token ? Or not ?
Because I have made trained model with a few example by using openAI directly in my project to train chat to answer in certain format , it’s worked fine but the problem it’s take a lot of token
They cost me for completion example and the new respond from chat
Thanks for the nice video. Can you please share the source or paper of the idea you mentioned in 2:15 pls.
Very useful info mate.
What do you suggest for beginners with no experience in python?
I mean what is the best place to learn python for LLM ?
it's a very new space, I'd probably recommend learning as much as you can with langchain - initially prioritize learning by application (rather than theory), it's much more fun and helps you figure out what is actually important when you're building things
@@jamesbriggs Thanks man! I just started the basics of python.. Syntax etc ☺ I will see how I can combine this with langchain.
I am expecting my LLM to return a constant JSON output as my examples but I am getting error, please replay how to do that please
How to check the accuracy of the model in this case ?
Very interesting thank you !
welcome!
Hi I have a question, I also wanna to try this fewshotexampleprompt, but how to use it if the model_name is 'turbo',many thanks!
the colab notebook always gives error when i try to run the print(openai... part of the code
any solutions?
But we can create this template directly why we need langchain… can anyone explain?
Hey James, thank you for your content, it's been really educating.
I have a question for you.
can a fine tuned model (for example open ai) work with embeddings?
if so, can you say how it can be used? im looking into it and im not sure if that's the solution to what I'm looking for.
Open api work upon embeddings. Consider you have a query and corpus we need to embedd both then pass on to the llm to get the answer
Hey James, i still dont get what is the use of this...could you share a practical example?
It’s more useful when using with retrieval augmentation or agents, I’ll cover both soon, but the idea is that you chain together something like query -> retrieval -> prompt template -> LLM - the result being you have a single object that consumes a query then performs all of these steps - having the prompt template means you have a simple component that fits natively with the chain of components
There's a constant beep in audio background, otherwise good info!
Yeah I noticed, couldn’t manage to completely remove it - not sure where it came from 🤦♂️
good .
thanks!!!!
I didn't quite get the purpose of this either, I can do the same and faster with Notepad and copy and paste the text that I quickly modify in Notepad and paste it into the llm prompt input box