This used about $3 of my $5 of free credit for OpenAI. If you have a half decent graphics card I would recommend using BERT instead for the embeddings. My 1060 crunched the data in ~15 minutes. Required some slight modification of the code, but nothing crazy
Fantastic videos, expertly put together and very informative. It is a pity that things change way too fast and things break if one tries to replicate the code more than a few months later. Some backward compatibility should become standard practice for library maintainers... Even though all the installs have completed successfully, the code bombs out on the first statement 'load_dataset(...'
Your videos are fantastic! I love the way you break down these topics into easy-to-understand pieces and make them accessible to people from all walks of life. I was wondering if you've ever explored the capabilities of large language models when it comes to handling tabular data? It would be really interesting to see a video from you exploring how this can be used in that context.
Thank you for these videos! Just learning pinecone and having a little trouble with creating a domain-specific db where the dimension are predefined. Any chance of a video on that process? Thanks again for making everything so easy to follow and providing the colabs.
I'd like to see a tool from LangChain that highlights relevant parts of a URL from its citations, or (less conveniently) quotes the relevant section of the citation. Too many times, I read through the cited source and find the LLM's misconstrued the source material (often confirming a statement where the source material negated it). Showing the passage the LLM thinks is important would make it much easier to sanity check.
I've been testing Bard, and GPT-4 (ChatGPT). They are all pretty good at riffing at the "general idea" level. But when it comes to specifics, such as URL's, citations, or even specific facts, the failure rate goes up. (Bing and Bard are both tied to the internet. Bing is best with citations. But it's responses are a bit literal, sticking close to the document. It's no longer as appealing. Bard is more conversational. But it makes mistakes, that I think, shouldn't be the case, given it has access to the internet.) The problem is that LLM's (by themselves), is that they are not architecturally geared to be precise, when it comes to specific facts. It's based on statistical associations between language elements. What allows them to synthesize a coherent natural language response, on the fly, is also the basis for "hallucination". Hence, one needs an external source, as a more reliable reference. (The same can be said about human beings. What saves us is "fact checking", "scientific method", pushback from others, our senses, etc.).
Thank you James. This video is impressive. Just a question that might be the source to a nw video. Let's say I want to make a chatbot that has multiple data sources. Let's say one source gives the resume's of all employees in the organisation and another returns all quality procedures. How would you do ? Put all in a single vector store and let the thing find out the best content ? Or alternatively could you have a chain of specialized retrievers with descriptions such as "Use this retriever whenever you are searching for an employee that has specific skills" or "Use this retriever when you have questions about quality procédures" How would you manage that ?
you can use an agent with access to multiple vector dbs or with access to multiple tools that route to the same vector db but apply different metadata filters - have a video explaining how to do this coming in ~3 hours - and you can find the code already here github.com/pinecone-io/examples/blob/master/generation/langchain/handbook/08-langchain-retrieval-agent.ipynb :)
What do you think about adding a zero-shot answer from the LLM to the query to improve the quality of the embeddings retrieved? This would follow the HyDE pattern described by Luyu Gao et al. but I'm not sure if it's just a waste of tokens when dealing with OpenAI's models.
Is it possible to do this with Claude by Anthropic? There's general support for Anthropic, but I do not see any Anthropic embeddings in the documentation
Your videos are awesome. I was able to build a flutter app to work with a python backend running in replit, using fastapi to serve API endpoints. In my app, I can upload a PDF file and chat with it using an agent with memory. It works fine. However, I need to allow multiple users, each one to have its own agent with its own memory. I have no idea how to acomplish this. What path would you recommend?
Hi James, there are so many different embeddings in langchain. Could you elaborate on how to choose the right embeddings and right model? Especially when we don't want to use openai the whole time. Thanks and br
Love your videos! I was asking chatGPT about the Source/Parametric knowledge and it said: Source are all training data, Parametric is the resulting weights/biases of the model based on training data, and prompts are neither, rather are just Inputs to the model to generate a response. Not sure if this is correct or your explanation, just thought i'd share that.
Your content is amazing. Looking forward for more hopefully. As I'm going through Langchain I'm wondering: what are the up/downside of using vector database for structured data (DBs, CSVs mostly) as opposed to more direct retrieval method like SQLChain or straight up custom tool?
thanks a lot for this informative video!!! I'm trying to follow the same method that you showed in this video, but sometimes my model answers out of the given text, do you have any idea on how can I solve it? I tried to play with the prompt or even the prompt template but didn't help too much..., and still generate hallucinations is there any way to guarantee never get any answer out of the given text?
Thanks for all, the information, it is really that you take time and effort to make all this knowledge accessible to everybody. I am trying to follow your explanations and I am clicking on the "open in colab" link in the notebook and receive this error "AttributeError: module 'threading' has no attribute '_Condition'" when I execute the second cell "from datasets import load_dataset". Anybody can tell me what am I doing wrong? thank you in advance for your extra help and for the awesome content
@5:15, Why is it being said that LLM model performance degrade as input size increases? Counter Thought: By giving more amount of embedding based selected chunk data, model may be able to respond correctly.
Great job, James. txs a lot for your awesome videos! I went through langchain python documentation (which is also very good) and all is cristal clear thanks to your videos. make it a great day !
But how do you know when to call the model directly and when to call it with the sentences the compose the answer (meaning when to query the db first)?
It is typically 400:20 that is 5:1 ratio for chunk size: overlap. But note smaller the input lesser the chunk size and overlap but keeping a ratio of 5:1 typically yields good results
Where did the WithSourcesChain version of RetrievalIQ get the sources it output? The sources shouldnt be wikipedia, but the chunks you fed in your prompt. Is it making it up based its training data? Also, does your approach ensure that the response is based solely on the chunks provided in the prompt? Wouldn't want it using LLM training data in this case, right?
No reason, in the demos I’ll usually stick with cosine in case others are using a different embedding model to me, for text-embedding-ada-002 you can use either, but a lot of embedding models need cosine rather than dot product
I tried to run your notebook (in Colab) and get the following error on the first code cell: ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. google-api-core 2.11.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,=3.19.5, but you have protobuf 3.19.3 which is incompatible. google-cloud-bigquery 3.9.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,=3.19.5, but you have protobuf 3.19.3 which is incompatible.
I got this error, code seems to be outdated: --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in () ----> 1 from datasets import load_dataset 2 3 data = load_dataset("wikipedia", "20220301.simple", split='train[:10000]') 4 data 10 frames /usr/local/lib/python3.10/dist-packages/multiprocess/dummy/__init__.py in 85 # 86 ---> 87 class Condition(threading._Condition): 88 # XXX 89 if sys.version_info < (3, 0): AttributeError: module 'threading' has no attribute '_Condition'
Thank you so so much James. It seems like every time I have a new question along my journey you post a video about that exact topic.
Great content. What local model would be good to replace the embedding of OpenAI’s text-embedding-ada-002?
This used about $3 of my $5 of free credit for OpenAI.
If you have a half decent graphics card I would recommend using BERT instead for the embeddings. My 1060 crunched the data in ~15 minutes. Required some slight modification of the code, but nothing crazy
Fantastic videos, expertly put together and very informative. It is a pity that things change way too fast and things break if one tries to replicate the code more than a few months later. Some backward compatibility should become standard practice for library maintainers... Even though all the installs have completed successfully, the code bombs out on the first statement 'load_dataset(...'
Your videos are fantastic! I love the way you break down these topics into easy-to-understand pieces and make them accessible to people from all walks of life.
I was wondering if you've ever explored the capabilities of large language models when it comes to handling tabular data? It would be really interesting to see a video from you exploring how this can be used in that context.
I've been working on this a bit recently, they're pretty good at it - could be an interesting video, thanks for the idea!
Extracting tabular data works. I have been working on tabular data, and OpenAI amazes me.
Hello, can you make a video on how we can use HuggingFace instruct embeddings for Information Retrieval and Q&A?
Thanks for helping us understand. Can I use pdf document and how can Is this cost effective when put into production for the public
I have learned so much from you James especially in regards to LangChain. Thanks for your time indeed.
Thank you for these videos! Just learning pinecone and having a little trouble with creating a domain-specific db where the dimension are predefined. Any chance of a video on that process? Thanks again for making everything so easy to follow and providing the colabs.
You’re welcome I’m glad to hear it helps, I think a Pinecone 101 could be a good video, I’ll see if I can put it together soon :)
Great shirts in each video! And thanks for the great series!
Amazing content James, what python IDE did you use to run this tutorial? I am new to Python developing
I'd like to see a tool from LangChain that highlights relevant parts of a URL from its citations, or (less conveniently) quotes the relevant section of the citation. Too many times, I read through the cited source and find the LLM's misconstrued the source material (often confirming a statement where the source material negated it). Showing the passage the LLM thinks is important would make it much easier to sanity check.
Wonder does langchain explores all chunks related to query (in your example they are 3 items) when prompting to openai, and how it does internally.
I've been testing Bard, and GPT-4 (ChatGPT). They are all pretty good at riffing at the "general idea" level. But when it comes to specifics, such as URL's, citations, or even specific facts, the failure rate goes up.
(Bing and Bard are both tied to the internet. Bing is best with citations. But it's responses are a bit literal, sticking close to the document. It's no longer as appealing. Bard is more conversational. But it makes mistakes, that I think, shouldn't be the case, given it has access to the internet.)
The problem is that LLM's (by themselves), is that they are not architecturally geared to be precise, when it comes to specific facts. It's based on statistical associations between language elements. What allows them to synthesize a coherent natural language response, on the fly, is also the basis for "hallucination". Hence, one needs an external source, as a more reliable reference. (The same can be said about human beings. What saves us is "fact checking", "scientific method", pushback from others, our senses, etc.).
Thank you James. This video is impressive. Just a question that might be the source to a nw video. Let's say I want to make a chatbot that has multiple data sources. Let's say one source gives the resume's of all employees in the organisation and another returns all quality procedures. How would you do ? Put all in a single vector store and let the thing find out the best content ? Or alternatively could you have a chain of specialized retrievers with descriptions such as "Use this retriever whenever you are searching for an employee that has specific skills" or "Use this retriever when you have questions about quality procédures"
How would you manage that ?
you can use an agent with access to multiple vector dbs or with access to multiple tools that route to the same vector db but apply different metadata filters - have a video explaining how to do this coming in ~3 hours - and you can find the code already here github.com/pinecone-io/examples/blob/master/generation/langchain/handbook/08-langchain-retrieval-agent.ipynb :)
Awesome video thank you 🎉❤
What is the tool you are using to build your guides? It looks awesome too 😅😮
What do you think about adding a zero-shot answer from the LLM to the query to improve the quality of the embeddings retrieved? This would follow the HyDE pattern described by Luyu Gao et al. but I'm not sure if it's just a waste of tokens when dealing with OpenAI's models.
Is it possible to do this with Claude by Anthropic? There's general support for Anthropic, but I do not see any Anthropic embeddings in the documentation
Your videos are awesome. I was able to build a flutter app to work with a python backend running in replit, using fastapi to serve API endpoints. In my app, I can upload a PDF file and chat with it using an agent with memory. It works fine. However, I need to allow multiple users, each one to have its own agent with its own memory. I have no idea how to acomplish this. What path would you recommend?
Hi, I am also working on question answering with pdf. What is the ideal chunk size for long text in pdf. I am bit confused about it
Hi James, there are so many different embeddings in langchain. Could you elaborate on how to choose the right embeddings and right model? Especially when we don't want to use openai the whole time.
Thanks and br
Love your videos! I was asking chatGPT about the Source/Parametric knowledge and it said: Source are all training data, Parametric is the resulting weights/biases of the model based on training data, and prompts are neither, rather are just Inputs to the model to generate a response. Not sure if this is correct or your explanation, just thought i'd share that.
Great lectures. 🎉
Your content is amazing. Looking forward for more hopefully.
As I'm going through Langchain I'm wondering: what are the up/downside of using vector database for structured data (DBs, CSVs mostly) as opposed to more direct retrieval method like SQLChain or straight up custom tool?
thanks a lot for this informative video!!!
I'm trying to follow the same method that you showed in this video, but sometimes my model answers out of the given text, do you have any idea on how can I solve it? I tried to play with the prompt or even the prompt template but didn't help too much..., and still generate hallucinations
is there any way to guarantee never get any answer out of the given text?
hello, did you find an answer?
Thanks for all, the information, it is really that you take time and effort to make all this knowledge accessible to everybody.
I am trying to follow your explanations and I am clicking on the "open in colab" link in the notebook and receive this error "AttributeError: module 'threading' has no attribute '_Condition'" when I execute the second cell "from datasets import load_dataset".
Anybody can tell me what am I doing wrong? thank you in advance for your extra help and for the awesome content
Your content is gold, thanks!!
thanks!
My man.. Amazing content!
@5:15, Why is it being said that LLM model performance degrade as input size increases? Counter Thought: By giving more amount of embedding based selected chunk data, model may be able to respond correctly.
Also, I really like your videos.
Great job, James. txs a lot for your awesome videos! I went through langchain python documentation (which is also very good) and all is cristal clear thanks to your videos. make it a great day !
glad to hear!
But how do you know when to call the model directly and when to call it with the sentences the compose the answer (meaning when to query the db first)?
Amazing content James! Love it. Any recommended range for chunk size and overlap based on your experience?
It is typically 400:20 that is 5:1 ratio for chunk size: overlap. But note smaller the input lesser the chunk size and overlap but keeping a ratio of 5:1 typically yields good results
Interesting and well explained👍
Btw: next time use Faiss instead of pinecone. You did +/- 1 year ago multiple video’s on Faiss😉
Where did the WithSourcesChain version of RetrievalIQ get the sources it output? The sources shouldnt be wikipedia, but the chunks you fed in your prompt. Is it making it up based its training data? Also, does your approach ensure that the response is based solely on the chunks provided in the prompt? Wouldn't want it using LLM training data in this case, right?
Is there a particular reason you used cosine as opposed to dotproduct here james?
No reason, in the demos I’ll usually stick with cosine in case others are using a different embedding model to me, for text-embedding-ada-002 you can use either, but a lot of embedding models need cosine rather than dot product
Cosine is equal to dot product when vectors are normalised i.e. magnitude = 1. Congrats @james, amazing content.
Good but could you make the same using open source LLMs
Hi Can you do langchain with Azure cognitive search index .
How to get rid of hallucination problem?
tiktoken_len should use 'text-embedding-ada-002' as the encoding model, not 'gpt-35-turbo', since you are using this chunks for embedding.
I tried to run your notebook (in Colab) and get the following error on the first code cell: ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-api-core 2.11.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,=3.19.5, but you have protobuf 3.19.3 which is incompatible.
google-cloud-bigquery 3.9.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,=3.19.5, but you have protobuf 3.19.3 which is incompatible.
I'm getting the same error
same error.
I got this error, code seems to be outdated:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
in ()
----> 1 from datasets import load_dataset
2
3 data = load_dataset("wikipedia", "20220301.simple", split='train[:10000]')
4 data
10 frames
/usr/local/lib/python3.10/dist-packages/multiprocess/dummy/__init__.py in
85 #
86
---> 87 class Condition(threading._Condition):
88 # XXX
89 if sys.version_info < (3, 0):
AttributeError: module 'threading' has no attribute '_Condition'
Great. Great. Great
Hy james, can you please do with own csv file questions answer chat bot with pinecone
nice, alot easier to use with intelligent iterations console though!
Hi Open api is paid can you provide another alternative model for embedding
Yes try this video for some open source ua-cam.com/video/LzRpTNV74Ck/v-deo.html
Why get rid of hallucinations if we can get virtual machines out of hallucinations?