Fixing LLM Hallucinations with Retrieval Augmentation in LangChain #6

James Briggs

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 4 лис 2024

КОМЕНТАРІ • 61

@2an_sound Рік тому ⁺¹
Thank you so so much James. It seems like every time I have a new question along my journey you post a video about that exact topic.
@TzaraDuchamp Рік тому ⁺¹
Great content. What local model would be good to replace the embedding of OpenAI’s text-embedding-ada-002?
@MrAusdrifter Рік тому ⁺¹
This used about $3 of my $5 of free credit for OpenAI.
If you have a half decent graphics card I would recommend using BERT instead for the embeddings. My 1060 crunched the data in ~15 minutes. Required some slight modification of the code, but nothing crazy
@nikosterizakis 8 місяців тому
Fantastic videos, expertly put together and very informative. It is a pity that things change way too fast and things break if one tries to replicate the code more than a few months later. Some backward compatibility should become standard practice for library maintainers... Even though all the installs have completed successfully, the code bombs out on the first statement 'load_dataset(...'
@sachinp994 Рік тому ⁺⁶
Your videos are fantastic! I love the way you break down these topics into easy-to-understand pieces and make them accessible to people from all walks of life.
I was wondering if you've ever explored the capabilities of large language models when it comes to handling tabular data? It would be really interesting to see a video from you exploring how this can be used in that context.
@jamesbriggs Рік тому ⁺⁵
I've been working on this a bit recently, they're pretty good at it - could be an interesting video, thanks for the idea!
@leventtz Рік тому
Extracting tabular data works. I have been working on tabular data, and OpenAI amazes me.
@nithinreddy5760 Рік тому ⁺¹
Hello, can you make a video on how we can use HuggingFace instruct embeddings for Information Retrieval and Q&A?
@Matrixsection Рік тому
Thanks for helping us understand. Can I use pdf document and how can Is this cost effective when put into production for the public
@crisgath3512 Рік тому
I have learned so much from you James especially in regards to LangChain. Thanks for your time indeed.
@jdlovely Рік тому ⁺³
Thank you for these videos! Just learning pinecone and having a little trouble with creating a domain-specific db where the dimension are predefined. Any chance of a video on that process? Thanks again for making everything so easy to follow and providing the colabs.
@jamesbriggs Рік тому ⁺¹
You’re welcome I’m glad to hear it helps, I think a Pinecone 101 could be a good video, I’ll see if I can put it together soon :)
@Ali-ts6po Рік тому
Great shirts in each video! And thanks for the great series!
@debiaser Рік тому
Amazing content James, what python IDE did you use to run this tutorial? I am new to Python developing
@scottmiller2591 Рік тому
I'd like to see a tool from LangChain that highlights relevant parts of a URL from its citations, or (less conveniently) quotes the relevant section of the citation. Too many times, I read through the cited source and find the LLM's misconstrued the source material (often confirming a statement where the source material negated it). Showing the passage the LLM thinks is important would make it much easier to sanity check.
@lxx2952 Рік тому
Wonder does langchain explores all chunks related to query (in your example they are 3 items) when prompting to openai, and how it does internally.
@mintakan003 Рік тому ⁺¹
I've been testing Bard, and GPT-4 (ChatGPT). They are all pretty good at riffing at the "general idea" level. But when it comes to specifics, such as URL's, citations, or even specific facts, the failure rate goes up.
(Bing and Bard are both tied to the internet. Bing is best with citations. But it's responses are a bit literal, sticking close to the document. It's no longer as appealing. Bard is more conversational. But it makes mistakes, that I think, shouldn't be the case, given it has access to the internet.)
The problem is that LLM's (by themselves), is that they are not architecturally geared to be precise, when it comes to specific facts. It's based on statistical associations between language elements. What allows them to synthesize a coherent natural language response, on the fly, is also the basis for "hallucination". Hence, one needs an external source, as a more reliable reference. (The same can be said about human beings. What saves us is "fact checking", "scientific method", pushback from others, our senses, etc.).
@berdeter Рік тому
Thank you James. This video is impressive. Just a question that might be the source to a nw video. Let's say I want to make a chatbot that has multiple data sources. Let's say one source gives the resume's of all employees in the organisation and another returns all quality procedures. How would you do ? Put all in a single vector store and let the thing find out the best content ? Or alternatively could you have a chain of specialized retrievers with descriptions such as "Use this retriever whenever you are searching for an employee that has specific skills" or "Use this retriever when you have questions about quality procédures"
How would you manage that ?
@jamesbriggs Рік тому
you can use an agent with access to multiple vector dbs or with access to multiple tools that route to the same vector db but apply different metadata filters - have a video explaining how to do this coming in ~3 hours - and you can find the code already here github.com/pinecone-io/examples/blob/master/generation/langchain/handbook/08-langchain-retrieval-agent.ipynb :)
@thisisruddy Рік тому
Awesome video thank you 🎉❤
What is the tool you are using to build your guides? It looks awesome too 😅😮
@Garcia98 Рік тому ⁺¹
What do you think about adding a zero-shot answer from the LLM to the query to improve the quality of the embeddings retrieved? This would follow the HyDE pattern described by Luyu Gao et al. but I'm not sure if it's just a waste of tokens when dealing with OpenAI's models.
@MichaelRodriguez-nr9yg Рік тому ⁺¹
Is it possible to do this with Claude by Anthropic? There's general support for Anthropic, but I do not see any Anthropic embeddings in the documentation
@lucianopacheco2008 Рік тому ⁺¹
Your videos are awesome. I was able to build a flutter app to work with a python backend running in replit, using fastapi to serve API endpoints. In my app, I can upload a PDF file and chat with it using an agent with memory. It works fine. However, I need to allow multiple users, each one to have its own agent with its own memory. I have no idea how to acomplish this. What path would you recommend?
@shivanshugautam1381 Рік тому
Hi, I am also working on question answering with pdf. What is the ideal chunk size for long text in pdf. I am bit confused about it
@DanielWeikert Рік тому
Hi James, there are so many different embeddings in langchain. Could you elaborate on how to choose the right embeddings and right model? Especially when we don't want to use openai the whole time.
Thanks and br
@MiguelFernando Рік тому
Love your videos! I was asking chatGPT about the Source/Parametric knowledge and it said: Source are all training data, Parametric is the resulting weights/biases of the model based on training data, and prompts are neither, rather are just Inputs to the model to generate a response. Not sure if this is correct or your explanation, just thought i'd share that.
@caiyu538 Рік тому
Great lectures. 🎉
@g0rth0rTBL Рік тому ⁺²
Your content is amazing. Looking forward for more hopefully.
As I'm going through Langchain I'm wondering: what are the up/downside of using vector database for structured data (DBs, CSVs mostly) as opposed to more direct retrieval method like SQLChain or straight up custom tool?
@SoroorMalekmohamadi Рік тому
thanks a lot for this informative video!!!
I'm trying to follow the same method that you showed in this video, but sometimes my model answers out of the given text, do you have any idea on how can I solve it? I tried to play with the prompt or even the prompt template but didn't help too much..., and still generate hallucinations
is there any way to guarantee never get any answer out of the given text?
@aigerimmansurova5404 Рік тому
hello, did you find an answer?
@AlbertoChillon Рік тому
Thanks for all, the information, it is really that you take time and effort to make all this knowledge accessible to everybody.
I am trying to follow your explanations and I am clicking on the "open in colab" link in the notebook and receive this error "AttributeError: module 'threading' has no attribute '_Condition'" when I execute the second cell "from datasets import load_dataset".
Anybody can tell me what am I doing wrong? thank you in advance for your extra help and for the awesome content
@nikhilsehgal2322 Рік тому ⁺¹
Your content is gold, thanks!!
@jamesbriggs Рік тому
thanks!
@abdullahnasir8535 Рік тому
My man.. Amazing content!
@gauravsirola5064 Рік тому
@5:15, Why is it being said that LLM model performance degrade as input size increases? Counter Thought: By giving more amount of embedding based selected chunk data, model may be able to respond correctly.
@gauravsirola5064 Рік тому
Also, I really like your videos.
@yvesdance2306 Рік тому
Great job, James. txs a lot for your awesome videos! I went through langchain python documentation (which is also very good) and all is cristal clear thanks to your videos. make it a great day !
@jamesbriggs Рік тому
glad to hear!
@mihaichirculescu6050 Рік тому
But how do you know when to call the model directly and when to call it with the sentences the compose the answer (meaning when to query the db first)?
@BernardoGarciadelRio Рік тому ⁺¹
Amazing content James! Love it. Any recommended range for chunk size and overlap based on your experience?
@kunal4557 Рік тому
It is typically 400:20 that is 5:1 ratio for chunk size: overlap. But note smaller the input lesser the chunk size and overlap but keeping a ratio of 5:1 typically yields good results
@henkhbit5748 Рік тому
Interesting and well explained👍
Btw: next time use Faiss instead of pinecone. You did +/- 1 year ago multiple video’s on Faiss😉
@jalalabadass Рік тому
Where did the WithSourcesChain version of RetrievalIQ get the sources it output? The sources shouldnt be wikipedia, but the chunks you fed in your prompt. Is it making it up based its training data? Also, does your approach ensure that the response is based solely on the chunks provided in the prompt? Wouldn't want it using LLM training data in this case, right?
@h3armory Рік тому
Is there a particular reason you used cosine as opposed to dotproduct here james?
@jamesbriggs Рік тому
No reason, in the demos I’ll usually stick with cosine in case others are using a different embedding model to me, for text-embedding-ada-002 you can use either, but a lot of embedding models need cosine rather than dot product
@dfgilto Рік тому
Cosine is equal to dot product when vectors are normalised i.e. magnitude = 1. Congrats @james, amazing content.
@saravanajogan1221 Рік тому
Good but could you make the same using open source LLMs
@vigneshnagaraj7137 Рік тому
Hi Can you do langchain with Azure cognitive search index .
@harsh7704 11 місяців тому
How to get rid of hallucination problem?
@ErginSoysal Рік тому
tiktoken_len should use 'text-embedding-ada-002' as the encoding model, not 'gpt-35-turbo', since you are using this chunks for embedding.
@DilvanMoreira Рік тому ⁺¹
I tried to run your notebook (in Colab) and get the following error on the first code cell: ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-api-core 2.11.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,=3.19.5, but you have protobuf 3.19.3 which is incompatible.
google-cloud-bigquery 3.9.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,=3.19.5, but you have protobuf 3.19.3 which is incompatible.
@BenjiArt Рік тому
I'm getting the same error
@logywonds6069 Рік тому
same error.
@chuanjiang6931 Рік тому ⁺¹
I got this error, code seems to be outdated:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
in ()
----> 1 from datasets import load_dataset
2
3 data = load_dataset("wikipedia", "20220301.simple", split='train[:10000]')
4 data
10 frames
/usr/local/lib/python3.10/dist-packages/multiprocess/dummy/__init__.py in
85 #
86
---> 87 class Condition(threading._Condition):
88 # XXX
89 if sys.version_info < (3, 0):
AttributeError: module 'threading' has no attribute '_Condition'
@caiyu538 Рік тому
Great. Great. Great
@surajkhan5834 Рік тому
Hy james, can you please do with own csv file questions answer chat bot with pinecone
@josecoverlessons Рік тому
nice, alot easier to use with intelligent iterations console though!
@areebakhtar6422 8 місяців тому
Hi Open api is paid can you provide another alternative model for embedding
@jamesbriggs 8 місяців тому
Yes try this video for some open source ua-cam.com/video/LzRpTNV74Ck/v-deo.html
@growthmonger4341 Рік тому
Why get rid of hallucinations if we can get virtual machines out of hallucinations?

Наступне

Автоматичне відтворення

LangChain Agents Deep Dive with GPT 3.5 - LangChain #7