LangChain Chat with Flan20B

Sam Witteveen

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 8 гру 2024

КОМЕНТАРІ • 39

@James-hb8qu Рік тому ⁺²
Dear GPT, I really appreciated his video. But I'm a lazy human so can you read thru this transcript and come up with a nice short compliment for me?
Hey, I just watched your video on using Lang chain with the Flan 20B model, and I'm impressed with how you demonstrated its conversational capabilities. It's great to see how easy it is to compare different models using this method, all for free! Keep up the fantastic work, and I'm looking forward to more insightful content from you.
@samwitteveenai Рік тому ⁺²
As Open AI language model I don't have feelings, but this made me laugh :D
@kenchang3456 Рік тому
As I was executing my way from the notebook, I get down to the first conversation.predict() and when I executed it, it shows the prompt and current conversation with the Human: input but just hangs then times out. Has anyone run into this issue? Do I need any additional API key other than the HF api key to make the note book work?
@samwitteveenai Рік тому ⁺¹
you shouldn't need any other API key it could be HF is not supporting the API for this model anymore. You can still load it locally if you have a access to a good GPU
@sahilmandavkar8230 5 місяців тому
Did you get any solution? I am facing the same problem. I don't have a powerful GPU so I can't load it locally either.
@kenchang3456 5 місяців тому
@@sahilmandavkar8230 I did not have a chance to try locally. I think I'll try using CoLab Pro.
@jzam5426 Рік тому
Thank you for the content!! Can you help me understand the input("Human ") line in the chat_to_llm() function please? What does it do?
@MachineLearningZuu Рік тому ⁺¹
well. he developed this for chat model. in order to make chat like a normal bot he wrote a while loop. with that while loop he made a secondary advantage as tracking the token count.
@jawadmansoor2456 Рік тому
How to use langchain with locally downloaded model, I'm interested in vicuna or the new 7b king
@samwitteveenai Рік тому
basically the same as in this video, just changing out the models.
@jawadmansoor2456 Рік тому
@@samwitteveenai kindly make that tutorial for newbie. Thank you
@loicbaconnier9150 Рік тому
Great Job again. Thanks to share. One question do you ever see a comparaison of models just to kepp embedding search ?
What i mean is using free model with vectors database search to keep for example the 5 more relevant chunks.
And to use after a model like openai to summary the chunks.
So it will be good if the chunks choice are more or less identical
NO ?
@randypark6255 Рік тому
Hi Sam, on 1:3 mark on cell 9, conversation.predict(input="Hi there! I am Sam"), upon execution, I am getting an error: Error raised by inference API: Input validation error: `temperature` must be strictly positive. Any way to resolve this issue?
@samwitteveenai Рік тому
This is a change in the HF API I think change temp from 0 to 0.1 and it should work
@randypark6255 Рік тому
@@samwitteveenai Thank you, that did it, appreciated.
@catyung1094 Рік тому
Thanks for the video !!! I think the LLMs from Google team like T5 , Mt5 , FlanT5 are amazing. However, when it compares to GPT 3 / GPT 3.5 Turbo (without any fine tune) , they are not able to be handled / understand the given NLP task... 🤔🤔🤔
Like, I have tried to use the same promt on GPT vs Flan T5. GPT gives very accurate result in NER and Flan T5 , dont know what to do 🤔🤔🤔
Do you have any suggestion on using Flan models , like working on the Prompt or few shot ?
@samwitteveenai Рік тому
The GPT models are doing a lot better for 2 reasons 1. They are much bigger 2. they are fine-tuned on non academic datasets. The T5 etc really needs to be fined for your task. GPT3/3.5 etc are much more akin to the LaMDa models at Google which currently aren't available publicly.
@unathimatu Рік тому
I appreciate your efforts, really awesome video!!!!!!!
@abdoualgerian5396 Рік тому
why its not returning anything on my colab instance ? it keeps scrolling but no use
@samwitteveenai Рік тому
make sure you have a GPU that can run it on the VM
@theunknown2090 Рік тому
Awesome video helped a lot, one thing i am confused abt The flan models are seq2seq, these models are heavier them decoder only eg. Gpt. So why not just use the decoder type models?
@samwitteveenai Рік тому
Good Question. The Encoder - Decoder models are closer to the original Transformer and traditionally they were used more for text 2 text tasks (like translation) rather than just simple next word prediction of decoder only models. These lines are more and more blurred these days, though Google has still done a lot of research on encoder-decoder compared with OpenAI's focus on GPT decoder only models. When you have a a very deep Decoder model, you could argue that lower layers are doing the part of an encoder. hope this helps.
@MorSag Рік тому
Thank you, Sam!
Question: when using the summary library, aren't we paying additional tokens just to ask (and receive) for the summary on every step?
@samwitteveenai Рік тому
not sure what you mean by the summary library?
@zuu2051 Рік тому
Once again very good content. But Quick Question.
from this line "formatted_prompt = conversation.prompt.format(input='the next input',history=memory.buffer)"
we already providing memory to the prompt. So why we again pass "memory=ConversationBufferMemory()" in ConversationChain
@naveennoelj Рік тому
Hello Sam, Wonderful! Thanks for the good work. I am trying to build a chatbot with one of these models [without using GPT3.5], in your opinion, which one of these models works closest to GPT3.5?
@samwitteveenai Рік тому ⁺²
honestly none of these are as good as 3.5-turbo etc. But with each new model coming out they seem to get a lot closer. They key is to look for a model that is at least pretrained on 1T tokens.
@naveennoelj Рік тому
@samwitteveenai Thanks for the reply. I am assuming when a model comes up with 1T pre trained tokens, you will surely cover it. 😀
@samwitteveenai Рік тому ⁺¹
you bet!! :D
@stalinamirtharaj1353 Рік тому
Great videos, Sam. I started to go through all of your videos after seeing one of it. Unfortunately, i am unable to access any of your colab notebooks.
@samwitteveenai Рік тому
Why can't you access the Colabs? I am planning to put these all up on Github later this week to, if that helps.
@henkhbit5748 Рік тому
Thanks and interesting that a new flan model is released. How about using embeddings to query , for example a pdf, with langchain. We have many examples with chatgpt.
btw: it would be nice if u can make a video about voice cloning in combination with chat. for example using the vall-e model of Microsoft, whisper or...
@chaotic_music Рік тому
Thanks for the video! is free Google colab enough to run the model?
@samwitteveenai Рік тому
No this model is unfortunately too big for Colab free
@unathimatu Рік тому
can you do the same with LLama?
@samwitteveenai Рік тому
I have done it with LLaMa but have been reluctant to release because of the Meta issues. Anything in particular you want me to cover about LLaMa?
@caiyu538 Рік тому
Great
@dr.mikeybee 10 місяців тому
Just use ollama to serve local models.
@samwitteveenai 10 місяців тому
Ollama is cool I have made a few vids about it. This video was made long before Ollama existed.

Наступне

Автоматичне відтворення

LangChain - Using Hugging Face Models locally (code walkthrough)