Build AI chatbot with custom knowledge base using OpenAI API and GPT Index

Irina Nik

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 18 лис 2024

КОМЕНТАРІ • 295

@irina_nik Рік тому ⁺³
Learn how to build a full-stack app in this tutorial: ua-cam.com/video/AMc2A5Abj3M/v-deo.html
@Taskade Рік тому
Irina, amazing tutorial on integrating OpenAI API with a custom knowledge base! Really excited about the potential of GPTIndex and Langchain. I'd love to see a deep dive comparing AI Agents in Langchain, especially when they're long-running and autonomous. Keep up the fantastic work! 🌟
@JulianHarris Рік тому ⁺⁵
Using ChatGPT to generate sample user interview data: genius 💥
@kermitec Рік тому ⁺¹⁰
Thank you for the tutorial... also, to refresh the files details there is a "Refresh" button located just above the Files detail section. It's an icon of a folder with a circular arrow. This will refresh the section without needing to refresh the page.
@irina_nik Рік тому
Thank you for the tip!
@pragmatica1032 Рік тому ⁺⁹
So happy to have found you this morning! We need more designers that can code and explore AI possibilities like you do!
@sammathew535 Рік тому ⁺⁴
You don't need to refresh the whole colab page to update the view of the files/folders, but just the refresh button above the directory structure, in the left pane.
@Kisssonik Рік тому ⁺¹
почему ты все время улыбаешься))) так мило)))
@rdy4trvl Рік тому ⁺⁴
Great video and thanks for answering many of the questions! Looking forward to your future YT on integrating into a website.
@irina_nik Рік тому
Thank you, I'm glad you liked it
@Inglewhite1 Рік тому
@@irina_nik thank you for this video. Do you have a tutorial to show how to integrate it into website/whatsapp? thanks
@vverboX Рік тому
Miss Irina, thank you. After few days playing around you got me to the point. Merci!
@malexandersalazar Рік тому
I didn't know that we can do something like this with OpenAI, thanks for the video Irina.
@zhiyingwang1234 Рік тому
Thank you so much, Irina! I copied your source code to Jupyter notebook and create a chatbot in a few minutes! To my surprise, it works! Please give some thumb-ups to this amazing lady. She has spent time to make this solution so easy to use for everyone!
@lcruzintel Рік тому ⁺²
You are the best at explaining things Irina!! Thank you for taking the time to putting this together.
@jonathandanemo Рік тому ⁺¹
That was a great tutorial. And I like your approach to explaining why one should not be using only one long prompt etc.
@chatgpt_explained Рік тому ⁺¹
Thanks for this info - it's easier to setup a chatbot than I realized!
@irina_nik Рік тому
I'm glad it's useful for you.
@harel4u2 Рік тому ⁺¹
Great explanation. Very explicit and clear instructions. Thank you very much for this.
@researchforumonline Рік тому
Nice, already done it but i don't know everything so had to watch this!
@somu6666 8 місяців тому
It's really nice, I got the insights how we can use the custom knowledge base
@borakou39 Рік тому ⁺¹
This is exactly what I was looking for, thank you!
@chinamatt Рік тому ⁺¹
Great work!! Really nice step by step explanation! By the way you can click the refresh button in the file explorer panel (2nd icon) to refresh the files so that they appear.
@shanesteven4578 Рік тому ⁺¹
Excellent tutorial, well presented and very clear. Thank you …. It works perfectly, unlike many so-called tutorials on YT about AI 😊
@chuck18420 Рік тому ⁺¹
What could be happening here? I asked how many people were interviewed and the reply was "One person was interviewed". I asked how many times did "It was fun to talk about cooking." appear and it said none (interview4 ends with this quote). Thank you, great video!
@njorogekamau3820 Рік тому ⁺¹
Thanks for the amazing tutorial, simple but impactful.
@gianantonel9913 Рік тому
Great video Irina !!
I was looking for this exact solution and it was the first video of your channel that I followed exactly step by step and it works perfectly end to end
It was very clear and well explained.
Nice job !!
Please continue making this kind of useful videos
It was extremely useful for me and extremely detailed.
Keep going!
@addkik Рік тому
Very informative...Thanks 😀
wishing you Lots of love and strength to you.
@Keith_Rothwell Рік тому
Thank you so much for this information. This is exactly the kind of thing I've looking for. Step by step tutorials for finetining your own AI. This is perfect.
@HelpHub150 Рік тому
thank you !!!! this is a great video Irina, keep up the good work !
@maneeshk2355 6 місяців тому
I love your teaching ❤
@YahaS-vf7cq Рік тому ⁺¹
Amazing video, very friendly to beginners. Thank you.
@abhishekmandloi95 10 місяців тому ⁺¹
i am getting an error while using the code when I ask question. Can someone help me?
@gangwu3235 Рік тому ⁺⁴
Thanks for the amazing tutorial. BTW, is there any method to increase the output length? I could only get a answer of approximately 160 words (~250 tokens) right now.
@Lexa-Live Рік тому ⁺⁴
Even I understood almost everything! Well delivered and interesting content!
@irina_nik Рік тому
Thanks ☺️
@dannydiscovers Рік тому
This is an incredible video. You did an amazing job. Subscribed
@QuanDaniel-n3c Рік тому ⁺¹
Great work, it's quite clear, Seems the llama Index has many updates, I can't recreate your work, would you please make an updated version? thanks a lot~
@lopnezk1320 Рік тому ⁺⁷⁸
Thanks! Now I can fire all my employees and save lots of money!
@irina_nik Рік тому ⁺⁷
😎
@BwahBwah Рік тому ⁺¹³
🤣🤣.... 😅😅.... 😄😄.... 🙂.... 🤔🤔🤔... 😐😐
@unitedstarsutopia Рік тому
Seriously 😂😂
@unitedstarsutopia Рік тому ⁺²
@@BwahBwahdon't tell me you are going to fire your employees too😂
@BwahBwah Рік тому ⁺²
@@unitedstarsutopia I'll go one better. I won't have to employ anyone now 😀
@ganeshkris Рік тому ⁺¹
This just spits out the text related to the query. If I want to augment GPT capabilities with my own data set. what is the best way to do it? For example, using the same example of interview transcription, I should be able to ask the GPT to summarize how the candidate did or whether the interviewee answer was correct for a particular question. Any idea how to go about that? I understand fine-tuning is a possibility but if i have 10,000 interview scripts i want to augment the GPT capabilities with, I am not sure how to go about it.
Any help?
@johnsmith1953x Рік тому ⁺¹
Is there a software package that can make an entire openai chatbox GPT4 or even 3.5 just by
pointing at a folder of PDFs?
We would pay thousands for this right now.
The application has to run local on a PC.
@irina_nik Рік тому ⁺¹
You can use langchain for that. I'll make more tutorials on that topic
@thepunisher0702 Рік тому ⁺¹
Great !! Keep Going. All the very best !!👍😄
@irina_nik Рік тому ⁺¹
Thank you!!! Your words inspire me for more videos)
@evaagustine7962 Рік тому ⁺³
Hi Irina! it is such a great tutorial and would be useful for case that I currently work on. I have tried this with my own research data and turns out so good with relevant and decent answer. But I am wondering is it possible to use the GPT 3 Model but not using it's training data or knowledge? So the information/answer produced would be just using custom data that we added to the knowledge base. Your answer would be very appreciated, thanks!
@hishamalawi6011 Рік тому
An excellent tutorial. Thank you.
@hishamalawi6011 Рік тому
I converted this code to a flask app and it works fine on my local server. However when I deploy to google app engine it fails to return responses. The error is 500 internal server error! any idea or advice is much appreciated.
@AIMagician996 Рік тому ⁺²
A few points not mentioned in the videos:
Essentially, it is fine-tuning. However, the module for fine-tuning has been pre-written for you.
Fine-tuning can only be done with models below GPT3. Currently, Fine-tuning is not available for ChatGPT, GPT3.5, or CPT4.
For GPT2 to be effective, you need at least 300M training data. Models with more parameters than GPT3 require even more data to achieve the desired effect
@lnyitrai Рік тому ⁺¹
She used about 12kB of text in this demo.
Llamaindex built 559kB index from it.
And it did the job on text-davinci-003.
I'm genuinely interested in the reason behind your training data size needs claim.
@prabharora0 Рік тому
Hello! Thank you for the video! Also your secret API key is visible in the first few frames before you blur it! You should delete that API key completely!
@NatkhatNoble Рік тому
That smile, that damned smile 😊 And thanks for the nice tutorial btw.
@MikeyMcCorry Рік тому ⁺²⁸
Amazing tutorial! Thanks! If you're looking for future tutorial ideas, I'd love to know how to expand on this to create my own API endpoints so my trained chat bot can be made publicly available from my website. I'm not very familiar with Google Collab (or python for that matter - I'm a php/js web developer), so I'll try to do some of my own research on how this might be possible -- but I really enjoyed and easily absorbed the info in this video. Well done. :)
@irina_nik Рік тому ⁺²¹
Hi Mikey! Thank you for the suggestion, I definitely need to make a video about that. I think, I'll be able to post it in 3-4 weeks. Though I'll be using NextJS/Typescript because this is what I'm familiar with.
@Adrian_Marmy Рік тому ⁺³
this response made me subscribe... That would be awesome!
@maertscisum Рік тому ⁺²
@@irina_nikyou are smart. Can't wait to see you share the typescript/node js version.
@lstephen Рік тому
Good question Mikey! I have the same question and subscribed to find out from her next video! Thank you!
@austink9285 Рік тому
Irina, thank you for your help? When I ask it irrelevant article questions, it seems to many times provide answers, when it shouldn't. Anyway to ensure it only focuses on my uploaded article?
@sambhajisawant4559 Рік тому ⁺¹
Thanks it’s really helpful. Capfuls you please let me know if I can use complex data having 100 of parameters (text & numbers) ? If yes in what format the should be uploaded?
@DrMohanMuthal Рік тому
Great information irina❤🎉
@gabrielcastaing8035 10 місяців тому
Hi
Thank you for that content!
I am just curious about the files size limit and the importance of the file format in your approach. I have seen that you are using .txt files. I am using pdfs to feed the knowledge base of custom GPTs but I am observing a low accurary in the answers. It seems that the GPT is not looking at all the knowledge base (6 merged pdfs with 7000 pages approx. in total). Do you have any advice?
@Progenix Рік тому ⁺³
How is this GPT-index different from OpenAI's Text Embedding ADA model? Or is it just a wrapper of that model?
@drewwellington2496 Рік тому ⁺²
great question. would love an answer to this; I can't figure out the difference. they appear to be doing the same thing.
@irina_nik Рік тому ⁺¹
You can achieve the same result without any external library, GPTIndex just makes it easier for non-professional coders like me 😉. It uses the found chunks as the context to the prompt and not as the answer itself.
@josehoyos Рік тому ⁺¹
For embedding you need also a vector database . I wonder if this index solution also performs in a production environment?!
@mattizzle81 Рік тому
@@josehoyos Yeah i have seen Pinecone proposed a lot for this. I did a little test with Pinecone and while it was unfamiliar to me it ended up being dead simple.
@XHVSTLEX Рік тому ⁺³
Great job! The new data shows a llama_index
I went with it because I figured you updated it.
But when I construct the index I get and error on line 58 in red it is super()._init_( and it fails.
Any helps on this?
@andrewdoulames8321 Рік тому
I got the same error as well
@MrJeffpohl Рік тому
I did as well and not sure how to get past it
@iztimetocode7513 Рік тому ⁺¹
Changing
index = GPTSimpleVectorIndex(documents, llm_predictor=llm_predictor, prompt_helper=prompt_helper)
To
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)
index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context)
And importing
from llama_index import ServiceContext
@irina_nik Рік тому
Hi! Thank you for mentioning it. I've updated the code and it should work now 😀
@mrpips76 Рік тому ⁺¹
Absolutely Great video Irina. The colab seems to have llama errors. Was anything changed with the colab? Would love to connect to discuss more. Great tutorials!!
@jlaroche0 Рік тому ⁺¹
While you're at it, you may want to change the following in the "Define the functions" section of the Colab.
Change this --> from langchain import OpenAI
To this --> from langchain.chat_models import ChatOpenAI
Apparently "from langchain import OpenAI" is old and being deprecated.
@mrpips76 Рік тому
@@jlaroche0 Thanks for the feedback Jacques. I tried both recommendations. They seemed to install fine. But still getting error with the following line: -> from llama_index import SimpleDirectoryReader, GPTListIndex, readers, GPTSimpleVectorIndex, LLMPredictor, PromptHelper, ServiceContext
@mrpips76 Рік тому
@@jlaroche0 It works now. Thank you so much Jacques. Really appreciate your help. These educational videos are super helpful! Looking forward to more videos!
@mrmgflynn Рік тому ⁺²
Hi, when I load and then run your Colab notebook, I get an error - TypeError: __init__() got an unexpected keyword argument 'llm_predictor' when I run the construct_index("context_data/data") code. Any clues on what I'm doing wrong?
@sandipshaw3397 Рік тому
Did you get the solution?
@mrmgflynn Рік тому
@@sandipshaw3397 not yet. Have you got the same problem?
@iztimetocode7513 Рік тому ⁺¹
Changing
index = GPTSimpleVectorIndex(documents, llm_predictor=llm_predictor, prompt_helper=prompt_helper)
To
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)
index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context)
And importing
from llama_index import ServiceContext
@Shrab Рік тому
Great explination, thnka you, may I ask, Is there a limit on how much custom data you can use and would large custom knowledge slow down the chat?
@tulsipatro4662 Рік тому
Amazing tutorial.
Is there a way where we can let the model answer the questions faster! It takes nearly 30 seconds to answer the questions.
@javi_v7.0 Рік тому
Great video, thanks!!!
@KristieTo-i9i 10 місяців тому
Hey Irina! Thank you for this tutorial, it's a game changer. This is built off GPT 3, how would you go about running it off GPT4? Thanks!
@konstantinlozev2272 Рік тому
Can the indexing and query code be run locally (interfacing with GPT-3 over the internet of course)?
What IDE?
@CamsYoga Рік тому
Thanks worked for me 😇
@TorNeely Рік тому
Very intresting. I noticed you said that you can't share the real interviews in video because there can be private information, which is understandable. However, how do you secure that Open AI doesn't receive this information? I find the biggest problem is how to avoid Open AI getting either user or customer information?
@anitanyajur2933 17 днів тому
Hi Irina, I am using this one year in the future and the davinci transformer has been deprecated are there any notes on how to handle code on when resources are no longer supported. The Gemini help still results in quite a few errors.
@M-ABDULLAH-AZIZ Рік тому
having data in a file and real time embeddings vs embeddings in a db for chatbot for an application (provides information about an application)?
@tutacat 9 місяців тому
Actually, knowledge bases are different to prompts. It is better. It is closer to quantum computing, because it can search the vector space for that document without having to parse the raw files.
@vl9110012010 Рік тому
Благодарочка! нижайший поклон! Респект и уважуха)))
@HredFuzz Рік тому
Hello dear, could you explain please how to do this on pipedream?
@saw970 Рік тому
Very nice and easy way thank you !!! I have a question regarding the custom knowledge base … can I implement a prolog knowledge base and put it there or it should be a text type because prolog is a requirement in my school project… I hope you answer and thanks a lot ❤
@alexdomla Рік тому ⁺¹
Thank you for the video! Really cool. I have a question: here you are working on Google Colaboración, but how would you bring this to a website? Is it possible? Is it easy? Greetings from Spain :)
@sachinmotwani2905 Рік тому
Uanble to use any other file. Even custom text file gives error: 'Rate limit reached'
@sojoba3521 Рік тому ⁺¹
Great tutorial! Thank you so much for going through this is such detail. Can you suggest a resource that explains how to take the chatbot we create and integrate it into a website or web app with a prettier interface?
@bartake1 Рік тому
Great tutorial. When we send data to OpenAI is that getting used for public training or would it remain private for me ?
@inflationking1271 Рік тому ⁺²
Really good tutorial. I wonder on how well this scales with more documents than just a couple. Do you have some experience with the performance of 1k or 10k documents?
@leegray72 Рік тому
im getting a Traceback error when contruct_index. what am I missing?
@MichaelLloydAI Рік тому
Irina,
Many use cases. Excellent information. Thank you.
Are you able to provide a similar method for creating a generative AI for a closed system that ensures secret or confidential company or government data cannot be leaked?
@kawingchan Рік тому ⁺¹
Thanks for posting this video. The whole demo is great. The only thing that I am not clear about how to pick those input, output sizes, and if some are based on the particular model, how do you obtain those from OpenAI (like the davinci) page, just in more details and a screen split such that you don’t have to toggle around.
@athuldas8689 Рік тому
where did the answers come from chat gpt? or the data fed. When I checked the data, I could only find questions?
@会飞的猪-s8d Рік тому
How do I break the word limit for an answer，Sometimes the answer feels half, not quite ，How can I modify it thank you
@Vaibhavkhairnar19 Рік тому ⁺¹
How to extend this as a service to use in web / mobile app?
@irina_nik Рік тому ⁺³
This will take much more time than just playing with API :) I'll make another video about that when I have time.
@leoheise9967 Рік тому
hey, any tips on how to fine tune a model based on a very large pdf document without the "
" to split prompt/resolution? I thought maybe have a script break down in every question mark? Or is there some other way?
@p.c.336 Рік тому ⁺²
Congrats Irina very clear and nicely explained 👍Which file formats does it support for indexing? Is it only .txt?
@irina_nik Рік тому ⁺²
Thanks! You can connect other file types with LlamaHub gpt-index.readthedocs.io/en/latest/how_to/data_connectors.html
@diederik6975 Рік тому
Thank you very much, very useful tutorial.
Wondering, why did you not use gpt-3.5-turbo - as it is much more inexpensive and probably almost as good?
@matthewwrightsman8959 Рік тому ⁺¹
3:18 to skip the unreasonably long intro
@BillyRybka Рік тому ⁺¹
Hey! Great video! Now that Chat GPT api is out do you know if these libraries will work for it? or is this still only a gpt 3.5 method?
@irina_nik Рік тому ⁺¹
Hi! This library is not available with ChatGPT yet, but you can keep an eye for updates here gpt-index.readthedocs.io/en/latest/how_to/custom_llms.html
@leegray72 Рік тому
Can i use this ai, custom knowledge base in chatGPT or in the playground of openAI?
@happydrawing7309 Рік тому
I got this error after ask_ai() "RetryError[]" How can I fix it?
@0xeb- Рік тому
Thank you Irina
@123arskas Рік тому ⁺¹
One question, Whenever we ask a question......Does it go through the entire Index everytime? And does that cost us a lot of Tokens for each question? Because If that's the case then we would run out of credit if we applied an App like that for users online.
@kunalr_ai Рік тому
You nailed it ..I ll follow you on Twitter.
@adambrickley1119 Рік тому
How would you adapt this to derive context from dynamic data being generated in a website?
@thinanadl4939 Рік тому
Hi @larina Nik can you check the code again, I think some of libraries are already outdated eg: GPTindex
@user-vc2sc9rq7t Рік тому
Thanks for the great tutorial! For multiple documents, can you please advise on how i can retrieve the file name where the contextual information is retrieved from?
@botiqueservices Рік тому
Hello! Thank you for the helpful tutorial! What would happen if I ask a question in another language? Would this chatbot switch to the language as ChatGPT does? Thanks a lot.
@narekmuradyan1980 Рік тому
Could you add this to someone’s website? If so, could you point me to a video you already have on the topic?
@NK5LLC Рік тому ⁺¹
This is great, thank you! When asking questions to the AI, I didn't notice any custom instructions in use. How can you be sure it was answering only using the data given to it in the index?
Can you also make more videos for using custom data from other sources, such as databases? How about the ability to categorize?
One minor thing: When pronouncing the word "answer", the "w" is actually silent. (My wife is ESL and always asks me to correct her pronunciation, and I ask the same of her when I speak her native tongue.)
@lisaduddington 7 місяців тому
The code for 'Construct an index' no longer works. I get the following error msg: You tried to access openai.Embedding, but this is no longer supported in openai>=1.0.0
@basedblueboy8770 Рік тому
Can you set up to read Python code bases?
@pedromoreno8655 Рік тому
Hi Irina, thanks for the video. I want to ask how do you limit the model to answer only about your information. I.e., what would happen if the person asks any question out of context (like: "Can I go to Miami for holiday?"), will it reply?.
Thanks
@luizmelo662 Рік тому ⁺¹
Irina, tutorial fantástico. É muito dificil encontrar um material tão simples e que explique como treinar a IA. Muito obrigado por compartilhar o seu conhecimento.
@jairam470 Рік тому
Hello, nice video. Please let me know how this will ensure our data still will be our data. Will OpenAI won't have access to it now ?
@anastasiosmichaelkoutoumba9384 Рік тому
Excellent
@wardaraees4887 Рік тому
You feed text data files for providing the data to the model, what if I have an excel file or a tabular data file?
And, Openai api key is free or it is paid?
@jwardmagic07 Рік тому
I am a researcher and I followed this up until the 'installing the dependencies' part. I have never used Github, but it would be great if this could be covered too. Also, collab didn't let me upload my txt files from github. I received an error saying syntax error. Need a simper method.
@Swanidhi Рік тому
Can I implement this locally?
Would my knowledge base be private?
Can I ingest HTML data? I want to upload technical documentation as a knowledge base so that I can use the prompt to translate the code into a human-readable form or be able to make more sense of the code to make improvements. Because it's proprietary syntax, I want to use the documentation. Thank you for this video!
@vinayaka.b1494 Рік тому
How can i use this as a backend of my website.?

Наступне

Автоматичне відтворення

Let's prototype an AI tool in Next.js with ChatGPT API