End to end RAG LLM App Using Llamaindex and OpenAI- Indexing and Querying Multiple pdf's

Krish Naik

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 15 гру 2024

КОМЕНТАРІ • 163

@krishnaik06 10 місяців тому ⁺²⁹
Fixed the issue and reuploaded the video again
@shantanusharma5617 9 місяців тому ⁺¹
I love your videos. Before starting the setup, could you make sure that your code is future proof by sharing the Python/Conda version you are using?
Preferably, start with the `pyenv install` command. Could you also commit the `requirements.txt` file with the version number you used? Thank you 🙏
@AtharvaWeginwar 9 місяців тому ⁺⁶
Hi sir I am getting error "cannot import name 'VectorStoreIndex' from 'llama_index' (unknown location)". Can you help me with this?
@balvendarsingh9905 9 місяців тому
same issues with me @@AtharvaWeginwar
@NavyaSravaniSomu 8 місяців тому
Can you use OCR model also to read images in pdf.
@chrisdsilva7114 8 місяців тому ⁺¹
Hey Krish i've been following a lot of your videos lately especially the road to gen ai repo wanted to ask you if this is as similar as the Chat with PDF which we build with gemini model if so what makes this RAG and not that or is it the same?
@ajg3951 10 місяців тому ⁺⁵
This session is fantastic! It would be great if you could also demonstrate how to change the default embedding, specify which embedding the model is using, and explain how to switch between different models such as GPT and LLM. Additionally, it would be helpful to cover how to utilize this dataset to answer specific questions.
@Venom-gt3hi 7 місяців тому ⁺²
You are amazing and your videos taught me more than any of my graduate professors could. Thank you
@faqs-answered 7 місяців тому ⁺¹
I really love the way you teach these hard concepts with so much enthusiasm that it sounds so easy. Thank you so so much.
@phanindraparashar8930 10 місяців тому ⁺¹
much-awaited series. would be nice if we have even more complex rag applications.
@ariondas7415 10 місяців тому ⁺¹⁴
please use open source LLMs
As a student, it's difficult to come up with budgets for openai api key
btw just wanted to thank you for everything you're doing!!
@manikumar-vr3kp 3 місяці тому
hey for the above task there is no need of open api
@rafiali7315 3 місяці тому
But how
@enestemel9490 4 місяці тому
The source nodes are not simply other curated answers. Instead, they are the similar indexes retrieved from the vector store based on the query. These similar indexes serve as the primary source for constructing the final answer. In essence, the vector store identifies and retrieves the most relevant information from similar contexts or data points, which are then used to generate the final response.
@1murali5teja 10 місяців тому ⁺¹
Thanks for the video, I have been constantly learning from your videos
@khalidal-reemi3361 8 місяців тому ⁺¹
eagerly waiting for a video to include databases.
@alexaimlllm 10 місяців тому ⁺³
Thanks Sir.
May i know where did we use OpenAI here, Can we use any open source model like Llama-2?
@bernard2735 10 місяців тому ⁺¹
Thank you - this was a great tutorial. Liked and subscribed.
@bevansmith3210 10 місяців тому ⁺³
Great channel Krish! Is it possible to create a RAG/LLM model to interact with a database to ask statistical type questions? what is the max, min, median, mean? basically to create a chatbot for non-technical users to interact with spreadsheets
@narsimharao8565 10 місяців тому ⁺²
Hi Krishn sir, very thank you for this video❤
@bawbee27 7 місяців тому
I love how verbose this is. Thank you!
@pranavgaming7634 10 місяців тому ⁺⁶
Amazing Energy Krish. I am your student in Master GenAI class. I am trying this project but i am getting Import Error while loading VectorStoreIndex,SimpleDirectoryReader from llama_index. I have tried loading only one but status quo. Could you please guide me to fix it
@linalogvina6001 8 місяців тому
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
@shashankag5361 8 місяців тому
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
@lixiasong3459 10 місяців тому ⁺²
Thank you very much, Sir. In your Llamaindex playlist, it says five videos so far, but 2 unavailable videos are hidden. do I have to pay and become a member to be able to say the full playlist? Thanks again for the amazing videos!
@seanrodrigues1842 10 місяців тому ⁺²
Since we are using open ai, does it mean we are using one of the gpt models? There was no parameters in the code to choose what llm model to you. How do we select a particular open ai model?
@pavankonakalla4668 10 місяців тому ⁺²
So it is power full than Azure AI Search?? or it does the same thing as AI search(Azure cognitive search).
@dhruvasthana5270 Місяць тому
*SPOILER ALERT*
its not a 27min project!
i think it works for every sir's project video :)
@rizwanat7496 9 місяців тому ⁺¹
I am using mistral open source model, and I want to store the relevant documents that are retrieved. How do to it?
@shobhitagnihotri416 10 місяців тому ⁺²
We cam do same thing in langchain , so what id difference
@aravindraamasamy9453 10 місяців тому ⁺²
Hi Krish , I have a doubt regarding the project I am doing. So the project is that from a pdf file I need to create a excel file which have 5 columns and the info in excel can be filled from the pdf. Can I get a an approach to solve the problem using llm. I am looking forward to hearing from you.
@AbdelilahBo 9 місяців тому ⁺¹
Hello,
thank you so much for this video.
i have a question related of summarize questions in LLM documents.
For example in vector database with thousands documents with date property, and i want ask the model how much document i received in the last week?
@alfatmiuzma 10 місяців тому ⁺¹
Hello Krish, Thanks for this informative video on RAG and LLAmaIndex. I have one doubt - When you query "what is attention is all you need", the source having 0.78 similarity score is chosen as Final Response instead of the source having similarity score 0.81. Why?
@livealil 6 місяців тому
Are you going to cover how to do the LangChain integration that mentioned in the first video of the series and is included in the diagram pulled up at 25:09 (same as the first video)?
@piyush-A.I.-IIT 10 місяців тому ⁺³
Thanks! Just a quick question:
For indexing the documents, does it call openAI api internally? I understand for retrieval it calls openAI api to formulate the final answer. But I am unclear whether it calls api for indexing. I need to index 10000 pages document so I have to account cost if it calls openAI api.
@vivekshindeVivekShinde 10 місяців тому
As per my knowledge, for Indexing it doesn't use OpenAI. For retrieval it does.
Correct me if I am wrong
@ajg3951 10 місяців тому
@@vivekshindeVivekShinde we are using indexing with the API because by default we are utilizing OpenAI's text embedding. Indexing involves embedding the text into respective folders. However, you have the freedom to change the embedding method from OpenAI to any other open-source option available for this purpose.
@manikumar-vr3kp 3 місяці тому
@@ajg3951 no
@deepak_kori 9 місяців тому
thank you sir making such video these are amzaing video🤩🤩
@RanjitSingh-rq1qx 10 місяців тому ⁺¹
Wow sir, I were waiting this video ❤
@ParulSharma-fx5zw 6 місяців тому
In your playlists, you have been using the Open AI llm with some API key, but where are you using it and linking it with llama index? something is missing here....
@CheggAnonymous 6 місяців тому
In your playlists, you have been using the llm with some API key, but where is the RAG here?
@SanketSancheti-h5e 3 місяці тому
in .env how can we store our api keys? i mean without """this format""' or not??
@summa7545 9 місяців тому
Hello krish, first of all, I'd like to thank you for all your guidance. Your videos are my main source of study. Now, my query related to this video. The codes have been changed from the one you are showing. Most remain same with addition of core to the library. But I couldn't find any for vectorindexautoretriever, mainly the keywords to be used inside. Currently it's asking for vectorstoreinfo apart from index and similarity top k
@jcneto25 10 місяців тому ⁺¹
Excellent Tutorial. Thanks
@keepguessing1234 7 місяців тому
My requirements.txt is not able to install... Throwing error
@ANKITKUMAR-oc7pt 6 місяців тому
Sir, you haven't provided the attention pdf and yolo pdf
@ashusoni6448 8 місяців тому
sir, As you know that libraries like llama index are still undergoing various changes, please try to mention the exact version of library in requirements.txt files
@Dream-lp7km 10 місяців тому
Sir company mein Google colab or jupyter notebook kisme work krte hai
@sravan9253 8 місяців тому
Instead of using an LLM to generate embeddings of input data, we are using LlamaIndex here to embed and index the same?
@prasanthV-ji1ub 8 місяців тому ⁺¹
We are getting an Error that says ---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
c:\Users\wwwdo\Desktop\LLAMA_INDEX\Llamindex-Projects\Basic Rag\test.ipynb Cell 1 line 4
1 ## Retrieval augmented generation
3 import os
----> 4 from dotenv import load_dotenv
5 load_dotenv()
ModuleNotFoundError: No module named 'dotenv' even though I try to add python-dotenv in the requirments.txt
@rishabkhuba2663 8 місяців тому
Same here
Did you find a workaround?
@ShaikKarishma-c8q 5 місяців тому
create a file with name of .env
@ANKITKUMAR-oc7pt 6 місяців тому
Can't see the pdf , where you have uploaded the pdf
@achukisaini2797 10 місяців тому
Sir i need your help i am using llama index and saving the embeddings in pinecone using sentence transfromer but i am not to connect with the pinecone
@akshatapalsule2940 10 місяців тому ⁺¹
Thankyou so much Krish!
@udaysharma138 10 місяців тому ⁺¹
Can you please create a Video on How we can Summarize a long PDF with Mistral or Llama-2 to get a very Efficient output , Because with Open AI we have great amount of Context Length , But with these Open Source LLM Models we are Restricted while summarize a Large PDF
@abhayjoshi2121 5 місяців тому
Hi Krish , Great video do you plan to use open source LLM reason bing private data is the key in all the industries
@shrankhalmohite6387 4 місяці тому
Is it fetching data from internet if he didn't find answer In PPT indexex?
@nelohenriq 8 місяців тому
What about doing all this but using only open source models from HF?
@qzwwzt 10 місяців тому ⁺¹
Sir, congrats on your lessons! I'm from Brazil. I tried other PDFs in Portuguese. At the end of the response, the text came in English. "These are the enteral nutritional
requirements for preterm infants weighing less than 1500g."
Is it possible to get everything in Portuguese?
Tks a lot
@marcelobeckmann9552 6 місяців тому
Did you try to ask LLM to translate the response to Portuguese?
@lordclayton 6 місяців тому
Can anyone link the llamaindex playlist that Krish Naik has started? I can't seem to find it somehow
@tonydavis2318 7 місяців тому
Out of curiosity, why are you using python 3.10 instead of the current stable version 3.12?
@ishratsyed77 9 місяців тому
Llama-index installation is giving errors Any suggestions?
@StutiKumari-yn5ws 6 місяців тому
Hi krish if there is an option to store index in hard disk , then why we need vector store like chroma db
@shrihanscreativeworld8813 5 місяців тому
Hi sir, Really Excellent Content, I am following all your playlists to learn GenAI. I have one query in this video, why we are not using any embedding model here, as it's mentioned in the title that RAG application using Lammaindex and openAI but I dind't found any call to OpenAI here. Please correct me if anything wrong in my understanding.
@srishtisdesignerstudio8317 10 місяців тому
Krish Ji, hi. we are into Stock market and use ML which we only use for LSTM and weka and to some extent knime and rapidminer for building simple models involving moderate levels of data sets 4000 to 7000 instances may go upto 10000 instances and 8-10 features hence not very big models in terms of size as may be termed in actual seious ML. We saw in one of your videos building LSTM on TF on your 1650 gtx laptop we guess. We had been training our models on CPU only till now and it consumed a lot of time. however we have recently started working on sentiment library and wish to implement it into our models to make some auto trading bots. could you please guide us on our laptop purchase I mean will a 1650 be good enough or do need to invest heavily. we have shortlisted some gaming budget laptops with 70-80k range with rtx 4050 or 3050. your valuable suggesttions will be of great help. dont want to waste our money and also think you are quite well versed in the subject. you suggestions please........
@JeevankumarKodali-c5k 10 місяців тому
where can I find those pdf's used in the project?
@rajum9474 4 місяці тому
Can someone help me, where can I find the pdf files used in this video.
@abax_ 10 місяців тому ⁺¹
sir can you plz use a opensource model in next video such as google palm i tried using the palm model but VectorStoreIndex is constantly demanding open api key , even took help with the docs but i am only able to get response without chaining the pdf
@Chuukwudi 7 місяців тому
Thank you Krish.
Important Notes:
llama_index doesn't support python 3.12
If you're decide to use python 3.11, while importing, you will need to use `from llama_index.core.`.
@fatimazehra5962 7 місяців тому
Which location is to be added in method_location?
@DataDorz 9 місяців тому ⁺¹
what is the need of openai in this video?
@thespritualhindu 8 місяців тому
For response synthesis. Once the relevant nodes are retrieved, it is passed as a context to LLM(openAI) model and then LLM provided the answer in much better way to the users query.
@AniketGupta-et7zw 10 місяців тому
Hi Krish, can you also make a roadmap video on data engineer.
@allaboutgaming836 10 місяців тому
Getting error for importing CohereRerank
ImportError: cannot import name 'CohereRerank' from 'llama_index.core.postprocessor'
Causing the error while importing SimilarityPostprocessor
from llama_index.core.indices.postprocessor import SimilarityPostprocessor
@Decoder_Sami 8 місяців тому
from llama_index import VectorStoreIndex,SimpleDirectoryReader
documents=SimpleDirectoryReader("data").load_data()
ImportError Traceback (most recent call last)
Cell In[20], line 1
----> 1 from llama_index import VectorStoreIndex,SimpleDirectoryReader
2 documents=SimpleDirectoryReader("data").load_data()
ImportError: cannot import name 'VectorStoreIndex' from 'llama_index' (unknown location)
How can I fix this issue any suggestions, please!
@Decoder_Sami 8 місяців тому ⁺²
Yes I got it
The correct code should be like this:
from llama_index.core import VectorStoreIndex,SimpleDirectoryReader
documents=SimpleDirectoryReader("data").load_data()
@tonydavis2318 6 місяців тому
@@Decoder_Sami That little tidbit took me about 3 hours to figure out. Thanks for posting!
@sanjaykrish8719 7 місяців тому
can llamaindex be used with Llama? why did openai name it after metas llama?
@keepguessing1234 6 місяців тому
.env file ...how did you make it... Like you keep original key value in it or the name we given 'open_ai_api_key'
@ardensarmiento8417 4 місяці тому
should contain something like this OPENAI_API_KEY='insert your open ai key'
@Raaj_ML 5 місяців тому
Thanks..but in this video, there was no openAI used. please correct me if I am wrong.
@ardensarmiento8417 4 місяці тому
there is, it's in the .env file mentioned at 2:56. It should contain something like this OPENAI_API_KEY='insert your open ai key'
@Raaj_ML 4 місяці тому
@@ardensarmiento8417 but where it is used ?
@ambarpathakkilbar 10 місяців тому
very basic question - is llama index using the open api key you initialized in the os environment ?
@ambarpathakkilbar 10 місяців тому
also where exactly did you use open ai I am not able to understand it
@vivekshindeVivekShinde 10 місяців тому ⁺¹
No. I think Llamaindex is not using thr OpenAI api key. Also he didn't use it anywhere in the project. Like he said in future we will create more complex conversational bots, maybe at that time he ll use it. He just added that OpenAI part for sake of maintaining the future flow. I might be wrong. Feel free to correct me.
@subhamjyoti4189 10 місяців тому
@@vivekshindeVivekShinde 'VectorStoreIndex' is using openai internally for generating embeddings.
@omerilyas7347 9 місяців тому
@@subhamjyoti4189 I dont think VectorStoreIndex uses OpenAI's embeddings.
@saumyajaiswal6585 10 місяців тому
Thank you for the awesome video. Can you please tell the best approach where we have a multiple pdf chatbot.The Pdfs can have text,images,tables.The answer should contain text, images and tables(or get answers from them) from the pdfs itself wherever required.
@vivekshindeVivekShinde 10 місяців тому
Facing similar issue. Let me know if you find something. It ll be helpful
@saumyajaiswal6585 10 місяців тому
@@vivekshindeVivekShinde sure....you found any solution?
@RahulAthreyaKM 10 місяців тому
can we use Gemini with llama index?
@harik5591 10 місяців тому
Can you create an application with indexing images and creating a prompt with similarity search for a given image content
@AliYar-e4u 4 місяці тому
can we use this for urdu language as well
????????????????????
@MangeshSarwale 10 місяців тому ⁺¹
sir I did'nt have the paid openai key so while running code i am getting the error(RateLimitError : You have exceed your current quota) at the line index=VectorStoreIndex.from_documents(documents,show_progress=True)
please tell how to solve this
@kishanpayadi8168 10 місяців тому
Either create a new account and get free but limited access for 30 days or use gemini pro
@shravaninevagi5729 8 місяців тому
did you find any alternative? i am stuck here as well
@aiml.meetsolanki 7 місяців тому
@@shravaninevagi5729 in source file .venv\Lib\site-packages\llama_index\core\embeddings\utils.py change below for GooglePalmEmbedding which is working in my case
Install llama-index-embeddings-google by command "pip install llama-index-embeddings-google"
"""Embedding utils for LlamaIndex."""
import os
from typing import TYPE_CHECKING, List, Optional, Union
if TYPE_CHECKING:
from llama_index.core.bridge.langchain import Embeddings as LCEmbeddings
from llama_index.core.base.embeddings.base import BaseEmbedding
from llama_index.core.callbacks import CallbackManager
from llama_index.core.embeddings.mock_embed_model import MockEmbedding
from llama_index.core.utils import get_cache_dir
from llama_index.embeddings.google import GooglePaLMEmbedding
EmbedType = Union[BaseEmbedding, "LCEmbeddings", str]
def save_embedding(embedding: List[float], file_path: str) -> None:
"""Save embedding to file."""
with open(file_path, "w") as f:
f.write(",".join([str(x) for x in embedding]))
def load_embedding(file_path: str) -> List[float]:
"""Load embedding from file. Will only return first embedding in file."""
with open(file_path) as f:
for line in f:
embedding = [float(x) for x in line.strip().split(",")]
break
return embedding
def resolve_embed_model(
embed_model: Optional[EmbedType] = None,
callback_manager: Optional[CallbackManager] = None,
) -> BaseEmbedding:
"""Resolve embed model."""
from llama_index.core.settings import Settings
try:
from llama_index.core.bridge.langchain import Embeddings as LCEmbeddings
except ImportError:
LCEmbeddings = None
# Check if embed_model is 'default' or not specified
if embed_model == "default" or embed_model is None:
# Initialize Google PaLM embedding
google_palm_embedding = GooglePaLMEmbedding()
embed_model = google_palm_embedding
return embed_model
@awakenwithoutcoffee 7 місяців тому
Hi there Krish, amazing tutorial once again but I'm running in the issue that the "maximum context length is 8192 tokens". How can we best chunk per PDF page/chapter if the PDF size > 8k tokens ?
*EDIT*: Our use-case is the following: we want to retrieve 100% accurate text from a page or chapter. Is this possible or does the AI only knows how to summarize ?
@kamitp4972 10 місяців тому
Sir can you please make an implementation video on TableGPT?
@satyamoahnty 3 місяці тому
how to add chat memory ?
@nagrajkethavarapu3938 3 місяці тому
We should purchase openai api?
@pritioli8429 6 місяців тому
great tutorial! Thanks for sharing!
@kamalakantanayak3250 10 місяців тому
How is this different from embedding technique ??
@kishanpayadi8168 10 місяців тому
As far as I understand, It is RAG is based on embedding for similarity search. LLAMA index is just at frame work to build application on top of it.
@bhanu866 7 місяців тому
How can we start this in colab
@manjushreegs1063 4 місяці тому
Sir can you please upload a video for multi doc RAG using llamaindex and other other agent apart from openAI.
@ArunkumarMTamil 10 місяців тому
Teach about Direct Preference Optimization
@harshsingh7842 10 місяців тому
how to create open api key please tell me please help me with this doubt
@kunalbose6360 10 місяців тому
Can we have some content where we can fine tune as well as feedback or advance RAG for QnA ❤❤ Or Triplets way for RAG
@sachinborgave8094 8 місяців тому ⁺¹
don't see the pdf's
@chinnibngrm272 10 місяців тому
Hi sir
Previously I have tried with Gemini pro
In that project while extracting text from pdf of 32 pages it's not extracting all text...
That's why I am not able to get perfect answers..
What I have to do sir...
Please help me to solve
@krishnaik06 10 місяців тому ⁺²
Use this technique it will work
@chinnibngrm272 10 місяців тому
@@krishnaik06
Sure sir
@chinnibngrm272 10 місяців тому
@@krishnaik06
Thank you soo much sir for helping lot of students....
You are Amazing😍
Waiting for more projects.
And also one request from my side sir... Please share some project ideas to us as assignments.
It will help us to do it on our own
Please sir... Please share some application ideas
@akj3344 10 місяців тому
@@chinnibngrm272 omg stop begging.
@bindupriya117 6 місяців тому
Can you make a video how to RAFT RAG handson video
@MirGlobalAcademy 9 місяців тому
Why don't you use vs code as code WRITING purpose. why are you using PyCharm inside vs code?
@aibyak 9 місяців тому
ERROR: Failed building wheel for greenlet
Failed to build greenlet
ERROR: Could not build wheels for greenlet, which is required to install pyproject.toml-based projects
getting this error while installing frameworks from requirements.txt
@Innocentlyevil367 10 місяців тому
Hey krish can u do end to end project on model fine tuning
@Munnu-hs6rk 6 місяців тому
you should make same video using open source llms if we can make the project in free why we should pay......and also make end to end streamlit app
@vijjapuvenkatavinay8207 10 місяців тому
I'm getting rate limit error sir.
@kishanpayadi8168 10 місяців тому
Toh gemini use karle mere bhai
@khanmahmuna 7 місяців тому
please can you build a project based on document summarisation app using RAG and LLM without locally downloading the llm and without using gpt model,it would be very helpful or anyone can guide me through this from the viewers it would be very helpful.
@manzoorhussain5275 10 місяців тому
ERROR: Could not install packages due to an OSError: [WinError 5] Access is denied: 'c:\\programdata\\anaconda3\\lib\\site-packages\\__pycache__\\typing_extensions.cpython-39.pyc'
Consider using the `--user` option or check the permissions.
WARNING: Ignoring invalid distribution -umpy (c:\programdata\anaconda3\lib\site-packages)
WARNING: Ignoring invalid distribution - (c:\programdata\anaconda3\lib\site-packages)
WARNING: Ignoring invalid distribution -umpy (c:\programdata\anaconda3\lib\site-packages)
WARNING: Ignoring invalid distribution - (c:\programdata\anaconda3\lib\site-packages)
WARNING: Ignoring invalid distribution -umpy (c:\programdata\anaconda3\lib\site-packages)
WARNING: Ignoring invalid distribution - (c:\programdata\anaconda3\lib\site-packages)
Getting the errors and warnings when I am trying to install the packages from requriement.txt file
kindly help
@amritsubramanian8384 7 місяців тому
Gr8 Video
@karmicveda9648 10 місяців тому
🔥🔥🔥
@lakshman587 3 місяці тому
why are you keep telling openapi open api, rather it is open ai api key.
@Magnus-kd7wd 3 місяці тому ⁺¹
You should really look into poetry instead of using conda. Also, why are you assigning an environment variable to the same environment variabel? If you can `getenv` it, it's already in `os.environ`.
@GaganaMD 6 місяців тому
Sad that openAI api is no longer free
@nefelimarketou1892 8 місяців тому
thankyou!
@svdfxd 10 місяців тому
With all due respect, the speed with which you are posting videos makes it very difficult to keep up with the learning pace.
@KevinNdhlovu 3 місяці тому
😂
@Rider-jn6zh 9 місяців тому
Hello Krish,
I have not seen any video related to how to evaluate LLM
Can you please upload videos on how to evaluate llm model and which evaluation metrics can be used for specific usecase... Like q&a, summerization etc
As I am getting this question in every interview and not able to answer itt

Наступне

Автоматичне відтворення

Step-by-Step Guide to Building a RAG LLM App with LLamA2 and LLaMAindex