Langchain + ChatGPT + Pinecone: A Question Answering Streamlit App

AI Anytime

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 17 чер 2023
In this exciting video tutorial, I walk you through creating a Streamlit application that allows you to search and query PDF documents effortlessly. Using cutting-edge technologies such as Pinecone and LLM (OpenAI's ChatGPT), I guide you step-by-step in harnessing the potential of these tools.
By leveraging Pinecone as a vector database and search engine, we enable lightning-fast search capabilities for PDF documents. Additionally, we employ LLM to enhance the search functionality with question-answering capabilities, making your app even more versatile and intelligent.
To ensure smooth data preprocessing, chains, and other essential tasks, we utilize the incredible Langchain framework. With its powerful features, Langchain simplifies and streamlines the development process, enabling you to focus on building an exceptional PDF query search app.
Whether a beginner or an experienced developer, this tutorial provides a comprehensive guide to building your own Streamlit app with Pinecone, LLM, and Langchain. Join me as we dive deep into natural language processing and create a game-changing application together!
Don't forget to like, share, and subscribe to stay updated on the latest advancements in AI/ML.
GitHub Repo: github.com/AIAnytime/QA-in-PD...
OpenAI API: platform.openai.com/account/a...
Langchain Doc: python.langchain.com/docs/get...
Pinecone Vector DB: www.pinecone.io/
Streamlit Chat Repo: github.com/AI-Yash/st-chat
LLM Playlist: • Large Language Models
#ai #python #coding
Your Queries:-
pinecone ai tutorial
pinecone ai memory
embeddings from language models
langchain
langchain tutorial
langchain agent
langchain chatbot
langchain tutorial python
chatgpt
chatgpt explained
chat gpt
chatgpt how to use
chatgpt tutorial
question answering in artificial intelligence
question answering nlp
question answering app
streamlit tutorial
streamlit python
streamlit web app
Langchain + ChatGPT + Pinecone: A Question Answering Streamlit App
Наука та технологія

КОМЕНТАРІ • 27

@michael43420 10 місяців тому
thanks so much for posting this - it's been very helpful!
Just wanted to ask about the doc_preprocessing fx - I sometimes get "ValueError: zero-size array to reduction operation maximum which has no identity" when trying to run streamlit
I first got the error when I downloaded a google sheet file (containing text) as a pdf. So I deleted that file and then retried with downloading a google word file as pdf and streamlit loaded and worked fine.
But if I have both of the above mentioned files, then the error recurs again. I'm assuming it must have something to do with the data type of the google sheet based pdf messing with the directoryloader module. But it's interesting how it ends up being a zero-size array.
Just wondering if you had any insights into the issue?
@thomashietkamp9859 9 місяців тому
Yo bro, great video! However, I got an error 'batch size exceeds maximum'... Does that mean I use too many documents? And can I fix that?
@FCrobot 10 місяців тому
In the scenario of conversational robots, how to limit the token consumption of the entire conversation?
For example, once the consumption reaches 1,000, it will prompt that the tokens for this conversation have been used up.
@hernandocastroarana6206 10 місяців тому
Great. Thanks for the video. Do you know how I can make it show the sources from where it was consulted? or for example to show the links from where the information was extracted (for the case when doing web scrap)?
@AIAnytime 10 місяців тому
Please look at my other videos in the LLM playlist. I have shown source citation, etc. For web scraping examples, look at open AI function and Langchain agent video.
@GiangNguyen-ef6br 2 місяці тому
Thank you!!! Great resource. Pinecone has moved into a serverless model, and apparently there have been quite a bit of movements with the langchain packages. Would it be possible for us to have an updated script as of Apr 2024? Otherwise, I would be very interested in a private meet to discuss this. Would greatly appreciate it!
@jorgerios4091 Рік тому
Great video, thanks! Would it work if we replace openAI with LaMini-LLM? in order to run it on a cpu.
@AIAnytime Рік тому ⁺¹
What a timing @jorgerios4091.. just finished recording the exact same video. Langchain + Sentence Transformers + Chroma DB+ LaMiNi LM.... The video will be available tomorrow. End to end with workflow explanation. Plz subscribe to the channel after tomorrow's video if you like.
@jorgerios4091 Рік тому
@@AIAnytime I'm in Sir! Big thanks!
@AIAnytime Рік тому
Here we go: ua-cam.com/video/rIV1EseKwU4/v-deo.html
@akshay_raut Рік тому ⁺¹
This one is good , but i have one question if pdf have information in table format so it will still able to retrieve data from it?
@AIAnytime Рік тому ⁺¹
Hi Akshay, it should retrieve from the table also. You can also check out TAPAS that works well on table and also open source. Find it here:huggingface.co/docs/transformers/model_doc/tapas
@abduljaweed2886 Рік тому ⁺¹
@@AIAnytime what about the text documents like txt file or docs file
@abduljaweed2886 Рік тому ⁺¹
so we only use openai for generating embedding and using pinecone store embedding and querying result ?
@AIAnytime Рік тому ⁺¹
Hi Abdul, in this case yes! OpenAI embeddings 002 for embedding through Langchain integration..... Pinecone acts as a vector DB which has many features. First is information in lower dimensional spaces, inbuilt semantic search capabilities, algorithms like Cosine are built, etc etc. It is faster as well when you compare it to traditional mechanisms.
@AIAnytime Рік тому ⁺¹
And LLM is again used for humans like output when it retrieves the information from embeddings.
@abduljaweed2886 Рік тому ⁺¹
@@AIAnytime can you build one generative chatbot without using openai api key and also storing embedding in some free space
@bongomango1 11 місяців тому
thanks heaps for this tutorial. Are you able to add from langchain.prompts import (
ChatPromptTemplate,
MessagesPlaceholder,
SystemMessagePromptTemplate,
HumanMessagePromptTemplate
)
and
from langchain.chains import ConversationChain
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.memory import ConversationBufferWindowMemory
import streamlit as st
from streamlit_chat import message
and get this app you make in this tutorial have buffer memory and retrieve answers only from the /data folder (corpus of PDFs?) or is this not posssible. I can't find a video that explains how to QA my own corpus and use the buffer memory.
@NISRAL 10 місяців тому
Thank you for this video, would this work to generate q/a from pdf ?
@AIAnytime 10 місяців тому
Watch my Question Answer Generator video. In the LLM playlist.
@NISRAL 10 місяців тому
@@AIAnytime Thank you, If I want to store vectors on SupaBase is it the same process as Pinecone ?
@binitkunal4627 11 місяців тому
Hi
i am getting this error in your code
can please check this
, in partition
elements = partition_pdf(
NameError: name 'partition_pdf' is not defined. Did you mean: 'partition_xml'?
@AIAnytime 11 місяців тому
Few steps if you are getting partition_pdf not found error:
1. Check Unstructured version. You need to install pip install unstructured==0.7.12
2. If the above doesn't work, do a pip install langchain == 0.0.251
@ShivamKumar-iv4rk 11 місяців тому ⁺¹
can you make a video on langchain streaming response using RetrievalQA and pinecone
@AIAnytime 11 місяців тому
Very good idea. I am working on it. Will post soon.... Thanks

Наступне

Автоматичне відтворення

Pinecone LangChain - Questions/Answer on Your Own TXT/PDF Files - Code in 9 Minutes!