Langchain + ChatGPT + Pinecone: A Question Answering Streamlit App

Поділитися
Вставка
  • Опубліковано 17 чер 2023
  • In this exciting video tutorial, I walk you through creating a Streamlit application that allows you to search and query PDF documents effortlessly. Using cutting-edge technologies such as Pinecone and LLM (OpenAI's ChatGPT), I guide you step-by-step in harnessing the potential of these tools.
    By leveraging Pinecone as a vector database and search engine, we enable lightning-fast search capabilities for PDF documents. Additionally, we employ LLM to enhance the search functionality with question-answering capabilities, making your app even more versatile and intelligent.
    To ensure smooth data preprocessing, chains, and other essential tasks, we utilize the incredible Langchain framework. With its powerful features, Langchain simplifies and streamlines the development process, enabling you to focus on building an exceptional PDF query search app.
    Whether a beginner or an experienced developer, this tutorial provides a comprehensive guide to building your own Streamlit app with Pinecone, LLM, and Langchain. Join me as we dive deep into natural language processing and create a game-changing application together!
    Don't forget to like, share, and subscribe to stay updated on the latest advancements in AI/ML.
    GitHub Repo: github.com/AIAnytime/QA-in-PD...
    OpenAI API: platform.openai.com/account/a...
    Langchain Doc: python.langchain.com/docs/get...
    Pinecone Vector DB: www.pinecone.io/
    Streamlit Chat Repo: github.com/AI-Yash/st-chat
    LLM Playlist: • Large Language Models
    #ai #python #coding
    Your Queries:-
    pinecone ai tutorial
    pinecone ai memory
    embeddings from language models
    langchain
    langchain tutorial
    langchain agent
    langchain chatbot
    langchain tutorial python
    chatgpt
    chatgpt explained
    chat gpt
    chatgpt how to use
    chatgpt tutorial
    question answering in artificial intelligence
    question answering nlp
    question answering app
    streamlit tutorial
    streamlit python
    streamlit web app
    Langchain + ChatGPT + Pinecone: A Question Answering Streamlit App
  • Наука та технологія

КОМЕНТАРІ • 27

  • @michael43420
    @michael43420 10 місяців тому

    thanks so much for posting this - it's been very helpful!
    Just wanted to ask about the doc_preprocessing fx - I sometimes get "ValueError: zero-size array to reduction operation maximum which has no identity" when trying to run streamlit
    I first got the error when I downloaded a google sheet file (containing text) as a pdf. So I deleted that file and then retried with downloading a google word file as pdf and streamlit loaded and worked fine.
    But if I have both of the above mentioned files, then the error recurs again. I'm assuming it must have something to do with the data type of the google sheet based pdf messing with the directoryloader module. But it's interesting how it ends up being a zero-size array.
    Just wondering if you had any insights into the issue?

  • @thomashietkamp9859
    @thomashietkamp9859 9 місяців тому

    Yo bro, great video! However, I got an error 'batch size exceeds maximum'... Does that mean I use too many documents? And can I fix that?

  • @FCrobot
    @FCrobot 10 місяців тому

    In the scenario of conversational robots, how to limit the token consumption of the entire conversation?
    For example, once the consumption reaches 1,000, it will prompt that the tokens for this conversation have been used up.

  • @hernandocastroarana6206
    @hernandocastroarana6206 10 місяців тому

    Great. Thanks for the video. Do you know how I can make it show the sources from where it was consulted? or for example to show the links from where the information was extracted (for the case when doing web scrap)?

    • @AIAnytime
      @AIAnytime  10 місяців тому

      Please look at my other videos in the LLM playlist. I have shown source citation, etc. For web scraping examples, look at open AI function and Langchain agent video.

  • @GiangNguyen-ef6br
    @GiangNguyen-ef6br 2 місяці тому

    Thank you!!! Great resource. Pinecone has moved into a serverless model, and apparently there have been quite a bit of movements with the langchain packages. Would it be possible for us to have an updated script as of Apr 2024? Otherwise, I would be very interested in a private meet to discuss this. Would greatly appreciate it!

  • @jorgerios4091
    @jorgerios4091 Рік тому

    Great video, thanks! Would it work if we replace openAI with LaMini-LLM? in order to run it on a cpu.

    • @AIAnytime
      @AIAnytime  Рік тому +1

      What a timing @jorgerios4091.. just finished recording the exact same video. Langchain + Sentence Transformers + Chroma DB+ LaMiNi LM.... The video will be available tomorrow. End to end with workflow explanation. Plz subscribe to the channel after tomorrow's video if you like.

    • @jorgerios4091
      @jorgerios4091 Рік тому

      @@AIAnytime I'm in Sir! Big thanks!

    • @AIAnytime
      @AIAnytime  Рік тому

      Here we go: ua-cam.com/video/rIV1EseKwU4/v-deo.html

  • @akshay_raut
    @akshay_raut Рік тому +1

    This one is good , but i have one question if pdf have information in table format so it will still able to retrieve data from it?

    • @AIAnytime
      @AIAnytime  Рік тому +1

      Hi Akshay, it should retrieve from the table also. You can also check out TAPAS that works well on table and also open source. Find it here:huggingface.co/docs/transformers/model_doc/tapas

    • @abduljaweed2886
      @abduljaweed2886 Рік тому +1

      @@AIAnytime what about the text documents like txt file or docs file

  • @abduljaweed2886
    @abduljaweed2886 Рік тому +1

    so we only use openai for generating embedding and using pinecone store embedding and querying result ?

    • @AIAnytime
      @AIAnytime  Рік тому +1

      Hi Abdul, in this case yes! OpenAI embeddings 002 for embedding through Langchain integration..... Pinecone acts as a vector DB which has many features. First is information in lower dimensional spaces, inbuilt semantic search capabilities, algorithms like Cosine are built, etc etc. It is faster as well when you compare it to traditional mechanisms.

    • @AIAnytime
      @AIAnytime  Рік тому +1

      And LLM is again used for humans like output when it retrieves the information from embeddings.

    • @abduljaweed2886
      @abduljaweed2886 Рік тому +1

      @@AIAnytime can you build one generative chatbot without using openai api key and also storing embedding in some free space

  • @bongomango1
    @bongomango1 11 місяців тому

    thanks heaps for this tutorial. Are you able to add from langchain.prompts import (
    ChatPromptTemplate,
    MessagesPlaceholder,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate
    )
    and
    from langchain.chains import ConversationChain
    from langchain.chains import RetrievalQA
    from langchain.chat_models import ChatOpenAI
    from langchain.memory import ConversationBufferWindowMemory
    import streamlit as st
    from streamlit_chat import message
    and get this app you make in this tutorial have buffer memory and retrieve answers only from the /data folder (corpus of PDFs?) or is this not posssible. I can't find a video that explains how to QA my own corpus and use the buffer memory.

  • @NISRAL
    @NISRAL 10 місяців тому

    Thank you for this video, would this work to generate q/a from pdf ?

    • @AIAnytime
      @AIAnytime  10 місяців тому

      Watch my Question Answer Generator video. In the LLM playlist.

    • @NISRAL
      @NISRAL 10 місяців тому

      @@AIAnytime Thank you, If I want to store vectors on SupaBase is it the same process as Pinecone ?

  • @binitkunal4627
    @binitkunal4627 11 місяців тому

    Hi
    i am getting this error in your code
    can please check this
    , in partition
    elements = partition_pdf(
    NameError: name 'partition_pdf' is not defined. Did you mean: 'partition_xml'?

    • @AIAnytime
      @AIAnytime  11 місяців тому

      Few steps if you are getting partition_pdf not found error:
      1. Check Unstructured version. You need to install pip install unstructured==0.7.12
      2. If the above doesn't work, do a pip install langchain == 0.0.251

  • @ShivamKumar-iv4rk
    @ShivamKumar-iv4rk 11 місяців тому +1

    can you make a video on langchain streaming response using RetrievalQA and pinecone

    • @AIAnytime
      @AIAnytime  11 місяців тому

      Very good idea. I am working on it. Will post soon.... Thanks