Supercharge your Python App with RAG and Ollama in Minutes

Поділитися
Вставка
  • Опубліковано 8 чер 2024
  • This video will show you how easy it is to build a RAG app with just Ollama and ChromaDB using Python and nothing else. You can be up and running in less than 5 or so minutes.
    You can find the code for this video on my github repo at github.com/technovangelist/vi...
    Be sure to sign up to my monthly newsletter at technovangelist.com/newsletter
    And if interested in supporting me, sign up for my patreon at / technovangelist
  • Наука та технологія

КОМЕНТАРІ • 116

  • @G0rd0nG3ck0
    @G0rd0nG3ck0 2 місяці тому +16

    Hey Matt - I know I'm just an internet stranger in an endless ocean of internet noise, but I just wanted to drop you a comment to let you know I've really enjoyed your videos. You have a casual approach to production, or maybe it could better be described as a thorough planning and preparation process that results in a casual vibe for the viewer, and I dig the nuggets of wisdom I've gleaned over the past few months I've been watching your content. I work in tech (professionally) myself, and I mostly recreate in tech as well. I have been all over the place with my inspiration and dabbling over the past year and change. Image and video generation, LLMs, open source architectures and paid platforms, basically whatever I stumble across that looks nifty. I've recently been seeking some new inspiration, and your videos have been a breath of fresh air to watch and get the gears turning for me as I consider what I could dig into for a next project. I'm not a "developer" either but I've had great fun with Python and Ollama, and you explain how to use these tools in a manner that is very approachable. Keep up the great work.

  • @BetterThanTV888
    @BetterThanTV888 2 місяці тому +6

    The most amazing video I have watched in over a year. Your style of educating users is amazing. Great videography and editing. My only wish is that your channel shoots the moon 🚀and you get rewarded from the YT algorithm and compensated.

    • @technovangelist
      @technovangelist  2 місяці тому

      It’s growing faster than I expected so all good. Thanks so much for the comment

  • @juliprabhu
    @juliprabhu Місяць тому

    Tashi Dalek! I figured out from your background picture hanging on the side wall. Great video. I recently been using ollama at work and I am loving it.

    • @technovangelist
      @technovangelist  Місяць тому

      Ahh. I picked that up on one of my trips to Nepal. My sister used to run a free healthcare clinic in a small town called Jiri east of Kathmandu.

  • @markdkberry
    @markdkberry Місяць тому +2

    best quick explainer I found to date

  • @JohnMitchellCalif
    @JohnMitchellCalif 2 місяці тому +1

    super useful and clear! Subscribed.

  • @jz-xq4vx
    @jz-xq4vx Місяць тому

    enjoying your videos and Ollama - looking forward to the TS version of this one!

  • @brian2590
    @brian2590 2 місяці тому +6

    PDF's are a nightmare. I started building a semantic search platform for PDFs back in 2017. This needed to process thousands of PDFs a day all within a very short time frame of half an hour. Then i added some early AI to recognize entities and sentiment. Now i am tasked to add LLMs to the mix. The result is a massive project with many moving parts, it's polyglot.. Java, Python, Rust JS. Uses old and new NLP and AI technologies. I hate it though there is no other way pull off what i need to do without all of these moving parts. It's also too expensive to run all of this in a cloud. I feel for anyone tasked with a large RAG project and cringe a bit when i see people trying to throw a bunch of PDFs into an LLM. don't do it! . Ollama has been very helpful as this system progresses. Thank you!

    • @BelgranoK
      @BelgranoK 2 місяці тому

      I work in a project to build a RAG from pdf using php. We tested somo tools seperately and they works.

    • @mrrohitjadhav470
      @mrrohitjadhav470 2 місяці тому

      @@BelgranoK may i ask list of tools?

  • @rayzorr
    @rayzorr Місяць тому

    Hey Matt. Love your work. This one took me a while to get up and running but with just the right amount of cursing I was able to get it up. When I looked at the data in database, there was a lot of blank lines in between text. I assume it would be preferable to strip that out before chunking?

  • @khalifarmili1256
    @khalifarmili1256 Місяць тому

    that awkward silence in the end 😅, Thanks a lot for the insights 🎉🎉❤

  • @PoGGiE06
    @PoGGiE06 2 місяці тому +1

    I really like your clear, measured, logical presentation style. This is a great, informative, video that will help get anyone up and running with RAG and chroma db, quickly, without getting bogged down in langchain, which does not seem necessary for this task, and yet is often lazily used, along with openAI.
    My questions would be:
    (i). Why not use e.g. llama index? Is that not a more powerful approach, especially if one is smart/targeted in the way one constructs indices for particular subsets of the data.
    (Ii). Should one finetune the embeddings model for the use case? E.g. specific tuned embedding models e.g. extracting particular information from annual reports, for example, one model to retrieve corporate compensation/options data, another for segmental/divisional data, and another for analysing notes to accounts/accounting standards etc.
    (Iii). Pre-processing data e.g. using eg pdf plumber to extract all tables from annual report, locate and extract each relevant section of annual report eg management report, notes etc. and then query relevant section for information sought.
    (Iv). Agentic use of models, possibly also fine tuned to the specific data retrieval tasks above. In particular, using several prompts asking the same question differently and passing the e.g. 3 responses back to another model for summarisation to ‘iterate’ to the best response.
    (V) optimal use of different llms for different tasks. E.g. could one finetune tinydolphin and use that for targeted, faster information retrieval tasks, and then use e.g. mistral for the final combination of responses into summary?
    (Vi). Basic reasoning applied to data sets. For example, I have my own, custom, ih house financial dat set: say i want to compare the leverage between different competitors in the same subindustry, what mode might be best to do that? Shoul i fine tune the model with examples of my particular analyses and conclusions that i would like to see? Or even, using multiple considerations e.g. company valuation, ‘quality’ metrics, growth, competitive positioning, and scenario analysis, it should be possible to construct a simple, reasoned, investment thesis.
    Re: (i), I think that you recently did a vid on this. But I have seen a number of seemingly knowledgeable people saying the best approach is to just finetune bert as it uses encoding and is the best starting point. Apologies if that sounds confused: it probably is, I am new to this area.

  • @joeburkeson8946
    @joeburkeson8946 Місяць тому

    Wow, I can only imagine what things will be like 5 years in the future, thanks for all you do.

    • @technovangelist
      @technovangelist  Місяць тому

      yeah, i wonder if it will be a big change or incremental. A lot of the science behind whats going on was started being research 60 -70 years ago. And the core concepts of language models and how they work was from the early 1990's. There was the transformers paper about 8 years ago which was evolutionary rather than revolutionary, but is the big change that got us to here. So i could see it going both ways. Maybe as things are moving fast the mythical AGI is only a decade away, or maybe its much further. who knows. Exciting time to be making videos.

  • @nicholascomjean
    @nicholascomjean 2 місяці тому +11

    Please do a STT (Speech to Text) / TTS (Text to Speech) integration!

  • @brucoder
    @brucoder 2 місяці тому

    Thank you, Matt.

  • @Aristocle
    @Aristocle 2 місяці тому +3

    One idea for a next video could be a guide on how to create a chatbot+RAG with Knowledge Graphs(Neo4J).

  • @dbwstein
    @dbwstein 2 місяці тому +9

    Great stuff. Here’s the issue: Most of the data I want to review like contracts, zoning laws…etc are in PDFs. So, the RAG apps I want to build will be for getting data out of PDFs. So, anything you can do on that front would be great.

    • @technovangelist
      @technovangelist  2 місяці тому +3

      your best bet is to find the source documents with the full text. PDFs are never the source. The amount of cleanup required to get good info will take longer. In some cases you may get lucky.

    • @adrianbool4568
      @adrianbool4568 2 місяці тому

      If you're on a Mac with homebrew, trying installing the "gc" package (Ghostscript) (brew install gc). Similar on Linux, using whatever package manager is appropriate. Ghostscript provides the "ps2ascii" tool - just call that giving it the input PDF filename and an output (text) filenaname as arguments and it will perform the translation. If your PDF is mostly just text, the output is usually pretty good. If there are lots of "design elements" within the PDF - not so much. For your type of content, it may do pretty well. You casn script this with zsh/bash to convert whole folders of PDF files to text quickly. Good luck.

    • @technovangelist
      @technovangelist  2 місяці тому +1

      It is unfortunate that you need to go through hoops like that. I hope to find a better way that works that doesn’t require a horrible approach like that.

    • @chrisBruner
      @chrisBruner 2 місяці тому

      I have a shell script called summerize_pdf which is
      pdf2text $1 | ollama run mistral "summarize this in 2000 words or less"
      pdf2text is a python program which is:
      !/usr/bin/env python3
      import sys
      from PyPDF2 import PdfReader # Use `PdfReader` instead of `PdfFileReader` for more recent versions
      def extract_text(pdf_path):
      with open(pdf_path, 'rb') as f:
      reader = PdfReader(f)
      num_pages = len(reader.pages) # Get the total number of pages in the PDF document
      text = ""
      for page_num in range(num_pages): # Iterate over each page
      page = reader.pages[page_num]
      text += page.extract_text() + "
      " # Append the text from this page to the full text string, followed by a newline
      return num_pages,text
      def main():
      if len(sys.argv) != 2:
      print("hi Usage: pdf_text_extractor.py ")
      return
      filename = sys.argv[1] # Get the PDF filename from the command line arguments
      numPages, text = extract_text(filename) # Extract text from the file
      print("Total pages: ", numPages)
      print("Extracted Text:
      ", text) # Print out the extracted text
      if __name__ == "__main__":
      main()
      Not elegant but it gets the job done.

    • @technovangelist
      @technovangelist  2 місяці тому +3

      Unfortunately it uses pypdf which does a terrible job for most PDFs. Sometimes it works ok, but way too often the text is jumbled up. Since many PDFs are simply images, an OCR step is often needed. I think most who think pypdf works don't actually look at the resulting text.

  • @RedShipsofSpainAgain
    @RedShipsofSpainAgain 2 місяці тому +6

    Great video, Matt. This is so cool.
    One small suggestion: at 6:00, could you please use syntax highlighting in your code? The all white font makes it hard to follow which finctions youre improtong from 3rd party libraries vs UDFs. I think a color scheme similar to what VS Code uses in its default theme would help readability.
    Thanks again for the excellent videos.

  • @farexBaby-ur8ns
    @farexBaby-ur8ns Місяць тому

    Matt, have very good content.
    I first saw the webui vid and then came to this vid. Qn: so what you do here can be done via openui >Documents. Correct?

  • @csepartha
    @csepartha 2 місяці тому

    Good explanation

  • @uskola
    @uskola 2 місяці тому

    Thanks for this video

  • @thesilentcitadel
    @thesilentcitadel Місяць тому

    Hi Matt, how would you approach dealing with code as the data you want to put into the vector store? I am thinking that sentence chunks might be function chunks?

  • @alibahrami6810
    @alibahrami6810 2 місяці тому

    Awesome content ❤

  •  Місяць тому

    Hi Matt, thanks for the video! I've encountered some setbacks while trying this under Windows 11, but managed to solve them easily. Here I mention the problems and solutions:
    - ImportError: failed to find libmagic. Solution: pip install python-magic-bin
    - ModuleNotFoundError: No module named 'nltk'. Solution: pip install nltk
    - FileNotFoundError: [Errno 2] No such file or directory: 'content/some_docs. Solution: Create the "content" folder because it doesn't exist nor gets created in windows
    - Resource punkt not found. Solution: Follow the steps given by the module:
    - import nltk
    - nltk.download('punkt')

    • @technovangelist
      @technovangelist  Місяць тому

      This is one of the annoyances with Python. Their requirements process is garbage. It’s python_magic for some platforms and not others. Nltk is the module on some platforms. Punkt isn’t required on some platforms. Just a mess.

    • @saakshihs9401
      @saakshihs9401 Місяць тому

      hey, where do you create the contents folder?

  • @TauvicRitter
    @TauvicRitter Місяць тому

    Hello Matt would chunking knowledge into logical units like paragraphs or chapters not better than chopping after so many sentences? Could use llm and instruct it to do the chopping more intelligent or use nlp software for that. Did you consider this?

  • @darthcryod1562
    @darthcryod1562 22 дні тому

    Great video! any suggestions regarding what embeddings to use if my RAG app is to consume pdfs, any document in spanish, i have tried nomic-text, fastembeddings and all-minilm for sentence transformers, but all of them fail to retrieve a good answer from chroma using search, similarity_search or similarity_search_whith relevant score, I have tried using only english language pdfs and works fairly ok

  • @SashaBraus321
    @SashaBraus321 2 місяці тому

    This dude is so underrated

  • @ursamajor3275
    @ursamajor3275 Місяць тому +1

    Hi @Matt, can you update your repo? So that we can have a full working one? Some steps are missing.
    Thanks.

  • @ChristianCOLOT-rb4li
    @ChristianCOLOT-rb4li Місяць тому

    Hello Math, Thank you for this great video. I try to implement your solution but I am facing issues when using the ollama libray "Connection refused". Are the embedding model and LLM dynamically downloaded from a website with your code or should we do it ourselves before using it ?

    • @technovangelist
      @technovangelist  Місяць тому

      If you are getting some sort of error when running ollama pull look into your network connection

  •  2 місяці тому

    Thanks - very good explanations! Would there be any advantages using Ollama RAG Application using Langchain or LlamaIndex?

    • @technovangelist
      @technovangelist  2 місяці тому

      Not for this. A much more complicated app might benefit but I haven’t seen it.

  • @thesilentcitadel
    @thesilentcitadel Місяць тому

    Hi Matt, further to the idea of chunks and the use case being code as input to the RAG, how would you think about context of related functions.. thinking that the retrieval could miss the important interdependence of functions..

    • @technovangelist
      @technovangelist  Місяць тому

      Yes that is interesting. I was purely looking at English. I’m not sure how to look at code for this

  • @IdPreferNot1
    @IdPreferNot1 15 годин тому

    Hi Matt. Could you consider a video where you take this local RAG script you've made here and redo as using Langchain to demonstrate the process and if you think the abstraction approach is efficient or helpful for 1)new coders and/or 2) experienced coders?

    • @technovangelist
      @technovangelist  14 годин тому +1

      Lang chain only complicates things, especially in such a simple app. I don’t want to create videos about the wrong way to do something.

  • @FetaleKetones
    @FetaleKetones 2 місяці тому

    You’re breathtaking 😘

  • @DC-xt1ry
    @DC-xt1ry Місяць тому

    After playing around with RAG I have several questions
    * What Vector DB is the best option?
    * Multi-Agent? CrewAI??
    * What orchestrator is the best? lanchain, lamaindex
    * What open source models is the best?
    * What is ideal workflow?
    Goal = reliable answers and reduce hallucinations

    • @technovangelist
      @technovangelist  Місяць тому

      Well keep watching. For rag orchestrators add complexity without benefit. Which model is best depends on what your needs are and only you can decide. Workflow again is all about you.

  • @neil12345
    @neil12345 Місяць тому +1

    I launch the ChromaDB in a separate terminal within VS Code. Then I run the import.py script in a different terminal. When I run the script, I receive a Errno 61, however when I look at the logs of the localhost port 8000 ChromaDB server, I get multiple 200 API Responses. Is there any troubleshooting as to why it would generate 200 Responses while still erroring in the "for index, chunk in enumerate(chunks_: loop?

    • @highwayman696
      @highwayman696 Місяць тому

      I am facing the same issue as of now. Have you managed to find a solution?

  • @ovots
    @ovots 2 місяці тому

    Hi Matt, I've cloned the project repo for this video, and I'm trying to play along, but I'm running my Ollama service on a separate machine, and I can't figure out where/how I'd specify that in either the config file or the individual ollama.embeddings() and ollama.generate() invocations. Sorry if I've missed something obvious. I have zero experience with Python.

    • @ovots
      @ovots 2 місяці тому

      Solved: I needed to create a "custom client". I should have RTFM for the Python SDK more carefully. Guess I glossed over that the first time.

  • @Sri_Harsha_Electronics_Guthik
    @Sri_Harsha_Electronics_Guthik Місяць тому

    what about htmls? strip em and use it as it is?

  • @GeorgAubele
    @GeorgAubele Місяць тому +1

    Thanks for your video, I think I understand the process of embedding.
    Is there a way to use the embedded docs with an API call? I want tot write a Winform App in C# and therefore a API call would come in handy.

    • @technovangelist
      @technovangelist  Місяць тому

      The full api is documented in the docs. Https://github.com/ollama/ollama

    • @GeorgAubele
      @GeorgAubele Місяць тому

      @@technovangelist Yes, I know, but there is only a small chapter for generating embeddings, but not on how to use them with the API.

    • @technovangelist
      @technovangelist  Місяць тому

      you wouldn't use them directly. You can generate the embedding but then you need to put it somewhere. That is often a vector db. the model can't do anything with the embedding itself. you use the embedding to do a similarity search and then use the source in the model

    • @GeorgAubele
      @GeorgAubele Місяць тому

      @@technovangelist Ok, my question was misleading: I now got your scripts running, and I have the documents in ChromaDB.
      Is there a way to use the Ollama API to talk to my documents in the DB instead of using a python script to do so?
      I wrote a small WinForm app in C# to talk to my models via the Ollama API, but I don't see a way to use the API to support this talks with my documents in the DB.

  • @UnchartedWorlds
    @UnchartedWorlds 2 місяці тому +2

    Typescript 👌 please also pdf tutorial also 👌

    • @technovangelist
      @technovangelist  2 місяці тому +4

      the typescript version will be published on Monday. And then will look at pdf in the next few weeks.

    • @rhenriquea
      @rhenriquea 2 місяці тому

      @@technovangelist I really appreciate what you're doing man, I'm acting as an advocate to use JS/TS with AI and your videos help me a lot. Success!

    • @chrisBruner
      @chrisBruner 2 місяці тому

      @@technovangelistI put some python code in the comments, for a quick and dirty system.

  • @user-tw3fo8re5v
    @user-tw3fo8re5v 2 місяці тому

    Great work and explanation Sir. Thanks for sparing your valuable time and the code but could you please add folders for documents and DB in the code where we can add our own files.Sorry I am not a SW guy, just copy the OpenSource codes and try/run them.THX.

  • @marcinlesniak3642
    @marcinlesniak3642 2 місяці тому

    Say, we have a database like this, which includes medical criteria for different conditions, examples of cases etc. and we want to use it as a context for LLM. Now we provide a description of a new case and we prompt the model to compare the provided information with the database and suggest a proper diagnosis. Is RAG a good choice in this scenario? RAG + prompt engineering? No-RAG solution? What would be your suggestion?

    • @technovangelist
      @technovangelist  2 місяці тому

      I don’t know. Best way to find out is to try it.

  • @shuntera
    @shuntera Місяць тому

    I think I watched you say you now use Obsidian. How about a video where you write some Python to ingest your Obsidian Vault to RAG for Ollama LLM access to the content?

  • @ShikharDadhich
    @ShikharDadhich 2 місяці тому

    I have few doubts here:
    1. The model always respond to a question which means, if I am asking something outside the vector database, the LLM will respond using the knowledge on which it is been trained. Is there any way to handle this?
    2. How to identify the model suitable for RAG, I have tried multiple models some are, extremely slow, some are fast with low quality output. unable to find the right model which can work for a large enterprise application.
    3. Is RAG is also good for document summarisation?

    • @technovangelist
      @technovangelist  2 місяці тому +2

      if you dont want the model to respond if nothing was found in the db, then don't ask the model if there are no results. easy. Most models can respond well, but its easy to get the chunk size wrong. Too big or too small will result in bad output. Document summarization isn't really something RAG can help with.

  • @gokudomatic
    @gokudomatic 2 місяці тому

    I suppose that you prefer EPUB to PDF for printable file format. Right?

    • @technovangelist
      @technovangelist  2 місяці тому

      Well ideally txt or md. Even a docx which is a zipped xml is better.

    • @gokudomatic
      @gokudomatic 2 місяці тому

      @@technovangelist Oh, you mean the source document file. I thought you meant a processed file ready to print. I see what you mean. It can also be latex or asciidoc.

    • @technovangelist
      @technovangelist  2 місяці тому

      Well any format where the text is accessible as is. PDF obfuscates it

  • @primeq
    @primeq Місяць тому

    🌟

  • @ErnestOak
    @ErnestOak 2 місяці тому

    Waiting for the typescript and pdf videos

  • @ursamajor3275
    @ursamajor3275 Місяць тому

    Hi Matt,
    i have the chromadb running on 1 terminal, and on another terminal, I run:
    python3 import.py
    however ...
    Exception: {"error":"ValueError('Collection buildragwithpython does not exist.')"}

    • @ursamajor3275
      @ursamajor3275 Місяць тому

      I added
      """
      try:
      chroma.delete_collection("buildragwithpython")
      except Exception as e:
      print("An error occurred:", e)
      """
      in import.py
      and now I am seeing:
      """
      /import.py", line 23, in
      chunks = chunk_text_by_sentences(source_text=text, sentences_per_chunk=7, overlap=0 )
      Resource punkt not found.
      Please use the NLTK Downloader to obtain the resource:
      >>> import nltk
      >>> nltk.download('punkt')
      """

    • @ursamajor3275
      @ursamajor3275 Місяць тому

      # for ntlk
      /Applications/Python\ 3.12/Install\ Certificates.command

    • @technovangelist
      @technovangelist  Місяць тому

      doh, you can't delete something that doesn't exist, and it wont till you run the app which you can't run till the thing exists, which it wont until you run it.....
      fixed. thanks for pointing that out

    • @technovangelist
      @technovangelist  Місяць тому

      the certificates thing is weird. definitely didn't need that. I wonder if that’s a windows thing

  • @Vera150607
    @Vera150607 Місяць тому

    So now the old .chm format of older digital books is getting its revenge.

  • @patricktang3377
    @patricktang3377 Місяць тому

    Where is your Discord link, pls?

    • @technovangelist
      @technovangelist  Місяць тому

      I don't have a Discord, but the Discord for Ollama is Discord.gg/Ollama

    • @patricktang3377
      @patricktang3377 Місяць тому

      @@technovangelist Thanks. How do I update the list in the sourcedocs.txt file, pls? I tried to just add a url and save it, but received a 500 Server Error when ran import.py / do you know how to fix this?

    • @technovangelist
      @technovangelist  Місяць тому +1

      If it’s a 500 it’s probably not a valid url to a real server

    • @technovangelist
      @technovangelist  Місяць тому +1

      Plus it’s meant as a code sample so you can start building your own

  • @johngoad
    @johngoad 2 місяці тому

    I am just happy you know it pdf's suck...

  • @sethjchandler
    @sethjchandler Місяць тому

    Maybe if enough people rightly trash PDF, it will stop being a dominant format for document distribution? I can dream, can’t I?

  • @GeandersonLenz
    @GeandersonLenz 2 місяці тому

    Where to host ollama? Without expend million of dollars hahaha

    • @technovangelist
      @technovangelist  2 місяці тому

      On your machine is ideal. But I have another video that shows one option called Brev.dev. See Unlocking The Power Of GPUs For Ollama Made Simple!
      ua-cam.com/video/QRot1WtivqI/v-deo.html

  • @ClayShoaf
    @ClayShoaf 2 місяці тому

    "useless tools like ... PyMuPDF"
    Hard disagree.

  • @mbarsot
    @mbarsot Місяць тому

    Hi great video and will try to use if for a project here at the Metropole de Nice. Source documents ad .docx.
    A separate question: can you exlain how to get a copilot-like behaiour?
    Meaning:
    I ask ollama "summary of the three top news from cnn and corriere.it, in italian and in a radio-newscast style"
    it performs a google search on the two websites, puts everything in the prompt (or maybe builds an embeddings ? not clear). And gives me the answer.

    • @technovangelist
      @technovangelist  Місяць тому

      Docx is a bit better than pdf. Change the extension to zip and unzip it and you have a bunch of xml files and they are much better to pull text out of. Not sure what you mean by copilot behavior. I have used that in vscode but I am not a windows user.