How to chat with your PDFs using local Large Language Models [Ollama RAG]

Поділитися
Вставка
  • Опубліковано 25 гру 2024

КОМЕНТАРІ • 482

  • @tonykipkemboi
    @tonykipkemboi  Місяць тому +1

    🚨 NEW UPDATED TUTORIAL 🚨
    I've created a V2 tutorial of this video here ua-cam.com/video/SXjfAIwbkZY/v-deo.htmlsi=hQugknx01XYuemqJ

  • @levi4328
    @levi4328 8 місяців тому +111

    Im a medical researcher and, surprisingly, my life is all about pdfs i dont have any time to read; let alone learn the basics of code. And i think there's a lot of people on the same boat as mine. Unfortunately, its very fucking hard to actually find an ai tool thats barely reliable. Most of youtube is damped with sponsors for ai magnates trying to sell their rebranded and redudant worthless ai-thingy for a montlhy subscription or an unjustifiably costly api that follows the same premise. The fact that you, the only one that came closer to what i actually need - and a very legitimate need - is a channel with

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +18

      Thank you so much for sharing about the pain points you're experiencing and the solution you're seeking. I'd like to be more helpful to you and many more like you as well. I have an idea of creating a UI using Streamlit for the code in this tutorial with a step-by-step explanation of how to get it running on your system. You will essentially clone the repository, install Ollama and pull any models you like, install the dependencies, then run Streamlit. You'll then be able to upload PDFs on the Streamlit app and chat with it on a chatbot like interface. Let me know if this will be helpful. Thanks again for your feedback.

    • @ilyassemssaad9012
      @ilyassemssaad9012 8 місяців тому +2

      hey, hmu and ill give you my rag that supports multiple pdfs and you can choose the llm you desire to use.

    • @Aberger789
      @Aberger789 8 місяців тому +3

      I'm in the space as well, and am trying to find the best way to parse PDFs. I've setup grobid on docker and tried that out. My work laptop is a bit garbage, and being in the world's largest bureaucracy, procuring hardware is a pain in the ass. Anyways, great video.

    • @kumarmanchoju1129
      @kumarmanchoju1129 8 місяців тому +3

      USe nvidia RTX chat for pdf summarizing and querying. Purchase a cheap RTX card of minimum 8GB vRAM.

    • @SocratesWasRight
      @SocratesWasRight 8 місяців тому +1

      ​@@tonykipkemboiI think most people are in pain now with just this part "upload pdfs to service X". This is what they want/have to avoid. Anyhow, nice video you made here.

  • @claussa
    @claussa 8 місяців тому +8

    Welcome on my special list of channels I subscribe to. Looking forward to you making me smarter😊

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +1

      Thank you for that honor! I'm glad to be on your list and will do my best to deliver more awesome content! 🙏

  • @ten2the6
    @ten2the6 4 місяці тому +3

    You sir are awesome! It is easy to make things hard, yet hard to make them simple. Thanks for working so hard to make this simple. Excellent presentation. I will be coming back for more!!

    • @tonykipkemboi
      @tonykipkemboi  4 місяці тому +1

      Thank you @ten2the6, am glad you found it useful! 🫡

  • @daixtr
    @daixtr 2 місяці тому +2

    I don't subscribe easily even at gun point. But I like your way and now I'm a subscriber. Great technical clarity is what the modern world needs.

    • @tonykipkemboi
      @tonykipkemboi  2 місяці тому

      @@daixtr am glad you enjoyed the content and thanks for the sub!

  • @maly9903
    @maly9903 4 місяці тому +3

    I am a layman and have been trying to figure this out for a week, I've watched alot of video's but yours has been by far the best. You cadence is good, you are direct while making it accessible on a high level to follow along. No obfuscation or assumptions. etc etc etc. Great video thank you, have a comment like and subscribe.

    • @tonykipkemboi
      @tonykipkemboi  4 місяці тому

      Thank you, @maly9903, I am glad you found it useful! 🫡

  • @ISK_VAGR
    @ISK_VAGR 8 місяців тому +5

    Congrats man. Really useful content. Well explained and effective.

  • @Lumix-o1j
    @Lumix-o1j 2 місяці тому +2

    great man and you really supporting your viiewers and try to solve the error what a great personality

    • @tonykipkemboi
      @tonykipkemboi  2 місяці тому

      @@Lumix-o1j I appreciate the support. I owe that to my viewers tbh.

  • @nacksters3987
    @nacksters3987 3 місяці тому +2

    Such an amazing video, I didn't get a single video like this to get me started so easily. You are truly amazing!

  • @Adinasa2
    @Adinasa2 5 місяців тому +5

    one of the best videos about RAG and LLM i have come across!! thanks a lot!!

    • @tonykipkemboi
      @tonykipkemboi  5 місяців тому

      Glad you found it helpful!

    • @Adinasa2
      @Adinasa2 2 місяці тому +2

      @@tonykipkemboi code is not working

    • @tonykipkemboi
      @tonykipkemboi  2 місяці тому +1

      @@Adinasa2 what exactly is not working? can you share the error message?

    • @Adinasa2
      @Adinasa2 2 місяці тому

      @@tonykipkemboi its not working giving error
      ConnectError: [Errno 61] Connection refused
      Traceback:
      File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 75, in exec_func_with_error_handling
      result = func()
      ^^^^^^
      File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 574, in code_to_exec
      exec(code, module.__dict__)
      File "/Users/adityagupta/Desktop/rag/ollama_pdf_rag/streamlit_app.py", line 278, in
      main()
      File "/Users/adityagupta/Desktop/rag/ollama_pdf_rag/streamlit_app.py", line 200, in main
      models_info = ollama.list()
      ^^^^^^^^^^^^^
      File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/ollama/_client.py", line 464, in list
      return self._request('GET', '/api/tags').json()
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/ollama/_client.py", line 69, in _request
      response = self._client.request(method, url, **kwargs)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/httpx/_client.py", line 827, in request
      return self.send(request, auth=auth, follow_redirects=follow_redirects)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/httpx/_client.py", line 914, in send
      response = self._send_handling_auth(
      ^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/httpx/_client.py", line 942, in _send_handling_auth
      response = self._send_handling_redirects(
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/httpx/_client.py", line 979, in _send_handling_redirects
      response = self._send_single_request(request)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/httpx/_client.py", line 1015, in _send_single_request
      response = transport.handle_request(request)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/httpx/_transports/default.py", line 232, in handle_request
      with map_httpcore_exceptions():
      File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/contextlib.py", line 158, in __exit__
      self.gen.throw(value)
      File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
      raise mapped_exc(message) from exc

  • @DynamicMolecules
    @DynamicMolecules 6 місяців тому +2

    Thanks for this amazing tutorial on building a local LLM. I applied it to my research paper PDFs, and the results are impressive.

    • @tonykipkemboi
      @tonykipkemboi  6 місяців тому +1

      Awesome 🤩 Love to hear that! Did you experiment without using the MultiQueryRetriever in the tutorial to see the difference?

    • @DynamicMolecules
      @DynamicMolecules 6 місяців тому +2

      @@tonykipkemboi That's an interesting question. I tried and found that MultiQueryRetriever works well in general, when LLM needs to connect indirect information from document, but fails to provide relevant information for direct information present in the document. But, this observation could differ case to case.

  • @davidtindell950
    @davidtindell950 6 місяців тому +3

    Thank You. I have done several similar projects and I learn something new about 'local RAG' with each one !

  • @soudaminipanda
    @soudaminipanda 2 місяці тому +2

    very high quality tutorial. You explain things so well!!

  • @johnpark3138
    @johnpark3138 5 місяців тому +1

    It is by far the easiest and excellent tutorial for learning RAG + PDF. I'd love to see more of topic with a bit more advance in the future. Thank you very much!

    • @tonykipkemboi
      @tonykipkemboi  5 місяців тому

      Thanks! More to come for sure. What topics would you like me to cover?

  • @Reddington27
    @Reddington27 8 місяців тому +5

    Thats a pretty clean explanation.
    looking for more videos.

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +1

      Thank you! Glad you like the delivery. I got some more cooking 🧑‍🍳

  • @Joy_jester
    @Joy_jester 8 місяців тому +8

    Can you make one video of RAG using Agents? Great video btw. Thanks

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +5

      Sure thing. I actually have this in my list of upcoming videos. Agentic RAG is pretty cool right now and will play with it and share a video tutorial. Thanks again for your feedback.

    • @metaphyzxx
      @metaphyzxx 8 місяців тому

      I was planning on doing this as a project. If you beat me to it, I can compare notes

  • @thiagobutignonclaramunt410
    @thiagobutignonclaramunt410 7 місяців тому +4

    You are a awesome teacher, thank you so much to explain this in a clean and objective way :)

  • @JanosTech
    @JanosTech 2 місяці тому +2

    Wow, what a legend! Subscribed!

  • @stoicflow254
    @stoicflow254 2 місяці тому +1

    I just learned a new work and better workflow, great tutorial buddy

  • @VairalKE
    @VairalKE 8 місяців тому +7

    Good to see fellow Kenyans on AI. Perhaps the Ollama WebUI approach would be easier for beginners as one can attach a document, even several documents to the prompt and chat.

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +3

      🙏 Yes, actually working on a Streamlit UI for this

  • @aloveofsurf
    @aloveofsurf 8 місяців тому +2

    This is a fun and potent project. This provides access to a powerful space. Peace be on you.

  • @deldridg
    @deldridg 7 місяців тому +2

    Thank you for this excellent intro. You are a natural teacher of complex knowledge and this has certainly fast-tracked my understanding. I'm sure you will go far and now you have a new subscriber in Australia. Cheers and thank you - David

    • @tonykipkemboi
      @tonykipkemboi  7 місяців тому +1

      Glad to hear you found the content useful and thank you 🙏 😊

  • @Marduk477
    @Marduk477 8 місяців тому +4

    Really userful content and well explained. t would be interesting to see a video but with different types of files, like only PDFs, for example Markdown, PDF, and CSV all at once. It would be very interesting.

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +1

      Thank you! I have this in my content pipeline.

  • @tradertube
    @tradertube 28 днів тому

    Thanks for sharing. Great video and explanation.

  • @rockefeller7853
    @rockefeller7853 8 місяців тому +6

    Thanks for the share. Quite enlightening. I will def build upon that. Here is the problem I have. Let's say Ihave two documents and I wanna chat with both at the same time (for instance to extract conflicting points between the two). What would you advise here?

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +5

      Thank you! That's an interesting use case for sure. My instinct before looking up some solutions is to maybe create 2 separate collections for each of the files then retrieve them separaetly and chat with them for comparison. I'm sure my suggestion above might not be efficient at all. I will do some digging and share any info I find.

    • @paulhammer9058
      @paulhammer9058 5 місяців тому +1

      maybe a bit late but yuo can use instead of the pdfloader the directoryloader, there you can load folders and not just single pdfs

  • @Reichvarg
    @Reichvarg 4 місяці тому +2

    This is a really good video. Thanks a lot for making it, I found it very helpful!

  • @wesleymogaka
    @wesleymogaka 5 місяців тому +1

    Ahsante sana Kip. Working on a bank/ fintech chatbot and will use this info to build it.

  • @KinoInsight
    @KinoInsight Місяць тому +2

    Thank you. Hood tutorial!

  • @leolovetech
    @leolovetech Місяць тому +1

    very clear, thanks tony

  • @DaveJ6515
    @DaveJ6515 8 місяців тому +2

    Very good! Easy to understand, easy to try, expandable ....

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +1

      Awesome! Great to hear.

    • @DaveJ6515
      @DaveJ6515 8 місяців тому +2

      @@tonykipkemboi you deserve it. Too many LLM UA-camrs are more concerned to show a lot of things than to make them easy to understand and to reproduce. Keep up the great work!

  • @gptOdyssey
    @gptOdyssey 8 місяців тому +4

    Clear instruction, excellent tutorial. Thank you Tony!

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому

      Thank you for the feedback and glad you liked it! 😊

    • @Oiseaux_rebelle
      @Oiseaux_rebelle 6 місяців тому

      You're welcome Ezekiel!

  • @nationbuilding5319
    @nationbuilding5319 5 місяців тому +1

    Excellent bro! You just gained a new sub!

  • @web3namesai
    @web3namesai Місяць тому

    Great RAG tutorial! With Web3NS, imagine pairing YourName.Web3 with local AI pipelines like this to power secure, decentralized knowledge management in Web3. 🚀

  • @n0madc0re
    @n0madc0re 8 місяців тому +2

    this was super clear, extremely informative, and was spot on with the exact answers I was looking for. Thank you so much.

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому

      Glad you found it useful and thank you for the feedback!

  • @donwilliams3848
    @donwilliams3848 5 днів тому +1

    i am coming from the visual model and tensor world... you have my subscription about to head over to the second video and try this out. I have some huge pdf's I am hoping to work with.

  • @johnlunsford5868
    @johnlunsford5868 8 місяців тому +2

    Top-tier information here. Thank you!

  • @chrisogonas
    @chrisogonas 8 місяців тому +1

    Simple and well illustrated, Arap Kemboi 👍🏾👍🏾👍🏾

  • @HR31.1.1
    @HR31.1.1 8 місяців тому +2

    Dope video man! Keep them coming

  • @pupscub
    @pupscub 8 місяців тому +3

    Great job

  • @thealwayssmileguy9060
    @thealwayssmileguy9060 8 місяців тому +2

    Would love it if can make the streamlit app! I am still struggeling to make a streamlit app based on open source llms

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому

      Thank you! Yes, I'm working on a Streamlit RAG app.
      I have released a video on Ollama + Streamlit UI that you can start with in the meantime.

    • @thealwayssmileguy9060
      @thealwayssmileguy9060 8 місяців тому

      @@tonykipkemboi thanks bro! I will defo watch👌

  • @igorcastilhos
    @igorcastilhos Місяць тому +6

    Instead of using a single file as the PDF, can I point to a folder with many PDF's? Like 100 PDF's, and use that as the context for my model?

    • @fawad_khan
      @fawad_khan Місяць тому

      Yes you can . Make a dir and put all the PDFs inside the dir .

  • @Jary3166
    @Jary3166 2 місяці тому +1

    Thank you so much for this video! I am learning so much from it! I am trying to process hundreds of short PDFs (10 - 20 pages) and extract the same information from those PDF to generate a database, so in my case I don't necessarily need a live two-way chat, or the MultiQueryRetriever. Would you have a recommendation of which retriever would best fit my goal please?

  • @garthcase1829
    @garthcase1829 8 місяців тому +3

    Great job. Does the file you chat with have to be a PDF or can it be a CSV or other structured file type?

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +2

      🙏 thank you. I'm actually working on a video for RAG over CSV. The demo in this tutorial will not work for CSV or structured data; we need a better loader for structured data.

  • @ArmandoSilvaVelázquez
    @ArmandoSilvaVelázquez 3 місяці тому

    That cool be nice to see multiple pdfs loaded, to see if can be made handle different topics at once.

  • @BillVoisine
    @BillVoisine 3 місяці тому +1

    This is excellent!! Thank you!!

  • @Mind6
    @Mind6 8 місяців тому +1

    Very helpful! Great video! 👍

  • @teddyperera8531
    @teddyperera8531 7 місяців тому +2

    This is a great tutorial. Thank you

  • @Nyx-bm5be
    @Nyx-bm5be 6 місяців тому +2

    Wonderful tutorial, man! Let me ask you, what are the other kinds of prompts we can use? Also, is it normal for the rag to answer questions about things not on the pdf that was loaded? For example, i tested with the prompt "what is a dog" and got a answer back. Is it because of the RAG and Ollama? Thanks a bunch

  • @bramarambikaambati6352
    @bramarambikaambati6352 23 дні тому +1

    Hi Tony.
    Thanks for the video.
    Can you please make a video on how to use Colpali VLM.

    • @tonykipkemboi
      @tonykipkemboi  23 дні тому

      @@bramarambikaambati6352 yes, I got this coming.

  • @vineethnj8744
    @vineethnj8744 6 місяців тому +1

    Good one, Good luck🤞

  • @liamfinch2503
    @liamfinch2503 5 місяців тому +1

    Really good video thank you

  • @nagireddygajjela5430
    @nagireddygajjela5430 6 місяців тому +1

    Thank you for sharing good content

  • @kulumbapaul3065
    @kulumbapaul3065 2 місяці тому +1

    Thank you!

  • @Marques2025
    @Marques2025 7 місяців тому +9

    Useful tip : use a proper wifi dont use Mobile hotspot while pulling the model from ollama ,i had a error with that ,hopes it helps someone😊

    • @bigsmoke4568
      @bigsmoke4568 2 місяці тому +2

      Lol this is common sense 😂

  • @MarahTal
    @MarahTal 17 днів тому

    Thank you so much for this great tutorial! It was really helpful and insightful. I have a few questions:
    Could you please share what operating system you are using for this setup?
    Which Python version worked for you?
    If possible, could you share the specific versions of the libraries you installed? I’ve checked the requirements file on GitHub, but having the exact versions would be super helpful to avoid compatibility issues.

  • @scrollsofvipin
    @scrollsofvipin 8 місяців тому +3

    What GPU do you use ? I have Ollama running on an i5 intel with integrated CPU and so unable to use any of 3B + models. TinyLama and TinyDolphin works but the accuracy is way off

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +3

      I have an Apple M2 with 16GB of memory. I noticed that larger models slow down my system and sometimes force a shutdown of everything. One way around it is deleting other models you're not using.

  • @SimpleInformationINC
    @SimpleInformationINC 7 місяців тому

    Nice job, thanks Tony!

  • @stanTrX
    @stanTrX 8 місяців тому +2

    Thanks, Can you please explain one by one and slowly. Especially the RAG part

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +1

      Thanks for asking. Which part of the RAG pipeline?

  • @dudulascasas4509
    @dudulascasas4509 2 місяці тому +1

    First of all, thank you for this video. I understand that running models locally is good to deal with private data, but you are using chroma as a vector database. Is chroma reliable? How do I know the way they use our data?

    • @tonykipkemboi
      @tonykipkemboi  2 місяці тому

      @@dudulascasas4509 thank! Good question. The instance of Chroma can run locally as well is you prefer or pick another vectordb like milvus or pg vector and spin up a localhost instance to connect to. Making it totally airgapped is important as you mentioned.

  • @Alice8000
    @Alice8000 5 місяців тому +1

    Thanks Useful. Very Man.

  • @supriyakulkarni4063
    @supriyakulkarni4063 2 місяці тому

    hello, 8:03, i am getting an error: OSError: No such file or directory: '/home/supriya/nltk_data/tokenizers/punkt/PY3_tab'
    What it means? and how to fix it?? Please HELP!!

  • @ayushmishra5861
    @ayushmishra5861 8 місяців тому +2

    I've been given a story, the trojan war which is a 6 page pdf or I can even use the story as a text , also 5 pre decided question is given to ask based on the story, I want to evaluate different models answers but I am failing to evaluate even one, kindly help, please guide thoroughly.

    • @ayushmishra5861
      @ayushmishra5861 8 місяців тому

      Can you please reply, would really appreciate that.

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +1

      This sounds interesting! I believe if you're doing this locally, you can follow the tutorial to create embeddings of the PDF and store it in a vector db then use the 5 questions to generate output from the models. You can switch the model type in between each response and probbly have to save each response separately so you can compare them afterwards.

    • @ayushmishra5861
      @ayushmishra5861 8 місяців тому +2

      @@tonykipkemboi What amount of storage will the model take.
      I don't have greatest of the hardware.

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +2

      Yes, there are smaller quantized models on Ollama you can use, but most of them require a sizeable amount of RAM. Check out these instructions from Ollama on the size you need for each model. You can also do one at a time, then delete the model after use to create space for the next one you pull. I hope that helps.
      github.com/ollama/ollama?tab=readme-ov-file#model-library

  • @mehmetyusufbircan1224
    @mehmetyusufbircan1224 2 місяці тому +1

    Do you think I should use llama 3.1 8b or mistral7b for rag?

    • @tonykipkemboi
      @tonykipkemboi  2 місяці тому +1

      Good question. I'd even say try the new models like Llama3.2 and see how they perform

  • @vasanthnagkv5654
    @vasanthnagkv5654 2 місяці тому

    Hey there (from a new subscriber) :)
    Thank you for this amazing video. I have a few questions please
    1. You are talking to the MultiQueryRetriever in a way that it understands your english sentences (when you instruct it to create 5 questions) is this MultiQueryRetriever an AI itself that understands english and has some common sense like how ChatGPT does?
    2. similarly, you create a prompt telling ""Answer the question based on the context ONLY" and supply this prompt to the chain, meaning, to the local_model. So the local model also has some common sense to understand your instruction in the prompt. Right?
    A video with details like these would be super helpful for beginners and aspirants like me, I find no videos online that explains it at a more higher level.
    Thanks for your work !

  • @tharindulakshan4782
    @tharindulakshan4782 3 місяці тому +1

    Thank you for your explanation. However, I am encountering an issue: after getting OllamaEmbeddings to 100%, the Jupyter notebook requires an automatic restart. Why does this happen? As a result, I have to run the app again

    • @tonykipkemboi
      @tonykipkemboi  3 місяці тому

      @@tharindulakshan4782 what do you mean by automatic restart?

    • @tharindulakshan4782
      @tharindulakshan4782 3 місяці тому

      @@tonykipkemboi "The kernel appears to have died. It will restart automatically." This message popup on jupyter notebook after OllamaEmbeddings to 100%

    • @ADITYARAJPANDA-h7m
      @ADITYARAJPANDA-h7m 3 місяці тому +1

      Yeah facing the same problem.
      "The Kernel crashed while executing code in the current cell or a previous cell.
      Please review the code in the cell(s) to identify a possible cause of the failure. "
      How to solve this ?

    • @tharindulakshan4782
      @tharindulakshan4782 2 місяці тому

      @@ADITYARAJPANDA-h7m I solved my issue by using the Kaggle platform and getting a GPU to run Ollama

  • @essiebx
    @essiebx 7 місяців тому +1

    thanks for this tony

  • @jonvu-p9x
    @jonvu-p9x 4 місяці тому +1

    What do you use to record your screen capture??

  • @momdad5244
    @momdad5244 2 місяці тому

    Greatvideo thanks, but it's essentially a control+f from vecotr database right? I thought we would train a LLM with data and then it would generate a result form a given question

  • @guanjwcn
    @guanjwcn 8 місяців тому +2

    Thanks. Btw, how did you make your your UA-cam profile photo? It looks very nice.

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +2

      Thank you! 😊
      I used some AI avatar generator website that I forgot but I will find it and let you know.

    • @guanjwcn
      @guanjwcn 8 місяців тому +1

      Thank you

  • @felipetesta
    @felipetesta 3 місяці тому +1

    Great video! There's any way I can let a LLM read a folder on my PC and answer me using archives (pdfs, .md, .doc, sheets, etc) from that source?

    • @tonykipkemboi
      @tonykipkemboi  3 місяці тому +1

      yes you can. you can use the directory loader function from langchain but you'll have to adjust the embeddings to accommodate for the different file types.

  • @g-grizzle
    @g-grizzle 8 місяців тому +1

    thanks man this is extremely helpful!

  • @tonykipkemboi
    @tonykipkemboi  2 місяці тому +2

    Hi y'all! I know a lot of you reported some errors in getting the current code to run.
    Good news, I have updated the code and will be pushing it out today. Should I make a quick video to highlight the changes?

    • @thomashoddinott4537
      @thomashoddinott4537 2 місяці тому +1

      I managed to get the code working. Are there any tricks to speed up retrieval? I'm using a fairly modest business laptop. I have to wait 5 minutes per response.

    • @tonykipkemboi
      @tonykipkemboi  2 місяці тому +1

      @@thomashoddinott4537 oof yeah that's slow. TBH that's due to the bloat introduced by using LangChain. I'll try coming up with a solution and record an updated short supplementary video

    • @user-kq4ue8sn2h
      @user-kq4ue8sn2h 2 місяці тому +1

      yes, please

  • @Stellasogks
    @Stellasogks 6 місяців тому +1

    Are the libraries you used (langchain , chromaDB ...) open source? and can we use any ollama model?

  • @wah866sky7
    @wah866sky7 6 місяців тому +1

    Thanks a lot! If we have a mix of multiple PDFs, Words or Excel files, how can we change the RAG to support retrieval of them?

    • @tonykipkemboi
      @tonykipkemboi  6 місяців тому

      Glad you found it helpful. For different file types, you would consider the loading/parsing and chunking strategies that fit those data types. I'm working on the next video which I will go over CSV & Excel RAG.

  • @kiranshashiny
    @kiranshashiny 6 місяців тому +1

    Nice video, and very informative.
    My question: I have downloaded the LLMs like gemma, llama2, llama3 and so on on my MacOS. But due to some technical issue, I deleted these LLMs. ( e.g: $ ollama rm llama2)
    Now I want them again, and noticed that if I run "$ ollama run llama3", this **downloads the entire 4.7GB from the internet** over again.
    Is it possible to keep them downloaded at some place and when I want it - just run $ ollama run and use it and later delete it when not needed ?
    Again Thanks in advance and would appreciate a response.

    • @tonykipkemboi
      @tonykipkemboi  6 місяців тому

      Thank you. What you did earlier is the standard way of downloading, serving, and deleting the Ollama models.
      You can also download more quantized options for each, with less memory. I usually add and then delete whenever I don't need it or when I need to download another model.

  • @enochfoss8993
    @enochfoss8993 7 місяців тому +1

    Great video! Thanks for sharing. I ran into an issue with a Chroma dependency on SQLite3 (i.e. RuntimeError: Your system has an unsupported version of sqlite3. Chroma requires sqlite3 >= 3.35.0). The suggested solutions are not working. Is it possible to use another DB in place of Chroma?

    • @tonykipkemboi
      @tonykipkemboi  7 місяців тому +1

      Thank you! Yes, you can swap it with any other open-source vector database. You might also try using a more recent version of Python, which should come with a newer version of SQLite. Do you know what version you are using now?
      You can also try installing the binary version in the notebook like so: `!pip install pysqlite3-binary`

  • @sulayamar8538
    @sulayamar8538 5 місяців тому +1

    What are the Ollama modules that were used, I don't want to install unimportant modules on my machine since it has only limited space.

    • @tonykipkemboi
      @tonykipkemboi  5 місяців тому

      @@sulayamar8538 did you watch the video?

  • @abAbhi105
    @abAbhi105 2 місяці тому

    Thanks for the content and i am stuck and unable to find how to add context/chat history back in the model ?

  • @rmperine
    @rmperine 7 місяців тому +1

    Great delivery of material. How about fine-tuning for llama3 using your own curated dataset as a video? There are some out there, but your teaching style is very good.

    • @tonykipkemboi
      @tonykipkemboi  7 місяців тому

      Thank you and that's a great suggestion!
      I'll add that to my list.

  • @DataScienceandAI-doanngoccuong
    @DataScienceandAI-doanngoccuong 8 місяців тому +4

    Can this model query with tabular data or image data, can't it?

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +3

      I assume you're talking about Llama2? Or are you referring to the Nomic text embedding model? If it's Llama2, it's possible to use it to interact with tabular data by passing the data to it (RAG or just pasting data to the prompt) but cannot vouch for its accuracy though. Most LLMs are not great at advanced math but they're getting better for sure.

    • @sasikumartist
      @sasikumartist 3 місяці тому +1

      @@tonykipkemboi Does this model works on image data?

    • @tonykipkemboi
      @tonykipkemboi  3 місяці тому

      @@sasikumartist not the llama2. you'd have to use an image or multimodal model for that. check out llava model.

  • @AnkitSingh-xc8em
    @AnkitSingh-xc8em 6 місяців тому +2

    Appreciate your work, wanted to know can i use it for confidential pdf. is there will be any chances of data leak ??

    • @tonykipkemboi
      @tonykipkemboi  6 місяців тому

      Thank you for the kind words. Yes, if you use Ollama models like we did on the video, then your content will stay private and not be sent to any online service. To be sure, I'd recommend turning off your WiFi or any connection once you've loaded all the dependencies and imports. You can then run the cells to lead your PDF to a vector db and chat with it. After you're done, you can delete the collection where you saved the vectors of your PDF before turning your connection back on. This is an extra measure to give you peace of mind.

  • @farexBaby-ur8ns
    @farexBaby-ur8ns 8 місяців тому +1

    Good one.. ok you touched on security- you have here something that doesn’t let things flow out to the internet. I saw a bunch of vids about tapping data from dbs using sql agents. But none said specifically anything about security. So qn- does using sql agents violate data security?

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому

      You bring up a critical point and question. Yes, I believe most agentic workflows currently, especially tutorials, lack proper security and access moderation. This is a growing and evolving portion of agentic frameworks + observability, IMO. I like to think of it as people needing special access to databases at work and someone managing roles and the scope of access. So agents will need some form of that management as well.

  • @aaaguado
    @aaaguado 7 місяців тому +1

    Hello friend, thank you very much for your content. I have a question, how can I make it listen to my server within Google Collab so I don't have to use Jupyter, since my resources are a bit limited?

  • @SiddharthMishra-pg1os
    @SiddharthMishra-pg1os 6 місяців тому +1

    Hello ! nice tutorial. I was stuck on the first part unfortunately as I get the error:
    "Unable to get page count Is poppler installed and in PATH".
    Do you have any idea how to solve this ?
    I have already installed poppler using brew.

    • @tonykipkemboi
      @tonykipkemboi  6 місяців тому

      Thank you. Have you tried using chatgpt to troubleshoot?

  • @krakan4383
    @krakan4383 5 місяців тому +1

    Thank you nice tutorial, I work for automobile dealer in IT. Can I use the this approach in connecting millions of seperate invoices to llama3? Thanks in advance

    • @tonykipkemboi
      @tonykipkemboi  5 місяців тому

      @@krakan4383 that is possible but since your documents have structured data within them, you have to make sure you test that it parses the numbers appropriately. Unstructured has more fucntions to parse structured data that you can implement in the loading stage. Another thing to keep in mind is that the models are currently not too great at math so they might not return accurate calculations. You might consider adding an agent that does the calculation in a sandbox using something like Pandas. Look into e2b.dev or LangChain's pandas agent.

  • @RedCloudServices
    @RedCloudServices 4 місяці тому +1

    Does this pdf library encode embedded tables in the pdf document

    • @tonykipkemboi
      @tonykipkemboi  4 місяці тому

      I didn't cover that piece in this tutorial but my guess would be no.

  • @gancezhu
    @gancezhu 3 місяці тому

    Thanks for the video tutorial. It clearly guided us through all the key elements for a RAG system and was very helpful!!
    When trying your code, I got the following errors when submitting a question. What could the root cause of this issue be? Thanks!
    - ERROR - Error processing prompt: no such table: embeddings

  • @TheShreyas10
    @TheShreyas10 6 місяців тому +1

    Quite interesting and thanks for sharing it, can you let me know if this would run on 32GB CPU RAM Core i7 processor? Considering you are using mistral model

    • @tonykipkemboi
      @tonykipkemboi  6 місяців тому +1

      Thank you. Yes that should be sufficient to run the program.

  • @everybodyguitar5271
    @everybodyguitar5271 2 місяці тому

    Is there any restriction on size of the pdf? Is it possible to load multiple pdf files? Will the contents of the pdf will be passed to LLM so that will use token?

  • @pneumati8537
    @pneumati8537 Місяць тому +1

    Hi Tony. Thanks for this work. I get "ERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'"

    • @pneumati8537
      @pneumati8537 Місяць тому +1

      nevermind. think I got it. Thanks again!

  • @HughMcBrideDonegalFlyer
    @HughMcBrideDonegalFlyer Місяць тому +1

    nice one

  • @angadbandal3844
    @angadbandal3844 7 місяців тому +1

    very detailed explanation, thanks, can you please make the same project to give responses in multi-language and with voice output?

    • @tonykipkemboi
      @tonykipkemboi  7 місяців тому

      Thank you. Yes that would be cool. I can see the challenge coming from finding an open source model that is good at multiple languages. The ones I used are not great at all. For voice, it'd probably be easy to use an open source TTS or even be more granular and use 11labs for a better quality in spite of it not being local.

  • @sivakumar7679
    @sivakumar7679 5 місяців тому +1

    Is it compulsory to pull mistral model from Ollama to run the project which size around 4GB??

    • @tonykipkemboi
      @tonykipkemboi  5 місяців тому +1

      @@sivakumar7679 you can pick any other model.

  • @ninadbaruah1304
    @ninadbaruah1304 8 місяців тому +1

    Good video 👍👍👍

  • @xrlearn
    @xrlearn 8 місяців тому +4

    Thanks for sharing this. Very helpful. Also, what are you using for screen recording and editing this video ? I see that it records the section where your mouse cursor is ! Nice video work as well. Only suggestion is to increase gain in your audio

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +4

      I'm glad you find it very helpful. I'm using Screen Studio (screen.studio) for recording; it's awesome!
      Thank you so much for the feedback as well. I actually reduced it during editing thinking it was too loud haha. I will make sure to readjust next time.

    • @xrlearn
      @xrlearn 8 місяців тому +2

      @@tonykipkemboi Btw, can you see those 5 questions that it generated before summarizing the document?

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +2

      @@xrlearn, I'm sure I can. I will try printing them out and share them here with you tomorrow.

    • @tonykipkemboi
      @tonykipkemboi  8 місяців тому +3

      Hi @xrlearn - Found a way to print the 5 questions using `logging`. Here's the code you can use to print out the 5 questions:
      ```
      import logging
      logging.basicConfig()
      logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)
      unique_docs = retriever.get_relevant_documents(query=question)
      len(unique_docs)
      ```
      Here are more detailed docs from LangChain that will help.
      python.langchain.com/docs/modules/data_connection/retrievers/MultiQueryRetriever/

  • @dounia-o7i
    @dounia-o7i 4 місяці тому

    that was a really helpful video thanks a lot , but i have only one problem is that it took me so long to respond like 30min , btw im using weviate image in docker as vector db and nominic embeddings also the ollama's phi3 as my pretrained llm but it doesnt take so much time could u please suggest me to do smth to make it work

  • @mslashm
    @mslashm 5 місяців тому +1

    Thank you so for such impressive video, just one point when running the loading steps, I’m receiving SSL error certificate verification error… not sure why and which certificate it’s referring to?

    • @tonykipkemboi
      @tonykipkemboi  5 місяців тому

      @@mslashm can you share the full error log

    • @mslashm
      @mslashm 5 місяців тому

      @@tonykipkemboi Sure, it's happening whil loading the PDF file
      URLError:

  • @bhagavanprasad
    @bhagavanprasad 5 місяців тому +1

    @tonykipkemboi, Thank you very much for valuable video. It helped me a lot.
    I was struggling to get the right LLM that can run locally.
    I have a question: How do I create a persistent RAG so that the query results can be faster.

    • @tonykipkemboi
      @tonykipkemboi  5 місяців тому

      @@bhagavanprasad glad you found it useful. For this example, the speed depends on several factors one major one being your system configuration. If you have a GPU, then it will be much faster. An intermediate step would be to remove the MultiQueryRetriever since that generates more questions from your prompt then retrieve context for all the questions from the vectoredb which takes time and introduces latency. You can use the generic one question query then optimise retrieval another way like using a reranking model. But that might also be a bit more than what we covered in this tutorial. There's definitely a trade off where you sacrifice accuracy for speed and vice versa.

  • @ramonjales9941
    @ramonjales9941 4 місяці тому +1

    very good!

  • @theDaddyBouldering
    @theDaddyBouldering 6 місяців тому +1

    thanks for the tutorial ! how can I make the model to give answers in a different language?

    • @tonykipkemboi
      @tonykipkemboi  6 місяців тому

      It would largely depend on the capabilities of the given model to translate from English to the target language. You can try by adding the target language in the prompt. Tell it to return the results in X language.