Python RAG Tutorial (with Local LLMs): AI For Your PDFs

Поділитися
Вставка
  • Опубліковано 24 лис 2024

КОМЕНТАРІ • 550

  • @frederichominh3152
    @frederichominh3152 7 місяців тому +49

    Best tutorial I've ever seen in a long time, maybe ever. Timing, sequence, content, logic, context... everything is right in your video. Thank YOU and congrats, you are smart as hell.

    • @heesongkoh
      @heesongkoh 6 місяців тому +1

      agreed.

    • @pixegami
      @pixegami  6 місяців тому

      Wow, thanks for your comment. I really appreciate it, and I'm glad you liked the video.

  • @vdabhade
    @vdabhade 5 місяців тому +41

    It's hard to find such high quality videos which is to the point with simplification in all the aspects. Great work !!!

  • @jonuldrick
    @jonuldrick 3 місяці тому +10

    I just wanted to say that this video inspired me to setup my own RAG. I got some help from a friend with some parts, but I've been working on adding more functionality. My current iteration has a menu that has database management and chatbot options. Database management lets me create, update, and delete databases. The chatbot option lets me choose which databases to use before loading the LLM. I also have added graceful interrupt handling. Thanks for the tutorial that provided me with a jumping off point.

    • @jonuldrick
      @jonuldrick 3 місяці тому +4

      And everything is being run locally using HuggingFace and Ollama.

    • @mohamedjasim8247
      @mohamedjasim8247 2 місяці тому +1

      ​@@jonuldrick yes correct. I am also trying same like you

  • @fabsync
    @fabsync 6 місяців тому +27

    Oh man.. by far the best tutorial on the subject.. finally someone using pdf and explaining the entire process! You should do a more in-depth series on this...

    • @pixegami
      @pixegami  6 місяців тому +5

      Thank you for the feedback :) Looks like with the interest this topic has received, I'm definitely keen to dive into it a bit deeper.

    • @fabsync
      @fabsync 6 місяців тому +2

      One of the questions that I was asking myself with pdf.. do you clean the pdf before doing the embeddings .. or this is something that you can resolve by customizing the prompt?
      What would be a good way to do semantic search after using pgvector..? I am still struggling with those answers

    • @pixegami
      @pixegami  6 місяців тому +2

      @@fabsync Yeah I've had a lot of people ask about cleaning the PDFs too. I think if you have PDFs that have certain structural challenges, I'd probably recommend to find a way to clean/augment it for your workflow.
      And LLM prompt can only go so far, and cleaning noise from the data will always help.

    • @houstonfirefox
      @houstonfirefox Місяць тому +2

      @@fabsync Yep, do your best to make sure the incoming data is as clean as possible. In my projects, some of the PDFs were OCR'd many years ago with inferior tools. I re-OCR them and compare the original text with the newly-OCR'd text and see if there is an improvement with the known word count. If so, then I replace the text. I then do 'smudged-lens' image recognition on the document (ultra-low resolution of the document) and K-means clustering to determine an unsupervised classification of that document, similar to how you see a phone bill or electric bill from 10 feet away. You can't make out the individual letters but the overall image in your head tells you how to classify the document (phone bill vs electric bill) 😀

    • @sc5879
      @sc5879 6 днів тому

      @@houstonfirefox Do you have a github, video, or anything that goes into how you are cleaning up the pdfs? I need to do some cleaning up of my pdfs and am very interested in the details of your process. Which programs/python libraries/etc did you use to do this?

  • @tinghaowang-ei7kv
    @tinghaowang-ei7kv 7 місяців тому +26

    It's hard to find such high quality videos on China's Beep, but you've done it, thank you so much for your selflessness. Great talk, looking forward to the next video. Thanks again, you did a great job!

    • @pixegami
      @pixegami  7 місяців тому +1

      Thank you! Glad you enjoyed it!

  • @NW8187
    @NW8187 6 місяців тому +16

    Simplifying a complex topic for a diverse set of users requires an amazing level of clarity of thought, knowledge and communication skills, which you have demonstrated in this video. Congratulations! Here are some items on my wish list for you when you can get to it. 1. Ability for users to pick among a selected list of open-source LLMs. A list that users can keep it updated. 2. build a local RAG application for getting insights from personal tabular data, which stored in multiple formats e.g. excel/google sheets, PDF tables

    • @pixegami
      @pixegami  6 місяців тому +1

      Thanks for your comment, I'm really glad to hear it was helpful. I appreciate you sharing the feedback and suggestions as well, I've added these items to my list of ideas for future videos :)

  • @jakemgrim
    @jakemgrim 2 місяці тому +6

    One of the best tutorials I’ve watched on UA-cam in a while. The non local RAG video was also great! Well done and thank you for the information!

    • @pixegami
      @pixegami  Місяць тому

      Thanks! Really glad you found both videos helpful.

    • @pixegami
      @pixegami  Місяць тому

      Thanks! Really glad you found both videos helpful.

  • @humanetiger
    @humanetiger Місяць тому +2

    **Edit:** Just got to 10:25 and realized I was writing an answer to this 😀Yeah, it's just an outstanding tutorial in all aspects.
    Love the content of this video a lot! It's so well prepared - fantastic job. Just by watching it, I might have spotted a weak point. No biggy, but something that can be improved. The unique chunk ids do not represent their actual content, but rather the position of their content. When the content in an already existing position is changing, the chunk id won't recognise this change. What you can do is adding a checksum of the content to your chunk id. Something like data/monopoly.pdf:1:1:{chunk-content-checksum}. You could even leave out the index, and decide only based on the checksum if you keep / remove / add chunks for a page. In that line of thought, you could even remove all the other id parts, just ending up with something like data:{chunk-content-checksum}. The cleanup process for outdated chunks is a little more complicated doing it like this, but doable nontheless.

    • @houstonfirefox
      @houstonfirefox Місяць тому +1

      Sounds good but I think I'd keep the chunk ID just to be on the safe side. There is the infinitely small chance that the checksums of two different PDFs could potentially compute to the same value. Admittedly it's an extremely rare occurrence but just the fact that it Could happen is enough to make me want to leave the Chunk ID in place 😉

    • @pixegami
      @pixegami  Місяць тому +1

      Hey, thanks so much for the kind words! Really glad you found the tutorial helpful. 😊
      Using a content-based checksum is a clever idea that could definitely improve change detection. But then without the page (position) of the content, I do think it will be challenging to figure out whether a "new checksum" means you need to add a chunk vs replace a chunk.
      That's why I feel you'd probably need both a positional index (ID) and a checksum. Great thoughts and ideas. Thank you for sharing them :)

    • @humanetiger
      @humanetiger Місяць тому

      @@pixegami Meanwhile I adapted your tutorial, built a local RAG on my machine + a web-based UI. It all works pretty well. For the document update part (and prob. some other things in the future) I added a command concept. When I type "--refresh" the database gets cleared and all documents are loaded into a fresh database. For me this works well, and it saves all the efforts for implementing more sophisticated updates.

    • @humanetiger
      @humanetiger Місяць тому +1

      @@pixegami Suggestion for a future tutorial, if you like: With my little python skills I found it impossible to query the collection asynchronously - found some hints how to do this, but it's all too advanced. Would be really interested in learning & understanding how this can be done.

  • @musiitwaedmond1426
    @musiitwaedmond1426 7 місяців тому +21

    this is the best RAG tutorial I have come across on youtube, thank you so much man💪

    • @pixegami
      @pixegami  6 місяців тому +1

      Thank you! I appreciate it!

  • @nickmills8476
    @nickmills8476 6 місяців тому +7

    To update the chromadb data for PDF chunks whose data has changed, store the PDF document contents hash in the metadata field. In addition to adding IDs that don't already exist, select records whose metadata.hash has changed and update these records, using collection.update()

  • @denijane89
    @denijane89 7 місяців тому +10

    That was the most useful video I've seen on the topic (and I watched quite a lot). I didn't realise that the quality of the embedding is so important. I have one working code for local pdf ai, but I wasn't very impressed by the results. That explains why. Thank you for the great content. I'd love to see other uses of local LLMs.

    • @pixegami
      @pixegami  6 місяців тому +3

      Glad you liked it! Thanks for commenting and for sharing your experience.
      And absolutely - when building apps with LLM (or any kind of ML/AI technology), the quality of the data and the index is really non-negotiable if you want to have high-quality results.

    • @chair_guy
      @chair_guy 4 місяці тому +2

      which is the best freely available embedder? any help would be really appreciated

  • @JaqUkto
    @JaqUkto 6 місяців тому +4

    Thank you very much! I've started my RAG using your vids. Of course, much of your code needed to be updated, but it was simple even given my zero knowledge of Python.

    • @sergiovasquez7686
      @sergiovasquez7686 6 місяців тому +4

      You may could share the updates to us 😅

    • @pixegami
      @pixegami  6 місяців тому +1

      Nice work, glad you got it working!

  • @agustinfilippo5451
    @agustinfilippo5451 4 місяці тому +3

    I've watched a few of your videos and I didn't know which one to comment first. And congratulate you. Great content and even better style.

    • @pixegami
      @pixegami  Місяць тому

      Thanks so much! I'm really glad you're enjoying the content and style. It means a lot to hear that!

  • @davidtindell950
    @davidtindell950 4 місяці тому +15

    BTW (ByeTheWay): I used the OpenAI Embeddings model="text-embedding-3-large" and obtained very similar results to your demo query about Monopoly. I first used Ollama 'llama3', but then retested with Ollama 'mistra:latest'. Surprisingly, the 'mistral' results were better than the ''llama3' !?!?! All I can say now is "G'Day Mate" and thank you again!

    • @vslabs-za
      @vslabs-za 4 місяці тому +1

      llama3 vs mistral? That's a weighty comment there mate...

    • @habibbayo3327
      @habibbayo3327 3 місяці тому

      @@vslabs-za 🤣🤣🤣

  • @chabilihicham7136
    @chabilihicham7136 Місяць тому +1

    This is genuinely a crazy good tutorial, kudos to you man

    • @pixegami
      @pixegami  Місяць тому

      Thank you! Really glad you found it helpful.

  • @KrishnaKotabhattara
    @KrishnaKotabhattara 7 місяців тому +3

    For evaluation, use RAGAs and Langsmith.
    There is also an SDK for azure which does same things as RAGAs and Langsmith.

    • @pixegami
      @pixegami  7 місяців тому +1

      Oh, thanks for the recommendation. I'll have to take a look into that.

  • @mpesakapoeta
    @mpesakapoeta 3 місяці тому

    The best RAG tutorial so far, you've introduced concepts ave not seen in other similar tutorials

  • @nachoeigu
    @nachoeigu 7 місяців тому +4

    Your content is amazing! Keep it going. I would like to see the continuation of this video in terms of how to upload and automate the workflow in the cloud AWS and how to integrate the chat interface with telegram bot

    • @pixegami
      @pixegami  7 місяців тому +2

      Glad you liked it, and thanks for the suggestions. My next video will be focused on how to deploy this to the cloud - but I hadn't thought about the Telegram bot idea before, I will look up how to do that.

  • @maikoke6768
    @maikoke6768 6 місяців тому +1

    correct and improve:
    The issue I have with the Rag is that when I ask about something in a document that I know doesn't exist, the AI still provides a response, even though I would prefer it not to.

  • @mrrohitjadhav470
    @mrrohitjadhav470 7 місяців тому +11

    After searching 100s of videos journey ends here. 😍Please would you make a tutorial making a knowledge graph using Ollama?

    • @pixegami
      @pixegami  7 місяців тому +4

      Thanks, glad your journey came to an end :) Thanks for the suggestion - I've added the idea to my list :)

    • @mrrohitjadhav470
      @mrrohitjadhav470 7 місяців тому

      @@pixegami Aweeeeeeeesome, Just want to slightly change the knowledge graph based on pdf,txt (own data). Sorry for not elaborating, but too much own data makes it difficult to find connections between many sources.

    • @mrrohitjadhav470
      @mrrohitjadhav470 5 місяців тому

      ??/

  • @royli2009
    @royli2009 3 місяці тому

    Pure gold! Thanks for putting all the beads and pieces together so well bro!

  • @jial.5245
    @jial.5245 7 місяців тому +4

    Thank you so much for the content👍🏼 very well explained! Would be great to see a use case of using autogen multi-agent approach to enhance RAG response.

    • @pixegami
      @pixegami  7 місяців тому +1

      Glad you liked it, thank you! And thanks for the suggestion and project idea :)

  • @michaelmaloy6378
    @michaelmaloy6378 4 місяці тому +1

    My laptop is woefully underpowered, and I had to update a couple of the dependencies, but I was able to get mistral to tell me about gears. Hoping I can get these "6 simple machines" pdfs to accomplish similar.
    Thank you so much for this project. :)

    • @pixegami
      @pixegami  Місяць тому +1

      Yeah, that's awesome you got it working! Laptops can definitely struggle with local LLMs, but it's great you found a workaround. Those "6 simple machines" PDFs sound perfect for this kind of project. Have fun exploring that - I'd be curious to hear how it goes! And thanks, glad you're enjoying the project :)

  • @Mykyta-Korniienko-CS
    @Mykyta-Korniienko-CS 5 місяців тому +1

    Deploying the model on the cloud would definitely be interesting! thank you for the video :D

  • @kam3580
    @kam3580 5 місяців тому +1

    Thanks, great video! To avoid duplicate documents and to look for updated documents I use a document hash to check.

    • @pixegami
      @pixegami  5 місяців тому

      Yup! Great idea :)

  • @nhatminhtran2270
    @nhatminhtran2270 Місяць тому

    To run this project we need to install Ollama -> install nomic-embed-text and mistral by ollama pull

  • @joxxen
    @joxxen 7 місяців тому +5

    Very nice, I wish I had this guide few weeks ago, had to learn it the hard way xD

    • @pixegami
      @pixegami  7 місяців тому +1

      You got there in the end :)

  • @muhannadobeidat
    @muhannadobeidat 7 місяців тому +1

    Great video and nicely scripted. Thanks for the excellent effort.
    I find that nomic 1.5 is pretty good for embedding and lightweight as well. I did not do actual performance metric based analysis of that but actual recall and precision testing is pretty impressive with 768 dimensions only.

    • @pixegami
      @pixegami  7 місяців тому

      Thank you! Glad nomic text worked well for your use case :)

  • @JohnBoen
    @JohnBoen 6 місяців тому

    He he he... tests are easy. I was wondering how to do those.
    Prompt:
    State several facts about the data and construct a question that asks for each fact.
    Create tests that look for the wrong answer...
    Give me 50 of each...
    Give me some examples of boundary conditions...
    Formatting...
    In an hour I will have fat stack of tests that would normally take a day a day to create.
    This is awesome :)

  • @carlosalberto-mo1wj
    @carlosalberto-mo1wj 5 місяців тому +1

    I simply love the hole video!
    for the next Rag tutorial can you make a deploy on a azure cloud or any other cloud, just to see in depth how this works!
    thanks so mutch for the content man!

    • @pixegami
      @pixegami  5 місяців тому +2

      My upcoming video is actually about how to deploy a RAG app like this to the AWS cloud :) Stay tuned!

  • @NBPmusic9831
    @NBPmusic9831 5 місяців тому +2

    Thanks for sharing and I can have a grasp of the concept. If possible, it will be deeply appreciated if you can show how to do it in the cloud. Thanks.

    • @pixegami
      @pixegami  5 місяців тому

      Stay tuned for my next video! ua-cam.com/video/ldFONBo2CR0/v-deo.html

  • @iainhmunro
    @iainhmunro 7 місяців тому +2

    This is pretty good. I was wondering how I could integrate this with my current python scripts for my AI Calling Agent, so if someone wanted to call the number, they could chat with the PDF.

    • @pixegami
      @pixegami  7 місяців тому

      I think that certainly should be possible, but it's quite complicated (I haven't done anything like that before myself).
      You'd probably need something to hook up a phone number/service to an app that can transcribe the text in real like (like what Alexa or Siri does), then have an agent to figure out what to do with that interaction. And eventually hook it up to the RAG app.
      After that, you'll need to seriously think about guard-rails for the agent, otherwise you could end up with it getting your business into trouble. An example of this is when Air Canada's chatbot promised a customer a discount that wasn't available: www.bbc.com/travel/article/20240222-air-canada-chatbot-misinformation-what-travellers-should-know

  • @ManuelJimenez1
    @ManuelJimenez1 5 місяців тому

    Thanks for the whole tutorial, I would suggest adding speed at the queries of the vector database with PostgreSQL with pgvector, pg_embeddings plugin.

  • @davidgortega3734
    @davidgortega3734 5 місяців тому +1

    For the unit tests you can use tools or grammars to limit the output and that way you can fix some issues that you are showing

    • @pixegami
      @pixegami  5 місяців тому

      Good idea. I haven't actually explored testing LLM output in detail yet, and I think it will be a fascinating topic.

  • @nascentnaga
    @nascentnaga 7 місяців тому +4

    Suuuuuper helpful. I need to test this for a work idea. thank you!

    • @pixegami
      @pixegami  7 місяців тому

      You're welcome!

  • @davidtindell950
    @davidtindell950 4 місяці тому +1

    Thank You for: "Taking it up a notch ..." or 2 or 3 'notches' !

  • @කැලණිකුප්පි
    @කැලණිකුප්පි 7 місяців тому +3

    Recently discovered your channel 🎉 , subscribed 😊 keep up the awesome content

    • @pixegami
      @pixegami  7 місяців тому +1

      Thank you! Welcome to the channel!

  • @trueindian03
    @trueindian03 4 місяці тому

    This is the best RAG tutorial on youtube, Thanks for the Video, you got a new Subscriber 🎉

    • @pixegami
      @pixegami  Місяць тому

      Thanks so much! Really glad you found it helpful. Welcome aboard! 😊

  • @ravikiranbasuthkar2818
    @ravikiranbasuthkar2818 4 місяці тому

    This is the best practical tutorial came across llms, RAG, langchain. Also can you make one about agents and their use

    • @pixegami
      @pixegami  Місяць тому

      Thanks! Really glad you found it helpful. Agents are definitely an interesting topic - I'll keep that in mind for a future video.

  • @ramanamachireddy
    @ramanamachireddy 4 місяці тому

    Thanks for your good work. It was such a crisp and clear video straight to the point. I loved it. Keep doing the good work. I would like also see how can we deploy such models in production. If you can do one such video next time, it would be really informative. Thanks-Ramana

  • @mo3x
    @mo3x 7 місяців тому +66

    So it is just an advanced ctrl+f ?

    • @pixegami
      @pixegami  7 місяців тому +18

      Yes, that's one way to think about it. Still, incredibly powerful.

    • @Larimuss
      @Larimuss 4 місяці тому +18

      When a company has 10,000 documents. Trust me this shit is useful and will be the future. Probably Microsoft will sell it as Azure extension option.

    • @yadhapdahal758
      @yadhapdahal758 4 місяці тому +2

      Good usecase - alternative for FAQ/ help pages for websites/ applications . Will give it a shot

    • @kashishkumar
      @kashishkumar 4 місяці тому

      Semantic now

    • @beauforda.stenberg1280
      @beauforda.stenberg1280 2 місяці тому

      😂

  • @kofiadom7779
    @kofiadom7779 3 місяці тому

    your tutorials are very simple-to-understand. could you please do a tutorial on reinforcement learning from human feedback?

  • @krnl1304
    @krnl1304 9 днів тому

    Excellent one, man! Thanks so much!!!

  • @paulham.2447
    @paulham.2447 7 місяців тому +4

    Very very useful and so much well explained ! Thanks.

  • @zhubarb
    @zhubarb 7 місяців тому +3

    Crystal clear. Great video.

    • @pixegami
      @pixegami  6 місяців тому

      Thank you! Glad to hear that :)

  • @VenuraPussella
    @VenuraPussella 3 місяці тому

    This is really good and informative, make a video on deploying application like these to a cloud.

  • @oanna.m3125
    @oanna.m3125 2 місяці тому

    Excellent tutorial. Thanks for uploading such great content!

  • @Ex0dus111
    @Ex0dus111 Місяць тому +1

    Great video, love the stack you're using also.
    I wonder if its possible to do a 2 pass on the LLM. Once to help the user formulate a better question for the RAG vector lookup, and secondly running that through the actual RAG and LLM regular.
    You would need 2 different system prompts, and the first one would need an overview of the contents of the RAG db, so the LLM can read the users question, and then add key words that the RAG system can use to have a better chance at finding the right embeddings.

    • @pixegami
      @pixegami  Місяць тому +1

      Yeah, that's a really interesting idea. A two-pass approach could potentially improve the result.
      I haven't implemented this specific approach myself, but it sounds similar to query expansion techniques used in information retrieval. It could be particularly useful for handling ambiguous or brief user queries.
      One consideration would be the additional latency from the extra LLM call. But for use cases where accuracy is critical, it might be worth the trade-off.

  • @careyatou
    @careyatou 6 місяців тому +1

    I got this to work with my own data. This was so cool. Thanks!

    • @pixegami
      @pixegami  6 місяців тому

      Awesome! Glad to hear it worked for you :)

    • @pmgear
      @pmgear 4 місяці тому

      I could not get it to work, it won't build a database, neither using bedrock nor ollama, bummer.

    • @willnorden2268
      @willnorden2268 3 місяці тому

      @@pmgear yeah, it keeps saying that "from langchain.vectorstores.chroma import Chroma" is depracated

    • @davinchocamaron646
      @davinchocamaron646 Місяць тому

      did you use ollama?

  • @rongronghae
    @rongronghae Місяць тому +1

    재미있는 내용 감사합니다!

    • @pixegami
      @pixegami  Місяць тому

      네, 감사합니다! 도움이 되었다니 기쁩니다 :)

  • @ayoubfr8660
    @ayoubfr8660 7 місяців тому +4

    Great stuff as usual! Could we have a video about how to turn this RAG app into a nice and proper desktop app with a graphic interface? Cheers mate.

    • @pixegami
      @pixegami  7 місяців тому +1

      Good idea, thanks! I'll note it down as a video idea :)

    • @ayoubfr8660
      @ayoubfr8660 7 місяців тому +1

      @@pixegami Thank you for the reply and reactivity! Have a nice day!

    • @J3R3MI6
      @J3R3MI6 6 місяців тому +1

      @@pixegamiI subbed for the advanced RAG content

  • @michaelwindeyer6278
    @michaelwindeyer6278 4 місяці тому +3

    Thank you for this tutorial! It went into more detail than most. I have questions. In all the tutorial i have watched, there is always a small dataset used (a few games instructions in yours). How big can the dataset be? What if I have 1000s of PDFs? Will a RAG give less accurate answers in this case and are there other things to consider when dealing with larger datasets?

  • @basselkordy8223
    @basselkordy8223 7 місяців тому +4

    High quality stuff. Thanks

    • @pixegami
      @pixegami  7 місяців тому

      Glad you liked it!

  • @AlexandreBarbosaIT
    @AlexandreBarbosaIT 7 місяців тому +4

    Smashed the Subscribe button! Awesome content! Looking forward for the next ones.

    • @pixegami
      @pixegami  7 місяців тому

      Thank you! Glad you enjoyed it, and welcome!

  • @node547
    @node547 3 дні тому

    Cool. Thank you for the good work.

  • @ishadhiwar7636
    @ishadhiwar7636 6 місяців тому +1

    Thank you for the fantastic tutorial! It was incredibly helpful and well-explained. I was wondering if you have any plans to release a video on fine-tuning this project using techniques like RLHF? It would be great to see your insights on that aspect as well.

    • @pixegami
      @pixegami  6 місяців тому

      Thank you! Glad you enjoyed the video. I've noted the suggestion about fine-tuning-I hadn't considered it yet, but thanks for sharing that idea with me.

  • @maxflokinho
    @maxflokinho 5 місяців тому +2

    I would like it not only to be able to read PDFs but also if the final information was 'weak' or missing information, it would do an internet search on the topic provided in the 'query' and complete the final answer with the collected data. Do you think this is feasible to do? I thought about using agents for this, like crewai. I looked but couldn't find any tutorial that used both methods.

  • @DCW09
    @DCW09 7 місяців тому +2

    Cursory glance says - add a hashing function to the chunk metadata, this way the chunk should have a unique identifier (MD5, SHA, Etc) if anything changes the hash will also change. Then its just simple logic to validate current chunk page.index against an existing one's hash. If its different, overwrite. If its not, dont waste the cycles.
    In practice, I am not 100% sure that this would be the approach but at least the theory here should be pretty on point for identifying changes with few compute cycles.

    • @pixegami
      @pixegami  7 місяців тому +1

      Yup! I think that's probably the way I'd do it too. If there's too documents and you need to scale it, then I guess you can hash the entire document as well first to narrow the search space each time.

    • @MichaelTanOfficialChannel
      @MichaelTanOfficialChannel 6 місяців тому +2

      @@pixegami I just want to add that I would apply the hash to the entire page and not the chunk. A page can be edited in a way where the content is shorter than previous version, thereby causing the number of chunks to be less that what it previously was. And I would also remove all chunks belonging to the said page before adding new chunks, so as not to have an orphaned chunk from the previous lengthier page.

  • @Jorxdandres22
    @Jorxdandres22 4 місяці тому

    Gracia amigo eres un crack, me ayudo mucho en mi proyecto, lo pude implementar para un chat con django, ademas logre poder agregar las imagenes y tambien me las toma encuenta

  • @FurkanÇalışkan
    @FurkanÇalışkan День тому

    perfect explanation

  • @mehmetkaya4330
    @mehmetkaya4330 7 місяців тому +1

    Great tutorial! And if you could please do a tutorial on when/how to know data within the documents (pdf or csv etc) has changed?

    • @pixegami
      @pixegami  7 місяців тому

      Thanks for the suggestion :) That's a good idea, I think I'll have to plan it...

  • @Karthik-ln7eg
    @Karthik-ln7eg 4 місяці тому

    Great video! Love the way you simplified the concept. Are you thinking of making videos on the topics fine-tuning, function-calling, Agents? If so, that would be a great series of videos. I am sure all your subscribers including me would greatly benefit from them. Meanwhile, can you share any resources on these topics? (fine-tuning, function-calling, Agents)

    • @pixegami
      @pixegami  Місяць тому

      Yeah, thanks! Glad you found it helpful. I haven't dived deep into fine-tuning or Agents yet, but function calling is definitely on my radar. For now, here are some solid resources:
      - Function calling: OpenAI's guide (platform.openai.com/docs/guides/gpt/function-calling)
      - Agents: Langchain's overview: python.langchain.com/docs/how_to/#agents
      I'll keep your suggestions in mind for future videos. Appreciate the input!

  • @RasNot
    @RasNot 7 місяців тому +2

    Great content, thanks for making it!

    • @pixegami
      @pixegami  7 місяців тому

      Glad you enjoyed it!

  • @sardorshorahimov9486
    @sardorshorahimov9486 4 місяці тому

    Hi, thank you for video and info. This video one of the best videos about AI, ML, Rag. Your video was so helpful) thank you again)

    • @pixegami
      @pixegami  Місяць тому

      Thanks! I'm really glad you found the video helpful and informative!

  • @nickmills8476
    @nickmills8476 6 місяців тому +1

    Using a local embedding model: mxbai-embed-large, got me similar results to your monopoly answer.

    • @pixegami
      @pixegami  6 місяців тому

      Thanks for sharing! I hadn't tried that one yet.

  • @durand101
    @durand101 5 місяців тому +1

    Such a helpful tutorial, thank you!

    • @pixegami
      @pixegami  5 місяців тому

      Glad you enjoyed it!

  • @philc787
    @philc787 2 місяці тому

    Good content. Suggestion for the future: zoom in vscode to make test slightly bigger when running full screen.

    • @pixegami
      @pixegami  Місяць тому

      Thanks for the feedback! That's a great suggestion - I'll definitely keep that in mind for future videos. Always looking to improve the viewing experience.

  • @jasiriwa-kyendo8043
    @jasiriwa-kyendo8043 3 місяці тому

    Yes I would love to know how this can be pushed to the web, as using Ollama would completely change everything.

  • @TonySchiffbauer
    @TonySchiffbauer 2 місяці тому +2

    Do i need to use BedrockEmbeddings? I can't run this because it says I have the wrong AWS credentials. Maybe Im missing something.

  • @shankarkarande4175
    @shankarkarande4175 4 місяці тому

    Thank you so mcuh, Best tutorial I've ever seen!!

    • @pixegami
      @pixegami  Місяць тому

      Thanks so much! Really glad you found it helpful! 😊

  • @elvistolotti45
    @elvistolotti45 6 місяців тому +2

    great tutorial

  • @gustavojuantorena
    @gustavojuantorena 7 місяців тому +2

    Great content as always!

    • @pixegami
      @pixegami  7 місяців тому

      Thanks for watching!

  • @felixkindawoke
    @felixkindawoke 6 місяців тому

    Thank you! Could you do a tutorial on how to talk to the data? So based on this create a voice chat with it.

  • @adriansasdrich2853
    @adriansasdrich2853 3 місяці тому

    Thx very helpful and good explained ❤

  • @bartmeeus9033
    @bartmeeus9033 4 місяці тому

    Great tuturial, well explained!!!

    • @pixegami
      @pixegami  Місяць тому

      Thanks! Glad you found it helpful and easy to follow!

  • @eldino
    @eldino 3 місяці тому

    Thank you for the tutorial and the code!
    Two questions:
    1. How can we improve the chunking part?
    2. How can we create a derivative model of llama3 or similar that includes our embeddings? And how do we export and import in multiple machines running ollama?
    Thanks!

  • @maxi-g
    @maxi-g 6 місяців тому +2

    Hey, I have a question. I tried to load a fairly large PDF (100 pages) into the database (approx. 400 documents). However the add_to_chroma function seems to be excruciatingly slow. The output from ollama shows that the embeddings only get requested once every two seconds or so. There is also no CPU or GPU load on my system when this process is running. Is there any way to improve this? Thank's already

    • @pixegami
      @pixegami  6 місяців тому +2

      This is most definitely because of the time it takes to embed each page (since you mentioned embeddings get requested once every two seconds). Your Ollama model might not be able to fully leverage your hardware, which is potentially why your don't see your CPU/GPU load rise up.
      You could experiment by switching this to use an online embedding API (like OpenAI or AWS Bedrock) and see if it's faster. Or you could double check to see if Ollama is using your GPU correctly (github.com/ollama/ollama/blob/main/docs/gpu.md)

  • @chabilihicham7136
    @chabilihicham7136 Місяць тому +1

    Thanks!

    • @pixegami
      @pixegami  Місяць тому

      You're welcome! And thank you SO much for the Super Thanks 🤩

  • @derekpunaro2422
    @derekpunaro2422 7 місяців тому +3

    Hi Pixe! I was wondering how would you write the get_embedding_function for Chat GPT OPEN AI?

    • @pixegami
      @pixegami  7 місяців тому

      My first RAG project actually uses OpenAI embeddings: ua-cam.com/video/tcqEUSNCn8I/v-deo.html
      Here is the documentation and code examples from Langchain: python.langchain.com/docs/integrations/text_embedding/openai/

  • @techietoons
    @techietoons 4 місяці тому

    Very thorough. Best video.

    • @pixegami
      @pixegami  Місяць тому

      Thanks! Really glad you found it thorough and helpful!

  • @ziadbensaada
    @ziadbensaada 6 місяців тому +3

    hi, it give me this problem when I run python populate_database.py:
    Could not load credentials to authenticate with AWS client. Please check that credentials in the specified profile name are valid. Bedrock error: The config profile (default) could not be found (type=value_error)

    • @1nd3v
      @1nd3v 3 місяці тому +1

      same problem, did you solve it and if you did, how?

    • @ziadbensaada
      @ziadbensaada 3 місяці тому

      ​@@1nd3vthis model requires payment, you need to buy it!

  • @devmit2071
    @devmit2071 7 місяців тому +3

    How do you do this NOT running it locally? i.e. using the AWS cloud for pretty much everything (PDF in vector database, Langchain, Bedrock etc...)

    • @pixegami
      @pixegami  6 місяців тому +1

      You'd have to change all the LLM functions to be cloud based (e.g. AWS Bedrock or OpenAI), wrap the app in an API (like FastAPI) and Docker, and deploy it to the cloud (probably as a Lambda function).
      I'm working on a video about that now, so stay tuned :)

    • @devmit2071
      @devmit2071 6 місяців тому

      @@pixegami Thanks. I'll drop you an email with some ideas

  • @AiWithAnshul
    @AiWithAnshul 7 місяців тому +2

    This is an impressive setup! I'm currently using Weaviate as my Vector DB along with Open AI Models, and it's working really well for handling PDFs, Docs, PPTs, and even Outlook email files. However, I've been struggling to integrate Excel and CSV files into my Knowledge Base. For small Excel files, the vector approach seems fine, but it's challenging for larger ones. I'd love to get your input on how to build a system that incorporates Excel files along with the other formats. I've considered using something like PandasGPT for handling the Excel and CSV files and the traditional RAG approach for the remaining file types (PDFs, Docs, etc.). Perhaps adding an agent as the first layer to determine where to direct the query (to the RAG model or PandasGPT) would be a good idea? What are your thoughts on this?

    • @pixegami
      @pixegami  6 місяців тому

      Thanks for your comment and for sharing your challenges and ideas. I think if you are mixing free-form text (like documents) and something more traditionally queryable (like a DB), it does make sense to engineer some more modality into your app (like what you suggested).
      I haven't explored that far myself so I can't share anything useful yet. But I'll be sure to keep it in mind for future videos. Good luck with your project!

  • @wailfulcrab
    @wailfulcrab 6 місяців тому +1

    On model param size, 7B models are enough. Not related to this video, but I'm using Llama3 8B with OpenWebUI's RAG and it works but it sometimes have problems to refer to correct document while giving correct answer (it will hallucinate document name), but its how its RAG implementation are.

    • @pixegami
      @pixegami  6 місяців тому

      Interesting, I haven't tried this with the 7GB models yet. Thanks for sharing!

  • @bharanij6130
    @bharanij6130 4 місяці тому

    Hello! Mighty pleased and thank you very much!

    • @pixegami
      @pixegami  Місяць тому

      Hey there! So glad you enjoyed it! Thanks for the kind words 😊

  • @HimSecOps
    @HimSecOps 7 місяців тому +1

    Amaizing bro! Thank you I request you to tell how to connect the prompt part to ui

    • @pixegami
      @pixegami  6 місяців тому +1

      Thanks for the suggestion :) This is on my list to work on as well, stay tuned!

  • @danielcomeon
    @danielcomeon 7 місяців тому +1

    Thanks a lot. Great video!!! I want to know how to add new data to the existing database with new unique IDs.

    • @pixegami
      @pixegami  6 місяців тому

      Thanks! Glad you liked it. If you just want to **add** new data, the chapter on updating the database should already cover this. You just need to add new files into the folder, and run the `populate_database` command again. Any pages/docs on in the database will be added.
      But if you meant updating existing pages/segments in the existing data, then yes I'll have to make a video/tutorial about that :)

  • @joaquinestay8097
    @joaquinestay8097 4 місяці тому

    To regenerate modified pdf, just add modified date to meta data and add it to the workflow.

    • @pixegami
      @pixegami  Місяць тому

      Yeah, that's a smart approach! Adding a modified date to the metadata could definitely help track changes. You could then use that date to trigger re-embedding for only the updated chunks.

  • @theodorostrochatos7247
    @theodorostrochatos7247 5 місяців тому

    Great work! Thank you! Maybe a naive question, what should we change to make it work with LM studio instead?

  • @wawj-rf1ul
    @wawj-rf1ul 4 місяці тому

    Exceent video! Thanks. Is it free to use these embedding functions you recommended in the video, like AWS Bedrock or Ollama embedding?

  • @Ammarsays
    @Ammarsays 6 місяців тому +2

    I am just a layman and I want to know if text splitters count the characters, words or sentences for a given chunk size? And if the text splitters can identify sentences or paragraphs in text?

    • @pixegami
      @pixegami  6 місяців тому

      Yup, the splitter should attempt to do that. Here's the documentation: python.langchain.com/v0.1/docs/modules/data_connection/document_transformers/recursive_text_splitter/
      "It is parameterized by a list of characters. It tries to split on them in order until the chunks are small enough. The default list is ["

      ", "
      ", " ", ""]."

  • @TrevorDBEYDAG
    @TrevorDBEYDAG 6 місяців тому +1

    Thank you for the tutorial, I guess it should be a better way to create chunks, not only by character count because it cuts the paragraph in disruptive way. May be another library to split in paragraphs or at least end of the sentence?

    • @pixegami
      @pixegami  6 місяців тому +1

      The Recursive Text Splitter actually attempts to do what you suggest: python.langchain.com/v0.1/docs/modules/data_connection/document_transformers/recursive_text_splitter/
      "It tries to split on them in order until the chunks are small enough. The default list is ["

      ", "
      ", " ", ""]."

    • @TrevorDBEYDAG
      @TrevorDBEYDAG 6 місяців тому

      @@pixegami That's cool.

  • @60pluscrazy
    @60pluscrazy 7 місяців тому +1

    Excellent 🎉🎉🎉

    • @pixegami
      @pixegami  7 місяців тому

      Thank you! Cheers!

  • @sergiovasquez7686
    @sergiovasquez7686 6 місяців тому +1

    I just subscribed to your channel… very high vids on UA-cam

    • @pixegami
      @pixegami  6 місяців тому

      Thank you! Welcome.

  • @pradeepvenkat4557
    @pradeepvenkat4557 3 місяці тому

    nice video.. can you please share update logic too or make another video if possible.

  • @tzeroveca
    @tzeroveca 3 дні тому

    great video! Thank you! I would be interested how this consent can be deployed to cloud but also scaled.

  • @abdshomad
    @abdshomad Місяць тому

    10:36 updating the content: in kotaemon, there's an option to rebuild the index of a specific file. This is one good option i think.

    • @pixegami
      @pixegami  22 дні тому +1

      Oh, that sounds like a great feature.

  • @mingilin1317
    @mingilin1317 7 місяців тому +1

    Great video! Successfully implemented RAG for the first time, so touching. Subscribed to the channel already!
    In the video, you mentioned handling document updates. Do you have plans to cover this topic in the future? I'm really interested about it!
    Also, is "ticket_to_ride" and "monopoly" sharing the same database in example code? What if I don't want them to share? Is there a way to handle that?

    • @pixegami
      @pixegami  6 місяців тому

      Awesome! Glad to hear about your successful RAG project, well done!
      I've had a lot of folks ask about vector database updates, so it's something I definitely want to cover.
      If you want to store different pieces of data in different databases, then I recommend put another layer of logic on top of the document loading (and querying). Have each folder use a different database (named after each folder), then add another LLM layer to interpret the question, and map it to which database it should query.

  • @JorgeGil-qf6zy
    @JorgeGil-qf6zy 6 місяців тому +1

    thank you for the video pixegami, question, how will you do to implement follow up questions based on the last answer?

    • @pixegami
      @pixegami  6 місяців тому +1

      You'd probably need a way to store/manage memory and use it as part of the next prompt. I haven't explored this much myself, but it's a topic I'm interested to look into as well, so thanks for the comment :)

  • @pampaniyavijay007
    @pampaniyavijay007 6 місяців тому +1

    Superb bro 🤩