Thank you for this video. I've been going through all these n8n tutorials on UA-cam, and this one nailed the exact 'yeah, but' gaps I had when trying to put everything together. The memory storage, the data garbage collection-nobody else seems to go over these details, probably because they're trying to sell services to "figure this out" for you. Really appreciate you putting this info out there for everyone. You just earned a sub!
🎯 Key points for quick navigation: 00:00:00 *🚀 Introduction to RAG AI Agent* - Discusses the limitations of existing RAG tutorials and introduces a more robust solution using n8n and Supabase. - Highlights the ease of combining n8n and Supabase for a production-ready, cost-effective RAG AI agent. - Promises a step-by-step guide to set up the system in under 15 minutes. 00:01:25 *🛠️ Demonstration of RAG AI Agent* - Demonstrates the initial setup of the RAG AI agent with an empty knowledge base. - Shows the process of adding a document to the knowledge base and retrieving information using the agent. - Highlights the integration of Google Docs and Supabase for document management. 00:04:03 *📚 Setting Up Supabase* - Provides instructions for setting up Supabase for chat memory and vector database. - Explains the use of Supabase's free tier and necessary credentials for integration. - Details the steps to configure Supabase for use with n8n. 00:06:33 *💬 Workflow Execution in n8n* - Describes the workflow execution process in n8n, including chat message triggers. - Explains the use of GPT for chat models and the setup of chat memory using PostgreSQL. - Discusses the use of Supabase as a vector store for RAG. 00:09:55 *🔄 Managing Document Updates* - Details the process of managing document updates and avoiding duplicates in the vector database. - Explains the importance of deleting old vectors before inserting new ones. - Shows the steps to download, extract, and insert document content into Supabase. 00:14:35 *🎯 Finalizing the RAG AI Agent* - Summarizes the complete workflow for a production-ready RAG AI agent. - Suggests potential enhancements for better semantic search and keyword search. - Encourages feedback and future content on n8n and Supabase integration. Made with H
This integration of RAG AI Agents with platforms like n8n and Supabase is indeed a game changer! I love how it streamlines complex processes into manageable workflows.
I'm glad you found this useful! I agree it's a gamechanger - I've spent way too much time in the past building relatively complex AI Agents that I now know I wouldn't even need to code! haha
Perfect video my brother. I'm part of a large community here in Brazil and I will recommend it to colleagues. The video is so good that with my intermediate English, I was able to understand it perfectly. Thank you for this content!
I was sceptic first, as like you say: many people just talk and do not actually teach. Blessings for you and this channel. Ff you keep that up you will be successful. This was a good and clear explanation.
Amazing video Cole. This video is in a different league to the vast majority of other so called content creators who are more about getting clicks for headlines. They nearly ALWAYS show workflows that are missing the small details needed to create a real-world production ready agent. Subbed and liked. Looking forward to future content from you.
@@pumpituphomeboy Thank you very much, that means a ton! That's exactly what I'm aiming for - hitting the small details that make all the difference :)
15:02 Thank you for the great tutorial! Definitely keep it up. In case it helps you as an educator, the Document Loader Options (15:02) were vital for the success of the tutorial, but you went past them so fast I didn't realize they were there! In any case, I learned a ton from investigating that myself, and from the tutorial in general. Again, thank you!
Thank you very much and I appreciate you calling that out a ton! It's important for me to not miss anything big and I agree I should have covered the document loader options.
Man i just discovered your content. You are a gem :) Thanks. I've work and test some things in n8n few years ago, but it's great to see what it can be done RN.
Another thing I learned from your video is this: N8N has its own chatbot building system. I used Flowise for that. For me it would be interesting to see a video explaining the differences (and similarities) between N8N and Flowise in regards to AI agent/chatbot functionalities. Thanks again for the top-notch content!
You are welcome and thank you for the suggestion! I know a lot of individuals who love using Flowise with n8n, so I do have this on my list of content for the future!
This was a really good tutorial. You delivered everything you promised, it was easy to follow and beats the hell out of the RAG agents I've built in the past. Thank you.
How would you set up the initial vector database with recursive uploads of files from Google Drive Folders to be inserted into the vector database? This would give you access to your existing files and then updates as everything is updated over time. Would be super useful for things like Standard Operating Procedures.
This is a great question! You could create a separate n8n workflow that you run once that would go through all the folders in the Google Drive you want and index them into your vector DB. That's definitely doable with n8n. Then going forward the workflow I show in the video would handle new files or updated files. The other option is you could create a new folder for RAG that you use in the workflow, and then just copy your SOPs or other files from wherever else you store them into your new folder for RAG (then potentially delete the original copies after). Not an ideal solution if you don't want to shuffle things around but that would be the quickest!
Would love to see this setup with the change being local host. Like port forwarding the thing (if needed) etc. so that the agent (although running locally) can be access on the internet or integrated via api or embed
Great walkthrough. For next topic, it will be cool to show a custom react application where chatbot is hosted and connected to n8n behind the scene. The file uploads, processing all that smooth connection that can be shown.
This is a really thorough walk through for documents. How would this be altered if you were wanting to embed database data from a table? This is where I keep running into problems... I really love the enthusiasm and clarity you share with each video! Thank you for all the value!
Thanks Cameron, I appreciate it a lot! Great question! There are a lot of ways to embed a table. The easiest way would probably be to turn the data into a CSV and split the CSV into chunks (making sure to not split in the middle of a row) and ingest that into the vector database. Another option is to ingest each record or a set of records as raw text into the vector database. Hopefully that helps! What problems specifically have you run into?
@@ColeMedin Thank you so much for the reply. This might seem very long and like i am taking advantage of your generosity of time. I need two tables. One table that essentially would house my company wiki database (each row is it's own article, persay and thankfully this has been able to be sent one row at a time via webhook), and then another that would hold the extracted data from supplementary pdfs and documents. The idea would be that if the wiki table doesn't have the answer, the bot could then search the supplementary table. The problem I'm running into is that when I follow the steps of your video, the table is saved as documents and will then only ever try to search a table called "documents" (even if it is renamed). If I add another table, no matter what I set the vector store as, it always looks for "documents". I was able to get some SQL code that created more tables and gave me some new functions, but no matter what, N8N will just try to look in "documents" and will say my new functions failed. I'm considering moving to pinecone or something else, but the idea of using postgres was very appealing. I fully support the work you are doing and am certain the best is yet to come!
Of course and no worries, I'm glad to help! Your use case sounds awesome and it's definitely something you could accomplish with n8n and Supabase. It's hard to be concise in explaining this, but essentially you would replace the RAG tool I have in the video for the agent and replace it with two n8n workflow tools that would each search a different table with RAG given a query. So you would instruct the agent to use the workflow that queries the wiki table first, and if it fails to find the answer then call the tool to query the supplementary table. Does that make sense?
So it turns out, your context window is the same with or without rag according to Claude, which means all rag can do is help you choose your context segments that will be handed to the llm in text format for content larger than the context. I don't really need rag, I just need a method to isolate segments from my total context related to a particular query and then use the reduced context for it, this is a little slower, but it's going to better than rag at choosing segments. Don't pay for databases, you have a specialist at your disposal, use them.
Yes that is true that RAG is "just" a way to get specific context into the LLM from a knowledgebase and it doesn't actually extend the context window of the LLM! RAG is one of many methods to isolate segments from total context as you mentioned, and is generally considered the best/easiest to implement option. I'm curious to hear more about what exactly you are thinking of implementing. Sounds interesting!
@@ColeMedin I don't need an LLM to be a knowledge base, databases or even text files are much better at being accurate. Where LLMs are useful are being an interface to a database, my path is to create tools that can be 100% accurate rather than super fast, because of 2 main reasons, 1: agents will destroy themselves and do so far too easily because they don't really know where they messed up, even if they only do it 1% of the time, agents feeding off themselves will accumulate errors more than a quantum computer. 2: inference speeds will continue to become faster, so agents should spend a few more turns on verifying their every move when they interact with deterministic tools like a database, then you can leave them to complete a large task and not have it produce unusable garbage. Well, that's the plan anyway.
@@saxtant Yeah I see where you are coming from! How would you handle a large amount of unstructured data though? If you have a bunch of let's say meeting notes or standard operating procedures, it would be hard to get an LLM to query for those in a SQL database or text files without having RAG to do a similarity search and pick out what matters for the user question. What are your thoughts there? I'm really curious to know!
@@ColeMedin I would prefer not to let it get that far, I mean it's true my house is a complete mess, but I would prefer not to have a large pile of unstructured data and actually I don't, I may not have the same requirements as others, but my current method is all about having an LLM actually use a scratchpad for unstructured data, only for it to be sanitized in the background and removed from the scratchpad to keep the context small.
I'm working very hard to do exactly this. I made a script I call context manager which serves to create the memory of the LLM for a specific task. I'm working with large modules and applications. The context manager basically parses my code and then I select which part of the code I want to send to the LLM to reduce hallucinations and increase coherence and increase the accuracy of the response. My goal is to have this context determined by an AI that will read the script and select which part of the code are relevant to include in the coding LLM 'a context. The goal is to separate task on different calls as the more specific the question is and the more focused the context is better is the response. So far I'm doing the context creation manually which is of course a pain. I think. I'm getting close to getting it to work. The biggest issue I have is that LLM are incredibly inconsistent even with ultra clear instructions. Claude which is by far the best I've tried for coding does not follow instructions well. I get more structured responses from GPT4o latest but it is not on the same level as Claude for coding tasks. I'm also working on an auto merging software that takes the code output and parses it for commands. Which then the program uses to accurately merge the code snippet into the original. I use LibCST for the operation. I'm Getting close to getting it work correctly but still lots of little kinks to fix. Anyone working on this who wants to collaborate I'd be interested. Btw I'm not a programmer I'm an engineer with a knack for creation and understanding complex things. I rely on AI to write code. I am managing it and figuring out the problems and explaining it how things should work. So a talented coder could really help advance the project.
This is a very insightful video I am very curious why you rushed through the "Default Data Loader" which I though was the most important step to defining the "File_ID" mapping.
Thank you very much! I certainly should have spent more time on the Default Data Loader, that is something I would have done if I were to redo the video!
This is a great resource. Thanks for posting! One suggestion I would make is to add descriptive chapters. I find I have to seek around a bit to get to the parts I need.
Thank you and I appreciate the suggestion! I've been doing it for my more recent videos, but I honestly should go back and do it for this video. I'll be sure to take care of that!
Awesome video! I’m curious if you ever do consulting for businesses. I’m currently working on a startup and I plan on using this technique but I’d love to pay a consultant who actually knows what they’re doing to make sure it’s set up properly from the get go. If you don’t take on new clients I’d also be interested in anyone you might recommend!
We need to focus on Localization of RAG and creating STACKS to host/use locally for the utmost security. Let's not forget you're training theses companies data for them by sending all your content into cloud services.
Great point! I definitely want to work towards making more content around localizing the entire system. Local setup is a bit more involved for things like the database which is why the full stack isn't local at this point, but I agree fully local is ideal!
@@redneq I haven't fully attempted or seen this built out before but yes this would be possible! And you certainly aren't loony haha, this sounds great and I know it's what a lot of people are looking for. Lot I'm working on behind the scenes related to local RAG!
@@redneq I do this every day, every nigt when i have time besides my fulltime job.. and what you will expirience is. That it is not building the prncibles and showing a bit nonsense, it is the companys data. You must show up with a solution for the real problem, its the data! This advice here shows just that he has just no idea of what hes doing.
great guide! One issue I am having is my docs keep coming out in unicode on my rag upload. My extract text method keeps yieleding unicode... [Problem in node ‘Insert into Supabase Vectorstore‘ Error inserting: unsupported Unicode escape sequence 400 Bad Request] I can see my data from the extraction is all corrupt, tried a few documents, any ideas? Fixed my first issue with txt file. How do I make sure PDFs and DOCX get parsed correctly? Seems to be an encoding issue. Also any suggestions to deal with a large volume of docs needed to add to the vector store? If I try to drop a few files into drive it breaks obviously. How can I look through these items and process them in batch?
Thank you! For handling different file types like PDFs, you'll need to use the different "extract text" nodes in n8n made specifically for those file types. I actually have a video up going over how to do this! ua-cam.com/video/T1ZKEmDN8AA/v-deo.html The Google Drive trigger can handle multiple files at once, it just sends them all in a single workflow execution in n8n so you have to add a "loop" node to loop over each of the files uploaded and insert each one of them individually.
Question: Why don't you make the id in the table documents in supabase be the Google sheet id just change the id to be uuid instead of int8 that way you can remove that step of deleting and inserting the document ... supabase trigger will get called on update too. If you match the IDs this way you can reduce the step and avoid dupes and making sure that when the user is chatting if the document was deleted they fall into that inconsistent behavior as the document might not be there as it got updated and in your current flow you have to delete and insert in between these operations there can be hiccups. btw this is awesome, Most videos about n8n aren't well thought out and you cover a lot. I think this is the best way to get non tech and tech people into RAG! Plus automation, agents all the way.
This is a FANTASTIC question, I appreciate you asking it! And thank you for the kind words as well! The reason I don't do what you're describing is simply because of a limitation with n8n. When you insert documents into Supabase for RAG with the "Supabase Vectorstore" node, there isn't a way to customize the ID of each record to make it correspond to the Google sheet ID. At least not that I have found. So this approach with the metadata is the way to work around that in n8n. I also really wanted to demonstrate a use case for metadata since it's a really important topic for RAG so it worked out well that this workaround was necessary. If you coded this solution yourself, then what you are proposing would be a very good approach.
Perhaps I'm not understanding the suggestions, but with this suggestion, for longer documents which get chunked, wouldn't a single Google document ID be associated with multiple chunks? And wouldn't that be a problems because the Supabase ID needs to be unique for each chunk. Perhaps you could append the Google Doc ID to make it unique (e.g. {googleDocID}-{uniqueNumber}; then you could have access to the Google Doc ID. But this all depends on being able to manage the Supabase ID.
Hey Cole, great work! But do you know why the LangChain vector stuff (SQLEditor code) isn't installing properly on Supabase self-hosting? I entered the code and ran, and it seemed to work but no table was created. Thanks
Thank you! That's a bummer - you aren't seeing any error message at all? My guess is you're looking at the incorrect database schema. Supabase typically uses the "public" schema but maybe it's looking in a different place by default in the self hosted Supabase?
My main question here is, for instance, let’s say i have a bunch of PDFs that needs to be RAGed. How can i add them manually to that database instead of using n8n. And if a PDF gets updated, how do i replace it ( i guess i would have to remove from database and re add ? )
Since you have to vectorize the PDFs, there isn't a super good way to do it manually that I know of. You could have AI help you create a script to parse through your PDFs though and add them for RAG! To your last question - yes, remove from the DB and readd!
Thanks a lot for the video, which is very perfectly useful. I have planned to do a chatbot on Discord, if you have any plan to do some videos explaining how to connect to the Discord bot that would be very helpful. Thanks again and cannot wait for new videos.
Excellent video. Great job, you have spotted a perfect example and the realization is very well executed.👏 Why do you need to connect the RAG AI Agent for the Postgres chat memory using the Postgres while you use the API connection for the document insertion ? How can I better understand the steps and choices you took for the chunking part ? Could you briefly explain when your method is relevant ?
Thank you - I appreciate the kind words! The Postgres chat memory is separate from the document insertion (which is for the knowledge retrieval). The chat memory is there so the agent can remember previous messages in the conversation. The document insertion and the Supabase documents table is there for the agent to be able to search across your documents to answer a question. The reason the chat memory and document insertion use different credentials is mostly because the Postgres chat memory in n8n can use any Postgres database - it doesn't have to be Supabase. So those credentials are more "generic" to a Postgres database, while the credentials for RAG are specifically for Supabase. I hope that makes sense! For chunking, a lot of it just comes down to playing around with the chunk and overlap sizes to figure out what works best for your use case! There aren't too many rules to follow there. I just 1000 for my chunk size just because that's a default used in a lot of applications.
This is a great and a detailed tutorial on using n8n for RAG. I noticed that although I add files to the google drive directory the app is monitoring, it still won't fetch those docs for the RAG part. Any idea why this might be happening
Thank you man, I appreciate it! That's strange n8n isn't picking up on new files in your Google Drive... is your workflow switched to active? You'll have to toggle it to active in the top right of the workflow view to make the Google Drive triggers work! I'd also double check and make sure the triggers are set to use the same folder you are adding files to.
@@ColeMedin Thank you. It seems like it is not triggering on pdf files, just google doc files, I ended up adding a new text extraction for the pdf but that didn't work either. Maybe you can look into this in the future, unless if I am the only one having this issue. Thanks again for the great content
Thank you for demonstrating a no-code RAG implementation. It's quite impressive! However as some of the comments mentioned local RAG is a realistic requirement due to security concerns. However for a true real world rollout, there's going to be a need for a guardrail framework tied to role-permissions, and a test framework for validating expected / unexpected outcomes and I believe that will inevitably lead to a code-based implementation. For now, the n8n platform is great for prototyping different backends / engines. This was inspiring nonetheless!
Thank you Howard! You make great points! For a lot of applications, there truly is a lot that goes into making them production ready with all the security/testing requirements. n8n is suitable for some applications (such as many website chatbots), but others you are right in saying the requirements will often lead to the need for a coded solution.
My workflow appears to only be triggered by the most recent file that was created. That is, if 2 files are uploaded between trigger events, Only one of those files will be added to Supabase. So, if your trigger runs every minute, and you exceed a file-upload rate of 1 file/minute, files will not be added to Supabase. Have you tested this scenario and ruled it out as a flaw in the workflow? I'm unable to overcome it with this current setup, as I understand it, at least.
Okay I did some more testing with this and right now the workflow does only handle one file being uploaded/updated at once within a minute. If you do more than one within a minute, it will trigger the workflow only once but there will be multiple items there - so you just have to loop over them to process them and index them.
@@ColeMedin would make for a good video in of itself, not many people seem to want to explain how embeddings are created, how you can manage them or what the implications they can have for your outputs.
Yeah definitely - thank you for the suggestion! I have it in my list to create some content around more advanced RAG techniques. Embeddings are necessarily advanced but it does fall under the category of going into something specific to RAG in more detail!
Your tutorials are incredible compared to others. Thank you so much for your amazing work. I have a question regarding the approach to inserting PDF's with tables into vector stores. I am having a lot of trouble and still no good results. I tried your best practices (thank you so much for that) but you don't address this issue, how to insert PDF's with tables without loosing semantics. Would you address this in one of your videos, please? I see there are some solutions with coding but that is a no no for me... and from what I understand it requires pre-preparing the data, is that correct? Can it ne done in n8n? Thank you and keep up the amazing work!!
Thank you for the kind words! Great question but super hard to answer over text - I am going to be making a video soon though (within the next couple of months) on how to do RAG better with CSVs!
Great video! Thanks!!! Is it possible to make this product usable as a SaaS for searching for 10-20 different clients, with different databases and different accounts?
Thank you! You bet! Yes you certainly can - I would do some research on RAG metadata - you can essentially segment the knowledgebase based on the tenant (each of your clients), so when you insert new knowledge you specify the tenant it is for, and then when you query knowledge for a single tenant you use the metadata filtering to limit the knowledge it can pull from to just that tenant.
Comrade, you really are fantastic with N8N! I really appreciate you sharing your knowledge here on YT. Another "follower". Congratulations. I am from BRAZIL
Hi Cole, I got this working with your JSON file in minutes. Thank you for the detailed step-by-step instructions on getting APIs and SQL codes setup! Q: How can I make this just search one or a few documents from the database?
You are so welcome, nice job!! Could you clarify your question? For RAG it will only retrieve the documents you ingest into the vector DB. You could use metadata filtering to filter down on the documents you want to search. A lot of how that would work would depend on your use case and setup though!
Great Content! Can you make a tutorial about how to make agent that can be database manager ? Gives info to client, update record, and more for business?
Thank you!! And I actually am planning on making a video like this already! Are you thinking this agent would create custom queries to manage the database, or more just call tools that already have queries defined to perform certain actions? I am thinking of doing both but curious what you had in mind!
Why supabase? what is benefit of supabase vs postgres? In this video not using any advance auth mechanism to validate user and I believe same can still be done with Postgres, right?
Supabase is running Postgres under the hood! It's just super convenient compared to hosting Postgres yourself and it has features like authentication and row level security for expanding this solution.
Great question! When you have n8n hosted locally, you'll have to set up a domain and SSL certificate to be able to use Google since Google won't work with localhost unfortunately. I would suggest hosting n8n on a VPS using a service like Digital Ocean! n8n actually has great documentation on hosting in DigitalOcean: docs.n8n.io/hosting/installation/server-setups/digital-ocean/
Hi Great video. For some reason supabase does not create the documents table, and i get an error message from the Delete Old Doc Rows node saying Bad request - please check your parameters column n8n_chat_histories.metadata does not exist. What could be the issue? The rest of the nodes are seems to be working.
Thank you! Sorry you are running into this issue though! The documents table isn't created by itself, you have to follow these instructions: python.langchain.com/docs/integrations/vectorstores/supabase/ For the delete old docs node make sure the table is documents and not n8n_chat_histories!
Great tutorial! Ironically, I've been stuck for a couple of days on the google drive authentication. Not able to connect n8n with google for some reason :/
@@ColeMedin I believe the error was with the app being on Testing mode, it didn't allow any OAuth to proceed (contact developer - kinda funny when you are the 'developer' haha). I was able to troubleshoot by adding the email manually to google console, I guess whitelisting it for Oauth login while is Testing. I tried switching the app to Live mode but it required google review and approval, etc.
How can we setup so you can add pdfs, excel files, different types of files and have it extract them into text to be able to be embedded? I tried using a switch but it won't accept any schema from the binary files. Any ideas?
Great question Jack! This would involve a bit more of an in depth flow where you would add branching to your workflow based on the file type. I've done this before with n8n so I know it's pretty easy to set up. Basically you would add an "if" (router) node to your n8n workflow. If the file type from the Google Drive trigger (you could use the mimeType property) is a PDF, then you would route to a "Extract from PDF" node, if it's an Excel file, you would route to a "Extract from XLSX" file, etc. If you click on the "Extract from File" node in n8n you'll see a list of options that includes these. Then you have all of those separate "extract from file" nodes route back to the rest of the workflow that handles the extracted text. Hopefully that all makes sense!
Curious on the google drive node, is there a way to monitor subfolders? The google drive nodes both have this call out, "Changes within subfolders won't trigger this node" In other news, great video and thanks for sharing.
Thank you Don and great question! That is correct that the Google Drive trigger node doesn't watch subfolders. If you want to monitor subfolders, you could set up triggers for those specific folders as well. Obviously that's only realistic if you don't have dozens of subfolders. The best way to handle this without creating a trigger for each folder would be to use the Google Drive "Changes" API. You can basically tell Google Drive to alert you when a file is created/updated within a folder or your entire Drive by sending a request to a webhook which could be an n8n workflow (with a webhook trigger). This method does handle subfolders! So if you're really curious about extending this I would take a look at the Changes API!
@@ColeMedin Thank you, appreciate the help. I was able to work with the Google Drive search node, triggering the path every 15 minutes and searching for any files that have been modified in the last 15 minutes with this query: modifiedTime > '{{DateTime.now().minus({ minutes: 15 }).toUTC().toFormat("yyyy-MM-dd'T'HH:mm:ss'Z'") }}'
Hello, excellent video! I'm having trouble extracting text from the file. I don't know what encoding it uses because the output is unrecognizable characters, so it's not possible to store the vectors. Any idea?
@@ColeMedin Hi mate, thanks for your quick response. I have already identified the problem. The issue is that this workflow only recognizes files created within Google Drive using Google Docs, as it converts Google files to text format using Drive Download. If you try to upload files externally to the folder, Drive Download cannot correctly convert the binary file to text. The solution is to remove the Google document conversion and add a switch at the start that routes the workflow to "Extract from file" nodes based on the file type. For .doc and .docx files, I routed them to a web service running Apache Tika, and through an HTTP Request node with a PUT request, I send the file and receive the text in XML format. Now, the idea is to figure out how to filter the characters inside the text chain or use another solution that allows me to convert the .doc or .docx file into another format like .txt. That said, your video has been very inspiring, and I appreciate you taking the time to share this information. New sub here!
I've watche a lot of videos about n8n agents - most of them bs that has no real application for business. But your video and blueprint is totally gem! tysm
Thank you!! I haven't tested this with PDFs specifically myself but others have and this should work already for PDF documents! Otherwise, there is a specific "Extract from PDF" node in n8n you could use. So you could add a condition to the n8n workflow that routes to the regular text extractor when the file is not a PDF file, and route to "Extract from PDF" when it is.
@@ColeMedin Thanks for your reply and, again, for the great tutorial. I'm going to get this installed today or tomorrow and will try pdf's and let you know.
For anyone else wondering about this, here's how I was able to implement it: I set up an IF node after the download a file node with this condition {{ $binary.data.fileExtension }} = pdf - If True, it'd go into the extract pdf text node followed by a Set node, that takes the extracted value and saves it to data. It connects to the "insert into supabase" node. - If False, it'd flow normally, into the extract document text node
Supabase can be run locally like Qdrant! The main reason I have Qdrant in the local AI starter kit (the other video you referenced) is because that's just what was included in the package initially. But I actually do prefer using Supabase with PGVector for RAG! And I might be changing up that package to use Supabase instead of Qdrant and the vanilla Postgres it comes with.
How would you recurse through a single Google Drive directory? I'm doing a nightly sync with my local Obsidian vault and I'd love my AI Agent to get really smart on my years of notes
Great question! You can create a separate n8n workflow to scrape through an entire Google Drive directory pretty easily! I might be making a video on that in the future, but essentially in n8n you can set up a workflow to list all files in a directory and then in a loop go through each one and add it to the knowledgebase similar to how I do it in the video!
Hi Cole, do you know if it is possible to output a summary of the chat interaction? I wonder, for example, if I could add another AI assistant in this workflow to do that at the end of the chat iteration, but I don't know how to do that without screwing up the current workflow.
Probably would be better if I simple extract the chat history and sent it out of the workflow and then create another one to do that. Just don't know how to extract it and send it out. Any idea?
I have implemented this Workflow. I have my own "xmas plans.txt" file that is my document that in uploaded. I can see it in the Supabase DB. However, when I start a Chat (with a Q&A Agent, attached to a Supabase Retrieval node) and ask "what the the Christmas plans" or "what are the xmas plans", it doesn't provide an answer related to the RAG document. In the Chat UI, in the right-hand pane, I see the expected "output" with my DB row. It is just not passed on through the workflow (seemingly). What could I have done wrong?
Sorry you're running into this Dan! Which model are you using? I've had this happen with smaller models where they just seem to ignore the output from the RAG nodes.
Sorry, somehow it missed off the names of the Embeddings models I have tried (both, as above). mxbai-embed-large:latest (669 MB) and nomic-embed-text:latest (274 MB)
do you have something that is more code focused as an option less for document referencing but for storing conversations and things the agent learns so it can reference answers it doesn't know in a general sense.
Hey Cole - quick question... any chance you could describe the schema for the n8n_chat_history table? I could guess, but I'd rather not if I can help it. I was able to get the schema for the documents table by pausing the video and zooming in, but the n8n_chat_history wasn't shown. Thanks!
Great question! So N8N creates the n8n_chat_history table by itself which is why I didn't cover the schema. So I'd run the workflow to create the table and go into Supabase and take a look - it'll be set up for you automatically!
Hey Cole, I have a question regrading the aplication of this to a multi-tenant database where each tenat should have its own rag for its docuemnts. Is this posible? BTW great video!
Thank you very much! And great question! It is hard to get into this in great detail in a UA-cam comment, but you can easily do multi-tenant RAG using metadata filters within a vector DB. With metadata filtering you don't even need a separate index per tenant, though you can do that too. So basically the tenant ID (or company/customer ID, whatever you call it) will be a part of all requests into these workflows. Any inserts into the vector DB will have the tenant ID included in the metadata. Then any retrievals from the vector DB for this tenant can simply filter on the tenant ID in the metadata to guarantee that it is only retrieving information for that tenant. Let me know if this makes sense!
HI Cole! Thank you very much for the video and the detailed explanation, I was able to implement it here in minutes! But I have a question, and when the file inside the folder is deleted, is this action considered in the "Updated file" node?
You are so welcome, I'm glad you have it implemented! A deleted file won't trigger the file updated trigger, unfortunately. That's a limitation of N8N I hope they address soon since you have to create a custom webhook using the Google Drive API to actually watch for file deletions.
Hey my friend: how can I resolve this node problem in Supabase: Problem in node ‘Insert into Supabase Vectorstore‘ Error inserting: unsupported Unicode escape sequence 400 Bad Request
Hey, Cole! Amazing video. I've been working on my version of it. Quick question: I'm doing your workflow, but the original data for vector database is a CSV doc (an employees database)... So, I did a vector database where the batch size is 1:1 (each row became a row), to avoid breaking employees in pieces (lol) However, thus far, the vectors that are showing up in the file retrieval are presenting sub-par quality, where they are related to my query, but not enough. So.. 1) How would you set this up (in matters of parameters like batch size, files number etc.) 2) why is your retrieval files number = 4?. In my case, I've noticed the model is omitting/ignoring important stuff when I leave any value under 24... which is sad in a token expenditure perspective lol Anyways, thanks in advance. good day
Thank you very much and great questions! So the ideal setup depends a lot here on what kind of queries you want to make. RAG is really good at looking up specific employee records (example: "What is John Doe's salary?") but it is not good at answering questions that would require it to have the entire CSV in its context (example: "What is the average salary of all employees?"). This is because RAG will only have part of the CSV in its context unless you set it up to retrieve the entire document. If the CSV is small enough (rough estimate < 10k characters) you could just not chunk it at all when putting it in the knowledge base. That way it'll pull the entire document to answer questions. Otherwise your idea of one employee per record could work or you could do something like 10 employees per row. My retrieval is 4 because that is pretty standard when your chunk size is something like 1000-2000 characters and you want the RAG solution to lookup very specific information. But this is one of those parameters that you just have to play with a lot! That and the chunk size. The bigger the chunk size, the more information will be available with a smaller retrieval number. So maybe it has to be larger than 24 for you because your individual records are so small (since it's one per employee)?
@@ColeMedin thank you. really complete answer, right here. That makes sense. I’ll test that number higher. My automation has been working good so fine, with 70+ employees
@@leoplaysnotmuch under the document loader you have the “character splitter node”. I’ve set it to as high as my row can get (you can change from characters to tokens). Just make sure your rows aren’t too huge (mine with 1k tokens average are doing well)
I think RAG AI is really good with a small volume of data, like a 10-page PDF. However, when moving to something more serious, like over 50 pages and about 20 tables in a file, it doesn't respond as well. I'm referring to a file with more than 2,000 lines
Yes you certainly aren't wrong! There are a lot of factors that determine how well a RAG system performs, like the embedding model, the LLM model to handle retrieved chunks, your chunk size, how you split up your documents (especially for things like tabular data), your use of metadata filters, etc. A lot of advanced RAG techniques can be used too like reranking, hybrid search, knowledge graphs, etc. All of this becomes a lot more important once you have a lot of files or very large files like you are saying!
Thanks for pointing that out! PGVector is actually enabled as a part of the SQL script that I show how to run within the Supabase platform. But I certainly could have called that out more clearly!
hi @Cole thanks for this workflow I'm wondering is it possible to receive some sort of notification if there is no answer in the documents so I can see this request and later improve docs
Yeah you could! Basically you could tell the LLM to output something specific when it doesn't get any documents, and then have a part of the N8N workflow send a notification when that happens.
thanks for the video. I would like to analyze PDF studies of several hundred pages and make summaries to extract insights. The problem is that I can't copy/paste the pdf into GPT because it goes beyond the context window. Can I use RAG to do this use case? The RAG seems to be designed more for answering specific questions from a knowledge base than for synthesizing documents.
You bet! You are right that RAG is meant more for answering specific questions. To summarize very large PDFs like what you are trying to do, I would suggest having the LLM summarize something like 5-10 pages at a time, and then have a final prompt where you combine all the summaries together and ask it to make a final summary.
Thank you!! And n8n doesn't support using local vector DBs, although you can use a hosted Qdrant vector DB with n8n! If you wanted to use a local vector DB in the workflow, you could host it on the same machine that your n8n is self hosted on and then create a custom code step to work with the local vector DB.
For some reason, the agent prioritizes the memory and does not use the documents with the tool. When I remove the memory, it uses the documents tool perfectly. I don’t understand the logic behind this. This is the system message: You are a personal assistant responsible for answering questions using a corpus of documents. Before stating that you do not know the answer, you must use the 'documents' tool to search for relevant information in the vector store. This search should be your primary action every time you receive a question, unless it is absolutely clear that there is no useful information available. Always respond in Spanish. This version emphasizes the necessity of using the specified tools to ensure thorough document searches.
Interesting... I didn't run into this issue myself for this setup but I have had this happen with RAG agents before. This is especially common with models that aren't as powerful, so the easiest thing to try is to use a more powerful model if you can. Like try GPT-4o instead of GPT-4o-mini if you're using that. Also I'm curious - what kind of conversation did you have with the agent where the memory and knowledge base would have conflicting information? Is it because you added a document to the knowledge base half way through the conversation? Sometimes you have to restart the conversation when there is new information in the knowledge base, because the LLM doesn't necessarily understand that new info is available which is why it can resort to what it said earlier in the conversation.
i struggle with the exact same problem, have you found a solution for this? I'm using mistral-large-latest model. When i clean memory, it call the tool just fine, but the second question it doesn't call the tool
For those that have used this system and are considered new, maybe this will help you too. Vectorization Error: I got an error with SupaBase at the bottom part of the workflow. I first could get the connection to work, there was simple naming issue -I didnt know but used chat GPT to help me out. When I got it working it had another issue, something about the 'embedding size' so looked it up and followed the same sequence. troubleshooting. Basically it said to alter the table from (1536) to (3072). I'm not sure if others using this has come across this yet.
Great suggestion, I appreciate it! I'll be making more SQL AI agent videos in the future and this would be a great addition to what I've been thinking!
hi, just downloaded this workflow and replaced openai with anthropic plugin and it doesnt work. Is there some specific setup to use different model tha openAI?
great video, subscribed. i'm always looking for interesting n8n tutorials. Pity I'm using Groq at the moment so no idea how to do an Embedding tool with that. Maybe it's in the works
Thank you Martin! Groq is a fantastic product for LLMs! For embeddings, it is too bad you can't use Groq for that - but you can use Ollama or HuggingFace in n8n if you want to stay open source for the embeddings!
have you tried larger files? like a PDF that is 500kb or larger? My setup seems to choke on that in the embedding ollama part. I can have multiple small .doc files no problem.
this is a great video, but im struggeling with what happens when i delete a file from the drive folder. When i delete a file from my drive folder, it should also delete the vectors from the database, right? Somehow this doesnt happen? Any idea how to solve that?
Thank you! And yes unfortunately this is a limitation of n8n where there isn't a trigger for when files are deleted. So they aren't automatically removed from the vector DB. You'll have to either manually remove them through a workflow you set up and trigger yourself (not ideal), or integrate with the Google Drive API to set up a webhook for when files are deleted to trigger a workflow that removes that file from the knowledgebase.
I did not understand the purpose of n8n in this whole picture. Can I do it without n8n? Is there an alternative to n8n? Just want to see clearly where n8n fit in this picture
Great question! So n8n is what allows you to create this entire setup without having to code anything. The alternative to n8n would be to create this AI agent using Python and a library like LangChain. I do have a lot of content on that kind of thing as well! Or if you want other no code workflow automation alternatives to n8n, you could use Zapier or Make.com. But those are super expensive so I'd recommend n8n for sure!
Hey Cole, watched most of your vlogs but this is the first that I've had a crack at, after setting up n8n locally with docker following your suggestion. I tweaked the extract node to pdf as that all I'm using for the knowledge base, and ran OCR on the files first. My question and apologies if you've already answered it (I did have a quick scan of the comments) is it appears to only load the first pdf file in the folder, each time I run the workflow, how do i get it to fetch multiple files, so i can then summarise like you demonstrate in you 10 n8n tips?
Good question! So when you have multiple PDFs uploaded at once to Google Drive, it'll only trigger the n8n workflow once but all of the files will be available. So you can add a "loop" node into the workflow to process each of them exactly how I do it with just a single file.
Supabase changed the way how they store the connection details and I cant setup the connection. I used the direct connection parameters but I cant connect.
Hey Cole, Great video! I just have one question, it looks like the atribute "file_id" isn't getting uploaded on my supabase table on the column metadata. I get only "loc" > "lines" > "to": "from": "source": and "blobType": No file_id, and it makes the expression to delete duplicates not working
Thank you! Sorry you are running into that! Make sure you are including the file_id in the document splitter node! I would download the workflow JSON I link in the description and check that out. It's the node below "Supabase inserter" node.
I think the file from Google Drive was used only once to create the vector embeddings that were stored in the vector index. As you make a question, the AI agent would parse your prompt, and search for a vector embedding that has a similar value of what you asked for in your prompt. I'm also new to RAG, so I can't be 100% sure of what I just said. It seems that's what happens in there for me. If I'm wrong, please let me know.
I am certainly going to in the near future! n8n actually uses LangChain under the hood for their AI Agents, so this already is a LangChain + Supabase + n8n integration! I'm assuming you mean with code instead of n8n though?
Hello, is there a way to make the Google Drive File Updated Node to get all file updates rather than just the most recent one. I made changes to two of my files, but only the most recent one is recorded (and the others are missed). Again, the same for file created, if I put 5 new folders into my folder (that i am watching with the trigger) it will only pick up one of then. Thanks for the great videos
Great question! So when multiple files are updated within the same minute the workflow actually triggers once with multiple files as inputs. So you have to change up the workflow to loop through all the files passed in! You can set up a "loop" node at the beginning of the workflow and the rest can be essentially the same.
This is a great video, but I am facing an issue. I want to upload docx and pdfs. I already can do pdf uploads by using Extract from Pdf node, but for wordx it has been a hassle trying to figure this out. IF you can help with this that would be great. How can I extract text from docx.
Thank you! Sorry you're running into that issue though. It seems n8n doesn't support docx by default unfortunately, so you would have to convert it to a Google doc or text format (something like that) first.
Simply Amazing Brother, Thank you sooo much! And of course, it will be great to see more content related to RAG implementations using this setup.
Thank you, I appreciate it a lot! Yes, more RAG content coming soon!
Thank you for this video. I've been going through all these n8n tutorials on UA-cam, and this one nailed the exact 'yeah, but' gaps I had when trying to put everything together. The memory storage, the data garbage collection-nobody else seems to go over these details, probably because they're trying to sell services to "figure this out" for you. Really appreciate you putting this info out there for everyone. You just earned a sub!
I'm glad - thank you so much!!
Your tutorial are definitely better compared to others I've seen especially with supabase vector table maintenance in general. Thanks
Thank you very much! :D
🎯 Key points for quick navigation:
00:00:00 *🚀 Introduction to RAG AI Agent*
- Discusses the limitations of existing RAG tutorials and introduces a more robust solution using n8n and Supabase.
- Highlights the ease of combining n8n and Supabase for a production-ready, cost-effective RAG AI agent.
- Promises a step-by-step guide to set up the system in under 15 minutes.
00:01:25 *🛠️ Demonstration of RAG AI Agent*
- Demonstrates the initial setup of the RAG AI agent with an empty knowledge base.
- Shows the process of adding a document to the knowledge base and retrieving information using the agent.
- Highlights the integration of Google Docs and Supabase for document management.
00:04:03 *📚 Setting Up Supabase*
- Provides instructions for setting up Supabase for chat memory and vector database.
- Explains the use of Supabase's free tier and necessary credentials for integration.
- Details the steps to configure Supabase for use with n8n.
00:06:33 *💬 Workflow Execution in n8n*
- Describes the workflow execution process in n8n, including chat message triggers.
- Explains the use of GPT for chat models and the setup of chat memory using PostgreSQL.
- Discusses the use of Supabase as a vector store for RAG.
00:09:55 *🔄 Managing Document Updates*
- Details the process of managing document updates and avoiding duplicates in the vector database.
- Explains the importance of deleting old vectors before inserting new ones.
- Shows the steps to download, extract, and insert document content into Supabase.
00:14:35 *🎯 Finalizing the RAG AI Agent*
- Summarizes the complete workflow for a production-ready RAG AI agent.
- Suggests potential enhancements for better semantic search and keyword search.
- Encourages feedback and future content on n8n and Supabase integration.
Made with H
Duplicate of your other comment? Regardless I appreciate it a lot!
This integration of RAG AI Agents with platforms like n8n and Supabase is indeed a game changer! I love how it streamlines complex processes into manageable workflows.
I'm glad you found this useful! I agree it's a gamechanger - I've spent way too much time in the past building relatively complex AI Agents that I now know I wouldn't even need to code! haha
Perfect video my brother. I'm part of a large community here in Brazil and I will recommend it to colleagues. The video is so good that with my intermediate English, I was able to understand it perfectly. Thank you for this content!
Thanks for sharing! I appreciate it a ton man!
Blimey, this is a great tutorial. I’ve seen a couple of others on the same topic but I like the simplicity of yours. I’ll definitely give it a go!
Totally agree on channels not talking about keeping their databases duplicate free. Thanks for showing.
Of course Michael, I appreciate you calling that out!
Light RAG or graph database handles duplications for free
I was sceptic first, as like you say: many people just talk and do not actually teach. Blessings for you and this channel. Ff you keep that up you will be successful. This was a good and clear explanation.
Thank you very much - that seriously means a lot! I'm glad everything was clear as well! That's actually my primary goal :)
Amazing video Cole. This video is in a different league to the vast majority of other so called content creators who are more about getting clicks for headlines. They nearly ALWAYS show workflows that are missing the small details needed to create a real-world production ready agent. Subbed and liked. Looking forward to future content from you.
@@pumpituphomeboy Thank you very much, that means a ton! That's exactly what I'm aiming for - hitting the small details that make all the difference :)
nice! one pumpit... xD
15:02 Thank you for the great tutorial! Definitely keep it up. In case it helps you as an educator, the Document Loader Options (15:02) were vital for the success of the tutorial, but you went past them so fast I didn't realize they were there! In any case, I learned a ton from investigating that myself, and from the tutorial in general. Again, thank you!
Thank you very much and I appreciate you calling that out a ton! It's important for me to not miss anything big and I agree I should have covered the document loader options.
oh my days dude you have really REALLY sorted me out with this vid. So grateful.
Haha I'm glad to help man! Thank you!
Man i just discovered your content. You are a gem :) Thanks. I've work and test some things in n8n few years ago, but it's great to see what it can be done RN.
Thanks dude, that means a lot to me! That's awesome you were testing things out with N8N even a few years ago. I didn't even know about it back then!
Super high quality content. Unlike the rest, this really helped. Thanks for sharing.
Thank you, that means a lot!! My pleasure :)
Another thing I learned from your video is this: N8N has its own chatbot building system. I used Flowise for that. For me it would be interesting to see a video explaining the differences (and similarities) between N8N and Flowise in regards to AI agent/chatbot functionalities.
Thanks again for the top-notch content!
You are welcome and thank you for the suggestion! I know a lot of individuals who love using Flowise with n8n, so I do have this on my list of content for the future!
This was a really good tutorial. You delivered everything you promised, it was easy to follow and beats the hell out of the RAG agents I've built in the past. Thank you.
How would you set up the initial vector database with recursive uploads of files from Google Drive Folders to be inserted into the vector database? This would give you access to your existing files and then updates as everything is updated over time. Would be super useful for things like Standard Operating Procedures.
Thank you Michael - that means a lot!
This is a great question! You could create a separate n8n workflow that you run once that would go through all the folders in the Google Drive you want and index them into your vector DB. That's definitely doable with n8n. Then going forward the workflow I show in the video would handle new files or updated files.
The other option is you could create a new folder for RAG that you use in the workflow, and then just copy your SOPs or other files from wherever else you store them into your new folder for RAG (then potentially delete the original copies after). Not an ideal solution if you don't want to shuffle things around but that would be the quickest!
man, thank you! new subscriber here. Keep doing great job. You have a special gift and your videos are pleasure to watch.
Wow thank you so much for the very kind words! You bet man!
Would love to see this setup with the change being local host. Like port forwarding the thing (if needed) etc. so that the agent (although running locally) can be access on the internet or integrated via api or embed
Great walkthrough. For next topic, it will be cool to show a custom react application where chatbot is hosted and connected to n8n behind the scene. The file uploads, processing all that smooth connection that can be shown.
Thank you and I appreciate the suggestion! I will certainly be extending this with a frontend in the future, probably with React!
I second this. A video for a react front end would be amazing
brother, you are the best of all UA-camr about this n8n
Thanks man - that means a lot! :D
This is a really thorough walk through for documents. How would this be altered if you were wanting to embed database data from a table? This is where I keep running into problems...
I really love the enthusiasm and clarity you share with each video! Thank you for all the value!
Thanks Cameron, I appreciate it a lot!
Great question! There are a lot of ways to embed a table. The easiest way would probably be to turn the data into a CSV and split the CSV into chunks (making sure to not split in the middle of a row) and ingest that into the vector database. Another option is to ingest each record or a set of records as raw text into the vector database.
Hopefully that helps! What problems specifically have you run into?
@@ColeMedin Thank you so much for the reply.
This might seem very long and like i am taking advantage of your generosity of time.
I need two tables. One table that essentially would house my company wiki database (each row is it's own article, persay and thankfully this has been able to be sent one row at a time via webhook), and then another that would hold the extracted data from supplementary pdfs and documents. The idea would be that if the wiki table doesn't have the answer, the bot could then search the supplementary table.
The problem I'm running into is that when I follow the steps of your video, the table is saved as documents and will then only ever try to search a table called "documents" (even if it is renamed). If I add another table, no matter what I set the vector store as, it always looks for "documents". I was able to get some SQL code that created more tables and gave me some new functions, but no matter what, N8N will just try to look in "documents" and will say my new functions failed. I'm considering moving to pinecone or something else, but the idea of using postgres was very appealing.
I fully support the work you are doing and am certain the best is yet to come!
Of course and no worries, I'm glad to help!
Your use case sounds awesome and it's definitely something you could accomplish with n8n and Supabase.
It's hard to be concise in explaining this, but essentially you would replace the RAG tool I have in the video for the agent and replace it with two n8n workflow tools that would each search a different table with RAG given a query. So you would instruct the agent to use the workflow that queries the wiki table first, and if it fails to find the answer then call the tool to query the supplementary table. Does that make sense?
Thank you for all the awesome content!
You are most welcome!
So it turns out, your context window is the same with or without rag according to Claude, which means all rag can do is help you choose your context segments that will be handed to the llm in text format for content larger than the context. I don't really need rag, I just need a method to isolate segments from my total context related to a particular query and then use the reduced context for it, this is a little slower, but it's going to better than rag at choosing segments. Don't pay for databases, you have a specialist at your disposal, use them.
Yes that is true that RAG is "just" a way to get specific context into the LLM from a knowledgebase and it doesn't actually extend the context window of the LLM!
RAG is one of many methods to isolate segments from total context as you mentioned, and is generally considered the best/easiest to implement option. I'm curious to hear more about what exactly you are thinking of implementing. Sounds interesting!
@@ColeMedin I don't need an LLM to be a knowledge base, databases or even text files are much better at being accurate. Where LLMs are useful are being an interface to a database, my path is to create tools that can be 100% accurate rather than super fast, because of 2 main reasons, 1: agents will destroy themselves and do so far too easily because they don't really know where they messed up, even if they only do it 1% of the time, agents feeding off themselves will accumulate errors more than a quantum computer. 2: inference speeds will continue to become faster, so agents should spend a few more turns on verifying their every move when they interact with deterministic tools like a database, then you can leave them to complete a large task and not have it produce unusable garbage. Well, that's the plan anyway.
@@saxtant Yeah I see where you are coming from! How would you handle a large amount of unstructured data though? If you have a bunch of let's say meeting notes or standard operating procedures, it would be hard to get an LLM to query for those in a SQL database or text files without having RAG to do a similarity search and pick out what matters for the user question. What are your thoughts there? I'm really curious to know!
@@ColeMedin I would prefer not to let it get that far, I mean it's true my house is a complete mess, but I would prefer not to have a large pile of unstructured data and actually I don't, I may not have the same requirements as others, but my current method is all about having an LLM actually use a scratchpad for unstructured data, only for it to be sanitized in the background and removed from the scratchpad to keep the context small.
I'm working very hard to do exactly this. I made a script I call context manager which serves to create the memory of the LLM for a specific task. I'm working with large modules and applications. The context manager basically parses my code and then I select which part of the code I want to send to the LLM to reduce hallucinations and increase coherence and increase the accuracy of the response. My goal is to have this context determined by an AI that will read the script and select which part of the code are relevant to include in the coding LLM 'a context. The goal is to separate task on different calls as the more specific the question is and the more focused the context is better is the response.
So far I'm doing the context creation manually which is of course a pain. I think. I'm getting close to getting it to work.
The biggest issue I have is that LLM are incredibly inconsistent even with ultra clear instructions. Claude which is by far the best I've tried for coding does not follow instructions well. I get more structured responses from GPT4o latest but it is not on the same level as Claude for coding tasks.
I'm also working on an auto merging software that takes the code output and parses it for commands. Which then the program uses to accurately merge the code snippet into the original. I use LibCST for the operation. I'm Getting close to getting it work correctly but still lots of little kinks to fix. Anyone working on this who wants to collaborate I'd be interested.
Btw I'm not a programmer I'm an engineer with a knack for creation and understanding complex things. I rely on AI to write code. I am managing it and figuring out the problems and explaining it how things should work. So a talented coder could really help advance the project.
This is a very insightful video I am very curious why you rushed through the "Default Data Loader" which I though was the most important step to defining the "File_ID" mapping.
Thank you very much! I certainly should have spent more time on the Default Data Loader, that is something I would have done if I were to redo the video!
This is a great resource. Thanks for posting! One suggestion I would make is to add descriptive chapters. I find I have to seek around a bit to get to the parts I need.
Thank you and I appreciate the suggestion! I've been doing it for my more recent videos, but I honestly should go back and do it for this video. I'll be sure to take care of that!
@@ColeMedin Thanks! I just came back to the video to double-check something and having the chapters was super helpful.
Very good video, great for when you think about moving into production. Thanks for sharing.
Thank you, Abraham - my pleasure!! :)
You're my HERO!!! Thanks for this Masterclass!!!
Of course, I'm glad you found it useful!! :)
Awesome video! I’m curious if you ever do consulting for businesses. I’m currently working on a startup and I plan on using this technique but I’d love to pay a consultant who actually knows what they’re doing to make sure it’s set up properly from the get go. If you don’t take on new clients I’d also be interested in anyone you might recommend!
We need to focus on Localization of RAG and creating STACKS to host/use locally for the utmost security. Let's not forget you're training theses companies data for them by sending all your content into cloud services.
Great point! I definitely want to work towards making more content around localizing the entire system. Local setup is a bit more involved for things like the database which is why the full stack isn't local at this point, but I agree fully local is ideal!
@@ColeMedin While I'm a bit looney, using Redis, Ollama, Postgresql, n8n in a compose file setup properly could achieve exactly this though right?
@@redneq I haven't fully attempted or seen this built out before but yes this would be possible! And you certainly aren't loony haha, this sounds great and I know it's what a lot of people are looking for. Lot I'm working on behind the scenes related to local RAG!
@@redneq I do this every day, every nigt when i have time besides my fulltime job.. and what you will expirience is. That it is not building the prncibles and showing a bit nonsense, it is the companys data. You must show up with a solution for the real problem, its the data! This advice here shows just that he has just no idea of what hes doing.
Awesome observation
amazing, thanks, just what I was looking for
Glad to hear it! You bet man!
Awesome video. New to n8n. Can you share your video where you talk about setting up supabase and postgres. Getting an error here. Thanks!
What we need is a knowledge evaluator agents before RAG agents imo.
great guide! One issue I am having is my docs keep coming out in unicode on my rag upload. My extract text method keeps yieleding unicode... [Problem in node ‘Insert into Supabase Vectorstore‘
Error inserting: unsupported Unicode escape sequence 400 Bad Request] I can see my data from the extraction is all corrupt, tried a few documents, any ideas?
Fixed my first issue with txt file. How do I make sure PDFs and DOCX get parsed correctly? Seems to be an encoding issue.
Also any suggestions to deal with a large volume of docs needed to add to the vector store? If I try to drop a few files into drive it breaks obviously. How can I look through these items and process them in batch?
Thank you! For handling different file types like PDFs, you'll need to use the different "extract text" nodes in n8n made specifically for those file types. I actually have a video up going over how to do this!
ua-cam.com/video/T1ZKEmDN8AA/v-deo.html
The Google Drive trigger can handle multiple files at once, it just sends them all in a single workflow execution in n8n so you have to add a "loop" node to loop over each of the files uploaded and insert each one of them individually.
Question: Why don't you make the id in the table documents in supabase be the Google sheet id just change the id to be uuid instead of int8 that way you can remove that step of deleting and inserting the document ... supabase trigger will get called on update too. If you match the IDs this way you can reduce the step and avoid dupes and making sure that when the user is chatting if the document was deleted they fall into that inconsistent behavior as the document might not be there as it got updated and in your current flow you have to delete and insert in between these operations there can be hiccups.
btw this is awesome, Most videos about n8n aren't well thought out and you cover a lot. I think this is the best way to get non tech and tech people into RAG! Plus automation, agents all the way.
having trouble changing the int8 to uuid in supabase, saying "cannot cast type bigint to uuid".. Got a fix?
This is a FANTASTIC question, I appreciate you asking it! And thank you for the kind words as well!
The reason I don't do what you're describing is simply because of a limitation with n8n. When you insert documents into Supabase for RAG with the "Supabase Vectorstore" node, there isn't a way to customize the ID of each record to make it correspond to the Google sheet ID. At least not that I have found.
So this approach with the metadata is the way to work around that in n8n. I also really wanted to demonstrate a use case for metadata since it's a really important topic for RAG so it worked out well that this workaround was necessary.
If you coded this solution yourself, then what you are proposing would be a very good approach.
Perhaps I'm not understanding the suggestions, but with this suggestion, for longer documents which get chunked, wouldn't a single Google document ID be associated with multiple chunks? And wouldn't that be a problems because the Supabase ID needs to be unique for each chunk. Perhaps you could append the Google Doc ID to make it unique (e.g. {googleDocID}-{uniqueNumber}; then you could have access to the Google Doc ID. But this all depends on being able to manage the Supabase ID.
Hey Cole, great work! But do you know why the LangChain vector stuff (SQLEditor code) isn't installing properly on Supabase self-hosting? I entered the code and ran, and it seemed to work but no table was created. Thanks
Thank you!
That's a bummer - you aren't seeing any error message at all? My guess is you're looking at the incorrect database schema. Supabase typically uses the "public" schema but maybe it's looking in a different place by default in the self hosted Supabase?
Great work, Cole! Thank you for sharing.
Thank you Alex, my pleasure!!
i want somethign locally ran and or hosted. look forward to that.
My main question here is, for instance, let’s say i have a bunch of PDFs that needs to be RAGed.
How can i add them manually to that database instead of using n8n.
And if a PDF gets updated, how do i replace it ( i guess i would have to remove from database and re add ? )
Since you have to vectorize the PDFs, there isn't a super good way to do it manually that I know of. You could have AI help you create a script to parse through your PDFs though and add them for RAG!
To your last question - yes, remove from the DB and readd!
Bro doing God's work out here
Thank you Gabriel!! haha
🙏🙏🙏🙌🙌🙌
Thanks a lot for the video, which is very perfectly useful. I have planned to do a chatbot on Discord, if you have any plan to do some videos explaining how to connect to the Discord bot that would be very helpful. Thanks again and cannot wait for new videos.
You bet! Discord is a great platform so yeah I am considering that!
Excellent video. Great job, you have spotted a perfect example and the realization is very well executed.👏
Why do you need to connect the RAG AI Agent for the Postgres chat memory using the Postgres while you use the API connection for the document insertion ?
How can I better understand the steps and choices you took for the chunking part ? Could you briefly explain when your method is relevant ?
Thank you - I appreciate the kind words!
The Postgres chat memory is separate from the document insertion (which is for the knowledge retrieval). The chat memory is there so the agent can remember previous messages in the conversation. The document insertion and the Supabase documents table is there for the agent to be able to search across your documents to answer a question.
The reason the chat memory and document insertion use different credentials is mostly because the Postgres chat memory in n8n can use any Postgres database - it doesn't have to be Supabase. So those credentials are more "generic" to a Postgres database, while the credentials for RAG are specifically for Supabase. I hope that makes sense!
For chunking, a lot of it just comes down to playing around with the chunk and overlap sizes to figure out what works best for your use case! There aren't too many rules to follow there. I just 1000 for my chunk size just because that's a default used in a lot of applications.
@@ColeMedin thank you for the precise answers. It seems easy when YOU say it !!
Haha of course!! Let me know if you have any more questions!
This is a great and a detailed tutorial on using n8n for RAG. I noticed that although I add files to the google drive directory the app is monitoring, it still won't fetch those docs for the RAG part. Any idea why this might be happening
Thank you man, I appreciate it!
That's strange n8n isn't picking up on new files in your Google Drive... is your workflow switched to active? You'll have to toggle it to active in the top right of the workflow view to make the Google Drive triggers work! I'd also double check and make sure the triggers are set to use the same folder you are adding files to.
@@ColeMedin Thank you. It seems like it is not triggering on pdf files, just google doc files, I ended up adding a new text extraction for the pdf but that didn't work either. Maybe you can look into this in the future, unless if I am the only one having this issue. Thanks again for the great content
Interesting... the file type shouldn't change how well the trigger works! I'll have to look into it and test it out myself. And my pleasure :)
Thank you for demonstrating a no-code RAG implementation. It's quite impressive! However as some of the comments mentioned local RAG is a realistic requirement due to security concerns. However for a true real world rollout, there's going to be a need for a guardrail framework tied to role-permissions, and a test framework for validating expected / unexpected outcomes and I believe that will inevitably lead to a code-based implementation. For now, the n8n platform is great for prototyping different backends / engines. This was inspiring nonetheless!
Thank you Howard!
You make great points! For a lot of applications, there truly is a lot that goes into making them production ready with all the security/testing requirements. n8n is suitable for some applications (such as many website chatbots), but others you are right in saying the requirements will often lead to the need for a coded solution.
Thanks!
Thank you so much for your support!!
My workflow appears to only be triggered by the most recent file that was created. That is, if 2 files are uploaded between trigger events, Only one of those files will be added to Supabase. So, if your trigger runs every minute, and you exceed a file-upload rate of 1 file/minute, files will not be added to Supabase. Have you tested this scenario and ruled it out as a flaw in the workflow? I'm unable to overcome it with this current setup, as I understand it, at least.
Okay I did some more testing with this and right now the workflow does only handle one file being uploaded/updated at once within a minute. If you do more than one within a minute, it will trigger the workflow only once but there will be multiple items there - so you just have to loop over them to process them and index them.
Please create a video that we use all open source and setup locally
I do actually have a video out already where I have a similar setup that is fully local!
ua-cam.com/video/V_0dNE-H2gw/v-deo.html
your embeddings are everything, if your embeddings are chaotic and of poor quality this will greatly impact outputs.
Yes very true! I honestly should have focused on the embeddings even more here!
@@ColeMedin would make for a good video in of itself, not many people seem to want to explain how embeddings are created, how you can manage them or what the implications they can have for your outputs.
Yeah definitely - thank you for the suggestion! I have it in my list to create some content around more advanced RAG techniques. Embeddings are necessarily advanced but it does fall under the category of going into something specific to RAG in more detail!
Your tutorials are incredible compared to others. Thank you so much for your amazing work. I have a question regarding the approach to inserting PDF's with tables into vector stores. I am having a lot of trouble and still no good results. I tried your best practices (thank you so much for that) but you don't address this issue, how to insert PDF's with tables without loosing semantics. Would you address this in one of your videos, please? I see there are some solutions with coding but that is a no no for me... and from what I understand it requires pre-preparing the data, is that correct?
Can it ne done in n8n? Thank you and keep up the amazing work!!
Thank you for the kind words! Great question but super hard to answer over text - I am going to be making a video soon though (within the next couple of months) on how to do RAG better with CSVs!
Great video! Thanks!!!
Is it possible to make this product usable as a SaaS for searching for 10-20 different clients, with different databases and different accounts?
Thank you! You bet!
Yes you certainly can - I would do some research on RAG metadata - you can essentially segment the knowledgebase based on the tenant (each of your clients), so when you insert new knowledge you specify the tenant it is for, and then when you query knowledge for a single tenant you use the metadata filtering to limit the knowledge it can pull from to just that tenant.
Comrade, you really are fantastic with N8N! I really appreciate you sharing your knowledge here on YT. Another "follower". Congratulations. I am from BRAZIL
Thank you very much, that means a lot to me! 😃
I'm jealous you're in Brazil! Where I'm from (Midwest in the US) it's starting to get cold...
Great video. Just subscribed
Awesome, thank you very much! Glad you enjoyed it :)
Hi Cole, I got this working with your JSON file in minutes. Thank you for the detailed step-by-step instructions on getting APIs and SQL codes setup!
Q: How can I make this just search one or a few documents from the database?
You are so welcome, nice job!!
Could you clarify your question? For RAG it will only retrieve the documents you ingest into the vector DB. You could use metadata filtering to filter down on the documents you want to search. A lot of how that would work would depend on your use case and setup though!
tried, it works! thank you
Awesome you bet!
Interested in advanced RAG retrieval
Yeah more content coming soon for advanced RAG topics!
love these videos thank you for the value!
You are so welcome!!
Great Content! Can you make a tutorial about how to make agent that can be database manager ? Gives info to client, update record, and more for business?
Thank you!! And I actually am planning on making a video like this already! Are you thinking this agent would create custom queries to manage the database, or more just call tools that already have queries defined to perform certain actions? I am thinking of doing both but curious what you had in mind!
Are you self hosting n8n or using their cloud?
Self hosting!
Why supabase? what is benefit of supabase vs postgres? In this video not using any advance auth mechanism to validate user and I believe same can still be done with Postgres, right?
Supabase is running Postgres under the hood! It's just super convenient compared to hosting Postgres yourself and it has features like authentication and row level security for expanding this solution.
what to do when n8n is locally hosted what should be the app domain for google cloud?
Great question! When you have n8n hosted locally, you'll have to set up a domain and SSL certificate to be able to use Google since Google won't work with localhost unfortunately. I would suggest hosting n8n on a VPS using a service like Digital Ocean! n8n actually has great documentation on hosting in DigitalOcean:
docs.n8n.io/hosting/installation/server-setups/digital-ocean/
Hi Great video. For some reason supabase does not create the documents table, and i get an error message from the Delete Old Doc Rows node saying Bad request - please check your parameters
column n8n_chat_histories.metadata does not exist. What could be the issue? The rest of the nodes are seems to be working.
Thank you! Sorry you are running into this issue though!
The documents table isn't created by itself, you have to follow these instructions:
python.langchain.com/docs/integrations/vectorstores/supabase/
For the delete old docs node make sure the table is documents and not n8n_chat_histories!
@@ColeMedin Legend, thank you for helping out!!!
You bet!
Greate video, thank you for share.
Regarding, vector data base, do you suggest some self hosted solution?
Thank you, you are welcome!
For self hosting a vector DB I'd recommend either Qdrant or self-hosting Supabase and using PGVector.
Great tutorial! Ironically, I've been stuck for a couple of days on the google drive authentication. Not able to connect n8n with google for some reason :/
Thank you! I'm sorry you're having issues with the Google Drive authentication? What is the specific error message you are getting?
@@ColeMedin I believe the error was with the app being on Testing mode, it didn't allow any OAuth to proceed (contact developer - kinda funny when you are the 'developer' haha). I was able to troubleshoot by adding the email manually to google console, I guess whitelisting it for Oauth login while is Testing. I tried switching the app to Live mode but it required google review and approval, etc.
How can we setup so you can add pdfs, excel files, different types of files and have it extract them into text to be able to be embedded? I tried using a switch but it won't accept any schema from the binary files. Any ideas?
Great question Jack!
This would involve a bit more of an in depth flow where you would add branching to your workflow based on the file type. I've done this before with n8n so I know it's pretty easy to set up.
Basically you would add an "if" (router) node to your n8n workflow. If the file type from the Google Drive trigger (you could use the mimeType property) is a PDF, then you would route to a "Extract from PDF" node, if it's an Excel file, you would route to a "Extract from XLSX" file, etc. If you click on the "Extract from File" node in n8n you'll see a list of options that includes these.
Then you have all of those separate "extract from file" nodes route back to the rest of the workflow that handles the extracted text. Hopefully that all makes sense!
Curious on the google drive node, is there a way to monitor subfolders? The google drive nodes both have this call out, "Changes within subfolders won't trigger this node"
In other news, great video and thanks for sharing.
Thank you Don and great question!
That is correct that the Google Drive trigger node doesn't watch subfolders. If you want to monitor subfolders, you could set up triggers for those specific folders as well. Obviously that's only realistic if you don't have dozens of subfolders.
The best way to handle this without creating a trigger for each folder would be to use the Google Drive "Changes" API. You can basically tell Google Drive to alert you when a file is created/updated within a folder or your entire Drive by sending a request to a webhook which could be an n8n workflow (with a webhook trigger). This method does handle subfolders! So if you're really curious about extending this I would take a look at the Changes API!
@@ColeMedin Thank you, appreciate the help. I was able to work with the Google Drive search node, triggering the path every 15 minutes and searching for any files that have been modified in the last 15 minutes with this query:
modifiedTime > '{{DateTime.now().minus({ minutes: 15 }).toUTC().toFormat("yyyy-MM-dd'T'HH:mm:ss'Z'") }}'
Hello, excellent video! I'm having trouble extracting text from the file. I don't know what encoding it uses because the output is unrecognizable characters, so it's not possible to store the vectors. Any idea?
Thank you! What file type are you trying to extract from?
@@ColeMedin Hi mate, thanks for your quick response.
I have already identified the problem. The issue is that this workflow only recognizes files created within Google Drive using Google Docs, as it converts Google files to text format using Drive Download. If you try to upload files externally to the folder, Drive Download cannot correctly convert the binary file to text. The solution is to remove the Google document conversion and add a switch at the start that routes the workflow to "Extract from file" nodes based on the file type. For .doc and .docx files, I routed them to a web service running Apache Tika, and through an HTTP Request node with a PUT request, I send the file and receive the text in XML format. Now, the idea is to figure out how to filter the characters inside the text chain or use another solution that allows me to convert the .doc or .docx file into another format like .txt.
That said, your video has been very inspiring, and I appreciate you taking the time to share this information. New sub here!
I've watche a lot of videos about n8n agents - most of them bs that has no real application for business. But your video and blueprint is totally gem! tysm
Thank you, that means a lot to me!! I'm glad what I have here has real business application for you 😄
GREAT tutorial. How can this be modified to work with pdf's?
Thank you!! I haven't tested this with PDFs specifically myself but others have and this should work already for PDF documents!
Otherwise, there is a specific "Extract from PDF" node in n8n you could use. So you could add a condition to the n8n workflow that routes to the regular text extractor when the file is not a PDF file, and route to "Extract from PDF" when it is.
@@ColeMedin Thanks for your reply and, again, for the great tutorial. I'm going to get this installed today or tomorrow and will try pdf's and let you know.
@@moses5407 Of course and that sounds great!
For anyone else wondering about this, here's how I was able to implement it:
I set up an IF node after the download a file node with this condition
{{ $binary.data.fileExtension }} = pdf
- If True, it'd go into the extract pdf text node followed by a Set node, that takes the extracted value and saves it to data. It connects to the "insert into supabase" node.
- If False, it'd flow normally, into the extract document text node
Thank you, great video Cole! What is the difference between Supabase and Qdrant that you showed in another videos? Is Supabase not on-prem? Thank you,
Supabase can be run locally like Qdrant! The main reason I have Qdrant in the local AI starter kit (the other video you referenced) is because that's just what was included in the package initially. But I actually do prefer using Supabase with PGVector for RAG! And I might be changing up that package to use Supabase instead of Qdrant and the vanilla Postgres it comes with.
How would you recurse through a single Google Drive directory? I'm doing a nightly sync with my local Obsidian vault and I'd love my AI Agent to get really smart on my years of notes
Great question! You can create a separate n8n workflow to scrape through an entire Google Drive directory pretty easily! I might be making a video on that in the future, but essentially in n8n you can set up a workflow to list all files in a directory and then in a loop go through each one and add it to the knowledgebase similar to how I do it in the video!
Hi Cole, do you know if it is possible to output a summary of the chat interaction? I wonder, for example, if I could add another AI assistant in this workflow to do that at the end of the chat iteration, but I don't know how to do that without screwing up the current workflow.
Probably would be better if I simple extract the chat history and sent it out of the workflow and then create another one to do that. Just don't know how to extract it and send it out. Any idea?
Great question! So the chat memory is stored in the Supabase database so you can extract all the messages out based on the current session ID!
I have implemented this Workflow. I have my own "xmas plans.txt" file that is my document that in uploaded. I can see it in the Supabase DB. However, when I start a Chat (with a Q&A Agent, attached to a Supabase Retrieval node) and ask "what the the Christmas plans" or "what are the xmas plans", it doesn't provide an answer related to the RAG document. In the Chat UI, in the right-hand pane, I see the expected "output" with my DB row. It is just not passed on through the workflow (seemingly). What could I have done wrong?
Sorry you're running into this Dan!
Which model are you using? I've had this happen with smaller models where they just seem to ignore the output from the RAG nodes.
@@ColeMedin
mxbai-embed-large:latest (669 MB)
nomic-embed-text:latest. (274MB)
tried both
Sorry, somehow it missed off the names of the Embeddings models I have tried (both, as above). mxbai-embed-large:latest (669 MB) and nomic-embed-text:latest (274 MB)
do you have something that is more code focused as an option less for document referencing but for storing conversations and things the agent learns so it can reference answers it doesn't know in a general sense.
I don't have this yet, but I'm certainly going to be making content on this in the future
Thank you, very interesting. Question: are we obliged to "Vectorize/Embedd" documents which are below 20 pages? Or can we just use the Text Extractor?
Hey Cole - quick question... any chance you could describe the schema for the n8n_chat_history table? I could guess, but I'd rather not if I can help it. I was able to get the schema for the documents table by pausing the video and zooming in, but the n8n_chat_history wasn't shown. Thanks!
Great question! So N8N creates the n8n_chat_history table by itself which is why I didn't cover the schema. So I'd run the workflow to create the table and go into Supabase and take a look - it'll be set up for you automatically!
@@ColeMedin I wondered about that but I must have screwed something up. I appreciate the response.
You bet!
Hey Cole,
I have a question regrading the aplication of this to a multi-tenant database where each tenat should have its own rag for its docuemnts. Is this posible?
BTW great video!
Thank you very much! And great question!
It is hard to get into this in great detail in a UA-cam comment, but you can easily do multi-tenant RAG using metadata filters within a vector DB. With metadata filtering you don't even need a separate index per tenant, though you can do that too. So basically the tenant ID (or company/customer ID, whatever you call it) will be a part of all requests into these workflows. Any inserts into the vector DB will have the tenant ID included in the metadata. Then any retrievals from the vector DB for this tenant can simply filter on the tenant ID in the metadata to guarantee that it is only retrieving information for that tenant. Let me know if this makes sense!
HI Cole! Thank you very much for the video and the detailed explanation, I was able to implement it here in minutes! But I have a question, and when the file inside the folder is deleted, is this action considered in the "Updated file" node?
You are so welcome, I'm glad you have it implemented! A deleted file won't trigger the file updated trigger, unfortunately. That's a limitation of N8N I hope they address soon since you have to create a custom webhook using the Google Drive API to actually watch for file deletions.
Hey my friend: how can I resolve this node problem in Supabase:
Problem in node ‘Insert into Supabase Vectorstore‘
Error inserting: unsupported Unicode escape sequence 400 Bad Request
Seems like there might be an issue with your data? I would search around and see if someone else has seen this, I have not personally!
thank you so much for the value!! Just a question : why the file ID doesn't appear in my vector database in the column 'metadata' like you ?
Amazing video
Hey, Cole! Amazing video. I've been working on my version of it. Quick question:
I'm doing your workflow, but the original data for vector database is a CSV doc (an employees database)... So, I did a vector database where the batch size is 1:1 (each row became a row), to avoid breaking employees in pieces (lol)
However, thus far, the vectors that are showing up in the file retrieval are presenting sub-par quality, where they are related to my query, but not enough. So..
1) How would you set this up (in matters of parameters like batch size, files number etc.)
2) why is your retrieval files number = 4?. In my case, I've noticed the model is omitting/ignoring important stuff when I leave any value under 24... which is sad in a token expenditure perspective lol
Anyways, thanks in advance. good day
Thank you very much and great questions!
So the ideal setup depends a lot here on what kind of queries you want to make. RAG is really good at looking up specific employee records (example: "What is John Doe's salary?") but it is not good at answering questions that would require it to have the entire CSV in its context (example: "What is the average salary of all employees?"). This is because RAG will only have part of the CSV in its context unless you set it up to retrieve the entire document.
If the CSV is small enough (rough estimate < 10k characters) you could just not chunk it at all when putting it in the knowledge base. That way it'll pull the entire document to answer questions. Otherwise your idea of one employee per record could work or you could do something like 10 employees per row.
My retrieval is 4 because that is pretty standard when your chunk size is something like 1000-2000 characters and you want the RAG solution to lookup very specific information. But this is one of those parameters that you just have to play with a lot! That and the chunk size.
The bigger the chunk size, the more information will be available with a smaller retrieval number. So maybe it has to be larger than 24 for you because your individual records are so small (since it's one per employee)?
@@ColeMedin thank you. really complete answer, right here. That makes sense. I’ll test that number higher. My automation has been working good so fine, with 70+ employees
Glad it makes sense! And that's awesome!!
How did you do this? I am trying to change the chunk to match my row, but I don't know how.
@@leoplaysnotmuch under the document loader you have the “character splitter node”. I’ve set it to as high as my row can get (you can change from characters to tokens). Just make sure your rows aren’t too huge (mine with 1k tokens average are doing well)
I think RAG AI is really good with a small volume of data, like a 10-page PDF. However, when moving to something more serious, like over 50 pages and about 20 tables in a file, it doesn't respond as well. I'm referring to a file with more than 2,000 lines
Yes you certainly aren't wrong! There are a lot of factors that determine how well a RAG system performs, like the embedding model, the LLM model to handle retrieved chunks, your chunk size, how you split up your documents (especially for things like tabular data), your use of metadata filters, etc. A lot of advanced RAG techniques can be used too like reranking, hybrid search, knowledge graphs, etc. All of this becomes a lot more important once you have a lot of files or very large files like you are saying!
Hey Cole, you forgot to mention that pgvector extension needs to be enabled to work with embedding vectors in Supabase. Thanks,
Thanks for pointing that out! PGVector is actually enabled as a part of the SQL script that I show how to run within the Supabase platform. But I certainly could have called that out more clearly!
hi @Cole thanks for this workflow
I'm wondering is it possible to receive some sort of notification
if there is no answer in the documents
so I can see this request and later improve docs
Yeah you could! Basically you could tell the LLM to output something specific when it doesn't get any documents, and then have a part of the N8N workflow send a notification when that happens.
thanks for the video.
I would like to analyze PDF studies of several hundred pages and make summaries to extract insights.
The problem is that I can't copy/paste the pdf into GPT because it goes beyond the context window.
Can I use RAG to do this use case?
The RAG seems to be designed more for answering specific questions from a knowledge base than for synthesizing documents.
You bet! You are right that RAG is meant more for answering specific questions. To summarize very large PDFs like what you are trying to do, I would suggest having the LLM summarize something like 5-10 pages at a time, and then have a final prompt where you combine all the summaries together and ask it to make a final summary.
Thanks for the great video.
can Qdrant or Chroma be used locally instead of hosted Supabase?
Thank you!! And n8n doesn't support using local vector DBs, although you can use a hosted Qdrant vector DB with n8n! If you wanted to use a local vector DB in the workflow, you could host it on the same machine that your n8n is self hosted on and then create a custom code step to work with the local vector DB.
For some reason, the agent prioritizes the memory and does not use the documents with the tool. When I remove the memory, it uses the documents tool perfectly. I don’t understand the logic behind this. This is the system message: You are a personal assistant responsible for answering questions using a corpus of documents. Before stating that you do not know the answer, you must use the 'documents' tool to search for relevant information in the vector store. This search should be your primary action every time you receive a question, unless it is absolutely clear that there is no useful information available. Always respond in Spanish. This version emphasizes the necessity of using the specified tools to ensure thorough document searches.
Interesting... I didn't run into this issue myself for this setup but I have had this happen with RAG agents before.
This is especially common with models that aren't as powerful, so the easiest thing to try is to use a more powerful model if you can. Like try GPT-4o instead of GPT-4o-mini if you're using that.
Also I'm curious - what kind of conversation did you have with the agent where the memory and knowledge base would have conflicting information? Is it because you added a document to the knowledge base half way through the conversation? Sometimes you have to restart the conversation when there is new information in the knowledge base, because the LLM doesn't necessarily understand that new info is available which is why it can resort to what it said earlier in the conversation.
i struggle with the exact same problem, have you found a solution for this? I'm using mistral-large-latest model. When i clean memory, it call the tool just fine, but the second question it doesn't call the tool
For those that have used this system and are considered new, maybe this will help you too.
Vectorization Error:
I got an error with SupaBase at the bottom part of the workflow.
I first could get the connection to work, there was simple naming issue -I didnt know but used chat GPT to help me out.
When I got it working it had another issue, something about the 'embedding size' so looked it up and followed the same sequence. troubleshooting.
Basically it said to alter the table from (1536) to (3072).
I'm not sure if others using this has come across this yet.
The embedding size you need depends on the embedding model you are using! It's 1536 if you are using the small embedding model from OpenAI.
Can you make a video how to talk to a postgressql db?
Great suggestion, I appreciate it! I'll be making more SQL AI agent videos in the future and this would be a great addition to what I've been thinking!
@@ColeMedin Can't wait mate! Great videos.
Thanks man!
hi, just downloaded this workflow and replaced openai with anthropic plugin and it doesnt work. Is there some specific setup to use different model tha openAI?
No there isn't, it should work right after the switch! What is the error you are seeing?
great video, subscribed. i'm always looking for interesting n8n tutorials. Pity I'm using Groq at the moment so no idea how to do an Embedding tool with that. Maybe it's in the works
Thank you Martin!
Groq is a fantastic product for LLMs! For embeddings, it is too bad you can't use Groq for that - but you can use Ollama or HuggingFace in n8n if you want to stay open source for the embeddings!
have you tried larger files? like a PDF that is 500kb or larger? My setup seems to choke on that in the embedding ollama part. I can have multiple small .doc files no problem.
I have tried and have experienced the same thing before! Typically that means you need more memory on whatever instance you are hosting N8N with.
this is a great video, but im struggeling with what happens when i delete a file from the drive folder. When i delete a file from my drive folder, it should also delete the vectors from the database, right? Somehow this doesnt happen? Any idea how to solve that?
Thank you! And yes unfortunately this is a limitation of n8n where there isn't a trigger for when files are deleted. So they aren't automatically removed from the vector DB. You'll have to either manually remove them through a workflow you set up and trigger yourself (not ideal), or integrate with the Google Drive API to set up a webhook for when files are deleted to trigger a workflow that removes that file from the knowledgebase.
I did not understand the purpose of n8n in this whole picture. Can I do it without n8n? Is there an alternative to n8n?
Just want to see clearly where n8n fit in this picture
Great question! So n8n is what allows you to create this entire setup without having to code anything. The alternative to n8n would be to create this AI agent using Python and a library like LangChain. I do have a lot of content on that kind of thing as well!
Or if you want other no code workflow automation alternatives to n8n, you could use Zapier or Make.com. But those are super expensive so I'd recommend n8n for sure!
Hey Cole, watched most of your vlogs but this is the first that I've had a crack at, after setting up n8n locally with docker following your suggestion. I tweaked the extract node to pdf as that all I'm using for the knowledge base, and ran OCR on the files first. My question and apologies if you've already answered it (I did have a quick scan of the comments) is it appears to only load the first pdf file in the folder, each time I run the workflow, how do i get it to fetch multiple files, so i can then summarise like you demonstrate in you 10 n8n tips?
Good question! So when you have multiple PDFs uploaded at once to Google Drive, it'll only trigger the n8n workflow once but all of the files will be available. So you can add a "loop" node into the workflow to process each of them exactly how I do it with just a single file.
Supabase changed the way how they store the connection details and I cant setup the connection. I used the direct connection parameters but I cant connect.
Hey Cole, Great video! I just have one question, it looks like the atribute "file_id" isn't getting uploaded on my supabase table on the column metadata. I get only "loc" > "lines" > "to": "from": "source": and "blobType": No file_id, and it makes the expression to delete duplicates not working
Thank you! Sorry you are running into that! Make sure you are including the file_id in the document splitter node! I would download the workflow JSON I link in the description and check that out. It's the node below "Supabase inserter" node.
Basic question, but the chatbot is consulting the database from Supabase Vector Store or from the file in Google Drive?
I think the file from Google Drive was used only once to create the vector embeddings that were stored in the vector index. As you make a question, the AI agent would parse your prompt, and search for a vector embedding that has a similar value of what you asked for in your prompt.
I'm also new to RAG, so I can't be 100% sure of what I just said. It seems that's what happens in there for me. If I'm wrong, please let me know.
@treefreezoner is totally right! Thank you for the response and good question @regisaabh!
Can you make a video on Langchain, Supabase and n8n integration
I am certainly going to in the near future! n8n actually uses LangChain under the hood for their AI Agents, so this already is a LangChain + Supabase + n8n integration! I'm assuming you mean with code instead of n8n though?
+1 on this. So n8n can be categorized as “orchestration tool” like langchain or LlamaIndex?
Hello, is there a way to make the Google Drive File Updated Node to get all file updates rather than just the most recent one. I made changes to two of my files, but only the most recent one is recorded (and the others are missed). Again, the same for file created, if I put 5 new folders into my folder (that i am watching with the trigger) it will only pick up one of then. Thanks for the great videos
Great question! So when multiple files are updated within the same minute the workflow actually triggers once with multiple files as inputs. So you have to change up the workflow to loop through all the files passed in! You can set up a "loop" node at the beginning of the workflow and the rest can be essentially the same.
This is a great video, but I am facing an issue. I want to upload docx and pdfs. I already can do pdf uploads by using Extract from Pdf node, but for wordx it has been a hassle trying to figure this out. IF you can help with this that would be great. How can I extract text from docx.
Thank you! Sorry you're running into that issue though.
It seems n8n doesn't support docx by default unfortunately, so you would have to convert it to a Google doc or text format (something like that) first.