I was sceptic first, as like you say: many people just talk and do not actually teach. Blessings for you and this channel. Ff you keep that up you will be successful. This was a good and clear explanation.
Man i just discovered your content. You are a gem :) Thanks. I've work and test some things in n8n few years ago, but it's great to see what it can be done RN.
This integration of RAG AI Agents with platforms like n8n and Supabase is indeed a game changer! I love how it streamlines complex processes into manageable workflows.
I'm glad you found this useful! I agree it's a gamechanger - I've spent way too much time in the past building relatively complex AI Agents that I now know I wouldn't even need to code! haha
Another thing I learned from your video is this: N8N has its own chatbot building system. I used Flowise for that. For me it would be interesting to see a video explaining the differences (and similarities) between N8N and Flowise in regards to AI agent/chatbot functionalities. Thanks again for the top-notch content!
You are welcome and thank you for the suggestion! I know a lot of individuals who love using Flowise with n8n, so I do have this on my list of content for the future!
Amazing video Cole. This video is in a different league to the vast majority of other so called content creators who are more about getting clicks for headlines. They nearly ALWAYS show workflows that are missing the small details needed to create a real-world production ready agent. Subbed and liked. Looking forward to future content from you.
@@pumpituphomeboy Thank you very much, that means a ton! That's exactly what I'm aiming for - hitting the small details that make all the difference :)
15:02 Thank you for the great tutorial! Definitely keep it up. In case it helps you as an educator, the Document Loader Options (15:02) were vital for the success of the tutorial, but you went past them so fast I didn't realize they were there! In any case, I learned a ton from investigating that myself, and from the tutorial in general. Again, thank you!
Thank you very much and I appreciate you calling that out a ton! It's important for me to not miss anything big and I agree I should have covered the document loader options.
This was a really good tutorial. You delivered everything you promised, it was easy to follow and beats the hell out of the RAG agents I've built in the past. Thank you.
How would you set up the initial vector database with recursive uploads of files from Google Drive Folders to be inserted into the vector database? This would give you access to your existing files and then updates as everything is updated over time. Would be super useful for things like Standard Operating Procedures.
This is a great question! You could create a separate n8n workflow that you run once that would go through all the folders in the Google Drive you want and index them into your vector DB. That's definitely doable with n8n. Then going forward the workflow I show in the video would handle new files or updated files. The other option is you could create a new folder for RAG that you use in the workflow, and then just copy your SOPs or other files from wherever else you store them into your new folder for RAG (then potentially delete the original copies after). Not an ideal solution if you don't want to shuffle things around but that would be the quickest!
This is a great resource. Thanks for posting! One suggestion I would make is to add descriptive chapters. I find I have to seek around a bit to get to the parts I need.
Thank you and I appreciate the suggestion! I've been doing it for my more recent videos, but I honestly should go back and do it for this video. I'll be sure to take care of that!
Great walkthrough. For next topic, it will be cool to show a custom react application where chatbot is hosted and connected to n8n behind the scene. The file uploads, processing all that smooth connection that can be shown.
This is a really thorough walk through for documents. How would this be altered if you were wanting to embed database data from a table? This is where I keep running into problems... I really love the enthusiasm and clarity you share with each video! Thank you for all the value!
Thanks Cameron, I appreciate it a lot! Great question! There are a lot of ways to embed a table. The easiest way would probably be to turn the data into a CSV and split the CSV into chunks (making sure to not split in the middle of a row) and ingest that into the vector database. Another option is to ingest each record or a set of records as raw text into the vector database. Hopefully that helps! What problems specifically have you run into?
@@ColeMedin Thank you so much for the reply. This might seem very long and like i am taking advantage of your generosity of time. I need two tables. One table that essentially would house my company wiki database (each row is it's own article, persay and thankfully this has been able to be sent one row at a time via webhook), and then another that would hold the extracted data from supplementary pdfs and documents. The idea would be that if the wiki table doesn't have the answer, the bot could then search the supplementary table. The problem I'm running into is that when I follow the steps of your video, the table is saved as documents and will then only ever try to search a table called "documents" (even if it is renamed). If I add another table, no matter what I set the vector store as, it always looks for "documents". I was able to get some SQL code that created more tables and gave me some new functions, but no matter what, N8N will just try to look in "documents" and will say my new functions failed. I'm considering moving to pinecone or something else, but the idea of using postgres was very appealing. I fully support the work you are doing and am certain the best is yet to come!
Of course and no worries, I'm glad to help! Your use case sounds awesome and it's definitely something you could accomplish with n8n and Supabase. It's hard to be concise in explaining this, but essentially you would replace the RAG tool I have in the video for the agent and replace it with two n8n workflow tools that would each search a different table with RAG given a query. So you would instruct the agent to use the workflow that queries the wiki table first, and if it fails to find the answer then call the tool to query the supplementary table. Does that make sense?
We need to focus on Localization of RAG and creating STACKS to host/use locally for the utmost security. Let's not forget you're training theses companies data for them by sending all your content into cloud services.
Great point! I definitely want to work towards making more content around localizing the entire system. Local setup is a bit more involved for things like the database which is why the full stack isn't local at this point, but I agree fully local is ideal!
@@redneq I haven't fully attempted or seen this built out before but yes this would be possible! And you certainly aren't loony haha, this sounds great and I know it's what a lot of people are looking for. Lot I'm working on behind the scenes related to local RAG!
@@redneq I do this every day, every nigt when i have time besides my fulltime job.. and what you will expirience is. That it is not building the prncibles and showing a bit nonsense, it is the companys data. You must show up with a solution for the real problem, its the data! This advice here shows just that he has just no idea of what hes doing.
I've watche a lot of videos about n8n agents - most of them bs that has no real application for business. But your video and blueprint is totally gem! tysm
Comrade, you really are fantastic with N8N! I really appreciate you sharing your knowledge here on YT. Another "follower". Congratulations. I am from BRAZIL
So it turns out, your context window is the same with or without rag according to Claude, which means all rag can do is help you choose your context segments that will be handed to the llm in text format for content larger than the context. I don't really need rag, I just need a method to isolate segments from my total context related to a particular query and then use the reduced context for it, this is a little slower, but it's going to better than rag at choosing segments. Don't pay for databases, you have a specialist at your disposal, use them.
Yes that is true that RAG is "just" a way to get specific context into the LLM from a knowledgebase and it doesn't actually extend the context window of the LLM! RAG is one of many methods to isolate segments from total context as you mentioned, and is generally considered the best/easiest to implement option. I'm curious to hear more about what exactly you are thinking of implementing. Sounds interesting!
@@ColeMedin I don't need an LLM to be a knowledge base, databases or even text files are much better at being accurate. Where LLMs are useful are being an interface to a database, my path is to create tools that can be 100% accurate rather than super fast, because of 2 main reasons, 1: agents will destroy themselves and do so far too easily because they don't really know where they messed up, even if they only do it 1% of the time, agents feeding off themselves will accumulate errors more than a quantum computer. 2: inference speeds will continue to become faster, so agents should spend a few more turns on verifying their every move when they interact with deterministic tools like a database, then you can leave them to complete a large task and not have it produce unusable garbage. Well, that's the plan anyway.
@@saxtant Yeah I see where you are coming from! How would you handle a large amount of unstructured data though? If you have a bunch of let's say meeting notes or standard operating procedures, it would be hard to get an LLM to query for those in a SQL database or text files without having RAG to do a similarity search and pick out what matters for the user question. What are your thoughts there? I'm really curious to know!
@@ColeMedin I would prefer not to let it get that far, I mean it's true my house is a complete mess, but I would prefer not to have a large pile of unstructured data and actually I don't, I may not have the same requirements as others, but my current method is all about having an LLM actually use a scratchpad for unstructured data, only for it to be sanitized in the background and removed from the scratchpad to keep the context small.
I'm working very hard to do exactly this. I made a script I call context manager which serves to create the memory of the LLM for a specific task. I'm working with large modules and applications. The context manager basically parses my code and then I select which part of the code I want to send to the LLM to reduce hallucinations and increase coherence and increase the accuracy of the response. My goal is to have this context determined by an AI that will read the script and select which part of the code are relevant to include in the coding LLM 'a context. The goal is to separate task on different calls as the more specific the question is and the more focused the context is better is the response. So far I'm doing the context creation manually which is of course a pain. I think. I'm getting close to getting it to work. The biggest issue I have is that LLM are incredibly inconsistent even with ultra clear instructions. Claude which is by far the best I've tried for coding does not follow instructions well. I get more structured responses from GPT4o latest but it is not on the same level as Claude for coding tasks. I'm also working on an auto merging software that takes the code output and parses it for commands. Which then the program uses to accurately merge the code snippet into the original. I use LibCST for the operation. I'm Getting close to getting it work correctly but still lots of little kinks to fix. Anyone working on this who wants to collaborate I'd be interested. Btw I'm not a programmer I'm an engineer with a knack for creation and understanding complex things. I rely on AI to write code. I am managing it and figuring out the problems and explaining it how things should work. So a talented coder could really help advance the project.
Hi Cole, I got this working with your JSON file in minutes. Thank you for the detailed step-by-step instructions on getting APIs and SQL codes setup! Q: How can I make this just search one or a few documents from the database?
You are so welcome, nice job!! Could you clarify your question? For RAG it will only retrieve the documents you ingest into the vector DB. You could use metadata filtering to filter down on the documents you want to search. A lot of how that would work would depend on your use case and setup though!
Good content but the only way to say whether this is a good RAG system is to evaluate it via a metric like RAGAS. Any RAG system will spit out content. At this point as an AI engineer I expect all RAG videos to include an evaluation so that we can see how good it is at retrieving content and how reliable the content retrieved is. good luck !
Thank you and that's a very fair point! I appreciate you pointing out something that's missing in a respectful way. Honestly I would love to have some sort of evaluation in the video but I wanted to keep it concise. I will definitely be thinking of how I can incorporate that in future videos though without making it too much longer!
Thank you for demonstrating a no-code RAG implementation. It's quite impressive! However as some of the comments mentioned local RAG is a realistic requirement due to security concerns. However for a true real world rollout, there's going to be a need for a guardrail framework tied to role-permissions, and a test framework for validating expected / unexpected outcomes and I believe that will inevitably lead to a code-based implementation. For now, the n8n platform is great for prototyping different backends / engines. This was inspiring nonetheless!
Thank you Howard! You make great points! For a lot of applications, there truly is a lot that goes into making them production ready with all the security/testing requirements. n8n is suitable for some applications (such as many website chatbots), but others you are right in saying the requirements will often lead to the need for a coded solution.
@@ColeMedin would make for a good video in of itself, not many people seem to want to explain how embeddings are created, how you can manage them or what the implications they can have for your outputs.
Yeah definitely - thank you for the suggestion! I have it in my list to create some content around more advanced RAG techniques. Embeddings are necessarily advanced but it does fall under the category of going into something specific to RAG in more detail!
Question: Why don't you make the id in the table documents in supabase be the Google sheet id just change the id to be uuid instead of int8 that way you can remove that step of deleting and inserting the document ... supabase trigger will get called on update too. If you match the IDs this way you can reduce the step and avoid dupes and making sure that when the user is chatting if the document was deleted they fall into that inconsistent behavior as the document might not be there as it got updated and in your current flow you have to delete and insert in between these operations there can be hiccups. btw this is awesome, Most videos about n8n aren't well thought out and you cover a lot. I think this is the best way to get non tech and tech people into RAG! Plus automation, agents all the way.
This is a FANTASTIC question, I appreciate you asking it! And thank you for the kind words as well! The reason I don't do what you're describing is simply because of a limitation with n8n. When you insert documents into Supabase for RAG with the "Supabase Vectorstore" node, there isn't a way to customize the ID of each record to make it correspond to the Google sheet ID. At least not that I have found. So this approach with the metadata is the way to work around that in n8n. I also really wanted to demonstrate a use case for metadata since it's a really important topic for RAG so it worked out well that this workaround was necessary. If you coded this solution yourself, then what you are proposing would be a very good approach.
Perhaps I'm not understanding the suggestions, but with this suggestion, for longer documents which get chunked, wouldn't a single Google document ID be associated with multiple chunks? And wouldn't that be a problems because the Supabase ID needs to be unique for each chunk. Perhaps you could append the Google Doc ID to make it unique (e.g. {googleDocID}-{uniqueNumber}; then you could have access to the Google Doc ID. But this all depends on being able to manage the Supabase ID.
Excellent video. Great job, you have spotted a perfect example and the realization is very well executed.👏 Why do you need to connect the RAG AI Agent for the Postgres chat memory using the Postgres while you use the API connection for the document insertion ? How can I better understand the steps and choices you took for the chunking part ? Could you briefly explain when your method is relevant ?
Thank you - I appreciate the kind words! The Postgres chat memory is separate from the document insertion (which is for the knowledge retrieval). The chat memory is there so the agent can remember previous messages in the conversation. The document insertion and the Supabase documents table is there for the agent to be able to search across your documents to answer a question. The reason the chat memory and document insertion use different credentials is mostly because the Postgres chat memory in n8n can use any Postgres database - it doesn't have to be Supabase. So those credentials are more "generic" to a Postgres database, while the credentials for RAG are specifically for Supabase. I hope that makes sense! For chunking, a lot of it just comes down to playing around with the chunk and overlap sizes to figure out what works best for your use case! There aren't too many rules to follow there. I just 1000 for my chunk size just because that's a default used in a lot of applications.
Hey Cole, I have a question regrading the aplication of this to a multi-tenant database where each tenat should have its own rag for its docuemnts. Is this posible? BTW great video!
Thank you very much! And great question! It is hard to get into this in great detail in a UA-cam comment, but you can easily do multi-tenant RAG using metadata filters within a vector DB. With metadata filtering you don't even need a separate index per tenant, though you can do that too. So basically the tenant ID (or company/customer ID, whatever you call it) will be a part of all requests into these workflows. Any inserts into the vector DB will have the tenant ID included in the metadata. Then any retrievals from the vector DB for this tenant can simply filter on the tenant ID in the metadata to guarantee that it is only retrieving information for that tenant. Let me know if this makes sense!
Hey, Cole! Amazing video. I've been working on my version of it. Quick question: I'm doing your workflow, but the original data for vector database is a CSV doc (an employees database)... So, I did a vector database where the batch size is 1:1 (each row became a row), to avoid breaking employees in pieces (lol) However, thus far, the vectors that are showing up in the file retrieval are presenting sub-par quality, where they are related to my query, but not enough. So.. 1) How would you set this up (in matters of parameters like batch size, files number etc.) 2) why is your retrieval files number = 4?. In my case, I've noticed the model is omitting/ignoring important stuff when I leave any value under 24... which is sad in a token expenditure perspective lol Anyways, thanks in advance. good day
Thank you very much and great questions! So the ideal setup depends a lot here on what kind of queries you want to make. RAG is really good at looking up specific employee records (example: "What is John Doe's salary?") but it is not good at answering questions that would require it to have the entire CSV in its context (example: "What is the average salary of all employees?"). This is because RAG will only have part of the CSV in its context unless you set it up to retrieve the entire document. If the CSV is small enough (rough estimate < 10k characters) you could just not chunk it at all when putting it in the knowledge base. That way it'll pull the entire document to answer questions. Otherwise your idea of one employee per record could work or you could do something like 10 employees per row. My retrieval is 4 because that is pretty standard when your chunk size is something like 1000-2000 characters and you want the RAG solution to lookup very specific information. But this is one of those parameters that you just have to play with a lot! That and the chunk size. The bigger the chunk size, the more information will be available with a smaller retrieval number. So maybe it has to be larger than 24 for you because your individual records are so small (since it's one per employee)?
@@ColeMedin thank you. really complete answer, right here. That makes sense. I’ll test that number higher. My automation has been working good so fine, with 70+ employees
@@leoplaysnotmuch under the document loader you have the “character splitter node”. I’ve set it to as high as my row can get (you can change from characters to tokens). Just make sure your rows aren’t too huge (mine with 1k tokens average are doing well)
Hi Cole, do you know if it is possible to output a summary of the chat interaction? I wonder, for example, if I could add another AI assistant in this workflow to do that at the end of the chat iteration, but I don't know how to do that without screwing up the current workflow.
Probably would be better if I simple extract the chat history and sent it out of the workflow and then create another one to do that. Just don't know how to extract it and send it out. Any idea?
great video, subscribed. i'm always looking for interesting n8n tutorials. Pity I'm using Groq at the moment so no idea how to do an Embedding tool with that. Maybe it's in the works
Thank you Martin! Groq is a fantastic product for LLMs! For embeddings, it is too bad you can't use Groq for that - but you can use Ollama or HuggingFace in n8n if you want to stay open source for the embeddings!
This is a great and a detailed tutorial on using n8n for RAG. I noticed that although I add files to the google drive directory the app is monitoring, it still won't fetch those docs for the RAG part. Any idea why this might be happening
Thank you man, I appreciate it! That's strange n8n isn't picking up on new files in your Google Drive... is your workflow switched to active? You'll have to toggle it to active in the top right of the workflow view to make the Google Drive triggers work! I'd also double check and make sure the triggers are set to use the same folder you are adding files to.
@@ColeMedin Thank you. It seems like it is not triggering on pdf files, just google doc files, I ended up adding a new text extraction for the pdf but that didn't work either. Maybe you can look into this in the future, unless if I am the only one having this issue. Thanks again for the great content
Thanks for the video. For the Supabase retriever node in n8n, is there way to specify additional filters (like a tag) to get documents within that tag or is that something that needs to be done in the match function within Supabase?
My pleasure and good question! You can filter on a tag or any kind of metadata within the Vector Store node you attach to the retrieval tool. It's the only additional option in the vector store node (at least for Supabase) when you click on "Add Option" in the bottom middle.
Curious on the google drive node, is there a way to monitor subfolders? The google drive nodes both have this call out, "Changes within subfolders won't trigger this node" In other news, great video and thanks for sharing.
Thank you Don and great question! That is correct that the Google Drive trigger node doesn't watch subfolders. If you want to monitor subfolders, you could set up triggers for those specific folders as well. Obviously that's only realistic if you don't have dozens of subfolders. The best way to handle this without creating a trigger for each folder would be to use the Google Drive "Changes" API. You can basically tell Google Drive to alert you when a file is created/updated within a folder or your entire Drive by sending a request to a webhook which could be an n8n workflow (with a webhook trigger). This method does handle subfolders! So if you're really curious about extending this I would take a look at the Changes API!
@@ColeMedin Thank you, appreciate the help. I was able to work with the Google Drive search node, triggering the path every 15 minutes and searching for any files that have been modified in the last 15 minutes with this query: modifiedTime > '{{DateTime.now().minus({ minutes: 15 }).toUTC().toFormat("yyyy-MM-dd'T'HH:mm:ss'Z'") }}'
Hey mate, nice vid! I was waiting for something that will show me how to use n8n for my use cases. Do you think Supabase is better than Qdrant? From your video it seems you like it because you can keep track of the conversation history without needing something extra like MongoDB?
Thanks Henry! Yes, that is one of the big reasons I picked Supabase over Pinecone or Qdrant. It's nice to not have to have one platform for the conversation history and another for RAG! On top of that, Supabase makes it really easy to look at my knowledgebase in a tabular view which I can't do with non pgvector implementations.
@@ColeMedin Yeah noticed that view as well. Super cool. Do you have the possibility to see the vectors within each document upload? I think the dream would be to have a solution, where you could just have everything within on platform. That would be money for sure.
@@HenrykAutomation Yes you can see the vectors for each document chunk in the Supabase table as well! I agree that's the dream, and it seems like Supabase is the right direction.
Thank you!! And n8n doesn't support using local vector DBs, although you can use a hosted Qdrant vector DB with n8n! If you wanted to use a local vector DB in the workflow, you could host it on the same machine that your n8n is self hosted on and then create a custom code step to work with the local vector DB.
super stuff man! I am using self hosted n8n and don't have postgres chat memory option. I think I'm on the latest version 1.47.0 is this something you need to install? Also there is no retrieve documents option off rag agent. When I copied your file these 2 parts were broken. Are you using cloud version?
@@jackmermigas9465 Thank you! That is really strange. I'm also self hosted and did not have to install anything extra for postgres chat memory or document retrieval. It honestly seems like something is off with your installation... Is it self hosted with Docker or did you use NPM? I'd love to try to help you get it working!
Great Content! Can you make a tutorial about how to make agent that can be database manager ? Gives info to client, update record, and more for business?
Thank you!! And I actually am planning on making a video like this already! Are you thinking this agent would create custom queries to manage the database, or more just call tools that already have queries defined to perform certain actions? I am thinking of doing both but curious what you had in mind!
How would you recurse through a single Google Drive directory? I'm doing a nightly sync with my local Obsidian vault and I'd love my AI Agent to get really smart on my years of notes
Great question! You can create a separate n8n workflow to scrape through an entire Google Drive directory pretty easily! I might be making a video on that in the future, but essentially in n8n you can set up a workflow to list all files in a directory and then in a loop go through each one and add it to the knowledgebase similar to how I do it in the video!
Why supabase? what is benefit of supabase vs postgres? In this video not using any advance auth mechanism to validate user and I believe same can still be done with Postgres, right?
Supabase is running Postgres under the hood! It's just super convenient compared to hosting Postgres yourself and it has features like authentication and row level security for expanding this solution.
Great tutorial! Ironically, I've been stuck for a couple of days on the google drive authentication. Not able to connect n8n with google for some reason :/
@@ColeMedin I believe the error was with the app being on Testing mode, it didn't allow any OAuth to proceed (contact developer - kinda funny when you are the 'developer' haha). I was able to troubleshoot by adding the email manually to google console, I guess whitelisting it for Oauth login while is Testing. I tried switching the app to Live mode but it required google review and approval, etc.
Thank you for the video. What wasn’t clear to me, and I would like to know before starting with this tutorial, is whether any of the documents I would upload are analyzed in the cloud or if everything is processed locally, including the database or even when the AI responds. Is everything done locally? I’d like to apply it at work, but there are certain privacy requirements that demand everything to be on-premise.
My pleasure and I appreciate your concern! This is not running entirely locally. The LLM I use is GPT, and I use the cloud (managed) version of Supabase. However, you could easily make this entirely local if you want! You could run the LLM yourself with Ollama (something like Llama 3.2 11b/90b) and then host Supabase yourself for the DB. And then n8n is self-hosted in this video so that is already local. I have a video on my channel where I show a similar setup that is entirely local! ua-cam.com/video/V_0dNE-H2gw/v-deo.html
Thank you!! Yes, this is only for files that are created/updated once the workflow is set up. But you can create another workflow in n8n that will get all existing files from a Google Drive folder and put those into the knowledge base! Basically you would create this workflow, run it once to add all existing files, and then not need it again.
Thanks for the video ! I believe your workflow has a problem, but you did not run into because your document is split in only one chunk. Indeed, if the doc is split in multiple chunks, the delete node runs multiple times and the download node runs multiple times as well (and so will the rest of the workflow, leading to duplicates in the DB). Checking the option "Execute once" is not an option if we want to maintain the capacity to handle multiple files at once. Would be happy to have a workaround that 😁
My pleasure, and thanks for raising this! I think you might be right, but I'll have to test it! Regardless, the workflow triggers per document that is created/updated even if they are created within the same polling minute (there would just be multiple executions at once) so you actually should be able to check the "Execute once" option.
@@ColeMedin Thank you for your quick feedback ! You're making a good point. In my case though, I want to maintain the possibility to deal with multiple docs at once (as I want to trigger manually not auto). I will keep investigating around that 😀
Ah okay that makes sense! I believe there is a way in n8n to merge multiple output items into one. That way you can take all the records that are outputted from the Supabase delete node and combine them so you aren't running the rest of the workflow multiple times.
Thank you!! I haven't tested this with PDFs specifically myself but others have and this should work already for PDF documents! Otherwise, there is a specific "Extract from PDF" node in n8n you could use. So you could add a condition to the n8n workflow that routes to the regular text extractor when the file is not a PDF file, and route to "Extract from PDF" when it is.
@@ColeMedin Thanks for your reply and, again, for the great tutorial. I'm going to get this installed today or tomorrow and will try pdf's and let you know.
For anyone else wondering about this, here's how I was able to implement it: I set up an IF node after the download a file node with this condition {{ $binary.data.fileExtension }} = pdf - If True, it'd go into the extract pdf text node followed by a Set node, that takes the extracted value and saves it to data. It connects to the "insert into supabase" node. - If False, it'd flow normally, into the extract document text node
Thanks for pointing that out! PGVector is actually enabled as a part of the SQL script that I show how to run within the Supabase platform. But I certainly could have called that out more clearly!
@@jubinroy4987 I appreciate the suggestion! I haven't thought too much about making content around AI devops besides this but I really like the idea so I'll seriously consider it!
Hi Great video. For some reason supabase does not create the documents table, and i get an error message from the Delete Old Doc Rows node saying Bad request - please check your parameters column n8n_chat_histories.metadata does not exist. What could be the issue? The rest of the nodes are seems to be working.
Thank you! Sorry you are running into this issue though! The documents table isn't created by itself, you have to follow these instructions: python.langchain.com/docs/integrations/vectorstores/supabase/ For the delete old docs node make sure the table is documents and not n8n_chat_histories!
Hi Cole, I have another question, if the meeting note becomes multiple chunk of vector, I've noticed that under N8N they are not passing the metadata back to the model so the issue here is that the AI model won't know which meeting date is the chunk coming from since it's stored under metadata. The alternative in my mind right now is by just storing the meeting summary as a vector and not the whole transcript to optimize the token, include the meeting date and other info in the summary content, and then store the summary into just 1 chunk (1 chunk per meeting summary) Keen to know your thoughts. Thanks!
Yes you are right here, and I love your thoughts for an alternative! This is one of the common pitfalls with RAG in general where if a chunk of text is retrieved from in the middle of a document, the LLM doesn't necessarily know which document it is from since only the first chunk would have the title. So I like your approach to keep it all in one chunk per meeting note! Another option is to prefix every chunk with the document title. That would require a more custom implementation though since you can do that by default with the vector document inserter node in n8n.
I did not understand the purpose of n8n in this whole picture. Can I do it without n8n? Is there an alternative to n8n? Just want to see clearly where n8n fit in this picture
Great question! So n8n is what allows you to create this entire setup without having to code anything. The alternative to n8n would be to create this AI agent using Python and a library like LangChain. I do have a lot of content on that kind of thing as well! Or if you want other no code workflow automation alternatives to n8n, you could use Zapier or Make.com. But those are super expensive so I'd recommend n8n for sure!
thanks for the video. I would like to analyze PDF studies of several hundred pages and make summaries to extract insights. The problem is that I can't copy/paste the pdf into GPT because it goes beyond the context window. Can I use RAG to do this use case? The RAG seems to be designed more for answering specific questions from a knowledge base than for synthesizing documents.
You bet! You are right that RAG is meant more for answering specific questions. To summarize very large PDFs like what you are trying to do, I would suggest having the LLM summarize something like 5-10 pages at a time, and then have a final prompt where you combine all the summaries together and ask it to make a final summary.
How can we setup so you can add pdfs, excel files, different types of files and have it extract them into text to be able to be embedded? I tried using a switch but it won't accept any schema from the binary files. Any ideas?
Great question Jack! This would involve a bit more of an in depth flow where you would add branching to your workflow based on the file type. I've done this before with n8n so I know it's pretty easy to set up. Basically you would add an "if" (router) node to your n8n workflow. If the file type from the Google Drive trigger (you could use the mimeType property) is a PDF, then you would route to a "Extract from PDF" node, if it's an Excel file, you would route to a "Extract from XLSX" file, etc. If you click on the "Extract from File" node in n8n you'll see a list of options that includes these. Then you have all of those separate "extract from file" nodes route back to the rest of the workflow that handles the extracted text. Hopefully that all makes sense!
have you tried larger files? like a PDF that is 500kb or larger? My setup seems to choke on that in the embedding ollama part. I can have multiple small .doc files no problem.
This is a great video, but I am facing an issue. I want to upload docx and pdfs. I already can do pdf uploads by using Extract from Pdf node, but for wordx it has been a hassle trying to figure this out. IF you can help with this that would be great. How can I extract text from docx.
Thank you! Sorry you're running into that issue though. It seems n8n doesn't support docx by default unfortunately, so you would have to convert it to a Google doc or text format (something like that) first.
The documents are automatically ingested into Supabase when they are created or updated in the specific Google Drive folder that my n8n workflow is watching! So nothing has to be done with the chatbot itself. That is done in the bottom part of the workflow where the two workflow triggers are "File Created" and "File Updated".
Isn't this workflow always overwritting the database (every one minure) since the File Created tool is always downloading the LAST file created? Great work by the way. Thank you.
Thank you very much! And no, it only polls for new files that are created every minute, meaning it won't run every minute unless a file was created in the last minute (and it can handle multiple files uploaded within a minute too). I hope that makes sense!
My workflow appears to only be triggered by the most recent file that was created. That is, if 2 files are uploaded between trigger events, Only one of those files will be added to Supabase. So, if your trigger runs every minute, and you exceed a file-upload rate of 1 file/minute, files will not be added to Supabase. Have you tested this scenario and ruled it out as a flaw in the workflow? I'm unable to overcome it with this current setup, as I understand it, at least.
Okay I did some more testing with this and right now the workflow does only handle one file being uploaded/updated at once within a minute. If you do more than one within a minute, it will trigger the workflow only once but there will be multiple items there - so you just have to loop over them to process them and index them.
great video! Thank you so much. I have some questions though :) Does this also work with PDF? From the Step "download files" I get the impression it is just for google drive filetypes. Exactly in this step I get the following error when putting a load of PDFs in the google drive folder: Multiple matching items for expression [item 0] An expression here won't work because it uses .item and n8n can't figure out the matching item. (There are multiple possible matches) Try using .first(), .last() or .all()[index] instead of .item or reference a different node. Any Idea what this is? So far my supabase database is completely empty. no change no matter if I put one PDF or 30 in the folder When I just let it sitting there it ran again 4 minutes later with only one item from drive with an error in "insert into supabase". unfortunately for some reason it will not open the node to tell me the error .... If I can find it, I will send it as another comment I am running a docker installation on a Synology NAS THANK YOU SO MUCH!
alright. when I create a google drive text file and execute the steps manually it works. I have now my first element in the database of supabase. I guess it is just not made for PDF files... So I need to find a way to convert PDFs to Google Drive Docs. You don't happen to have a workflow for that, do you? :D
Thank you Josef! This workflow doesn't inherently work with PDFs unfortunately. However, you can make it work with PDFs pretty easily! There is a specific "Extract Document Text" node in n8n for extracting PDFs. So if you add the "Extract Document Text" node, there will be a list of file types and PDF is one of them. You'll just have to route the workflow to that node when the file type is PDF, and the other extract node when it isn't. I hope that makes sense! You could also convert the PDFs as you mentioned in your other comment, but it's probably a lot easier to not have to!
@@ColeMedin hello there :) Indeed I created a separate WF for just converting PDF to txt. I now have a folder in my google drive which is called "PDF Converter" and contains an input and an output folder. so this solution is fine. Had some problems with some documents which contained non unicode signs... and for some reason the WF does not start for me.... so if I put something in the google drive folder I linked in your workflow it just does not start... I have to start it manually. I start, it processes one document, I stop it, start it again etc. until all documents are done.
@@JosefMaxHajda Sounds like a good solution! I'm sorry the trigger isn't working for you though... Is your workflow toggled to "active" in the top right of the builder?
Hi Cole. Thanks for another great video. I have a quick question about duplicates in the vector database, as you touched on at ua-cam.com/video/PEI_ePNNfJQ/v-deo.htmlsi=ac0SBGP3onyuT_I1&t=766 Is this functionality a limitation of N8N? It is the same (not doing upserts) with any vector store (such as Qdrant)? Or it is as a side effect of our table structure? (I mean maybe we can have instead a table that has the ID as the vector not another different ID, whether self generated or the file ID from Google drive). Maybe this is limitation of pgvector?!
Thank you and great question! It is more of a limitation of n8n since yes my workaround is really just making an upsert possible since inserting into the vector DB with n8n by default is a pure insert not upsert. Certainly not a limitation of PGVector!
At 9:29, you didn’t actually explain how to fill in the information for the Postgres chat memory node. At an earlier point, you entered Supabase and accessed the database and API sections and said we would need that later. Later, you accessed the Postgres chat memory and said that you had already shown how to configure it when you showed Supabase! Basically, you said: we will need this later, and when the time came, you said you had already explained how to set up the node earlier! And I couldn’t fill in the information for the node because there wasn’t actually that explanation in the tutorial, especially regarding the password.
Shoot I'm sorry! The password will be your Supabase DB password you get in the database tab in the Supabase settings. You set the database password when you create it in Supabase.
HELP ME COLE!!! I'm struggling with my Rag Agent because it seems like he doesn't like to work that much... Despite de retrieval process brings excellent information to the agent, most of the times he answers "I don't know". There's a "default" system prompt for the Retrieve Documents LLM that says: "..if you don't know the answer, just say that you don't know" and I think he is taking that tip really serious. How can I face this problem???
Interesting... so you're seeing that the right document chunks are retrieved from Supabase but the LLM still says it doesn't know the answer? I'm curious which LLM you are using. I've seen it a lot with less powerful LLMs that it will have the right context and still think it doesn't know the answer. I'd try gpt-4o or Claude 3.5 Sonnet and see if that helps. Otherwise, sometimes starting a new conversation can help because the LLM gets into a weird loop where it constantly thinks it doesn't know anything.
Hello, is there a way to make the Google Drive File Updated Node to get all file updates rather than just the most recent one. I made changes to two of my files, but only the most recent one is recorded (and the others are missed). Again, the same for file created, if I put 5 new folders into my folder (that i am watching with the trigger) it will only pick up one of then. Thanks for the great videos
Great question! So when multiple files are updated within the same minute the workflow actually triggers once with multiple files as inputs. So you have to change up the workflow to loop through all the files passed in! You can set up a "loop" node at the beginning of the workflow and the rest can be essentially the same.
I think the file from Google Drive was used only once to create the vector embeddings that were stored in the vector index. As you make a question, the AI agent would parse your prompt, and search for a vector embedding that has a similar value of what you asked for in your prompt. I'm also new to RAG, so I can't be 100% sure of what I just said. It seems that's what happens in there for me. If I'm wrong, please let me know.
HI, thank you for your great tutorials first. However, I've tried to import your workflow json file but never success. Neither using self-host n8n nor n8n cloud. Any suggestion for fixing the issue? Thank you.
@@ColeMedin Thanks for your replay. I've tried to import your "Supabase_RAG_AI_Agent.json" and "n8n_Workflow_RAG_AI_Agent.json" and both were showed "Could not import file: The file does not contain valid JSON data."
At intro he said most golden lines and my most frustrating point in my mind, you won't be believe I gave my 5 days but I didn't got a proper perfect way to implement that in real life as most of them are carrying the paid promotions and use that promotions in the project which frustates me and then some are of poor quality, my gosh!!!!! And if you find a way then it works only for personal use, when it comes to the production ready build, all fails.
Great question! When you have n8n hosted locally, you'll have to set up a domain and SSL certificate to be able to use Google since Google won't work with localhost unfortunately. I would suggest hosting n8n on a VPS using a service like Digital Ocean! n8n actually has great documentation on hosting in DigitalOcean: docs.n8n.io/hosting/installation/server-setups/digital-ocean/
I think RAG AI is really good with a small volume of data, like a 10-page PDF. However, when moving to something more serious, like over 50 pages and about 20 tables in a file, it doesn't respond as well. I'm referring to a file with more than 2,000 lines
Yes you certainly aren't wrong! There are a lot of factors that determine how well a RAG system performs, like the embedding model, the LLM model to handle retrieved chunks, your chunk size, how you split up your documents (especially for things like tabular data), your use of metadata filters, etc. A lot of advanced RAG techniques can be used too like reranking, hybrid search, knowledge graphs, etc. All of this becomes a lot more important once you have a lot of files or very large files like you are saying!
Great question! Yes - you can set up Ollama with n8n really easily since it's one of their chat model options! All you have to do is select the chat model on the AI Agent node, choose Ollama from the list of models, and then give the endpoint URL of your self-hosted LLM. docs.n8n.io/integrations/builtin/cluster-nodes/sub-nodes/n8n-nodes-langchain.lmchatollama
Whats the best way to allow for PDF's to be uploaded and work correctly? I switched the extract document text to pdf which is fine but for some reason it wont upload to Supabase. The text splitter node doesnt work
Interesting... what you are describing sounds correct to me! So I'd have to take a closer look to see your setup. What is the error you are getting with the text splitter node?
@@ColeMedin managed to figure this part out , everything uploading fine but some times the model is not checking the vector store for answers and just uses internal knowledge.
Glad you figured it out! And this new issue comes up a lot, but there is an easy solution! If you click into the "Tools Agent" node in n8n, you'll see a place to edit the System Message. There you can provide more instructions to the LLM to tell it something like "Always check your knowledgebase for the answer, don't rely on your internal knowledge".
I was trying to recreate this RAG but faced the problem. There is no any description of database tables structure. Can someone give me a hint? We are storing the documents-embedings in the Couchbase DB “documents” table, what are column names and type?
Thank you for the video! i followed all steps but have an error on Supabase Vector Store output: Error searching for documents: PGRST202 Could not find the function public.match_documents(filter, match_count, query_embedding) in the schema cache Searched for the function public.match_documents with parameters filter, match_count, query_embedding or with a single unnamed json/jsonb parameter, but no matches were found in the schema cache. cant work around it. any ideas?
Of course! Thanks for walking through it and I'm sorry you're getting an error! Did you run the SQL commands outlined here in the document I showed in the video? supabase.com/docs/guides/ai/langchain?database-method=sql That third command that starts with "create function match_documents (" is what creates the function your error message says is missing. I would also make sure the "public" namespace for Supabase is your default namespace. You can determine that by going to the table editor in Supabase and making sure "public" is the schema chosen by default in the top left.
Sorry I am just seeing this now! Yes - there is an option to extract from PDF in n8n! If you select the "Extract Text" node there is a "Extract from PDF" option.
I am certainly going to in the near future! n8n actually uses LangChain under the hood for their AI Agents, so this already is a LangChain + Supabase + n8n integration! I'm assuming you mean with code instead of n8n though?
@ sorry, I thought supabase and postgres are competitors. And you made 1 starter video with Postgres and the other with Supabase. Maybe I’m asking the wrong question. Mainly I want to know which Chatbox plugin I can use in WP to interface with N8N
Hmmm... and your Supabase connection is working for other nodes like inserting the vectors? What's the exact error you are getting? I haven't seen this one before!
Yeah I totally agree! Managed Supabase is cheaper to get started because the free tier is very generous, but yes, once you get to production and need to scale, a local Supabase instance is generally the way to go!
For some reason, the agent prioritizes the memory and does not use the documents with the tool. When I remove the memory, it uses the documents tool perfectly. I don’t understand the logic behind this. This is the system message: You are a personal assistant responsible for answering questions using a corpus of documents. Before stating that you do not know the answer, you must use the 'documents' tool to search for relevant information in the vector store. This search should be your primary action every time you receive a question, unless it is absolutely clear that there is no useful information available. Always respond in Spanish. This version emphasizes the necessity of using the specified tools to ensure thorough document searches.
Interesting... I didn't run into this issue myself for this setup but I have had this happen with RAG agents before. This is especially common with models that aren't as powerful, so the easiest thing to try is to use a more powerful model if you can. Like try GPT-4o instead of GPT-4o-mini if you're using that. Also I'm curious - what kind of conversation did you have with the agent where the memory and knowledge base would have conflicting information? Is it because you added a document to the knowledge base half way through the conversation? Sometimes you have to restart the conversation when there is new information in the knowledge base, because the LLM doesn't necessarily understand that new info is available which is why it can resort to what it said earlier in the conversation.
Great suggestion, I appreciate it! I'll be making more SQL AI agent videos in the future and this would be a great addition to what I've been thinking!
Thank you and yes for a lot of businesses that is certainly the case! But I do make videos on local solutions as well for this! Similar example to this video but local: ua-cam.com/video/V_0dNE-H2gw/v-deo.html
It's going to be better that a lot of the other RAG assistants on UA-cam for sure! I'm not guaranteeing it's the absolute best, which is why I didn't say "all" in the video (just "most"), but I have specifically seen a lot of mistakes repeated in other tutorials which I called out. Specifically, I have seen n8n RAG tutorials with Pinecone before that duplicate vectors whenever a document is reinserted into Pinecone when it is updated. The Pinecone insert function in n8n is not an "upsert" (update or insert)! So it won't remove/update the older vectors for a document, it will create entirely new ones no matter what. That means both the old and new version of the document are in the knowledgebase still.
@@ColeMedin I have a business and I'm waiting to pull the trigger the last few weeks on getting one of these set up either paid or myself (likely won't be me lol). I appreciate you citing an actual function issue with previous set ups. I've seen a lot of apps where they don't have "update row" on google sheet which is the number way to update a system on gsheet. I'm sure there are a few more nuggets to watch out for when setting these up. Duplicate knowledge bases is a big no no
You can actually connect to a local Ollama LLM inside the n8n workflow! It's one of the supported options when you select the chat model for the n8n Tools Agent. All you have to supply is the local URL (localhost + port) that the LLM is running on.
Good on you for working on creating a frontend for your chatbot! Making a frontend for AI agents certainly isn't a straightforward task! Have you tried using something like v0 to help? I would give that a shot - it has been a game changer for me. I will also be putting out more content in the near future around creating frontends for AI agents!
@@ColeMedin I have actually already built a frontend exactly with v0, game changer for me too. I just have trouble connecting the n8n chatbot to my React frontend made with v0. Anyways yes put out more content for creating frontends for agents, I got really good inspo from your videos lately :)
Glad you're getting good inspiration, I appreciate it! Could you clarify where you are running into issues connecting your n8n chatbot into React? One suggestion I have is to use a webhook trigger for the n8n agent. That way you can make a fetch request to call into your n8n workflow from the React application and basically turn the agent in an API endpoint. You can add authentication/authorization easily as well!
@@ColeMedin yea using a webhook trigger is smart, I managed to pass the query from the webhook trigger to RAG AI Agent node using chatInput but for some reason it still doesn't start as it shows the error - The value "" is not supported and I've tried with many different queries. Maybe you could make a video where you create a quick UI in v0 then connect the n8n chatbot to that UI via a webhook trigger. Btw my project is already working locally without the n8n integration, I added you on LinkedIn if you want to see what it does, it's pretty interesting but I currently have no business use case maybe you can get inspiration yourself.
Hmm... that's a strange error... I am definitely planning on making a video in the future connecting an n8n agent with a webhook trigger to a frontend I build with v0! So I really hope that can help you out! Thanks for connecting on LinkedIn, I'll head over there now.
Very valid question! I chose Supabase for three main reasons for this workflow: - It's easier to use - It's cheaper (Pinecone has a free tier as well but I've had relatively frequent outages with it) - Pinecone is a better vector DB overall but really only once you have tens of thousands of vectors, and it isn't better by too much from my experience I will definitely be making content with Pinecone in the future, but wanted to start with Supabase for these reasons!
@@littledaddi3 I honestly do try to give the best technical advice I can! I am not sponsored by Supabase in any way so this is all honest advice with no tracking links :)
friend, tell me how to do the same only with an online chat for my site, so that you can install an online chat on your site and there was a vector storage and chatgpt
n8n actually allows you to embed an AI Agent chatbot like the one I created onto any website as a widget! I'd take a look at this resource to see how to use it: www.npmjs.com/package/@n8n/chat Let me know if this helps!
Great question Bruno! The Google Drive nodes can indeed handle PDF documents. The workflow does need to be updated to work with PDFs though because the "Extract Text" node I use in the video doesn't work with PDFs. There is a version of that node that is specifically for PDFs, however, so you can use that one for PDF documents!
Yeah it does! Once you add in support for PDF files, you can just dump them into your Google Drive folder and then ask questions based on their contents to the LLM once the PDFs are ingested into the knowledgebase!
@@ColeMedin Even with the whole supabase methods inserted? supabase will be able to fill the database extracting info from my pdfs and make the workflow to run properly?
Great question! All you would need to do is these two steps: 1. Change the workflow trigger from "Chat message" to Slack's "On New Message Posted to Channel" trigger. Here you can specify the specific Slack channel and any other filters that would trigger the n8n workflow. 2. Add a step at the end of the workflow to send the response back to the Slack channel with Slack's "Send a message" action. For both the trigger and action I mentioned, you can search for Slack in n8n and you'll find both of those in the list that pops up once you select Slack!
I was sceptic first, as like you say: many people just talk and do not actually teach. Blessings for you and this channel. Ff you keep that up you will be successful. This was a good and clear explanation.
Thank you very much - that seriously means a lot! I'm glad everything was clear as well! That's actually my primary goal :)
Man i just discovered your content. You are a gem :) Thanks. I've work and test some things in n8n few years ago, but it's great to see what it can be done RN.
Thanks dude, that means a lot to me! That's awesome you were testing things out with N8N even a few years ago. I didn't even know about it back then!
This integration of RAG AI Agents with platforms like n8n and Supabase is indeed a game changer! I love how it streamlines complex processes into manageable workflows.
I'm glad you found this useful! I agree it's a gamechanger - I've spent way too much time in the past building relatively complex AI Agents that I now know I wouldn't even need to code! haha
oh my days dude you have really REALLY sorted me out with this vid. So grateful.
Haha I'm glad to help man! Thank you!
Another thing I learned from your video is this: N8N has its own chatbot building system. I used Flowise for that. For me it would be interesting to see a video explaining the differences (and similarities) between N8N and Flowise in regards to AI agent/chatbot functionalities.
Thanks again for the top-notch content!
You are welcome and thank you for the suggestion! I know a lot of individuals who love using Flowise with n8n, so I do have this on my list of content for the future!
Amazing video Cole. This video is in a different league to the vast majority of other so called content creators who are more about getting clicks for headlines. They nearly ALWAYS show workflows that are missing the small details needed to create a real-world production ready agent. Subbed and liked. Looking forward to future content from you.
@@pumpituphomeboy Thank you very much, that means a ton! That's exactly what I'm aiming for - hitting the small details that make all the difference :)
nice! one pumpit... xD
15:02 Thank you for the great tutorial! Definitely keep it up. In case it helps you as an educator, the Document Loader Options (15:02) were vital for the success of the tutorial, but you went past them so fast I didn't realize they were there! In any case, I learned a ton from investigating that myself, and from the tutorial in general. Again, thank you!
Thank you very much and I appreciate you calling that out a ton! It's important for me to not miss anything big and I agree I should have covered the document loader options.
Totally agree on channels not talking about keeping their databases duplicate free. Thanks for showing.
Of course Michael, I appreciate you calling that out!
man, thank you! new subscriber here. Keep doing great job. You have a special gift and your videos are pleasure to watch.
Wow thank you so much for the very kind words! You bet man!
Super high quality content. Unlike the rest, this really helped. Thanks for sharing.
Thank you, that means a lot!! My pleasure :)
This was a really good tutorial. You delivered everything you promised, it was easy to follow and beats the hell out of the RAG agents I've built in the past. Thank you.
How would you set up the initial vector database with recursive uploads of files from Google Drive Folders to be inserted into the vector database? This would give you access to your existing files and then updates as everything is updated over time. Would be super useful for things like Standard Operating Procedures.
Thank you Michael - that means a lot!
This is a great question! You could create a separate n8n workflow that you run once that would go through all the folders in the Google Drive you want and index them into your vector DB. That's definitely doable with n8n. Then going forward the workflow I show in the video would handle new files or updated files.
The other option is you could create a new folder for RAG that you use in the workflow, and then just copy your SOPs or other files from wherever else you store them into your new folder for RAG (then potentially delete the original copies after). Not an ideal solution if you don't want to shuffle things around but that would be the quickest!
This is a great resource. Thanks for posting! One suggestion I would make is to add descriptive chapters. I find I have to seek around a bit to get to the parts I need.
Thank you and I appreciate the suggestion! I've been doing it for my more recent videos, but I honestly should go back and do it for this video. I'll be sure to take care of that!
@@ColeMedin Thanks! I just came back to the video to double-check something and having the chapters was super helpful.
Great walkthrough. For next topic, it will be cool to show a custom react application where chatbot is hosted and connected to n8n behind the scene. The file uploads, processing all that smooth connection that can be shown.
Thank you and I appreciate the suggestion! I will certainly be extending this with a frontend in the future, probably with React!
I second this. A video for a react front end would be amazing
This is a really thorough walk through for documents. How would this be altered if you were wanting to embed database data from a table? This is where I keep running into problems...
I really love the enthusiasm and clarity you share with each video! Thank you for all the value!
Thanks Cameron, I appreciate it a lot!
Great question! There are a lot of ways to embed a table. The easiest way would probably be to turn the data into a CSV and split the CSV into chunks (making sure to not split in the middle of a row) and ingest that into the vector database. Another option is to ingest each record or a set of records as raw text into the vector database.
Hopefully that helps! What problems specifically have you run into?
@@ColeMedin Thank you so much for the reply.
This might seem very long and like i am taking advantage of your generosity of time.
I need two tables. One table that essentially would house my company wiki database (each row is it's own article, persay and thankfully this has been able to be sent one row at a time via webhook), and then another that would hold the extracted data from supplementary pdfs and documents. The idea would be that if the wiki table doesn't have the answer, the bot could then search the supplementary table.
The problem I'm running into is that when I follow the steps of your video, the table is saved as documents and will then only ever try to search a table called "documents" (even if it is renamed). If I add another table, no matter what I set the vector store as, it always looks for "documents". I was able to get some SQL code that created more tables and gave me some new functions, but no matter what, N8N will just try to look in "documents" and will say my new functions failed. I'm considering moving to pinecone or something else, but the idea of using postgres was very appealing.
I fully support the work you are doing and am certain the best is yet to come!
Of course and no worries, I'm glad to help!
Your use case sounds awesome and it's definitely something you could accomplish with n8n and Supabase.
It's hard to be concise in explaining this, but essentially you would replace the RAG tool I have in the video for the agent and replace it with two n8n workflow tools that would each search a different table with RAG given a query. So you would instruct the agent to use the workflow that queries the wiki table first, and if it fails to find the answer then call the tool to query the supplementary table. Does that make sense?
You're my HERO!!! Thanks for this Masterclass!!!
Of course, I'm glad you found it useful!! :)
We need to focus on Localization of RAG and creating STACKS to host/use locally for the utmost security. Let's not forget you're training theses companies data for them by sending all your content into cloud services.
Great point! I definitely want to work towards making more content around localizing the entire system. Local setup is a bit more involved for things like the database which is why the full stack isn't local at this point, but I agree fully local is ideal!
@@ColeMedin While I'm a bit looney, using Redis, Ollama, Postgresql, n8n in a compose file setup properly could achieve exactly this though right?
@@redneq I haven't fully attempted or seen this built out before but yes this would be possible! And you certainly aren't loony haha, this sounds great and I know it's what a lot of people are looking for. Lot I'm working on behind the scenes related to local RAG!
@@redneq I do this every day, every nigt when i have time besides my fulltime job.. and what you will expirience is. That it is not building the prncibles and showing a bit nonsense, it is the companys data. You must show up with a solution for the real problem, its the data! This advice here shows just that he has just no idea of what hes doing.
Awesome observation
Very good video, great for when you think about moving into production. Thanks for sharing.
Thank you, Abraham - my pleasure!! :)
I've watche a lot of videos about n8n agents - most of them bs that has no real application for business. But your video and blueprint is totally gem! tysm
Thank you, that means a lot to me!! I'm glad what I have here has real business application for you 😄
Comrade, you really are fantastic with N8N! I really appreciate you sharing your knowledge here on YT. Another "follower". Congratulations. I am from BRAZIL
Thank you very much, that means a lot to me! 😃
I'm jealous you're in Brazil! Where I'm from (Midwest in the US) it's starting to get cold...
So it turns out, your context window is the same with or without rag according to Claude, which means all rag can do is help you choose your context segments that will be handed to the llm in text format for content larger than the context. I don't really need rag, I just need a method to isolate segments from my total context related to a particular query and then use the reduced context for it, this is a little slower, but it's going to better than rag at choosing segments. Don't pay for databases, you have a specialist at your disposal, use them.
Yes that is true that RAG is "just" a way to get specific context into the LLM from a knowledgebase and it doesn't actually extend the context window of the LLM!
RAG is one of many methods to isolate segments from total context as you mentioned, and is generally considered the best/easiest to implement option. I'm curious to hear more about what exactly you are thinking of implementing. Sounds interesting!
@@ColeMedin I don't need an LLM to be a knowledge base, databases or even text files are much better at being accurate. Where LLMs are useful are being an interface to a database, my path is to create tools that can be 100% accurate rather than super fast, because of 2 main reasons, 1: agents will destroy themselves and do so far too easily because they don't really know where they messed up, even if they only do it 1% of the time, agents feeding off themselves will accumulate errors more than a quantum computer. 2: inference speeds will continue to become faster, so agents should spend a few more turns on verifying their every move when they interact with deterministic tools like a database, then you can leave them to complete a large task and not have it produce unusable garbage. Well, that's the plan anyway.
@@saxtant Yeah I see where you are coming from! How would you handle a large amount of unstructured data though? If you have a bunch of let's say meeting notes or standard operating procedures, it would be hard to get an LLM to query for those in a SQL database or text files without having RAG to do a similarity search and pick out what matters for the user question. What are your thoughts there? I'm really curious to know!
@@ColeMedin I would prefer not to let it get that far, I mean it's true my house is a complete mess, but I would prefer not to have a large pile of unstructured data and actually I don't, I may not have the same requirements as others, but my current method is all about having an LLM actually use a scratchpad for unstructured data, only for it to be sanitized in the background and removed from the scratchpad to keep the context small.
I'm working very hard to do exactly this. I made a script I call context manager which serves to create the memory of the LLM for a specific task. I'm working with large modules and applications. The context manager basically parses my code and then I select which part of the code I want to send to the LLM to reduce hallucinations and increase coherence and increase the accuracy of the response. My goal is to have this context determined by an AI that will read the script and select which part of the code are relevant to include in the coding LLM 'a context. The goal is to separate task on different calls as the more specific the question is and the more focused the context is better is the response.
So far I'm doing the context creation manually which is of course a pain. I think. I'm getting close to getting it to work.
The biggest issue I have is that LLM are incredibly inconsistent even with ultra clear instructions. Claude which is by far the best I've tried for coding does not follow instructions well. I get more structured responses from GPT4o latest but it is not on the same level as Claude for coding tasks.
I'm also working on an auto merging software that takes the code output and parses it for commands. Which then the program uses to accurately merge the code snippet into the original. I use LibCST for the operation. I'm Getting close to getting it work correctly but still lots of little kinks to fix. Anyone working on this who wants to collaborate I'd be interested.
Btw I'm not a programmer I'm an engineer with a knack for creation and understanding complex things. I rely on AI to write code. I am managing it and figuring out the problems and explaining it how things should work. So a talented coder could really help advance the project.
Greate video, thank you for share.
Regarding, vector data base, do you suggest some self hosted solution?
Thank you, you are welcome!
For self hosting a vector DB I'd recommend either Qdrant or self-hosting Supabase and using PGVector.
Hi Cole, I got this working with your JSON file in minutes. Thank you for the detailed step-by-step instructions on getting APIs and SQL codes setup!
Q: How can I make this just search one or a few documents from the database?
You are so welcome, nice job!!
Could you clarify your question? For RAG it will only retrieve the documents you ingest into the vector DB. You could use metadata filtering to filter down on the documents you want to search. A lot of how that would work would depend on your use case and setup though!
Good content but the only way to say whether this is a good RAG system is to evaluate it via a metric like RAGAS. Any RAG system will spit out content. At this point as an AI engineer I expect all RAG videos to include an evaluation so that we can see how good it is at retrieving content and how reliable the content retrieved is. good luck !
Thank you and that's a very fair point! I appreciate you pointing out something that's missing in a respectful way. Honestly I would love to have some sort of evaluation in the video but I wanted to keep it concise. I will definitely be thinking of how I can incorporate that in future videos though without making it too much longer!
I believe that's entirely based on the embedding model and LLM- which you can easily swap out.
love these videos thank you for the value!
You are so welcome!!
Thank you for demonstrating a no-code RAG implementation. It's quite impressive! However as some of the comments mentioned local RAG is a realistic requirement due to security concerns. However for a true real world rollout, there's going to be a need for a guardrail framework tied to role-permissions, and a test framework for validating expected / unexpected outcomes and I believe that will inevitably lead to a code-based implementation. For now, the n8n platform is great for prototyping different backends / engines. This was inspiring nonetheless!
Thank you Howard!
You make great points! For a lot of applications, there truly is a lot that goes into making them production ready with all the security/testing requirements. n8n is suitable for some applications (such as many website chatbots), but others you are right in saying the requirements will often lead to the need for a coded solution.
your embeddings are everything, if your embeddings are chaotic and of poor quality this will greatly impact outputs.
Yes very true! I honestly should have focused on the embeddings even more here!
@@ColeMedin would make for a good video in of itself, not many people seem to want to explain how embeddings are created, how you can manage them or what the implications they can have for your outputs.
Yeah definitely - thank you for the suggestion! I have it in my list to create some content around more advanced RAG techniques. Embeddings are necessarily advanced but it does fall under the category of going into something specific to RAG in more detail!
Great work, Cole! Thank you for sharing.
Thank you Alex, my pleasure!!
Question: Why don't you make the id in the table documents in supabase be the Google sheet id just change the id to be uuid instead of int8 that way you can remove that step of deleting and inserting the document ... supabase trigger will get called on update too. If you match the IDs this way you can reduce the step and avoid dupes and making sure that when the user is chatting if the document was deleted they fall into that inconsistent behavior as the document might not be there as it got updated and in your current flow you have to delete and insert in between these operations there can be hiccups.
btw this is awesome, Most videos about n8n aren't well thought out and you cover a lot. I think this is the best way to get non tech and tech people into RAG! Plus automation, agents all the way.
having trouble changing the int8 to uuid in supabase, saying "cannot cast type bigint to uuid".. Got a fix?
This is a FANTASTIC question, I appreciate you asking it! And thank you for the kind words as well!
The reason I don't do what you're describing is simply because of a limitation with n8n. When you insert documents into Supabase for RAG with the "Supabase Vectorstore" node, there isn't a way to customize the ID of each record to make it correspond to the Google sheet ID. At least not that I have found.
So this approach with the metadata is the way to work around that in n8n. I also really wanted to demonstrate a use case for metadata since it's a really important topic for RAG so it worked out well that this workaround was necessary.
If you coded this solution yourself, then what you are proposing would be a very good approach.
Perhaps I'm not understanding the suggestions, but with this suggestion, for longer documents which get chunked, wouldn't a single Google document ID be associated with multiple chunks? And wouldn't that be a problems because the Supabase ID needs to be unique for each chunk. Perhaps you could append the Google Doc ID to make it unique (e.g. {googleDocID}-{uniqueNumber}; then you could have access to the Google Doc ID. But this all depends on being able to manage the Supabase ID.
Bro doing God's work out here
Thank you Gabriel!! haha
🙏🙏🙏🙌🙌🙌
Excellent video. Great job, you have spotted a perfect example and the realization is very well executed.👏
Why do you need to connect the RAG AI Agent for the Postgres chat memory using the Postgres while you use the API connection for the document insertion ?
How can I better understand the steps and choices you took for the chunking part ? Could you briefly explain when your method is relevant ?
Thank you - I appreciate the kind words!
The Postgres chat memory is separate from the document insertion (which is for the knowledge retrieval). The chat memory is there so the agent can remember previous messages in the conversation. The document insertion and the Supabase documents table is there for the agent to be able to search across your documents to answer a question.
The reason the chat memory and document insertion use different credentials is mostly because the Postgres chat memory in n8n can use any Postgres database - it doesn't have to be Supabase. So those credentials are more "generic" to a Postgres database, while the credentials for RAG are specifically for Supabase. I hope that makes sense!
For chunking, a lot of it just comes down to playing around with the chunk and overlap sizes to figure out what works best for your use case! There aren't too many rules to follow there. I just 1000 for my chunk size just because that's a default used in a lot of applications.
@@ColeMedin thank you for the precise answers. It seems easy when YOU say it !!
Haha of course!! Let me know if you have any more questions!
Hey Cole,
I have a question regrading the aplication of this to a multi-tenant database where each tenat should have its own rag for its docuemnts. Is this posible?
BTW great video!
Thank you very much! And great question!
It is hard to get into this in great detail in a UA-cam comment, but you can easily do multi-tenant RAG using metadata filters within a vector DB. With metadata filtering you don't even need a separate index per tenant, though you can do that too. So basically the tenant ID (or company/customer ID, whatever you call it) will be a part of all requests into these workflows. Any inserts into the vector DB will have the tenant ID included in the metadata. Then any retrievals from the vector DB for this tenant can simply filter on the tenant ID in the metadata to guarantee that it is only retrieving information for that tenant. Let me know if this makes sense!
Hey, Cole! Amazing video. I've been working on my version of it. Quick question:
I'm doing your workflow, but the original data for vector database is a CSV doc (an employees database)... So, I did a vector database where the batch size is 1:1 (each row became a row), to avoid breaking employees in pieces (lol)
However, thus far, the vectors that are showing up in the file retrieval are presenting sub-par quality, where they are related to my query, but not enough. So..
1) How would you set this up (in matters of parameters like batch size, files number etc.)
2) why is your retrieval files number = 4?. In my case, I've noticed the model is omitting/ignoring important stuff when I leave any value under 24... which is sad in a token expenditure perspective lol
Anyways, thanks in advance. good day
Thank you very much and great questions!
So the ideal setup depends a lot here on what kind of queries you want to make. RAG is really good at looking up specific employee records (example: "What is John Doe's salary?") but it is not good at answering questions that would require it to have the entire CSV in its context (example: "What is the average salary of all employees?"). This is because RAG will only have part of the CSV in its context unless you set it up to retrieve the entire document.
If the CSV is small enough (rough estimate < 10k characters) you could just not chunk it at all when putting it in the knowledge base. That way it'll pull the entire document to answer questions. Otherwise your idea of one employee per record could work or you could do something like 10 employees per row.
My retrieval is 4 because that is pretty standard when your chunk size is something like 1000-2000 characters and you want the RAG solution to lookup very specific information. But this is one of those parameters that you just have to play with a lot! That and the chunk size.
The bigger the chunk size, the more information will be available with a smaller retrieval number. So maybe it has to be larger than 24 for you because your individual records are so small (since it's one per employee)?
@@ColeMedin thank you. really complete answer, right here. That makes sense. I’ll test that number higher. My automation has been working good so fine, with 70+ employees
Glad it makes sense! And that's awesome!!
How did you do this? I am trying to change the chunk to match my row, but I don't know how.
@@leoplaysnotmuch under the document loader you have the “character splitter node”. I’ve set it to as high as my row can get (you can change from characters to tokens). Just make sure your rows aren’t too huge (mine with 1k tokens average are doing well)
Hi Cole, do you know if it is possible to output a summary of the chat interaction? I wonder, for example, if I could add another AI assistant in this workflow to do that at the end of the chat iteration, but I don't know how to do that without screwing up the current workflow.
Probably would be better if I simple extract the chat history and sent it out of the workflow and then create another one to do that. Just don't know how to extract it and send it out. Any idea?
Great question! So the chat memory is stored in the Supabase database so you can extract all the messages out based on the current session ID!
Great video. Just subscribed
Awesome, thank you very much! Glad you enjoyed it :)
great video, subscribed. i'm always looking for interesting n8n tutorials. Pity I'm using Groq at the moment so no idea how to do an Embedding tool with that. Maybe it's in the works
Thank you Martin!
Groq is a fantastic product for LLMs! For embeddings, it is too bad you can't use Groq for that - but you can use Ollama or HuggingFace in n8n if you want to stay open source for the embeddings!
This is a great and a detailed tutorial on using n8n for RAG. I noticed that although I add files to the google drive directory the app is monitoring, it still won't fetch those docs for the RAG part. Any idea why this might be happening
Thank you man, I appreciate it!
That's strange n8n isn't picking up on new files in your Google Drive... is your workflow switched to active? You'll have to toggle it to active in the top right of the workflow view to make the Google Drive triggers work! I'd also double check and make sure the triggers are set to use the same folder you are adding files to.
@@ColeMedin Thank you. It seems like it is not triggering on pdf files, just google doc files, I ended up adding a new text extraction for the pdf but that didn't work either. Maybe you can look into this in the future, unless if I am the only one having this issue. Thanks again for the great content
Interesting... the file type shouldn't change how well the trigger works! I'll have to look into it and test it out myself. And my pleasure :)
Thanks for the video. For the Supabase retriever node in n8n, is there way to specify additional filters (like a tag) to get documents within that tag or is that something that needs to be done in the match function within Supabase?
My pleasure and good question! You can filter on a tag or any kind of metadata within the Vector Store node you attach to the retrieval tool. It's the only additional option in the vector store node (at least for Supabase) when you click on "Add Option" in the bottom middle.
Curious on the google drive node, is there a way to monitor subfolders? The google drive nodes both have this call out, "Changes within subfolders won't trigger this node"
In other news, great video and thanks for sharing.
Thank you Don and great question!
That is correct that the Google Drive trigger node doesn't watch subfolders. If you want to monitor subfolders, you could set up triggers for those specific folders as well. Obviously that's only realistic if you don't have dozens of subfolders.
The best way to handle this without creating a trigger for each folder would be to use the Google Drive "Changes" API. You can basically tell Google Drive to alert you when a file is created/updated within a folder or your entire Drive by sending a request to a webhook which could be an n8n workflow (with a webhook trigger). This method does handle subfolders! So if you're really curious about extending this I would take a look at the Changes API!
@@ColeMedin Thank you, appreciate the help. I was able to work with the Google Drive search node, triggering the path every 15 minutes and searching for any files that have been modified in the last 15 minutes with this query:
modifiedTime > '{{DateTime.now().minus({ minutes: 15 }).toUTC().toFormat("yyyy-MM-dd'T'HH:mm:ss'Z'") }}'
You are the man!
Thank you Marco :D
Hey mate, nice vid! I was waiting for something that will show me how to use n8n for my use cases.
Do you think Supabase is better than Qdrant? From your video it seems you like it because you can keep track of the conversation history without needing something extra like MongoDB?
Thanks Henry! Yes, that is one of the big reasons I picked Supabase over Pinecone or Qdrant. It's nice to not have to have one platform for the conversation history and another for RAG! On top of that, Supabase makes it really easy to look at my knowledgebase in a tabular view which I can't do with non pgvector implementations.
@@ColeMedin Yeah noticed that view as well. Super cool. Do you have the possibility to see the vectors within each document upload?
I think the dream would be to have a solution, where you could just have everything within on platform. That would be money for sure.
@@HenrykAutomation Yes you can see the vectors for each document chunk in the Supabase table as well!
I agree that's the dream, and it seems like Supabase is the right direction.
Thanks for the great video.
can Qdrant or Chroma be used locally instead of hosted Supabase?
Thank you!! And n8n doesn't support using local vector DBs, although you can use a hosted Qdrant vector DB with n8n! If you wanted to use a local vector DB in the workflow, you could host it on the same machine that your n8n is self hosted on and then create a custom code step to work with the local vector DB.
super stuff man! I am using self hosted n8n and don't have postgres chat memory option. I think I'm on the latest version 1.47.0 is this something you need to install? Also there is no retrieve documents option off rag agent. When I copied your file these 2 parts were broken. Are you using cloud version?
@@jackmermigas9465 Thank you!
That is really strange. I'm also self hosted and did not have to install anything extra for postgres chat memory or document retrieval. It honestly seems like something is off with your installation... Is it self hosted with Docker or did you use NPM? I'd love to try to help you get it working!
@@ColeMedin Thanks for the response, I'm on Render + Supabase installation. Is the latest version upto 1.58 now? Is that your version?
I figured it out I was on an old version 1.47 and didn't have docker image set to latest! Now on 1.58 and everything is showing correctly thank you!
Okay sweet, yep!!
Great Content! Can you make a tutorial about how to make agent that can be database manager ? Gives info to client, update record, and more for business?
Thank you!! And I actually am planning on making a video like this already! Are you thinking this agent would create custom queries to manage the database, or more just call tools that already have queries defined to perform certain actions? I am thinking of doing both but curious what you had in mind!
How would you recurse through a single Google Drive directory? I'm doing a nightly sync with my local Obsidian vault and I'd love my AI Agent to get really smart on my years of notes
Great question! You can create a separate n8n workflow to scrape through an entire Google Drive directory pretty easily! I might be making a video on that in the future, but essentially in n8n you can set up a workflow to list all files in a directory and then in a loop go through each one and add it to the knowledgebase similar to how I do it in the video!
Why supabase? what is benefit of supabase vs postgres? In this video not using any advance auth mechanism to validate user and I believe same can still be done with Postgres, right?
Supabase is running Postgres under the hood! It's just super convenient compared to hosting Postgres yourself and it has features like authentication and row level security for expanding this solution.
Great tutorial! Ironically, I've been stuck for a couple of days on the google drive authentication. Not able to connect n8n with google for some reason :/
Thank you! I'm sorry you're having issues with the Google Drive authentication? What is the specific error message you are getting?
@@ColeMedin I believe the error was with the app being on Testing mode, it didn't allow any OAuth to proceed (contact developer - kinda funny when you are the 'developer' haha). I was able to troubleshoot by adding the email manually to google console, I guess whitelisting it for Oauth login while is Testing. I tried switching the app to Live mode but it required google review and approval, etc.
Thank you for the video. What wasn’t clear to me, and I would like to know before starting with this tutorial, is whether any of the documents I would upload are analyzed in the cloud or if everything is processed locally, including the database or even when the AI responds. Is everything done locally? I’d like to apply it at work, but there are certain privacy requirements that demand everything to be on-premise.
My pleasure and I appreciate your concern!
This is not running entirely locally. The LLM I use is GPT, and I use the cloud (managed) version of Supabase.
However, you could easily make this entirely local if you want! You could run the LLM yourself with Ollama (something like Llama 3.2 11b/90b) and then host Supabase yourself for the DB. And then n8n is self-hosted in this video so that is already local.
I have a video on my channel where I show a similar setup that is entirely local!
ua-cam.com/video/V_0dNE-H2gw/v-deo.html
@@ColeMedin Thank you so much!
Of course!
Hi =) 🎉 good contents!
But it seems to work for updating file only, is that right? It could be done for all of the exsting files? 😮
Thank you!!
Yes, this is only for files that are created/updated once the workflow is set up. But you can create another workflow in n8n that will get all existing files from a Google Drive folder and put those into the knowledge base! Basically you would create this workflow, run it once to add all existing files, and then not need it again.
Thanks for the video !
I believe your workflow has a problem, but you did not run into because your document is split in only one chunk. Indeed, if the doc is split in multiple chunks, the delete node runs multiple times and the download node runs multiple times as well (and so will the rest of the workflow, leading to duplicates in the DB). Checking the option "Execute once" is not an option if we want to maintain the capacity to handle multiple files at once.
Would be happy to have a workaround that 😁
My pleasure, and thanks for raising this!
I think you might be right, but I'll have to test it! Regardless, the workflow triggers per document that is created/updated even if they are created within the same polling minute (there would just be multiple executions at once) so you actually should be able to check the "Execute once" option.
@@ColeMedin Thank you for your quick feedback ! You're making a good point. In my case though, I want to maintain the possibility to deal with multiple docs at once (as I want to trigger manually not auto). I will keep investigating around that 😀
Ah okay that makes sense!
I believe there is a way in n8n to merge multiple output items into one. That way you can take all the records that are outputted from the Supabase delete node and combine them so you aren't running the rest of the workflow multiple times.
GREAT tutorial. How can this be modified to work with pdf's?
Thank you!! I haven't tested this with PDFs specifically myself but others have and this should work already for PDF documents!
Otherwise, there is a specific "Extract from PDF" node in n8n you could use. So you could add a condition to the n8n workflow that routes to the regular text extractor when the file is not a PDF file, and route to "Extract from PDF" when it is.
@@ColeMedin Thanks for your reply and, again, for the great tutorial. I'm going to get this installed today or tomorrow and will try pdf's and let you know.
@@moses5407 Of course and that sounds great!
For anyone else wondering about this, here's how I was able to implement it:
I set up an IF node after the download a file node with this condition
{{ $binary.data.fileExtension }} = pdf
- If True, it'd go into the extract pdf text node followed by a Set node, that takes the extracted value and saves it to data. It connects to the "insert into supabase" node.
- If False, it'd flow normally, into the extract document text node
Hey Cole, you forgot to mention that pgvector extension needs to be enabled to work with embedding vectors in Supabase. Thanks,
Thanks for pointing that out! PGVector is actually enabled as a part of the SQL script that I show how to run within the Supabase platform. But I certainly could have called that out more clearly!
Love it!
Thank you very much!!
Can we have a bunch of videos teaching maybe Devops or docker kind of stuff after which we can put together stacks like this 😊
@@jubinroy4987 I appreciate the suggestion! I haven't thought too much about making content around AI devops besides this but I really like the idea so I'll seriously consider it!
Hi Great video. For some reason supabase does not create the documents table, and i get an error message from the Delete Old Doc Rows node saying Bad request - please check your parameters
column n8n_chat_histories.metadata does not exist. What could be the issue? The rest of the nodes are seems to be working.
Thank you! Sorry you are running into this issue though!
The documents table isn't created by itself, you have to follow these instructions:
python.langchain.com/docs/integrations/vectorstores/supabase/
For the delete old docs node make sure the table is documents and not n8n_chat_histories!
@@ColeMedin Legend, thank you for helping out!!!
You bet!
Hi Cole, I have another question, if the meeting note becomes multiple chunk of vector, I've noticed that under N8N they are not passing the metadata back to the model so the issue here is that the AI model won't know which meeting date is the chunk coming from since it's stored under metadata.
The alternative in my mind right now is by just storing the meeting summary as a vector and not the whole transcript to optimize the token, include the meeting date and other info in the summary content, and then store the summary into just 1 chunk (1 chunk per meeting summary)
Keen to know your thoughts.
Thanks!
Yes you are right here, and I love your thoughts for an alternative! This is one of the common pitfalls with RAG in general where if a chunk of text is retrieved from in the middle of a document, the LLM doesn't necessarily know which document it is from since only the first chunk would have the title. So I like your approach to keep it all in one chunk per meeting note!
Another option is to prefix every chunk with the document title. That would require a more custom implementation though since you can do that by default with the vector document inserter node in n8n.
@@ColeMedin Thanks Cole, I just tried both of your suggestions, both worked
Awesome, I'm so glad to hear!
I did not understand the purpose of n8n in this whole picture. Can I do it without n8n? Is there an alternative to n8n?
Just want to see clearly where n8n fit in this picture
Great question! So n8n is what allows you to create this entire setup without having to code anything. The alternative to n8n would be to create this AI agent using Python and a library like LangChain. I do have a lot of content on that kind of thing as well!
Or if you want other no code workflow automation alternatives to n8n, you could use Zapier or Make.com. But those are super expensive so I'd recommend n8n for sure!
well done. subscribed
Thank you very much!!
thanks for the video.
I would like to analyze PDF studies of several hundred pages and make summaries to extract insights.
The problem is that I can't copy/paste the pdf into GPT because it goes beyond the context window.
Can I use RAG to do this use case?
The RAG seems to be designed more for answering specific questions from a knowledge base than for synthesizing documents.
You bet! You are right that RAG is meant more for answering specific questions. To summarize very large PDFs like what you are trying to do, I would suggest having the LLM summarize something like 5-10 pages at a time, and then have a final prompt where you combine all the summaries together and ask it to make a final summary.
How can we setup so you can add pdfs, excel files, different types of files and have it extract them into text to be able to be embedded? I tried using a switch but it won't accept any schema from the binary files. Any ideas?
Great question Jack!
This would involve a bit more of an in depth flow where you would add branching to your workflow based on the file type. I've done this before with n8n so I know it's pretty easy to set up.
Basically you would add an "if" (router) node to your n8n workflow. If the file type from the Google Drive trigger (you could use the mimeType property) is a PDF, then you would route to a "Extract from PDF" node, if it's an Excel file, you would route to a "Extract from XLSX" file, etc. If you click on the "Extract from File" node in n8n you'll see a list of options that includes these.
Then you have all of those separate "extract from file" nodes route back to the rest of the workflow that handles the extracted text. Hopefully that all makes sense!
World is crazy. 15 mins to search through all my PDFs.
Indeed! 😃
have you tried larger files? like a PDF that is 500kb or larger? My setup seems to choke on that in the embedding ollama part. I can have multiple small .doc files no problem.
I have tried and have experienced the same thing before! Typically that means you need more memory on whatever instance you are hosting N8N with.
This is a great video, but I am facing an issue. I want to upload docx and pdfs. I already can do pdf uploads by using Extract from Pdf node, but for wordx it has been a hassle trying to figure this out. IF you can help with this that would be great. How can I extract text from docx.
Thank you! Sorry you're running into that issue though.
It seems n8n doesn't support docx by default unfortunately, so you would have to convert it to a Google doc or text format (something like that) first.
Do you insert the doc file manually to Supabase or you did via the chatbot? I did not see that step
The documents are automatically ingested into Supabase when they are created or updated in the specific Google Drive folder that my n8n workflow is watching! So nothing has to be done with the chatbot itself. That is done in the bottom part of the workflow where the two workflow triggers are "File Created" and "File Updated".
@@ColeMedin It works now by executing the given SQL code, thank you!
@@nmana9759 awesome, of course!
Isn't this workflow always overwritting the database (every one minure) since the File Created tool is always downloading the LAST file created?
Great work by the way. Thank you.
Thank you very much! And no, it only polls for new files that are created every minute, meaning it won't run every minute unless a file was created in the last minute (and it can handle multiple files uploaded within a minute too). I hope that makes sense!
Nice! Thats exactly what I wanted to hear :) thank you!
My workflow appears to only be triggered by the most recent file that was created. That is, if 2 files are uploaded between trigger events, Only one of those files will be added to Supabase. So, if your trigger runs every minute, and you exceed a file-upload rate of 1 file/minute, files will not be added to Supabase. Have you tested this scenario and ruled it out as a flaw in the workflow? I'm unable to overcome it with this current setup, as I understand it, at least.
Okay I did some more testing with this and right now the workflow does only handle one file being uploaded/updated at once within a minute. If you do more than one within a minute, it will trigger the workflow only once but there will be multiple items there - so you just have to loop over them to process them and index them.
great video! Thank you so much. I have some questions though :) Does this also work with PDF? From the Step "download files" I get the impression it is just for google drive filetypes.
Exactly in this step I get the following error when putting a load of PDFs in the google drive folder:
Multiple matching items for expression [item 0]
An expression here won't work because it uses .item and n8n can't figure out the matching item. (There are multiple possible matches)
Try using .first(), .last() or .all()[index] instead of .item or reference a different node.
Any Idea what this is? So far my supabase database is completely empty. no change no matter if I put one PDF or 30 in the folder
When I just let it sitting there it ran again 4 minutes later with only one item from drive with an error in "insert into supabase". unfortunately for some reason it will not open the node to tell me the error .... If I can find it, I will send it as another comment
I am running a docker installation on a Synology NAS
THANK YOU SO MUCH!
well it just says internal error...
alright. when I create a google drive text file and execute the steps manually it works. I have now my first element in the database of supabase. I guess it is just not made for PDF files... So I need to find a way to convert PDFs to Google Drive Docs. You don't happen to have a workflow for that, do you? :D
Thank you Josef!
This workflow doesn't inherently work with PDFs unfortunately. However, you can make it work with PDFs pretty easily! There is a specific "Extract Document Text" node in n8n for extracting PDFs. So if you add the "Extract Document Text" node, there will be a list of file types and PDF is one of them. You'll just have to route the workflow to that node when the file type is PDF, and the other extract node when it isn't. I hope that makes sense!
You could also convert the PDFs as you mentioned in your other comment, but it's probably a lot easier to not have to!
@@ColeMedin hello there :) Indeed I created a separate WF for just converting PDF to txt. I now have a folder in my google drive which is called "PDF Converter" and contains an input and an output folder. so this solution is fine. Had some problems with some documents which contained non unicode signs... and for some reason the WF does not start for me.... so if I put something in the google drive folder I linked in your workflow it just does not start... I have to start it manually. I start, it processes one document, I stop it, start it again etc. until all documents are done.
@@JosefMaxHajda Sounds like a good solution! I'm sorry the trigger isn't working for you though... Is your workflow toggled to "active" in the top right of the builder?
Great! Thanks a lot!!!
Thank you! My pleasure 😀
Hi Cole. Thanks for another great video. I have a quick question about duplicates in the vector database, as you touched on at ua-cam.com/video/PEI_ePNNfJQ/v-deo.htmlsi=ac0SBGP3onyuT_I1&t=766
Is this functionality a limitation of N8N? It is the same (not doing upserts) with any vector store (such as Qdrant)? Or it is as a side effect of our table structure? (I mean maybe we can have instead a table that has the ID as the vector not another different ID, whether self generated or the file ID from Google drive). Maybe this is limitation of pgvector?!
Thank you and great question! It is more of a limitation of n8n since yes my workaround is really just making an upsert possible since inserting into the vector DB with n8n by default is a pure insert not upsert. Certainly not a limitation of PGVector!
Please create a video that we use all open source and setup locally
I do actually have a video out already where I have a similar setup that is fully local!
ua-cam.com/video/V_0dNE-H2gw/v-deo.html
At 9:29, you didn’t actually explain how to fill in the information for the Postgres chat memory node. At an earlier point, you entered Supabase and accessed the database and API sections and said we would need that later.
Later, you accessed the Postgres chat memory and said that you had already shown how to configure it when you showed Supabase! Basically, you said: we will need this later, and when the time came, you said you had already explained how to set up the node earlier!
And I couldn’t fill in the information for the node because there wasn’t actually that explanation in the tutorial, especially regarding the password.
Shoot I'm sorry! The password will be your Supabase DB password you get in the database tab in the Supabase settings. You set the database password when you create it in Supabase.
HELP ME COLE!!! I'm struggling with my Rag Agent because it seems like he doesn't like to work that much... Despite de retrieval process brings excellent information to the agent, most of the times he answers "I don't know". There's a "default" system prompt for the Retrieve Documents LLM that says: "..if you don't know the answer, just say that you don't know" and I think he is taking that tip really serious. How can I face this problem???
Interesting... so you're seeing that the right document chunks are retrieved from Supabase but the LLM still says it doesn't know the answer?
I'm curious which LLM you are using. I've seen it a lot with less powerful LLMs that it will have the right context and still think it doesn't know the answer. I'd try gpt-4o or Claude 3.5 Sonnet and see if that helps.
Otherwise, sometimes starting a new conversation can help because the LLM gets into a weird loop where it constantly thinks it doesn't know anything.
Hello, is there a way to make the Google Drive File Updated Node to get all file updates rather than just the most recent one. I made changes to two of my files, but only the most recent one is recorded (and the others are missed). Again, the same for file created, if I put 5 new folders into my folder (that i am watching with the trigger) it will only pick up one of then. Thanks for the great videos
Great question! So when multiple files are updated within the same minute the workflow actually triggers once with multiple files as inputs. So you have to change up the workflow to loop through all the files passed in! You can set up a "loop" node at the beginning of the workflow and the rest can be essentially the same.
Basic question, but the chatbot is consulting the database from Supabase Vector Store or from the file in Google Drive?
I think the file from Google Drive was used only once to create the vector embeddings that were stored in the vector index. As you make a question, the AI agent would parse your prompt, and search for a vector embedding that has a similar value of what you asked for in your prompt.
I'm also new to RAG, so I can't be 100% sure of what I just said. It seems that's what happens in there for me. If I'm wrong, please let me know.
@treefreezoner is totally right! Thank you for the response and good question @regisaabh!
HI, thank you for your great tutorials first.
However, I've tried to import your workflow json file but never success. Neither using self-host n8n nor n8n cloud.
Any suggestion for fixing the issue? Thank you.
My pleasure! Sorry you are having trouble importing the workflow though! What is the error you are getting?
@@ColeMedin Thanks for your replay. I've tried to import your "Supabase_RAG_AI_Agent.json" and "n8n_Workflow_RAG_AI_Agent.json" and both were showed "Could not import file: The file does not contain valid JSON data."
That's really weird... I tested it myself and it is working. Are you self-hosting n8n or using the cloud n8n?
At intro he said most golden lines and my most frustrating point in my mind, you won't be believe I gave my 5 days but I didn't got a proper perfect way to implement that in real life as most of them are carrying the paid promotions and use that promotions in the project which frustates me and then some are of poor quality, my gosh!!!!! And if you find a way then it works only for personal use, when it comes to the production ready build, all fails.
what to do when n8n is locally hosted what should be the app domain for google cloud?
Great question! When you have n8n hosted locally, you'll have to set up a domain and SSL certificate to be able to use Google since Google won't work with localhost unfortunately. I would suggest hosting n8n on a VPS using a service like Digital Ocean! n8n actually has great documentation on hosting in DigitalOcean:
docs.n8n.io/hosting/installation/server-setups/digital-ocean/
Thanks!
Thank you so much for your support!!
I think RAG AI is really good with a small volume of data, like a 10-page PDF. However, when moving to something more serious, like over 50 pages and about 20 tables in a file, it doesn't respond as well. I'm referring to a file with more than 2,000 lines
Yes you certainly aren't wrong! There are a lot of factors that determine how well a RAG system performs, like the embedding model, the LLM model to handle retrieved chunks, your chunk size, how you split up your documents (especially for things like tabular data), your use of metadata filters, etc. A lot of advanced RAG techniques can be used too like reranking, hybrid search, knowledge graphs, etc. All of this becomes a lot more important once you have a lot of files or very large files like you are saying!
question, instead of using openai api is it possible to use ollama?
Great question! Yes - you can set up Ollama with n8n really easily since it's one of their chat model options!
All you have to do is select the chat model on the AI Agent node, choose Ollama from the list of models, and then give the endpoint URL of your self-hosted LLM.
docs.n8n.io/integrations/builtin/cluster-nodes/sub-nodes/n8n-nodes-langchain.lmchatollama
Whats the best way to allow for PDF's to be uploaded and work correctly? I switched the extract document text to pdf which is fine but for some reason it wont upload to Supabase. The text splitter node doesnt work
Interesting... what you are describing sounds correct to me! So I'd have to take a closer look to see your setup. What is the error you are getting with the text splitter node?
@@ColeMedin managed to figure this part out , everything uploading fine but some times the model is not checking the vector store for answers and just uses internal knowledge.
Glad you figured it out! And this new issue comes up a lot, but there is an easy solution! If you click into the "Tools Agent" node in n8n, you'll see a place to edit the System Message. There you can provide more instructions to the LLM to tell it something like "Always check your knowledgebase for the answer, don't rely on your internal knowledge".
I was trying to recreate this RAG but faced the problem. There is no any description of database tables structure. Can someone give me a hint? We are storing the documents-embedings in the Couchbase DB “documents” table, what are column names and type?
13:18
@e11e7en gave a good timestamp! Thank you for that!
Thank you for the video! i followed all steps but have an error on Supabase Vector Store output: Error searching for documents: PGRST202 Could not find the function public.match_documents(filter, match_count, query_embedding) in the schema cache Searched for the function public.match_documents with parameters filter, match_count, query_embedding or with a single unnamed json/jsonb parameter, but no matches were found in the schema cache.
cant work around it. any ideas?
Of course! Thanks for walking through it and I'm sorry you're getting an error!
Did you run the SQL commands outlined here in the document I showed in the video?
supabase.com/docs/guides/ai/langchain?database-method=sql
That third command that starts with "create function match_documents (" is what creates the function your error message says is missing. I would also make sure the "public" namespace for Supabase is your default namespace. You can determine that by going to the table editor in Supabase and making sure "public" is the schema chosen by default in the top left.
@@ColeMedin It worked! now i want to uplaod PDF files? is there a way exctract pdf document text?
Sorry I am just seeing this now! Yes - there is an option to extract from PDF in n8n! If you select the "Extract Text" node there is a "Extract from PDF" option.
Can you make a video on Langchain, Supabase and n8n integration
I am certainly going to in the near future! n8n actually uses LangChain under the hood for their AI Agents, so this already is a LangChain + Supabase + n8n integration! I'm assuming you mean with code instead of n8n though?
+1 on this. So n8n can be categorized as “orchestration tool” like langchain or LlamaIndex?
what's the difference between supabase in this video an your postgres video? which one is better?
Could you clarify what you are asking? Do you mean Supabase versus Qdrant for RAG?
@ sorry, I thought supabase and postgres are competitors. And you made 1 starter video with Postgres and the other with Supabase. Maybe I’m asking the wrong question. Mainly I want to know which Chatbox plugin I can use in WP to interface with N8N
Supabase is actually using Postgres under the hood, so they are not competitors! And could you clarify what you mean by your second question?
Hi, I am using the same code but when deleting the existing embedding I'm getting a timeout from supabase, I wonder what's the fix? thanks
Hmmm... and your Supabase connection is working for other nodes like inserting the vectors? What's the exact error you are getting? I haven't seen this one before!
@@ColeMedin No worries, I've found that the reason I got the error is I did not include during the Supabase authentication. thank you :)
Okay great, I'm glad you figured it out!
Even more cheaper is to install supabase on a own cloud server :) Free Stuff
Yeah I totally agree!
Managed Supabase is cheaper to get started because the free tier is very generous, but yes, once you get to production and need to scale, a local Supabase instance is generally the way to go!
followed all the steps but I do not have the file id in my metadata.. any fix?
I would have to see your workflow to know what exactly is going on. Do you have the metadata option added to the document loader in the n8n workflow?
For the host part it change the position, where can I got it for supabase, i stuck at there only
Sorry I'm not quite understanding your question, could you please clarify?
For some reason, the agent prioritizes the memory and does not use the documents with the tool. When I remove the memory, it uses the documents tool perfectly. I don’t understand the logic behind this. This is the system message: You are a personal assistant responsible for answering questions using a corpus of documents. Before stating that you do not know the answer, you must use the 'documents' tool to search for relevant information in the vector store. This search should be your primary action every time you receive a question, unless it is absolutely clear that there is no useful information available. Always respond in Spanish. This version emphasizes the necessity of using the specified tools to ensure thorough document searches.
Interesting... I didn't run into this issue myself for this setup but I have had this happen with RAG agents before.
This is especially common with models that aren't as powerful, so the easiest thing to try is to use a more powerful model if you can. Like try GPT-4o instead of GPT-4o-mini if you're using that.
Also I'm curious - what kind of conversation did you have with the agent where the memory and knowledge base would have conflicting information? Is it because you added a document to the knowledge base half way through the conversation? Sometimes you have to restart the conversation when there is new information in the knowledge base, because the LLM doesn't necessarily understand that new info is available which is why it can resort to what it said earlier in the conversation.
Can you make a video how to talk to a postgressql db?
Great suggestion, I appreciate it! I'll be making more SQL AI agent videos in the future and this would be a great addition to what I've been thinking!
@@ColeMedin Can't wait mate! Great videos.
Thanks man!
N8N says your json file contains no valid json data when I try to import it.
Huh that's weird... are you using N8N self hosted or in the cloud?
looks good, but upload sensitiva company data into the cloud will not be the best choice ;)
Thank you and yes for a lot of businesses that is certainly the case! But I do make videos on local solutions as well for this! Similar example to this video but local:
ua-cam.com/video/V_0dNE-H2gw/v-deo.html
@@ColeMedin thanks, got it :)
Awesome - you bet!
Are you saying this is better at knowledge base assistance than pinecone gpt4o set ups I’ve seen?
It's going to be better that a lot of the other RAG assistants on UA-cam for sure! I'm not guaranteeing it's the absolute best, which is why I didn't say "all" in the video (just "most"), but I have specifically seen a lot of mistakes repeated in other tutorials which I called out.
Specifically, I have seen n8n RAG tutorials with Pinecone before that duplicate vectors whenever a document is reinserted into Pinecone when it is updated. The Pinecone insert function in n8n is not an "upsert" (update or insert)! So it won't remove/update the older vectors for a document, it will create entirely new ones no matter what. That means both the old and new version of the document are in the knowledgebase still.
@@ColeMedin I have a business and I'm waiting to pull the trigger the last few weeks on getting one of these set up either paid or myself (likely won't be me lol). I appreciate you citing an actual function issue with previous set ups. I've seen a lot of apps where they don't have "update row" on google sheet which is the number way to update a system on gsheet. I'm sure there are a few more nuggets to watch out for when setting these up. Duplicate knowledge bases is a big no no
this only needs to be able to connect it to a local LLM like lmstudio or ollama
You can actually connect to a local Ollama LLM inside the n8n workflow! It's one of the supported options when you select the chat model for the n8n Tools Agent. All you have to supply is the local URL (localhost + port) that the LLM is running on.
I successfully created the chatbot but integrating it with a UI I have built with React is actually very hard
Good on you for working on creating a frontend for your chatbot!
Making a frontend for AI agents certainly isn't a straightforward task! Have you tried using something like v0 to help? I would give that a shot - it has been a game changer for me.
I will also be putting out more content in the near future around creating frontends for AI agents!
@@ColeMedin I have actually already built a frontend exactly with v0, game changer for me too.
I just have trouble connecting the n8n chatbot to my React frontend made with v0.
Anyways yes put out more content for creating frontends for agents, I got really good inspo from your videos lately :)
Glad you're getting good inspiration, I appreciate it!
Could you clarify where you are running into issues connecting your n8n chatbot into React? One suggestion I have is to use a webhook trigger for the n8n agent. That way you can make a fetch request to call into your n8n workflow from the React application and basically turn the agent in an API endpoint. You can add authentication/authorization easily as well!
@@ColeMedin yea using a webhook trigger is smart, I managed to pass the query from the webhook trigger to RAG AI Agent node using chatInput but for some reason it still doesn't start as it shows the error - The value "" is not supported and I've tried with many different queries.
Maybe you could make a video where you create a quick UI in v0 then connect the n8n chatbot to that UI via a webhook trigger.
Btw my project is already working locally without the n8n integration, I added you on LinkedIn if you want to see what it does, it's pretty interesting but I currently have no business use case maybe you can get inspiration yourself.
Hmm... that's a strange error... I am definitely planning on making a video in the future connecting an n8n agent with a webhook trigger to a frontend I build with v0! So I really hope that can help you out!
Thanks for connecting on LinkedIn, I'll head over there now.
Why not Pinecone instead of Supabase? Pinecone seems more suited (than Supabase) for unstructured text formats/files, no?
Because UA-camrs aren't paid to give you the best technical advice, they're paid by the tracking links when you go paid on Supabase
Very valid question! I chose Supabase for three main reasons for this workflow:
- It's easier to use
- It's cheaper (Pinecone has a free tier as well but I've had relatively frequent outages with it)
- Pinecone is a better vector DB overall but really only once you have tens of thousands of vectors, and it isn't better by too much from my experience
I will definitely be making content with Pinecone in the future, but wanted to start with Supabase for these reasons!
@@littledaddi3 I honestly do try to give the best technical advice I can! I am not sponsored by Supabase in any way so this is all honest advice with no tracking links :)
@@ColeMedin Makes sense. Thanks!
@@Apokalupsis88 Of course!
How would you add better semantic search please?
Could you please elaborate on what you are looking for?
friend, tell me how to do the same only with an online chat for my site, so that you can install an online chat on your site and there was a vector storage and chatgpt
n8n actually allows you to embed an AI Agent chatbot like the one I created onto any website as a widget! I'd take a look at this resource to see how to use it:
www.npmjs.com/package/@n8n/chat
Let me know if this helps!
The google drive nodes can read pdf files? The whole process can be based in reading pdf files?
Great question Bruno!
The Google Drive nodes can indeed handle PDF documents. The workflow does need to be updated to work with PDFs though because the "Extract Text" node I use in the video doesn't work with PDFs. There is a version of that node that is specifically for PDFs, however, so you can use that one for PDF documents!
@@ColeMedin Thanks!
@@ColeMedin what about to make the AI to chat with myu own pdf files? Does this workflow able to do this?
Yeah it does! Once you add in support for PDF files, you can just dump them into your Google Drive folder and then ask questions based on their contents to the LLM once the PDFs are ingested into the knowledgebase!
@@ColeMedin Even with the whole supabase methods inserted? supabase will be able to fill the database extracting info from my pdfs and make the workflow to run properly?
How do I bring this into slack? If it’s not 12 grade easy any way you could point me how to ask someone to help? Thanks
Great question! All you would need to do is these two steps:
1. Change the workflow trigger from "Chat message" to Slack's "On New Message Posted to Channel" trigger. Here you can specify the specific Slack channel and any other filters that would trigger the n8n workflow.
2. Add a step at the end of the workflow to send the response back to the Slack channel with Slack's "Send a message" action.
For both the trigger and action I mentioned, you can search for Slack in n8n and you'll find both of those in the list that pops up once you select Slack!