PostgreSQL as VectorDB - Beginner Tutorial
Вставка
- Опубліковано 20 гру 2023
- Want to get started with freelancing? Let me help: www.datalumina.com/data-freel...
Need help with a project? Work with me: www.datalumina.com/consulting
🔗 Links in this video
github.com/daveebbelaar/langc...
github.com/pgvector/pgvector
dev.to/confidentai/why-we-rep...
👤 Connect with me on LinkedIn
/ daveebbelaar
👋🏻 About Me
Hey there, my name is @daveebbelaar and I work as a freelance Data Scientist / AI Engineer and run a company called Datalumina. You've stumbled upon my UA-cam channel, where I give away all my secrets when it comes to working with data. If you want to learn more about what I do, then head over to www.datalumina.com/
in the video you are creating the data from text files, but it seems like a main advantage of having it on your postgres db is being able to use / query the data in your tables.
i'd love to see how to build a full text search or something from data stored in regular postgres tables!
A new fan here! It will be great to see a video where you use streamlit or something else to create a search with pgvector (full text search)
Thanks for this! I was leaning towards pgvector and your video convinced me so!
One of the things I learned in the past few months working with RAG-based LLMs is that it's definitely not one size fits all. The quality of inference depends on the embedding algorithm as well as the indexing and retrieval mechanism of the vector database.
This was a great video!
Great Video! Helped me in my work! Thanks :)
Clean solution. This is helpful, thank you for this.
Hi Dave, this is great video thanks for sharing the knowledge , i really liked the idea of using postgres sql , can you pls make one video on setting up postgres on azure
Thanks for the video. I'll be trying PGVector! Do you know of any good alternative to OpenAI embeddings that can be run locally?
How would this work if you were using more structured data that needed to be stored in columns and rows?
THis was super interesting. Do you have a video that explains your PGVector setup (do you install the database locally or do you have a cloud account)? I'd love to have a setup where I can view my document collections and embeddings in my editor like that. I use VSCode right now, so not sure ... good stuff!
I talk about this near the end of the video
What about an open source vector store like qdrant?
Hi Dave, I’m also using Pgvector but output are not really that good, could you make a video on improving performance of RAG pipeline in langchain and pgvector, thanks.
thank you💖💖
thing that bothers me about using postgres for RAG is that the vector search works fine, but its full text search capabilities are severely handicapped. it doesn't support partial or fuzzy matching, so you can't really do a nice reciprocial rank fusion between resources retrieved by multiple channels (vector + full text). i'm going to try ElasticSearch next, as i've previously worked with it and its really good at full text search (TF/IDF, fuzzy search, partial search, stemming...), and the newer versions also support vector search. the downside is having to sync elastic with your main db all the time...
Bro, did you do any indexing?
I have follow up question if let say 1 chapter of a book total words count is 3k will it be able to store all the 3k words ?
LOL dave i was googling if i can use postcres somehow instead of pinecone and your video popped up 🤣🤣👍🏽👍🏽👍🏽 Love it!
Haha you're becoming a true engineer Jenny. Those are some pretty serious Google searches haha. Let me know if you need further help!
@@daveebbelaar for sure dude! 🤌🏽 trying to get in that coder level 😂😂😂
I didn't fully understood it from the video but are you comparing times between using Pinecone on a remote host vs Postgres ran locally?
Not only processing time (because I know that's not a true fair comparison), but also easy of use and data management.
@@daveebbelaar I get that, but in a production environment it makes a big difference especially when you think of use cases. I would be curious to see a comparison between a cloud hosted postgres and pinecone or,between the locally hosted postgres and something like chroma
Greatt!! I enjoy watching your video. I have tried to hands-on the code from your GitHub but i am facing an error ModuleNotFoundError: No module named 'pgvector_service'. Then, I tried to pip install pgvector_service but this occured. ERROR: Could not find a version that satisfies the requirement pgvector_service (from versions: none)
ERROR: No matching distribution found for pgvector_service
Do you have any ideas how to overcome this?
Thanks for showing pg vector. weaviate is also free and can be run locally using docker. I agree I am for open source.
How do you update the vectorstore (e.g. replace outdated data?
br
Just update the outdated data like you would in any db.
Could you put the vectors inside fire base ? That’d be epic
Nope, firbase has a limit, tried it.
@@3wcdev878 dang that’s unfortunate
But you tested it with a small dataset, most relational databases go slower as they grow.
pinecone is managed isnt it? theres more reasons why enterprises would use and pay for it. For simple side projects, then yeah pgvector locally makes sense.
pgvector is the WORST performing vector db according to all comparison charts.
you need to tell people if you're sponsored by supabase, otherwise this is not ethical.
Can you share some more insights on this? And no, I am not sponsored or affiliated with Supabase.