18 Months of Pgvector Learnings in 47 Minutes (Tutorial)

Поділитися
Вставка
  • Опубліковано 24 гру 2024

КОМЕНТАРІ •

  • @NatColley-t4z
    @NatColley-t4z Місяць тому +6

    Excellent, excellent, excellent. It does even more than I had merely hoped for. Forgive me, postgres, I should never have doubted you.

  • @theointechs
    @theointechs Місяць тому +2

    Massivelly underrated video! So much valuable info, thank you so much!

    • @TimescaleDB
      @TimescaleDB  15 днів тому

      Thanks! Glad you found it helpful.

  • @gauthamvijayan
    @gauthamvijayan 10 днів тому

    With this single video - I was able to understand what I need to become a AI Engineer by leveraging PostgresSQL extensions and vector databases and then to consume them in my React/React Native Applications.
    Thanks a ton for making these videos.
    The instructor needs a raise for putting everything so well together.

    • @TimescaleDB
      @TimescaleDB  8 днів тому

      That's awesome - thanks for sharing! Glad we could help.

  • @dbanswan
    @dbanswan 2 місяці тому +3

    Amazing video, learnt a lot. Will make time to read timescale blog regularly.

    • @TimescaleDB
      @TimescaleDB  2 місяці тому +1

      Thanks! much appreciated

  • @BruntPixels1234
    @BruntPixels1234 2 місяці тому +9

    You should do more tutorials like these

    • @TimescaleDB
      @TimescaleDB  2 місяці тому

      What additional topics would you like to see? Let us know and we can make it happen.

  • @renobodyrenobody
    @renobodyrenobody Місяць тому +1

    Well, after trying the whole thing I think the caveat here is to use pgai that depends on OpenAI: this is not local, you have to pay for the tokens, your data are going away and it is a black box. So I found another way, coding some functions locally in Postgres to use ollama locally with local models: no privacy leak or data leak, no token cost. This is what I understood but I am a rookie.

    • @TimescaleDB
      @TimescaleDB  15 днів тому +1

      pgai supports Ollama so you can use local models for greater privacy and lower cost. Check out the Github repo for more. The example used in the video is with OpenAI, but pgai also supports Ollama, Cohere, and Anthropic models too.

  • @renobodyrenobody
    @renobodyrenobody Місяць тому +1

    Mmm... Old school engineer here, spent more 30 years with db systems. And now I understand I don't want a black box RAG system but I want to implement AI stuff with PG ! For me there is a little thing you can do better: add some examples for retrieving data without and with the vectors, especially when there is a where clause. Other than that, your video is a big source of inspiration. Thanks a lot.

  • @awakenwithoutcoffee
    @awakenwithoutcoffee 2 місяці тому +1

    lovely like usual , there is indeed allot to learn but were getting closer :) Bless you,
    ps. Regarding storing structured and unstructured data in the same table : are you using the technique to store complete structured tables inside a JSONB ?
    We thought about this approach but dropped it in favor of separating structured from unstructured data to prevent mis-matching and allow for better isolation/scaling.
    Still experimenting but currently our set-up creates 1 table per structured document and infer the SCHEMA dynamically on upload + the embedding. Than the Agent decides on run-time which tables to query on. Unstructured documents can be bundled together more easily but placing all document types together can give false-positive search results ?

  • @ram8849
    @ram8849 Місяць тому

    Hi, your presentation gives me a clear idea of vector DB (I am new to it). May I ask a question about the example in 18:03
    If I understand whats going on correctly, you are encoding every row's columns (or their combinations) to vector data type, and then the same to the verbose text query using llm model text-embedding-3-small, so you can compare them with cosine similarity and output the top result, therefore we can get data in the columns and send them WITH the original verbose text query to the llm as usual.
    1. Is this the idea of what RAG does?
    2. If so, since what is being stored in the row/columns are raw data (string/int whatever), for example a date could be a expiry date of a password/member since/birthday/etc anything, should we embedded the original data directly, or turn them verbose first BEFORE encoding to vector, in order to get a better result? Or depends on the situations? For example
    sex, age, verbose description, embedded(verbose description), embedded([sex, age])
    m, 18, "this is a man in age 18", [0.01, .......], [.........val in vector]

  • @afaha2214
    @afaha2214 Місяць тому +1

    what is the postgres sql client being used? looks like supabase

    • @jroy3427
      @jroy3427 Місяць тому +1

      PopSQL, it was acquired by Timescale a few months ago

    • @TimescaleDB
      @TimescaleDB  15 днів тому

      It's PopSQL

  • @renobodyrenobody
    @renobodyrenobody Місяць тому

    Also, where is StreamingDiskANN come from? Seems only IVFFLAT and HNSW are here but diskann SQL Error [42704]: ERROR: access method "diskann" does not exist! Ha, got it: pgvectorscale !

    • @TimescaleDB
      @TimescaleDB  15 днів тому

      Correct, install pgvectorscale and you can access the StreamingDiskANN index

  • @orenmizr
    @orenmizr Місяць тому

    give me more videos like this please : )

  • @SageRap
    @SageRap 2 місяці тому +1

    Appreciate the video. Just FYI, you're pronouncing the word "build" like "bulled" throughout the video, but most native speakers pronounce it like "billed"