OpenAI Embeddings and Vector Databases Crash Course

Поділитися
Вставка
  • Опубліковано 22 гру 2024

КОМЕНТАРІ • 205

  • @photorealm
    @photorealm 8 місяців тому +25

    That was the first video that actually gave me a understanding of how vector DB's kind of work. Thank you for sharing.

    • @goldenant9450
      @goldenant9450 2 місяці тому

      Key word being, "kind of" 😂😂

  • @SuperITPRO
    @SuperITPRO Рік тому +59

    My ADHD normally overrides my concentration. Your tutorial pace, live coding, and narrative made me complete my 1st Open AI coded app - thank you!

    • @zzej
      @zzej Рік тому +1

      Same

    • @nicholastroyandersen9505
      @nicholastroyandersen9505 Рік тому +4

      Don't use ADHD as an excuse, it ain't no sickness, just personality. Take it and make it your best quality.

    • @ScherrerMadness
      @ScherrerMadness Рік тому

      @@nicholastroyandersen9505it’s…. Not a personality, lmfao. It’s a very clear set of learning disabilities centered around working memory, executive function, and tuning out

    • @davidabellangarrido2056
      @davidabellangarrido2056 Рік тому

      same and without knowing english

    • @pauls064
      @pauls064 10 місяців тому

      @@nicholastroyandersen9505it’s literally a neurological condition that can be seen on scans and measured… ignorant comment

  • @aiadvantage
    @aiadvantage Рік тому +21

    Super high quality video right here. Good job Adrian

    • @AdrianTwarog
      @AdrianTwarog  Рік тому +3

      Hey I've seen your stuff too, it's great, thanks for the nice words!

  • @MohamadBahri-h3k
    @MohamadBahri-h3k Рік тому +22

    I have seen multiple tutorials, this is by far the best and most concise, great work man

  • @nickkondoori7550
    @nickkondoori7550 Рік тому +18

    Incredible teaching skills. First time ever, I loved someone who can teach "ME" the way I always wanted. Thousand thumbs up Adrian!!

  • @nickfleming3719
    @nickfleming3719 Рік тому +127

    That isn't a vector database. It's a relational database with vectors stored on a text column. In practice, you will have thousands of embeddings and performance will tank with this setup

    • @trevorbaier7072
      @trevorbaier7072 Рік тому +5

      What's a more ideal solution for storing vectors?

    • @brookster7772
      @brookster7772 Рік тому +10

      From my investigation, Redis is an excellent vector store to be used in both development and production especially when it’s a local Dockerized instance

    • @SussyBacca
      @SussyBacca Рік тому +9

      Mongodb atlas is awesome for vectors. They have a new vector feature called knnbeta

    • @ParthSaneHD
      @ParthSaneHD Рік тому +4

      Pinecone works too!

    • @amdenis
      @amdenis Рік тому +1

      You are correct, but you know that! It’s indexing is not fast enough for many serious AI projects, and its single threaded architecture does not scale. Under the hood there are many other non-vector legacy issues.

  • @adamduvick
    @adamduvick Рік тому +11

    Let me see if I understand what’s going on here:
    1) you have data you want to search semantically
    2) you create a vector database capable of storing & querying data semantic search queries
    3) you use OpenAI to process your data & convert it to vectors which can stored in your database
    4) you store the data along with the OpenAI generated vectors
    5) now you can search the data
    Is that all it is? I thought you were then going leverage this database to give chatgpt “long term memory” ( 0:20 ). What you’ve showed seems nice, but I don’t really see the point since most people/companies who have enough data that would need to be queried in this way would not be able to give it away to OpenAI to process.

    • @goldenant9450
      @goldenant9450 2 місяці тому

      what'd you mean "give it away to OpenAI" is everything shared with OpenAI accessible by the internal team or something? I'm pretty sure you can opt out of using your data to train their AI..at least that's the case with that chatbots.

    • @FieldMarshalFeels
      @FieldMarshalFeels 9 днів тому

      You just need to code in chat logging that chunks the logs after they exceed the AI's short-term memory.
      You can also dynamically compress the logs to achieve higher efficiency.

  • @tech.bharat18
    @tech.bharat18 4 місяці тому +1

    This is by far the most easiest & concise explanation. Thanks for creating this video

  • @krisograbek
    @krisograbek Рік тому +8

    Adrian, your channel is a gem! I love the way you explain complex topics and the pace of your videos! Greetings from Poland!

  • @codinginflow
    @codinginflow Рік тому +1

    This was a great overview Adrian!

  • @rasmuspiirtola4397
    @rasmuspiirtola4397 Рік тому +1

    Rarely comment, but damn, you did a perfect job - I am at 8:01, haven't watched the video but had to pause and comment - until 8:01, everything was perfect; how you explain concepts and utilize tools ensures that we understand the concept in practice with ease! Great job, continue making videos; you should do consulting if you don't already do so. It's easy money with little hours with your skills and knowledge!

  • @brookster7772
    @brookster7772 Рік тому +1

    Bare metal, removing all higher level obstructions going right down to the core. I love it the best understanding of what embedding’s earlier that I have seen great job.

  • @AbhinavKumar-jt8kx
    @AbhinavKumar-jt8kx 5 місяців тому

    This is awesome, perfect video for non-beginner developers to quickly grasp.

  • @LeoCB
    @LeoCB Рік тому +2

    I just bought 2 Udemy courses, and after 5 hours, none of them talk so well about this. I appreciate it, and I will buy your book. Thanks for your content.

  • @anrk97
    @anrk97 Рік тому +3

    Love your thumbnails. Keeps getting better with each video 👍

    • @AdrianTwarog
      @AdrianTwarog  Рік тому +2

      Thanks, I try to make them as clear to what they video represents as possible!

  • @MaverickCoder-mz6hp
    @MaverickCoder-mz6hp 5 місяців тому

    Nice high quality video with clear explanation of concepts. This video is engaging for learners. I would say one of the best videos out there on vector embeddings. Good Job Adrian

  • @LindsayHiebert
    @LindsayHiebert Рік тому +15

    Excellent overview! Very concise, clear and relevant! Great job! Thank you Adrian! 😊

  • @cryarchy
    @cryarchy 10 днів тому

    Thank you, @AdrianTwarog. I wanted to learn how to store and retrieve embeddings in a vector database. This video helped me with that. The missing bit is how to use the retrieved embeddings for resource-augmented generation.

  • @CodexCommunity
    @CodexCommunity Рік тому

    This is the best video on openai embeddings I have ever seen, I am also a bit biased!

  • @RajShekarsdreamzzz
    @RajShekarsdreamzzz Рік тому +1

    Very Good session Adrain... your way of teaching is keeping the people glued... Keep it up

  • @Danimsalinas
    @Danimsalinas 3 місяці тому +1

    Omg, thanks for this video, very straight forward and easy to understand. Thanks!

  • @dipayanroy964
    @dipayanroy964 8 місяців тому

    I wish everyone could have presented like you, simply Super. Looking forward for more in similar way

  • @andrey20111988
    @andrey20111988 6 місяців тому +1

    Also you can use in postman "Test", which can help you create a script to create a string with requested input and response data. Automate it! (If you need)

  • @karthikpillai420
    @karthikpillai420 20 днів тому

    Good video on the basics of creating embeddings & vector DB

  • @coinexponent1884
    @coinexponent1884 Рік тому +6

    Learn vector embeddings using first principles. Always engaging, and very rewarding for the learner. Thank you!

  • @Art-kz6zf
    @Art-kz6zf Рік тому +1

    How efficient is the vector search if you need to go through all of the records every time you search? Shouldn't there be some dedicated field type for embeddings other than blob?

  • @araujoao
    @araujoao 2 місяці тому

    Thanks for Sharing. This was a great video that clearly illustrate vectorsdb, embeddings, and searching.

  • @Glow0110
    @Glow0110 Рік тому +6

    Would be great to see a follow up video of practical applications using this.

    • @atursams5501
      @atursams5501 Рік тому

      The practical application are varied:
      sentiment analysis
      term search
      Classification

  • @chrismalingshu
    @chrismalingshu Рік тому +3

    [Question] When input hello earth, "Hello World" scored 0.89, meanwhile "OpenAI Vectors and Embeedings are Easy!" scored 0.74. Which is quite close to the top rank text. But syntactically first and second returned text are very different. Somehow I expect the second text might scored 0.5 and below.
    Could you please share your thoughts on this Adrian?
    Thank you!

    • @daffertube
      @daffertube Рік тому

      You would need to ask someone who built the transformers at openai.

  • @ravindrasingh2411
    @ravindrasingh2411 10 місяців тому

    Adrian, this is beautifully explained. Absolutely loved it :)

  • @daygo619ca
    @daygo619ca Рік тому +1

    This tutorial was incredible - completely glued to it

  • @karthikg752
    @karthikg752 Рік тому

    The voice recording and explanation is really clear - surprising how tone and voice plays a major role in understanding. Was watching another video which was equally good but somehow the slang and recording made it a bit difficult to understand. Thanks

  • @meirgoldenberg5638
    @meirgoldenberg5638 Рік тому +2

    How in the world did it get 0.74 score (which pretty high on the scale for 0 to 1!) for the similarity of "Hello Earth" and "OpenAI vectors and embeddings are easy"? Is there anything in common between the two?

  • @pajisounds
    @pajisounds Рік тому +6

    Nice video, it would have been nice with a demonstration at the end or intro, keep up the good work.

    • @AdrianTwarog
      @AdrianTwarog  Рік тому

      Oh good suggestion, I’ll do that next time!!

  • @munishtyro
    @munishtyro 6 місяців тому

    Simple, concise, and has everything in it. Thank You

  • @saik6730
    @saik6730 Рік тому

    Best AI video ever . Made it easy to understand with 2 simple concepts . Thanks man!

  • @AmanBansil
    @AmanBansil 10 місяців тому

    Absolutely LOVE this. you're so clear and concise.

  • @oscargalvez7
    @oscargalvez7 Рік тому +3

    Amazing tutorial! The way you explain is so easy and understandable!

  • @JeremyArtero
    @JeremyArtero Рік тому

    This course is gold! Thanks! I have done similar steps on Astra db and it was smooth

  • @phil97n
    @phil97n Рік тому +3

    Awesome thanks.
    Been studying calculus and linear algebra before I dive deep into AI. I will definitely be dealing with vector databases very soon and looking forward to it.

  • @abijithpradeep7478
    @abijithpradeep7478 Рік тому +3

    For those who already had an OpenAi account and you are facing an error while posting the HTTP request, its because your free credit has expired. You will have to add a payment method or createa new account to get free credits agin and then everything will work fine according to this tutorial.

  • @GenZManhood
    @GenZManhood Рік тому +2

    I get this message when I run the API. Do you need to pay OpenAI for it to work? Thanks! "error": {"message": "You exceeded your current quota, please check your plan and billing details.",

  • @MDMUNIFHASAN-sr2jk
    @MDMUNIFHASAN-sr2jk Місяць тому

    nice tutorial, i have a question for code completion which extension you use?

  • @curtisblake261
    @curtisblake261 Рік тому +1

    I like this video and I don't mind all the upselling. My only complaint is that if I pause the video for too long, it automatically sends me to another video in the series, which makes it hard to get back to where I was. You might assume it is user error, but it isn't. The automatic transferal and loss of context happens constantly with this UA-cam video, and I've never had the problem with any other UA-cam tutorial. I'm fine with the monetizing and upselling since it helps reward the content creator, I just wish it wouldn't keep making me lose my place in the tutorial.

  • @MRGCProductions20996
    @MRGCProductions20996 10 місяців тому

    isnt calculating the modulus of the subtraction of the vectors a more accurate way to find similarities?

  • @kfliden
    @kfliden 8 місяців тому

    Wow, thanks I'm finally starting to get embeddings!

  • @ZaidKhanPathan
    @ZaidKhanPathan Рік тому

    Wow! Easy, clear and to the point.

  • @satish1012
    @satish1012 4 місяці тому

    Great
    So bascially if i have create LLM for my company who has multiple documents , content i need to do it
    1. Pass all the documents and get Embeddings from OPEN AI
    2. Store all the Embeddings in a DB
    3. Create an app to to search vector DB
    But my question is how it can think and reason. The above approach has great for search capability but how it think like Summaration , comprehension etc

  • @FahadKiani1
    @FahadKiani1 Рік тому +4

    Will you create a second part of this video where PDF's are uploaded and then analyzed?

  • @BryanChance
    @BryanChance Рік тому

    Does the chuck size have an affect on the quality or accuracy of the search result? Let's say I split a document into words AND in 200 word chucks. The vector results are stored in a vector db.

  • @rkjellbe
    @rkjellbe Рік тому

    Finally, found a video with the appropriate detail. For me! 😊 Thank you!

  • @ismailm123
    @ismailm123 3 місяці тому

    Brilliant super simple and very easy to understand.

  • @fkxfkx
    @fkxfkx Рік тому +1

    Bought the book. It ended on page 54, is there anything after 54 to 58?
    Last example was open ai fine tuning.
    It leaves the ft up on open ai site.
    How long will it be available there?
    Can it be brought down locally and be used in the future as local in combination with cloud model?

    • @AdrianTwarog
      @AdrianTwarog  Рік тому

      I’ll double check, and any updates will automatically be enabled on Gumroad!

    • @Ricocase
      @Ricocase Рік тому

      ​@@AdrianTwaroghow to automate text importation with sql? Must one enter each text blob manually?

  • @ewhite_dipi
    @ewhite_dipi Рік тому

    what are the prerequisites to understand the content in this video? And where can I learn them?

  • @chrislannon
    @chrislannon 10 місяців тому

    Nice work! Thanks so much for this awesome demo.

  • @MikevanDam-j9g
    @MikevanDam-j9g 6 місяців тому

    This tutorial is well explained. Thanks for that. But could you explain how to do this on scale? Is it possible to have a no code tool that companies can use to store their data in a vector database? Also, retrieving this info later?
    It seems that there must be easier solutions for this right? (while also keeping it safe to use).

  • @cmdrls212
    @cmdrls212 7 місяців тому

    This is great. I had to learn this in a crunch and I grok it now.

  • @mohammadbarzegari8737
    @mohammadbarzegari8737 Рік тому

    Perfect learning ❤🎉 master of learning ❤❤❤❤

  • @noubgaemer1044
    @noubgaemer1044 11 місяців тому

    thanks for the tutorial can we use our own LLM like private GPT or Text-generation Web UI instead of OPENAI

  • @atursams5501
    @atursams5501 Рік тому

    Great work! How do you make these nice presentations with the fancy arrows?

  • @gman2036
    @gman2036 10 місяців тому

    Loved this tutorial Adrian, very straight forward and it worked the first time not like some others I've tried. Now for my question. I'm seeing this on February 2024. I did not know CHATGPT, BARD and those other AI apps until they hit the common pool that I must swim in. I take it that vectoring documents has been going on for awhile, outside of the math world. I knew of vectoring back from college in linear algebra. If this is the case, what I'm trying to do will not be new. I'm trying to vectorize my documents in order to practice doing this kind of work. So, are there IT companies out there doing this type of work already and can you name a few? How far have they gotten? Has someone already done the library of Congress for instance?

  • @matickovac
    @matickovac 8 місяців тому

    Great work presenting this!
    Do you happen to know how similar or different this is from what Elasticsearch does when performing full-text search?

  • @adavis912
    @adavis912 11 місяців тому

    Great tutorial!!! I will be buying your book.

  • @xspydazx
    @xspydazx 8 місяців тому

    yes but how do you save a vector store ? ie export it to json for upload or finetuning into the main lm ?

  • @robertcormia7970
    @robertcormia7970 Рік тому

    Well done, succinct, and excellent explainations of complex topics.

  • @BikashKumar-pz8hc
    @BikashKumar-pz8hc 4 місяці тому

    How do you interact with non text, like images content on a document?

  • @Joshua.Medellin
    @Joshua.Medellin Рік тому

    I'm a little confused.. If I created embeddings and which I'm assuming is essentially training the openai model on a specific topic for my company. Would it be able to answer questions only on the specific topic it was trained for?

  • @grantomohundro3298
    @grantomohundro3298 2 місяці тому

    Great tutorial man! thank you!

  • @coding-master-shayan
    @coding-master-shayan Рік тому +2

    How can I train my own ai using tensorflow to generate I images and text

  • @pablochacon7641
    @pablochacon7641 Рік тому

    Very interesting video, but what are the prerequisites to understand & actually implement this ?

  • @SimonCicero-g8n
    @SimonCicero-g8n Рік тому

    Perfect explaination!

  • @contactbhasker7483
    @contactbhasker7483 Рік тому

    dot_product is a function offered by this database for vector searching, ranking etc.. ?

  • @satanrasool1802
    @satanrasool1802 Рік тому

    Love it.. it was far simpler than I thought..

  • @joostschuur
    @joostschuur Рік тому

    How would I go about weighing the results by other meta data? Say I have a bunch of videos, and I'm searching the title/description, but want to give some amount of preference to newer videos too.

  • @Dydent10
    @Dydent10 5 місяців тому

    Brilliant stuff!

  • @EffectiveMuscle
    @EffectiveMuscle 6 місяців тому

    2:30 so he's saying they interviewed the other 2 criminals but havent decided to charge them... So only the homeowner is being charged. Am I hearing him right?

  • @e-Course.
    @e-Course. 2 місяці тому

    Very interesting video , thank you

  • @pazhani008
    @pazhani008 9 місяців тому

    how does SingleStore know the embeddings returned from OpenAI and searches it correctly in its vector db?

  • @nadershalabi6241
    @nadershalabi6241 7 місяців тому

    Thank you! Great walk through

  • @karsonkalt7607
    @karsonkalt7607 Рік тому

    Fantastic tutorial and explanation!!

  • @psyduck4763
    @psyduck4763 Рік тому

    Hey man, what are those fonts you've used in this video?

  • @mokiloke
    @mokiloke Рік тому +1

    Mine seemed to only come up with a paid postman version. Maybe its based on location?

  • @sany2k8
    @sany2k8 Рік тому

    Great content 👍👍👍, waiting for more OpenAI, AI related content

  • @sivakumarkalaiselvan6831
    @sivakumarkalaiselvan6831 Рік тому

    Hi Bro,
    What is the extension u used in the vs code for the code suggestions?

  • @sunnysk43
    @sunnysk43 Рік тому

    Absolutely amazing! Thank you so much for your work!

  • @Ricocase
    @Ricocase Рік тому

    Cool course. How does one connect it to a basic website?

  • @EFilizli
    @EFilizli Рік тому

    Did this become obsolete with GPT builder?

  • @DrAIScience
    @DrAIScience 11 місяців тому

    Is there any way to obtain embeddings of gpts from images?

  • @RiazSyed-n1x
    @RiazSyed-n1x Рік тому

    great explanation ! thanks !!

  • @alexsalgado
    @alexsalgado Рік тому

    Excellent content, what changes for audio search?

  • @oraculox
    @oraculox Рік тому

    What is the quickest way to feed recognition or pattern braking data into the system?. Or just lower the AI endorphine levels hahaha.

  • @Aayush-k3d
    @Aayush-k3d 5 місяців тому

    Very well explained

  • @zibitappert
    @zibitappert Рік тому

    would it be possible to use this for an AI NPC for training purposes in XR space for example?

  • @pranavkm4513
    @pranavkm4513 Рік тому +1

    wow great video sir. Helped a lot. may i know what extension is being used in 16:40 ?

  • @m67esteban
    @m67esteban 2 місяці тому

    thank you very much! super useful!

  • @bryanbai2017
    @bryanbai2017 Рік тому +1

    Any link to the digital book ?

  • @demetriusmds
    @demetriusmds Рік тому

    Excellent. Thank you. Helped a lot.

  • @MannyBernabe
    @MannyBernabe 10 місяців тому +1

    excellent. thx!

  • @omangramoswaane2211
    @omangramoswaane2211 Рік тому

    Nice video. I love your work.

  • @akshatkant1423
    @akshatkant1423 8 місяців тому

    I am looking forward to generate a pretty lengthy json about 25k tokens, None of the llm models currently support that much output response tokens, do you think is it possible if i somehow get embeddings in response which later on i can convert to json then my aim to generate 25k tokens could be possible. Because embeddings will take lesser token size?