From Pints to Insights: Unveiling Semantic Search Power with Word Embeddings and Vector Databases

Поділитися
Вставка
  • Опубліковано 7 чер 2024
  • 🍺🔍 "Revolutionizing Search: How Semantic AI Transforms Beer Selection & Beyond | Lucidate Explains"
    Join us on a fascinating journey from hops to high-tech with Lucidate's latest innovation in semantic search. In this video, Richard Walker dives into how AI not only helps you pick the perfect beer but also reshapes how we search for information in finance, investment banking, and more.
    What You'll Discover:
    [0:00] Introduction to Semantic Search
    [0:45] The Hoptimus Prime Scenario: AI in Beer Selection
    [1:23] Keyword vs. Semantic Search: What’s the Difference?
    [3:07] "Vectorizing" text
    [4:28] Building Word Embeddings
    [7:42] The mathematics of semantics
    [7:42] Hoptimus Prime and S.A.S.S.
    [10:45] Join Lucidate’s UA-cam Membership for Exclusive Content
    Whether you're a tech enthusiast, a professional in finance, or just curious about AI's impact on everyday decisions, there's something here for everyone.
    📚 Resources & Links:
    Neural Networks and Backpropagation Playlist: bit.ly/3QTCy8t
    Neural Networks in 60 Seconds Playlist: bit.ly/3FSjUqZ
    Learn More about Lucidate’s AI Solutions: bit.ly/3SzqxWN
    Website: www.lucidate.co.uk
    UA-cam home: www.youtube.com/@lucidateAI/f...
    🚀 Stay Ahead with Lucidate:
    Want to see how semantic search can transform your business? Reach out to schedule a consultation and explore our innovative solutions.
    📩 Contact: info@lucidate.co.uk
    👍 If you found this video insightful, don't forget to like, subscribe, and share it with your network. Your support helps us bring more AI-driven content to you!
    #SemanticSearch #AIInnovation #LucidateTech #datascience
    Alternative titles? Which do you prefer? Let me know in the descriptiopn!
    "AI Transforms How We Search: Semantic Search Revolution in Action"
    "Beyond Keywords: Semantic AI's Game-Changing Role in Information Retrieval"
    "Navigating the Data Deluge: Semantic Search Solutions by Lucidate"
    "The Future of Search: Lucidate's Deep Dive into AI-Driven Semantic Analysis"
    "Choosing Beer with AI? How Semantic Search is Redefining Decision-Making"
    "From Hops to AI: Revolutionizing Beer Choice with Semantic Search Tech"
    "Semantic Search Unveiled: Lucidate's AI Breakthrough in Understanding Data"
    "AI's New Frontier: Semantic Search's Impact from Brews to Banking"
    "Unlocking AI's Potential: Semantic Search for Precise, Contextual Results"
    "Revolution in Search: How Lucidate's AI Mastery Is Changing the Game"
  • Наука та технологія

КОМЕНТАРІ • 8

  • @neurojitsu
    @neurojitsu 3 місяці тому

    your explanations make me feel so much smarter! Then the memory fades, and I'm scratching my head again...

    • @lucidateAI
      @lucidateAI  3 місяці тому +1

      The beer might do that….!

  • @zugbob
    @zugbob 7 місяців тому +1

    So when a search is made on a sentence is the entire sentence embedded and searched on?
    Would it make sense to also search per keyword semantically and combine the results?

    • @lucidateAI
      @lucidateAI  7 місяців тому +1

      @Zugbob. The sequence is really what matters. Sure the actual words and the meaning of the words individually matter, of course they do. But of far greater importance is particular words in a specific order. To illustrate this take a look at ua-cam.com/video/DINUVMojNwU/v-deo.htmlsi=PwNubG_rgoa_wKS_ at 7:45. You’ll see exactly the same words in each sentence, but the order and position of the words is different in each. Do each of the sentences have the same meaning? Absolutely not! The sequence is crucial.

  • @PavelSTL
    @PavelSTL 6 місяців тому +1

    The word embeddings are pretty clear, although the explanation implies they are static after training and could be simply looked up in some file, but is that really the case ? I thought the embedding models might not give you the *exact* same embedding numbers every time you run the same word through them, should be easy to test. Ok though, what I'm still struggling to understand is how sentence, or chunk text embeddings work. You can vectorize a chunk up to 8k tokens by OpenAI ada-2, and the resulting embedding will be the same size as individual words (1536). So how does semantic search work for chunks then? Clearly the embedding model cannot be trained on all possible letter combinations of up to 8k tokens, at least the same way as words are. Is the chunk broken into individual words (tokens) and then the average of all individual word embeddings taken to represent the entire chunk with one 'mean' embedding?

    • @lucidateAI
      @lucidateAI  6 місяців тому +1

      In embedding schema like word2vec or GloVe word embeddings are fixed before training - or you won't be able to train the network! If you want to see this in action and play around with these embeddings to get a deeper understanding take a look at: github.com/spro/practical-pytorch/blob/master/glove-word-vectors/glove-word-vectors.ipynb. Once these models are trained, the word embeddings are fixed and can be looked up from a pre-trained model. However, in the context of transformer models like those used by OpenAI, embeddings can be dynamic. For a more detailed explanation please take a look at: 1 ua-cam.com/video/6XLJ7TZXSPg/v-deo.html, 2 ua-cam.com/video/DINUVMojNwU/v-deo.html, 3 ua-cam.com/video/sznZ78HquPc/v-deo.html, & 4ua-cam.com/video/6tzn5-XlhwU/v-deo.html. This covers 1) Word embedding generation and semantics, 2) Positional encoding, 3) The attention mechanism 4) Pull these three things together to train an encoder/decoder transformer. Semantic search measures (usually) cosine similarity between tensors, or (occasionally) euclidean distance between tensors, and (seldom) some other distance metric between tensors. Tensors in this case are often rank 2 for a sequence of word embeddings. That way you don't (and never would!) take a mean of vectors. You have a rank two tensor (a vector of vectors) to compare with another rank 2 tensor for semantic similarity. If this doesn't make sense after watching the videos, drop me a line.

    • @PavelSTL
      @PavelSTL 6 місяців тому +1

      @@lucidateAI thanks so much !

    • @lucidateAI
      @lucidateAI  6 місяців тому

      @@PavelSTL you are most welcome. I hope that the supplementary videos made sense. Richard