Vector and Hybrid Search with Elasticsearch

Поділитися
Вставка
  • Опубліковано 24 лис 2023
  • This talk is by Carly Richmond, Developer Advocate at Elastic.
    Search is not just traditional TF/IDF any more but the current trend of machine learning and models has opened another dimension for search. ChatGPT continues to dominate news headlines and marks a turning point in how we build search applications and find data. Yet concerns also exist in adapting our search solutions to perform semantic search against our own private data.
    This talk serves as an introduction to the difference between vector and keyword search, and how they can be combined together. We’ll cover:
    - "Classic" search and its limitations
    - What is a model and how can you use it
    - How to use vector search or hybrid search in Elasticsearch
    - Where OpenAI's ChatGPT or similar LLMs come into play to with Elastic
    This session was part of the 2023 Oktoberfest Data Science Festival at CodeNode in London.
    The Data Science Festival is the place for data driven people to come together, share cutting edge ideas and solve real-world problems. We run monthly events, meetups and the biggest free to attend data festivals in the UK. Join the community at datasciencefestival.com/
    #vectorsearch
    #machinelearning

КОМЕНТАРІ • 3

  • @eyemazed
    @eyemazed 6 місяців тому

    How would you get around this problem - you have 2 sets of results from 2 different search engines - for example, one is vector and the other one is full-text. However, it just so happens that the vector search results are super good but the full-text search results are really crappy for this particular query (Not always). Now you apply the Reciprocial Rank Fusion algorithm and it blends together crap and quality instead of keeping more quality and discarding more crap. Wish there was a way to address this problem other than Elastic's custom "script_score" which is basically a static function and assumes that the same scoring algorithm will be applied regardless of input (results)