BigQuery vector search and embedding generation

Поділитися
Вставка
  • Опубліковано 26 сер 2024
  • Discover the power of semantic search! With BigQuery's vector search capabilities, you can analyze unstructured data like text, images, and videos based on their underlying meaning. Explore how machine learning transforms your data into numerical representations called embeddings, making it possible to find connections that traditional keyword searches often miss.
    In this video, you'll learn how BigQuery seamlessly generates embeddings from unstructured objects and enables semantic search using familiar SQL functions. See a real-world example as we use these techniques to search a non-labeled product image catalog with text.
    Vector search resources:
    Learn more in the vector search documentation → goo.gle/bq-vec...
    Read the vector search blog here→ goo.gle/bq-vec...
    Embedding generation resources:
    Learn more in the embedding generation with BigQuery documentation → goo.gle/bqml-g...
    Subscribe to Google Cloud Tech → goo.gle/Google...

КОМЕНТАРІ • 18

  • @googlecloudtech
    @googlecloudtech  4 місяці тому +4

    Subscribe to Google Cloud Tech → goo.gle/GoogleCloudTech

  • @newsverse-ir6rc
    @newsverse-ir6rc 4 місяці тому +5

    With its seamless integration with the broader Google Cloud ecosystem, BigQuery empowers organizations to effortlessly ingest, store, and analyze massive amounts of data from a variety of sources.

    • @googlecloudtech
      @googlecloudtech  3 місяці тому

      We definitely agree that BigQuery empowers organizations of all kinds to make informed data-driven decisions. We're glad you enjoyed the content. 😎

  • @batumanagadze2920
    @batumanagadze2920 2 місяці тому +1

    how did we get product_names based on that query?

    • @tmoanryk
      @tmoanryk 2 місяці тому

      same question

    • @jeffnelson9889
      @jeffnelson9889 2 місяці тому +1

      The field 'product_name' was defined in the table 'merch_store_embeddings' around 6:00 in the video. We then access the field around 8:30 in the video.
      When we defined the 'product_name' field, it was blank. The video doesn't show it, but I ran an UPDATE statement in the background, to populate some sample product names based on the sku_id field. The code looks something like:
      UPDATE `cymbal-product-analytics.cymbal_retail.merch_store_embeddings`
      SET product_name =
      CASE
      WHEN sku_id = 'HL4C2MYZ' THEN 'Sprinkle of Sunshine Thick Knit'
      WHEN sku_id = '3MTHOVTU' THEN 'Bold and Beautiful Loose Fit'
      WHEN sku_id = 'T9NYYE6N' THEN 'Mix & Match Magic Sweater'
      WHEN sku_id = 'QQNYZ5F2' THEN 'The Bold Harvest Sweater'
      WHEN sku_id = '45QE9RWO' THEN 'Oversized Embrace Sweater'
      END
      WHERE 1=1;

    • @jeffnelson9889
      @jeffnelson9889 2 місяці тому +1

      @@tmoanryk Answered in the comment above.

  • @ammarfasih3866
    @ammarfasih3866 2 місяці тому

    Can someone confirm if these embeddings are created based on some sort of metadata (in text format) or they are created based on the images analyzed, like done by gemini vision pro?

  • @ammarfasih3866
    @ammarfasih3866 Місяць тому

    is BigQuery embedding and vector search supports the negation? Like say I'm giving following below statement.
    "looking for boys t-shirts and not in yellow"
    Here I'm looking for boys t-shirts but particularly don't wanna include the the color yellow. At the moment what I've observed it is unable to handle the negation and returning the results with color yellow.
    Is there a way to handle this?

  • @KEVINCABALLERO-nb2uv
    @KEVINCABALLERO-nb2uv 2 місяці тому +1

    Do anyone faced this issue ? Column 'ml_generate_embedding_result' must have the same array length, while the minimum length is 0 and the maximum length is 768.

    • @jeffnelson9889
      @jeffnelson9889 2 місяці тому +1

      Run a query like the following to make sure that all of your embeddings (the column 'ml_generate_embedding_result') have the same length before creating your vector index:
      SELECT ARRAY_LENGTH(ml_generate_embedding_result), count(*)
      FROM `cymbal-product-analytics.cymbal_retail.merch_store_embeddings`
      GROUP BY 1;

  • @abubakrabdalla9430
    @abubakrabdalla9430 4 місяці тому +1

    i'm facing this error
    Invalid table-valued function ML.GENERATE_EMBEDDING ML.GENERATE_EMBEDDING expects the 2nd argument to contain a column named content of type STRING. at [3:8]

    • @adammudrick6417
      @adammudrick6417 3 місяці тому +1

      you need to change the name of the columns and ensure they are in the right order; #1 to n content, entity_id, *....

    • @ammarfasih3866
      @ammarfasih3866 2 місяці тому

      you need include a column with alias as content (this column would be used for embedding)

  • @0269_m
    @0269_m 4 місяці тому +2

    i love googlecloud man i run mc server there

  • @iAmNotBobHope
    @iAmNotBobHope 6 днів тому

    The command at 5:19 "CREATE OR REPLACE MODEL" doesn't work and gives an error of "is not a supported object type". Must use "CREATE MODEL IF NOT EXISTS". Took me way too long to figure this out.

  • @anupaminsight
    @anupaminsight 4 місяці тому +1

    🇮🇳