Delta Lake Deep Dive: Liquid Clustering

Поділитися
Вставка
  • Опубліковано 4 січ 2025

КОМЕНТАРІ • 8

  • @arunsundar3739
    @arunsundar3739 Місяць тому

    very insightful, nice explanation of highlighting the limitations of hive-style partitioning, z-order optimization techniques & how liquid clustering provides a neat solution though its internals look quite complex to understand, thank you for sharing :)

  • @chrisstephenson9890
    @chrisstephenson9890 11 місяців тому

    Thank for sharing this talk. Would you be so kind to share a link to the slide deck presented by Vitor?

  • @alexischicoine2072
    @alexischicoine2072 8 місяців тому

    Very interesting. For zordering you can store the columns in table properties at table creation and then retrieve them when optimizing it's not that much code.

  • @alexischicoine2072
    @alexischicoine2072 8 місяців тому

    It's a great combo with vector deletions as you don't have to rewrite the data. Without vector deletions it could make deletes more expensive as the data would be spread and mixed across files.

  • @luisriveros1119
    @luisriveros1119 Рік тому

    Hi !! I have a question is it possible to implementing liquid clustering for DataFrames directly saved to delta files (df.write.format("delta").save("path")), The conventional approach involving table creation

  • @raviv5109
    @raviv5109 10 місяців тому

    One question, is it wise decision to apply partition to liquid clustering table?

    • @paulfunigga
      @paulfunigga 10 місяців тому +1

      partitioning is not compatible with liquid clustering

  • @k.saibhargav8072
    @k.saibhargav8072 9 місяців тому

    what is difference between bucket By vs Liquid Clustering