very insightful, nice explanation of highlighting the limitations of hive-style partitioning, z-order optimization techniques & how liquid clustering provides a neat solution though its internals look quite complex to understand, thank you for sharing :)
Very interesting. For zordering you can store the columns in table properties at table creation and then retrieve them when optimizing it's not that much code.
It's a great combo with vector deletions as you don't have to rewrite the data. Without vector deletions it could make deletes more expensive as the data would be spread and mixed across files.
Hi !! I have a question is it possible to implementing liquid clustering for DataFrames directly saved to delta files (df.write.format("delta").save("path")), The conventional approach involving table creation
very insightful, nice explanation of highlighting the limitations of hive-style partitioning, z-order optimization techniques & how liquid clustering provides a neat solution though its internals look quite complex to understand, thank you for sharing :)
Thank for sharing this talk. Would you be so kind to share a link to the slide deck presented by Vitor?
Very interesting. For zordering you can store the columns in table properties at table creation and then retrieve them when optimizing it's not that much code.
It's a great combo with vector deletions as you don't have to rewrite the data. Without vector deletions it could make deletes more expensive as the data would be spread and mixed across files.
Hi !! I have a question is it possible to implementing liquid clustering for DataFrames directly saved to delta files (df.write.format("delta").save("path")), The conventional approach involving table creation
One question, is it wise decision to apply partition to liquid clustering table?
partitioning is not compatible with liquid clustering
what is difference between bucket By vs Liquid Clustering