Advancing Spark - Delta Deletion Vectors

Поділитися
Вставка
  • Опубліковано 29 січ 2025

КОМЕНТАРІ • 8

  • @XiaoyunZhang-h7r
    @XiaoyunZhang-h7r 4 місяці тому

    Is "tombstoned" row (soft deleted state) recoverable before the new file rewritten?

  • @riteshsharma344
    @riteshsharma344 Рік тому

    Thanks for great video as always 🙂

  • @alexischicoine2072
    @alexischicoine2072 8 місяців тому

    Deletion vectors are amazing. They improve concurrency as well which is detailed on the page about isolation and serialization. If you need to delete data about customers for compliance it’s great. Also if you need to replicate your data to another region you won’t be creating as many extra files that need to be transferred and stored so you can get good savings from that as well. Imagine if you have big gigabyte parquet files in a huge table and you need to delete a record here and there it will make a massive difference.

  • @SladeFlash
    @SladeFlash Рік тому

    Hi, can we set this property in streaming table?

  • @2307Leito
    @2307Leito Рік тому

    Awesome! love your videos! nice feature, quick question, for doing upserts in delta what could be the best way to implement it? let's say you have a fact table by day and on daily runs it loads the 3 closest day to getdate() (it reloads some data and insert new one -upsert-)

  • @jeanchindeko5477
    @jeanchindeko5477 Рік тому

    Thanks for this great video. Is this like Merge on Read like in Iceberg and Hudi?

  • @malebeauty
    @malebeauty 10 місяців тому

    You're so cool

  • @NeumsFor9
    @NeumsFor9 Рік тому

    Pretty soon we will be at the old SSAS .deleted store, and all those .store files 😂😂😂....