Spark way of creating iceberg tables: ``` CREATE TABLE db1.table1 ( order_id int, . order_ts TIMESTAMP ) USING iceberg PARTITIONED BY (hour (order_ts)) ``` The USING clause indicates the type of table to create. Here its iceberg. The partition clause here uses the iceberg hidden paritioning feature. It uses an existing data column to create hourly partitions. Any query that uses order_ts to filter data would be able to take the benefit of the partitioning. Also note that we havent created a new column named hour. In this scenario, hours is a transform function in iceberg. There are many other like month, day, truncate etc. Creating a table using above query , creates the first version of the metadata file (v1.metadata.json). Note that metadata file is a JSON file.
@Dremio What happens when there are 100s or 1000s of versions of a table and there by 100s or 1000s of copies of data as we are not deleting previous versions data files ? Won't the storage required to keep all those data files explode ? How do you handle it practically in real life scenarios ? Do we just maintain the last 10 versions or something like that ?
Depends on your needs, you should have clear data retention policies so you can expire snapshots predictably. Dremio can automate this expiration with our data lakehouse management features
If each newly created metadata file has the entire snapshot history, then why create a new metadata file in every operation? Why not use the same file and use the latest snapshot information within the file ? Before creating the file, we could take a backup of the file with a different name and then overwrite the new metadata file with the same name as previous? This way there are just 2 metadata files, one current and another backup, differing by one snapshot. We could get rid of the dynamic updates on every data write that way and only one metadata entry is created at each table creation. I am definitely missing a core design decision here, otherwise the creators of Iceberg would already have thought of this.
The spec assumes file immutability so files are never updated only created, this allows consistent assumptions about what’s in the files. When using a Nessie catalog, this is important cause of I rollback the catalog the catalog will now refer to the older metadata.json files based on the rollback timestamp.
@@Dremio Thank you. Immutability is what I was missing out. So everything is file based. And I think Iceberg is hard-coded to look into the latest snapshot of any given metadata.json file even though the file contains older snapshots. So moving through snapshots amounts to moving through multiple metadata.json files rather than moving through multiple snapshots within a single file.
@@electronicsacademy2D2E29 It always looks into the most current metadata file (catalog pointer) to get you any older snapshot. The old metadata files are essentially redundant.
Excellent explanation, thank you!
Glad it was helpful!
Spark way of creating iceberg tables:
```
CREATE TABLE db1.table1 (
order_id int,
.
order_ts TIMESTAMP
)
USING iceberg
PARTITIONED BY (hour (order_ts))
```
The USING clause indicates the type of table to create. Here its iceberg.
The partition clause here uses the iceberg hidden paritioning feature. It uses an existing data column to create hourly partitions. Any query that uses order_ts to filter data would be able to take the benefit of the partitioning. Also note that we havent created a new column named hour.
In this scenario, hours is a transform function in iceberg. There are many other like month, day, truncate etc.
Creating a table using above query , creates the first version of the metadata file (v1.metadata.json). Note that metadata file is a JSON file.
@Dremio What happens when there are 100s or 1000s of versions of a table and there by 100s or 1000s of copies of data as we are not deleting previous versions data files ? Won't the storage required to keep all those data files explode ? How do you handle it practically in real life scenarios ? Do we just maintain the last 10 versions or something like that ?
Depends on your needs, you should have clear data retention policies so you can expire snapshots predictably. Dremio can automate this expiration with our data lakehouse management features
If each newly created metadata file has the entire snapshot history, then why create a new metadata file in every operation? Why not use the same file and use the latest snapshot information within the file ? Before creating the file, we could take a backup of the file with a different name and then overwrite the new metadata file with the same name as previous? This way there are just 2 metadata files, one current and another backup, differing by one snapshot. We could get rid of the dynamic updates on every data write that way and only one metadata entry is created at each table creation. I am definitely missing a core design decision here, otherwise the creators of Iceberg would already have thought of this.
The spec assumes file immutability so files are never updated only created, this allows consistent assumptions about what’s in the files.
When using a Nessie catalog, this is important cause of I rollback the catalog the catalog will now refer to the older metadata.json files based on the rollback timestamp.
@@Dremio Thank you. Immutability is what I was missing out. So everything is file based. And I think Iceberg is hard-coded to look into the latest snapshot of any given metadata.json file even though the file contains older snapshots. So moving through snapshots amounts to moving through multiple metadata.json files rather than moving through multiple snapshots within a single file.
@@electronicsacademy2D2E29 It always looks into the most current metadata file (catalog pointer) to get you any older snapshot. The old metadata files are essentially redundant.