IcebergCRUDSpec: A dataset we CRUD - should create the appropriate Iceberg files + Given data Datum(0,label_0,0,2025-03-28,2025-03-28 13:59:34.403) Datum(1,label_1,1,2025-03-27,2025-03-28 13:59:34.603) Datum(2,label_2,2,2025-03-26,2025-03-28 13:59:34.803) ... + When writing to table 'polaris.my_namespace.IcebergCRUDSpec' + Then reading the table back yields the same data + And there is no mention of the table in the metastore + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - should support updates with 'update polaris.my_namespace.IcebergCRUDSpec set label='ipse locum'' + Given SQL update polaris.my_namespace.IcebergCRUDSpec set label = 'ipse locum' + When we execute it + Then all rows are updated + And look like: Datum(10,ipse locum,0,2025-03-18,2025-03-28 13:59:36.403) Datum(11,ipse locum,1,2025-03-17,2025-03-28 13:59:36.603) Datum(12,ipse locum,2,2025-03-16,2025-03-28 13:59:36.803) ... + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - should be able to have its schema updated + Given SQL ALTER TABLE polaris.my_namespace.IcebergCRUDSpec ADD COLUMNS (new_string string comment 'new_string docs') + When we execute it + Then all rows are updated + And look like: [10,ipse locum,0,2025-03-18,2025-03-28 13:59:36.403,null] [11,ipse locum,1,2025-03-17,2025-03-28 13:59:36.603,null] [12,ipse locum,2,2025-03-16,2025-03-28 13:59:36.803,null] ... + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - should be able to have rows removed + Given SQL DELETE FROM polaris.my_namespace.IcebergCRUDSpec where id < 10 + And the parquet files: /tmp/polaris/my_namespace/IcebergCRUDSpec/data/.00003-790-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet.crc /tmp/polaris/my_namespace/IcebergCRUDSpec/data/00001-788-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet /tmp/polaris/my_namespace/IcebergCRUDSpec/data/.00000-793-44952364-c6a4-4165-91a6-6ef5780498cc-0-00001.parquet.crc /tmp/polaris/my_namespace/IcebergCRUDSpec/data/00000-787-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet /tmp/polaris/my_namespace/IcebergCRUDSpec/data/.00000-787-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet.crc /tmp/polaris/my_namespace/IcebergCRUDSpec/data/00000-793-44952364-c6a4-4165-91a6-6ef5780498cc-0-00001.parquet /tmp/polaris/my_namespace/IcebergCRUDSpec/data/00002-789-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet /tmp/polaris/my_namespace/IcebergCRUDSpec/data/.00002-789-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet.crc /tmp/polaris/my_namespace/IcebergCRUDSpec/data/.00001-788-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet.crc /tmp/polaris/my_namespace/IcebergCRUDSpec/data/00003-790-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet + When we execute it + Then there are no longer an rows with ID < 10 + And no files are deleted but there are the following new parquet files: /tmp/polaris/my_namespace/IcebergCRUDSpec/data/.00000-799-184ae093-44d7-4ed8-931f-6be32d4ee2a5-0-00001.parquet.crc /tmp/polaris/my_namespace/IcebergCRUDSpec/data/00000-799-184ae093-44d7-4ed8-931f-6be32d4ee2a5-0-00001.parquet + And those new files contain just the data with id >= 10 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - should can have its history queried + Given a table that has seen changes + When we execute: select * from polaris.my_namespace.IcebergCRUDSpec.history + Then we see entries in the history table thus: +-----------------------+-------------------+-------------------+-------------------+ |made_current_at |snapshot_id |parent_id |is_current_ancestor| +-----------------------+-------------------+-------------------+-------------------+ |2025-03-28 13:59:47.959|4387101536688183765|NULL |true | |2025-03-28 13:59:48.184|2064186678789202043|4387101536688183765|true | |2025-03-28 13:59:48.662|6645494646071078980|2064186678789202043|true | +-----------------------+-------------------+-------------------+-------------------+ + And we can view a snapshot image with SQL: select * from polaris.my_namespace.IcebergCRUDSpec VERSION AS OF 4387101536688183765 +---+--------+------------+----------+-----------------------+ id |label |partitionKey|date |timestamp | +---+--------+------------+----------+-----------------------+ 0 |label_0 |0 |2025-03-28|2025-03-28 13:59:34.403| 1 |label_1 |1 |2025-03-27|2025-03-28 13:59:34.603| 2 |label_2 |2 |2025-03-26|2025-03-28 13:59:34.803| 3 |label_3 |3 |2025-03-25|2025-03-28 13:59:35.003| 4 |label_4 |4 |2025-03-24|2025-03-28 13:59:35.203| 5 |label_5 |0 |2025-03-23|2025-03-28 13:59:35.403| 6 |label_6 |1 |2025-03-22|2025-03-28 13:59:35.603| 7 |label_7 |2 |2025-03-21|2025-03-28 13:59:35.803| 8 |label_8 |3 |2025-03-20|2025-03-28 13:59:36.003| 9 |label_9 |4 |2025-03-19|2025-03-28 13:59:36.203| 10 |label_10|0 |2025-03-18|2025-03-28 13:59:36.403| 11 |label_11|1 |2025-03-17|2025-03-28 13:59:36.603| 12 |label_12|2 |2025-03-16|2025-03-28 13:59:36.803| 13 |label_13|3 |2025-03-15|2025-03-28 13:59:37.003| 14 |label_14|4 |2025-03-14|2025-03-28 13:59:37.203| 15 |label_15|0 |2025-03-13|2025-03-28 13:59:37.403| 16 |label_16|1 |2025-03-12|2025-03-28 13:59:37.603| 17 |label_17|2 |2025-03-11|2025-03-28 13:59:37.803| 18 |label_18|3 |2025-03-10|2025-03-28 13:59:38.003| 19 |label_19|4 |2025-03-09|2025-03-28 13:59:38.203| +---+--------+------------+----------+-----------------------+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - should when vacuumed, have old files removed + Given the 12 files are: /tmp/polaris/my_namespace/IcebergCRUDSpec/data/.00003-790-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet.crc /tmp/polaris/my_namespace/IcebergCRUDSpec/data/00000-793-44952364-c6a4-4165-91a6-6ef5780498cc-0-00001.parquet /tmp/polaris/my_namespace/IcebergCRUDSpec/data/00002-789-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet /tmp/polaris/my_namespace/IcebergCRUDSpec/data/.00000-787-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet.crc /tmp/polaris/my_namespace/IcebergCRUDSpec/data/.00001-788-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet.crc /tmp/polaris/my_namespace/IcebergCRUDSpec/data/.00000-799-184ae093-44d7-4ed8-931f-6be32d4ee2a5-0-00001.parquet.crc /tmp/polaris/my_namespace/IcebergCRUDSpec/data/00000-787-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet /tmp/polaris/my_namespace/IcebergCRUDSpec/data/00003-790-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet /tmp/polaris/my_namespace/IcebergCRUDSpec/data/.00000-793-44952364-c6a4-4165-91a6-6ef5780498cc-0-00001.parquet.crc /tmp/polaris/my_namespace/IcebergCRUDSpec/data/00001-788-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet /tmp/polaris/my_namespace/IcebergCRUDSpec/data/00000-799-184ae093-44d7-4ed8-931f-6be32d4ee2a5-0-00001.parquet /tmp/polaris/my_namespace/IcebergCRUDSpec/data/.00002-789-f2848f65-038f-4ea2-94a8-b84ba52b8227-0-00001.parquet.crc + When we call expireSnapshot + Then the original 12 files now look like: /tmp/polaris/my_namespace/IcebergCRUDSpec/data/00000-799-184ae093-44d7-4ed8-931f-6be32d4ee2a5-0-00001.parquet /tmp/polaris/my_namespace/IcebergCRUDSpec/data/.00000-799-184ae093-44d7-4ed8-931f-6be32d4ee2a5-0-00001.parquet.crc /tmp/polaris/my_namespace/IcebergCRUDSpec/data/.00000-793-44952364-c6a4-4165-91a6-6ef5780498cc-0-00001.parquet.crc /tmp/polaris/my_namespace/IcebergCRUDSpec/data/00000-793-44952364-c6a4-4165-91a6-6ef5780498cc-0-00001.parquet + And there are 8 new files + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - should should delete all files when dropped + Given polaris.my_namespace.IcebergCRUDSpec has 2 + When we execute: DROP TABLE polaris.my_namespace.IcebergCRUDSpec PURGE + Then there are no files + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +