ConcurrentWriteSpec:
+ https://iceberg.apache.org/docs/latest/reliability/
Concurrent writes
- should cause one transaction to fail
+ Given two transactions trying to write data
Datum(0,label_0,0,2025-05-13,2025-05-13 16:37:59.688)
Datum(1,label_1,1,2025-05-12,2025-05-13 16:37:59.888)
Datum(2,label_2,2,2025-05-11,2025-05-13 16:38:00.088)
...
+ When both run at the same time
+ Then one fails with exception org.apache.iceberg.exceptions.AlreadyExistsException: Requirement failed: table already exists
+ And one succeeds
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Data integrity
- should be intact after a failure
+ Given the table 'polaris.my_namespace.ConcurrentWriteSpec' has had a failed write
+ When we call spark.read.table("polaris.my_namespace.ConcurrentWriteSpec")
+ Then the table still contains 20 records
+ And failed files are left behind
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Orphans
- should be deleted
+ Given there are 40 raw rows in the directory when there should be 20
+ When we use the Java API to call deleteOrphanFiles on anything older than now
+ Then old files are deleted and the raw row count is now 20
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +