fix: allow missing sequence-number in v2 snapshots for v1-upgraded tables#2127
Conversation
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
10eb184 to
734dc8f
Compare
|
@CTTY could I get a rereview on this please? |
|
cc @blackmwk to take a look |
|
@jembishop How did you get a table in this state - is it abnormal? If you have performed a table upgrade from V1 to V2, would the query engine / operation not replace the metadata.json with a V2 format metadata.json which should have set the sequence numbers to 0 anyway? Can you share with what query engine you performed the upgraded table format and how? I found this in the spec (https://iceberg.apache.org/spec/#writer-requirements): "Readers may be more strict for metadata JSON files because the JSON files are not reused and will always match the table version. Required fields that were not present in or were optional in prior versions may be handled as required fields. For example, a v2 table that is missing last-sequence-number can throw an exception.". (Apologies if I'm missing something, I'm pretty new to the Iceberg space at the moment.) |
Which issue does this PR close?
Didn't make an issue sorry. Very small change.
What changes are included in this PR?
After upgrading to v2 table I got complaints that this field does not exist, so
Added a default for sequence number for this struct.
I think this should be ok, as this is treated as 0 for iceberg v1 wrt to v2 compat in other contexts? But would like some confirmation.
Are these changes tested?
Tested that it fixes my problem, yes.