Arrow: Fix NPE reading a constant/null column in vectorized reader#16871
Arrow: Fix NPE reading a constant/null column in vectorized reader#16871thswlsqls wants to merge 1 commit into
Conversation
When a column is added after data files were written, the arrow vectorized reader produces a constant holder with a null vector. getPlainVectorAccessor then called vector.getClass() and threw a confusing NullPointerException. Guard the null vector so the message is "Unsupported vector: null", matching the null-safe pattern already used by the default branch in the same class. Revives the stale fix from apache#10284 (closed by the stale bot, not by design) and adds the reproduction test the reviewer requested. Generated-by: Claude Code
singhpk234
left a comment
There was a problem hiding this comment.
Is it then just throwing UnsupportException instead of NPE ? we should update the pr description accordingly
because the fix is not making the reads successful just wraps into another exception, is the workaround then to disable vectorized reads ?
|
You're right — this doesn't make the read succeed; it replaces the confusing NPE ( |
Closes #10275
Summary
ALTER TABLE ... ADD COLUMNafter data files were written, the arrow vectorized reader produces a constantVectorHolderwhose vector is null.GenericArrowVectorAccessorFactory.getPlainVectorAccessorcalledvector.getClass()on that null vector, throwing a confusing NullPointerException.Unsupported vector: null, matching the null-safe pattern already used by thedefaultbranch in the same class (line 178).Testing done
TestArrowReader#testReadAddedColumnFailsWithClearMessage: writes one row, adds a column viaupdateSchema().addColumn, scans withVectorizedTableScanIterable, and assertsUnsupportedOperationExceptionwith messageUnsupported vector: null.main(NPE onvector.getClass()) and passes with the fix../gradlew :iceberg-arrow:check— passed (30 tests, JDK 21).AI Disclosure