Skip to content

Fix RecordBatch::normalize() null bitmap bug and add StructArray::flatten()#9733

Open
sqd wants to merge 3 commits intoapache:mainfrom
sqd:oss_normalize_bug
Open

Fix RecordBatch::normalize() null bitmap bug and add StructArray::flatten()#9733
sqd wants to merge 3 commits intoapache:mainfrom
sqd:oss_normalize_bug

Conversation

@sqd
Copy link
Copy Markdown

@sqd sqd commented Apr 15, 2026

Currently RecordBatch::normalize() has a bug in that the top level struct's null bitmap is not propagated into the resulting normalized arrays' null bitmap. In other words, a child element may suddenly appear non-null, losing the fact that the parent level struct is null at that index. See the test in this change for a bug reproduction.

This change fixes that behavior. Also adds StructArray::flatten() which mirrors arrow-cpp's semantics and handles the aforementioned behavior correctly. The fixed RecordBatch::normalize() now uses StructArray::flatten() under the hood.

Which issue does this PR close?

Are these changes tested?

Yes

Are there any user-facing changes?

No

…tten()

Currently RecordBatch::normalize() has a bug in that the top level
struct's null bitmap is not propagated into the resulting normalized
arrays' null bitmap. In other words, a child element may suddenly
appear non-null, losing the fact that the parent level struct is null at
that index. See the test in this change for a bug reproduction.

This change fixes that behavior. Also adds StructArray::flatten() which
mirrors arrow-cpp's semantics and handles the aforementioned behavior
correctly. The fixed RecordBatch::normalize() now uses
StructArray::flatten() under the hood.
@github-actions github-actions bot added the arrow Changes to the arrow crate label Apr 15, 2026
@sqd
Copy link
Copy Markdown
Author

sqd commented Apr 15, 2026

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Apr 15, 2026

@negli-me, since you added the normalize function, could you help review this PR

It was added in

@sqd
Copy link
Copy Markdown
Author

sqd commented Apr 18, 2026

I think this is the correct github handle, without "e" @ngli-me

Copy link
Copy Markdown
Contributor

@ngli-me ngli-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate the ping, switched companies recently and got stuck in approval purgatory for outside work. Code looks good, tests had no issues for me. Thanks for the fix.

@sqd
Copy link
Copy Markdown
Author

sqd commented Apr 19, 2026

Thank you @ngli-me ! This is ready for another look @alamb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RecordBatch::normalize() does not propagate top level null bitmap into the results

3 participants