Fix bitwise_unary_op#6940
Conversation
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
|
Ah, that's why you can't just change this. I made this change in #6880 which means that binary op needs adjustment as well. I don't see how the mut op would be faster though |
Merging this PR will degrade performance by 12.27%
Performance Changes
Comparing Footnotes
|
|
align_to is not the problem here though? It's that the padding is wrong so offset and length are wrong |
|
I thought that align to caused the padding to be wrong in this case. Anyways I don't see much reason to not just delegate to the _mut version, the regression on codspeed is likely because it was just incorrect before. |
|
align_to just splits the buffer (it's a const function). But the padding is clearly not preserved for any buffer in 17-32 range potentially. Anyway I agree with your fix. We should make sure to fix the binary op as well since we're here |
|
what do you mean by fix the binary op? |
robert3005
left a comment
There was a problem hiding this comment.
I can follow up with the fix to binary op
|
Ok, binary op is not a problem since it's only ever used for aligned arrays (i.e. 8 byte aligned) but we really only need 1 byte alignment |
| let result = Buffer::<u64>::from_trusted_len_iter(iter).into_byte_buffer(); | ||
|
|
||
| BitBuffer::new_with_offset(result, buffer.len(), buffer.offset()) | ||
| let mut buf = buffer.clone().into_mut(); |
There was a problem hiding this comment.
@robert3005 we should kill the into_mut() functions. If we cannot into_mut, then we do a memcopy for no reason since we're about to overwrite the data anyway.
We should just have try_into_mut and the caller can either do in-place, or allocate a new buffer and compute directly into that.
Summary
Closes: #6895 (which might seem completely unrelated but is relevant because of an incorrect mask)
Fixes a bug in
bitwise_unary_opwhere the it used Arrow'sUnalignedBitChunkiterator, which for buffers larger than 16 bytes (128 bits) usesalign_to::<u64>()and introduces lead padding zeros when the byte pointer isn't u64-aligned. After applying the operation (e.g. NOT), those padding bits were written into a fresh, aligned output buffer at real data positions, corrupting the first padding bits of the resultThis change just delegates to the correct
mutimplementation which is likely also faster.Testing
Adds a simple regression test.