feat[gpu]: arrow device array stream support#8483
Conversation
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
This reverts commit 52952d2. Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
|
@claude review in depth |
|
Claude finished @0ax1's task in 3m 22s —— View job Claude Code Review — Arrow Device Array Stream support
Overall this is a careful, well-documented implementation of the Arrow C Device stream interface on top of the existing single-array device export. The unsafe FFI surface is handled with discipline: panics are caught at every callback boundary, double-release is guarded, Findings1.
|
Merging this PR will improve performance by 15.69%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | chunked_varbinview_into_canonical[(1000, 10)] |
177.7 µs | 213.9 µs | -16.94% |
| ⚡ | Simulation | take_10k_random |
255.8 µs | 197.8 µs | +29.27% |
| ⚡ | Simulation | take_10k_contiguous |
276.3 µs | 218.5 µs | +26.46% |
| ⚡ | Simulation | patched_take_10k_contiguous_patches |
291 µs | 232.3 µs | +25.26% |
| ⚡ | Simulation | patched_take_10k_random |
303 µs | 244.2 µs | +24.07% |
| ⚡ | WallTime | cuda/bitpacked_u8/unpack/3bw[100M] |
352 µs | 299.4 µs | +17.58% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[128] |
215.3 ns | 186.1 ns | +15.67% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[1024] |
275.6 ns | 246.4 ns | +11.84% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing ad/arrow-device-array-stream (b2b0cbd) with develop (85aad72)
Footnotes
-
11 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
No description provided.