feat: add CUDA cuDF convenience API by 0ax1 · Pull Request #8624 · vortex-data/vortex

0ax1 · 2026-06-29T16:29:26Z

See the README.md added as part of this PR how to use pyVortex CUDA.

codspeed-hq · 2026-06-29T16:49:28Z

Merging this PR will not alter performance

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 2 improved benchmarks
❌ 3 regressed benchmarks
✅ 1590 untouched benchmarks
⏩ 4 skipped benchmarks¹

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
❌	Simulation	`chunked_varbinview_opt_canonical_into[(1000, 10)]`	169.6 µs	206.2 µs	-17.75%
❌	Simulation	`chunked_varbinview_into_canonical[(1000, 10)]`	169.3 µs	205.6 µs	-17.67%
❌	Simulation	`chunked_varbinview_opt_into_canonical[(1000, 10)]`	183.4 µs	219.7 µs	-16.52%
⚡	Simulation	`chunked_varbinview_into_canonical[(100, 100)]`	306.4 µs	272.1 µs	+12.61%
⚡	Simulation	`bitwise_not_vortex_buffer_mut[128]`	273.6 ns	244.4 ns	+11.93%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing ad/pycudf4 (bbfb83a) with develop (0a45777)}

4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

github-actions · 2026-07-01T15:06:10Z

Claude finished @0ax1's task in 3m 24s —— View job

Claude PR Review — CUDA cuDF convenience API

I reviewed the full diff (c913efa) against develop. Overall this is a careful, well-tested change: the validity-padding rework is conservative (it proves a tail is zeroed before reusing a device bitmap, and copies/repacks otherwise), the FFI capsule lifecycle is memory-safe, and the Python surface is small and documented. No blocking issues found. Notes below.

Correctness & safety — looks good

Validity tail-padding reuse (export_arrow_validity_bitmap / has_zeroed_tail_padding) is sound. The reuse path only returns the existing bitmap when zeroed_tail_start <= offset + logical_len and offset + padded_len <= allocation_len, so the whole [logical_end, offset+padded_len) window is provably inside the zeroed region. When the tail isn't tracked (zeroed_tail_start == None) or offsets differ, it falls back to copy_arrow_validity_buffer / repack_arrow_validity_buffer. That conservative default is the right call.
zeroed_tail_start is allocation-relative and is correctly carried through slice() / with_alignment() / cuda_backing_allocation unchanged, which matches offset also being allocation-relative. Consistent.
FFI capsule release is idempotent. release_schema / release_device_array clear the release callback, and release_*_capsule destructors (via Box::from_raw + capsule_pointer_with_name_or_used) run exactly once and see a cleared callback after _debug_consume_..., so no double-free of device buffers. The "used_" capsule renaming + fallback lookup is a nice touch.
setattr(vortex.Array, ...) works because Array is a PyO3 heap type (#[pyclass(..., subclass, frozen)]); frozen only affects Rust-side interior mutability, not class-level attribute assignment. The install is idempotent on re-import.

Observations / suggestions

Padding constant bumped 4 → 64 bytes affects all copy_to_device allocations, not just validity. padded_device_allocation_len now rounds every host→device copy up to a multiple of CUDF_VALIDITY_BUFFER_PADDING. For large buffers this is negligible, but workloads with many tiny buffers now pay up to 16× the previous minimum tail. This looks intentional (uniform cuDF-safe padding avoids special-casing validity), but the constant name CUDF_VALIDITY_BUFFER_PADDING undersells that it's now the general device-copy padding. Consider a more general name or a comment on padded_device_allocation_len noting the trade-off for small allocations.
to_cudf(..., fallback=...) is speculative generality. fallback only accepts "error" and otherwise raises NotImplementedError; it's a reserved-for-future no-op parameter. It's documented and harmless, but it's untested behavior surface (YAGNI). Fine to keep given the README documents it explicitly.
Struct column renaming assumes positional name alignment. In to_cudf, dataframe.columns = dtype.names() relies on pylibcudf.Table.from_arrow preserving field order 1:1 with StructDType.names(). That holds for Arrow, but it silently overwrites whatever names cuDF produced — if a future export ever reordered/flattened fields this would mislabel columns. A brief comment noting the order invariant would help the next reader.
Global side effect on import. Importing vortex_cuda mutates the third-party vortex.Array class. This is the documented design and is idempotent, but it's worth being aware that any import of vortex_cuda (even transitively) installs to_cudf/__arrow_c_device_array__ process-wide.

Test coverage — strong

New Rust tests cover matching-offset padding, padded-bitmap reuse, repack-without-tail-padding, and empty-array omission, and assert the backing allocation size + zeroed tail. Good edge coverage.
Python tests exercise the pure dispatch logic with fakes (no GPU needed) plus a real-cuDF smoke test gated on cuda_available(), including a lifetime/gc.collect() check that a derived Series outlives its parent DataFrame — nice validation of the zero-copy ownership claim.
The nullable-struct rejection and CUDA-unavailable paths are verified to not import cuDF, which keeps the fast-fail contract honest.

Nits

stream.rs: the new module-level comment block (lines ~26–30) sits detached above struct VortexCudaStream with a blank line between; reads slightly orphaned from what it documents.
test_native_bridge.py: a few assertion lines exceed the repo's usual width; confirm ruff/formatting passes (uv run ruff check on the changed test files).

I did not run builds/tests (no CUDA device in this environment and the change is GPU-centric); please ensure cargo clippy --all-targets --all-features -p vortex-cuda, cargo +nightly fmt --check, and the vortex-python-cuda pytest suite were run locally per CLAUDE.md.
· ad/pycudf4

Add vortex_cuda.to_cudf and install optional Vortex Array helpers for cuDF conversion and Arrow C Device export. Keep the conversion path CUDA-only by rejecting unsupported fallback policies and routing cuDF ingestion through fresh Arrow C Device capsules. Expand CUDA Python tests for the convenience API, installed Array methods, Arrow Device export smoke coverage, and capsule ownership paths. Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>

0ax1 added the changelog/chore A trivial change label Jun 29, 2026

0ax1 force-pushed the ad/pycudf4 branch 2 times, most recently from 93c8619 to 8ebaae5 Compare June 29, 2026 16:33

vortex-data deleted a comment from github-actions Bot Jun 29, 2026

0ax1 force-pushed the ad/pycudf4 branch 3 times, most recently from 71f8a1d to b7a5b32 Compare June 30, 2026 12:39

0ax1 changed the title ~~chore: test CUDA Arrow Device capsule exports~~ feat: add CUDA cuDF convenience API Jun 30, 2026

0ax1 force-pushed the ad/pycudf4 branch 2 times, most recently from 5e24a18 to e55b447 Compare June 30, 2026 15:28

0ax1 marked this pull request as ready for review June 30, 2026 15:30

0ax1 requested review from a team and robert3005 June 30, 2026 15:30

0ax1 enabled auto-merge (squash) June 30, 2026 15:39

0ax1 requested a review from AdamGS June 30, 2026 15:53

0ax1 marked this pull request as draft July 1, 2026 09:57

auto-merge was automatically disabled July 1, 2026 09:57
Pull request was converted to draft

0ax1 force-pushed the ad/pycudf4 branch from e55b447 to 2944fe1 Compare July 1, 2026 14:24

vortex-data deleted a comment from github-actions Bot Jul 1, 2026

0ax1 force-pushed the ad/pycudf4 branch 2 times, most recently from da401e4 to c913efa Compare July 1, 2026 14:57

vortex-data deleted a comment from github-actions Bot Jul 1, 2026

0ax1 force-pushed the ad/pycudf4 branch from c913efa to bbfb83a Compare July 1, 2026 15:20

0ax1 marked this pull request as ready for review July 1, 2026 15:20

0ax1 enabled auto-merge (squash) July 1, 2026 15:20

AdamGS approved these changes Jul 1, 2026

View reviewed changes

0ax1 merged commit 890704f into develop Jul 1, 2026
72 of 73 checks passed

0ax1 deleted the ad/pycudf4 branch July 1, 2026 16:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add CUDA cuDF convenience API#8624

feat: add CUDA cuDF convenience API#8624
0ax1 merged 1 commit into
developfrom
ad/pycudf4

0ax1 commented Jun 29, 2026 •

edited

Loading

Uh oh!

codspeed-hq Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jul 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

0ax1 commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Performance Changes

Footnotes

Uh oh!

github-actions Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude PR Review — CUDA cuDF convenience API

Correctness & safety — looks good

Observations / suggestions

Test coverage — strong

Nits

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

0ax1 commented Jun 29, 2026 •

edited

Loading

codspeed-hq Bot commented Jun 29, 2026 •

edited

Loading

github-actions Bot commented Jul 1, 2026 •

edited

Loading