You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* [STF] Add C bindings for the places layer
Extends the experimental STF C API to mirror the C++ places layer:
- green_context_helper (create/destroy/count/device id) and green-context
exec_place / data_place factories (CUDA 12.4+).
- exec_place scope enter/exit (RAII context activation), affine data_place
accessor, and grid sub-place accessor (get_place).
- data_place stream-ordered allocate/deallocate and an
allocation_is_stream_ordered query, plus machine_init.
- task grid accessors: get_grid_dims and get_custream_at_index.
Adds coverage in test_places.cpp. Extracted from the python-bindings PR
to keep that change reviewable.
* [STF] Harden places C API at the C/FFI boundary
Address CodeRabbit review feedback:
- stf_exec_place_scope_enter now rejects out-of-range indices with NULL,
matching the contract of the neighboring index-based accessors.
- stf_data_place_deallocate catches and maps C++ exceptions instead of
letting them escape the extern "C" entry point.
* [STF] Use range-based for loops in test_places (clang-tidy)
Fix modernize-loop-convert clang-tidy errors by iterating the places
array with range-based for loops instead of index-based loops.
* [STF] Add unified_task grid introspection used by places C API
The places C bindings (stf_task_get_grid_dims / stf_task_get_custream_at_index)
call get_grid_dims(dim4*) and get_stream(size_t) on context::unified_task<>,
but those overloads were never declared on unified_task in this branch, so
stf.cu failed to compile. Add both methods, dispatching the per-place stream
to stream_task<Deps...> and returning nullptr/false for graph tasks or
non-grid exec places.
* [STF] Bounds-check unified_task::get_stream(place_index)
stream_task::get_stream(size_t) indexes the stream grid without any bounds
check, so stf_task_get_custream_at_index could read past the grid for an
out-of-range index (UB) and returned success for non-grid exec places,
contradicting the documented contract (non-zero on "not a grid" / index out
of range). Guard the linear index in the unified_task<> wrapper: return
nullptr for graph tasks, non-grid exec places, and out-of-range indices.
Add a regression check to the grid test for the out-of-range index case.
* [STF] Test green-context places C API
Add direct C API coverage for green-context helper and green-context exec/data place factories so the extracted places bindings are self-contained.
* [STF] Guard stf_machine_init at the C boundary
machine::instance() does real work on first call (P2P/mempool/topology
setup) and can throw. Wrap it in try/catch so a C++ exception never
unwinds across the extern "C" boundary into a C caller (UB / terminate),
matching the error-reporting convention used by stf_try_allocate.
* [STF] Document allocate/deallocate size signedness rationale
stf_data_place_allocate takes a signed ptrdiff_t while stf_data_place_deallocate
takes an unsigned size_t. This mirrors the C++ allocator interface, where the
requested size is passed by reference and negated to signal allocation failure;
deallocation has no such error to signal. Document the asymmetry on both entry
points so the C surface explains why the types differ.
---------
Co-authored-by: Andrei Alexandrescu <andrei@erdani.com>
0 commit comments