Add green context support by leofang · Pull Request #1976 · NVIDIA/cuda-python

leofang · 2026-04-25T03:56:04Z

Close #1563. Close #112.

Summary

Add green context support to cuda.core v1.0 — the push-model API for querying device resources, splitting SMs, and creating/using green contexts.

Design

See the companion design doc for full rationale. Key decisions:

Unified Context type — no user-visible GreenContext subclass. A single Context wraps either a primary CUcontext or a CUgreenCtx + derived CUcontext. ctx.is_green distinguishes them. Inspired by the CUDA runtime's execution-context (EC) abstraction.
dev.resources namespace — DeviceResources groups hardware resource queries (dev.resources.sm, dev.resources.workqueue). Follows the existing "plural = namespace" pattern (dev.properties, kernel.attributes).
SMResourceOptions with SoA broadcasting — single dataclass for SMResource.split(). Scalar fields broadcast; count drives the group count. count=None means discovery mode (translated to smCount=0 internally).
Merged workqueue types — WorkqueueResource merges CU_DEV_RESOURCE_TYPE_WORKQUEUE_CONFIG and CU_DEV_RESOURCE_TYPE_WORKQUEUE under one user-facing class. Strings for option values (e.g. sharing_scope="green_ctx_balanced").
ContextOptions(resources=[...]) → dev.create_context() — resource descriptor generation and cuGreenCtxCreate are internal. The user passes pre-split resource objects.
ctx.close() does not manage the context stack — the user must swap out via dev.set_current(prev) before closing. Closing a current context raises RuntimeError.

New public API

Device.resources → DeviceResources (namespace: .sm, .workqueue)
SMResource — properties: sm_count, min_partition_size, coscheduled_alignment, flags, handle; method: split(options, *, dry_run=False)
SMResourceOptions — count, coscheduled_sm_count, preferred_coscheduled_sm_count
WorkqueueResource — method: configure(options)
WorkqueueResourceOptions — sharing_scope
ContextOptions.resources — accepts Sequence[SMResource | WorkqueueResource]
Context.is_green — bool property

Implementation details

C++ handle layer (resource_handles.hpp/cpp):

GreenCtxHandle (shared_ptr<const CUgreenCtx>) — owning handle; destructor calls cuGreenCtxDestroy.
ContextBox gains a GreenCtxHandle field so the derived CUcontext keeps the green ctx alive. get_context_green_ctx() provides reverse lookup.
create_green_ctx_handle() combines cuDevResourceGenerateDesc + cuGreenCtxCreate in one call — the descriptor is transient (no DevResourceDescHandle needed since CUDA has no explicit destroy for it).
context_registry / stream_registry (HandleRegistry) deduplicate handles by raw CUDA pointer, enabling identity-preserving set_current swaps.

Bug fix — stream context tracking:

StreamBox now carries a ContextHandle dependency, populated at creation time.
get_stream_context() returns it without a driver call.
Stream._from_handle and Stream_ensure_ctx prefer the registry-backed handle before falling back to cuStreamGetCtx. This fixes a latent issue where streams created in a green context would lose their context association after a set_current swap.

Version guards:

Compile-time: IF CUDA_CORE_BUILD_MAJOR >= 13 gates cuDevSmResourceSplit (the general/structured form).
Runtime: cy_driver_version() >= (12, 4, 0) for all green ctx APIs; >= (13, 1, 0) for structured splits.
CUDA 12.x fallback: cuDevSmResourceSplitByCount for basic (homogeneous) splits. Per-group coscheduled_sm_count and heterogeneous counts require 13.1+ and raise NotImplementedError on 12.x.
Green ctx function pointers loaded via _get_optional_driver_fn — graceful NULL when bindings lack the symbol.

Test coverage

27 tests in test_green_context.py, organized with proper pytest fixtures and classes:

Fixtures: sm_resource, wq_resource, green_ctx (with CUDAError → skip), green_ctx_active (push/pop with try/finally), fill_kernel
_use_green_ctx context manager for safe push/pop in all tests — prevents context stack leaks on failure
TestSMResourceQuery — properties, arch constraints (pre-Hopper vs Hopper+)
TestWorkqueueResource — query, configure valid/invalid
TestSMResourceSplitValidation — scalar/Sequence mismatch, negative count, dry-run blocked
TestSMResourceSplit — single/two-group splits with arch-aligned counts, discovery mode, alignment, dry-run parity
TestGreenContextLifecycle — is_green, identity-preserving swap, stream/event context tracking, close-while-current guard
TestGreenContextKernelLaunch — compile + launch + host-verify in green ctx, two independent green contexts with different fill values, SM + workqueue combined

Validation

CUDA_HOME=... pip install -e . --no-build-isolation
python -m pytest tests/test_green_context.py -v             # 26 passed, 1 skipped (arch)
python -m pytest tests/test_device.py tests/test_stream.py tests/test_event.py tests/test_context.py -v  # no regressions

-- Leo's bot

copy-pr-bot · 2026-04-25T03:56:07Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Restructure tests into fixtures + classes with full resource cleanup: - Fixtures: sm_resource, wq_resource, green_ctx (with CUDAError skip), green_ctx_active (with try/finally restore), fill_kernel - _use_green_ctx context manager for safe push/pop in all tests - TestSMResourceQuery: properties, arch constraints per CC - TestSMResourceSplit: single/two-group splits, discovery, alignment, dry-run vs real parity - TestGreenContextKernelLaunch: compile + launch + verify in green ctx, two independent green contexts, SM + workqueue combined All set_current calls are paired with restore in finally blocks to prevent context stack leaks on test failure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

leofang · 2026-04-25T04:44:20Z

/ok to test ac5c0fc

github-actions · 2026-04-25T05:02:45Z

Doc Preview CI
🚀 View preview at https://nvidia.github.io/cuda-python/pr-preview/pr-1976/
https://nvidia.github.io/cuda-python/pr-preview/pr-1976/cuda-core/
https://nvidia.github.io/cuda-python/pr-preview/pr-1976/cuda-bindings/
https://nvidia.github.io/cuda-python/pr-preview/pr-1976/cuda-pathfinder/
Preview will be ready when the GitHub Pages deployment is complete.

leofang added 5 commits April 24, 2026 23:16

Implement green context v1 API

353a382

Refine green context split compatibility

b33c381

Encode green context handle dependencies

faf0d17

Simplify green context view handles

7c35ef3

Simplify green context descriptor handling

da58b7d

github-actions Bot added the cuda.core Everything related to the cuda.core module label Apr 25, 2026

leofang changed the title ~~Add cuda.core green context v1 API~~ Add green context support Apr 25, 2026

leofang added P0 High priority - Must do! feature New feature or request labels Apr 25, 2026

leofang self-assigned this Apr 25, 2026

leofang added this to the cuda.core v1.0.0 milestone Apr 25, 2026

leofang mentioned this pull request Apr 25, 2026

Implement Context.__init__() in cuda.core.experimental._context #189

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add green context support#1976

Add green context support#1976
leofang wants to merge 6 commits intoNVIDIA:mainfrom
leofang:leof/green-ctx-v1

leofang commented Apr 25, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented Apr 25, 2026

Uh oh!

leofang commented Apr 25, 2026

Uh oh!

github-actions Bot commented Apr 25, 2026

Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

leofang commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design

New public API

Implementation details

Test coverage

Validation

Uh oh!

copy-pr-bot Bot commented Apr 25, 2026

Uh oh!

leofang commented Apr 25, 2026

Uh oh!

github-actions Bot commented Apr 25, 2026

Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

leofang commented Apr 25, 2026 •

edited

Loading