Skip to content

Commit 641a089

Browse files
committed
Document the two-level handle and object registry design
Add REGISTRY_DESIGN.md explaining how the C++ HandleRegistry (Level 1) and Cython _node_registry (Level 2) work together to preserve Python object identity through driver round-trips. Add cross-references at each registry instantiation site. Made-with: Cursor
1 parent 347693f commit 641a089

File tree

3 files changed

+55
-0
lines changed

3 files changed

+55
-0
lines changed
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# Handle and Object Registries
2+
3+
When Python-managed objects round-trip through the CUDA driver (e.g.,
4+
querying a graph's nodes and getting back raw `CUgraphNode` pointers),
5+
we need to recover the original Python object rather than creating a
6+
duplicate.
7+
8+
This document describes the approach used to achieve this. The pattern
9+
is driven mainly by needs arising in the context of CUDA graphs, but
10+
it is general and can be extended to other object types as needs arise.
11+
12+
This solves the same problem as pybind11's `registered_instances` map
13+
and is sometimes called the Identity Map pattern. Two registries work
14+
together to map a raw driver handle all the way back to the original
15+
Python object. Both use weak references so they
16+
do not prevent cleanup. Entries are removed either explicitly (via
17+
`destroy()` or a Box destructor) or implicitly when the weak reference
18+
expires.
19+
20+
## Level 1: Driver Handle -> Resource Handle (C++)
21+
22+
`HandleRegistry` in `resource_handles.cpp` maps a raw CUDA handle
23+
(e.g., `CUevent`, `CUkernel`, `CUgraphNode`) to the `weak_ptr` that
24+
owns it. When a `_ref` constructor receives a raw handle, it
25+
checks the registry first. If found, it returns the existing
26+
`shared_ptr`, preserving the Box and its metadata (e.g., `EventBox`
27+
carries timing/IPC flags, `KernelBox` carries the library dependency).
28+
29+
Without this level, a round-tripped handle would produce a new Box
30+
with default metadata, losing information that was set at creation.
31+
32+
Instances: `event_registry`, `kernel_registry`, `graph_node_registry`.
33+
34+
## Level 2: Resource Handle -> Python Object (Cython)
35+
36+
`_node_registry` in `_graph_node.pyx` is a `WeakValueDictionary`
37+
mapping a resource address (`shared_ptr::get()`) to a Python
38+
`GraphNode` object. When `GraphNode._create` receives a handle from
39+
Level 1, it checks this registry. If found, it returns the existing
40+
Python object.
41+
42+
Without this level, each driver round-trip would produce a distinct
43+
Python object for the same logical node, resulting in surprising
44+
behavior:
45+
46+
```python
47+
a = g.empty()
48+
a.succ = {b}
49+
b2, = a.succ # queries driver, gets back CUgraphNode for b
50+
assert b2 is b # fails without Level 2 registry
51+
```

cuda_core/cuda/core/_cpp/resource_handles.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -388,6 +388,7 @@ ContextHandle get_event_context(const EventHandle& h) noexcept {
388388
return h ? get_box(h)->h_context : ContextHandle{};
389389
}
390390

391+
// See REGISTRY_DESIGN.md (Level 1: Driver Handle -> Resource Handle)
391392
static HandleRegistry<CUevent, EventHandle> event_registry;
392393

393394
EventHandle create_event_handle(const ContextHandle& h_ctx, unsigned int flags,
@@ -894,6 +895,7 @@ static const KernelBox* get_box(const KernelHandle& h) {
894895
);
895896
}
896897

898+
// See REGISTRY_DESIGN.md (Level 1: Driver Handle -> Resource Handle)
897899
static HandleRegistry<CUkernel, KernelHandle> kernel_registry;
898900

899901
KernelHandle create_kernel_handle(const LibraryHandle& h_library, const char* name) {
@@ -964,6 +966,7 @@ static const GraphNodeBox* get_box(const GraphNodeHandle& h) {
964966
);
965967
}
966968

969+
// See REGISTRY_DESIGN.md (Level 1: Driver Handle -> Resource Handle)
967970
static HandleRegistry<CUgraphNode, GraphNodeHandle> graph_node_registry;
968971

969972
GraphNodeHandle create_graph_node_handle(CUgraphNode node, const GraphHandle& h_graph) {

cuda_core/cuda/core/_graph/_graph_def/_graph_node.pyx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ from cuda.core import Device
6363
from cuda.core._graph._graph_def._adjacency_set_proxy import AdjacencySetProxy
6464
from cuda.core._utils.cuda_utils import driver, handle_return
6565

66+
# See _cpp/REGISTRY_DESIGN.md (Level 2: Resource Handle -> Python Object)
6667
_node_registry = weakref.WeakValueDictionary()
6768

6869

0 commit comments

Comments
 (0)