Skip to content

GB10 / DGX Spark (SM 12.1, AArch64): candidate tests for requires unified_shared_memory #911

Description

@parallelArchitect

Platform: NVIDIA GB10 (SM 12.1), AArch64

The platform on which these tests are intended to run reports:
cudaDevAttrPageableMemoryAccessUsesHostPageTables = 1

Hardware measurements collected across three independent GB10 systems:

Fault COLD/WARM p50 ratio: 1.00x
Atomic SYS/GPU scope ratio: 1.00x
NVLink-C2C coherence overhead: 0.0 ns

Source: https://github.com/parallelArchitect/nvidia-uma-fault-probe

These measurements are consistent with a hardware-coherent memory model
and motivate evaluation of OpenMP unified shared memory behavior on
this platform.

OpenMP map-clause behavior under #pragma omp requires unified_shared_memory
has not yet been characterized on GB10. Two candidate tests and platform
notes are available at:
https://github.com/parallelArchitect/OpenMP_VV/tree/gb10-uma-aware

test_gb10_requires_unified_shared_memory_nomap.c
Evaluates heap allocation access and interleaved CPU/GPU updates under
requires unified_shared_memory without explicit map clauses.

test_gb10_uma_pointer_validity.c
Evaluates host stack and heap pointer accessibility inside an OpenMP
target region when unified shared memory is active and neither map
nor is_device_ptr clauses are provided.

Both tests include a runtime platform gate.
Systems reporting 0: SKIP. Systems reporting 1: RUN.

Neither test has been executed on GB10 hardware at the time of writing.
Execution would provide conformance data for OpenMP unified shared memory
constructs on a platform reporting:
cudaDevAttrPageableMemoryAccessUsesHostPageTables = 1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions