WIP: pathfinder_compatibility_guard_rails#1977
Conversation
Introduce CompatibilityGuardRails plus related errors and tests so callers can opt into CTK and driver compatibility checks while reusing the existing pathfinder lookup APIs. Made-with: Cursor
Expose process_wide_compatibility_guard_rails at import time so follow-up changes can route the default cuda.pathfinder APIs through a stable public instance. Document the singleton and pin its public availability with a small regression test. Made-with: Cursor
Make the process-wide CompatibilityGuardRails instance the default path for the public load/find/locate APIs so top-level calls share compatibility state. Factor the routing/fallback/cache-reset glue into a dedicated internal module to keep `cuda.pathfinder.__init__` focused on the public surface, and fall back to the existing raw resolvers when v1 guard rails only have insufficient metadata. Made-with: Cursor
Allow CUDA_PATHFINDER_COMPATIBILITY_GUARD_RAILS to select strict, best_effort, or off behavior so we can experiment with stricter compatibility checks without changing the public API shape. Made-with: Cursor
Treat driver-packaged libraries as compatibility-neutral so strict mode can load NVML and other driver libs without a raw fallback, while CTK-backed artifacts remain the only items that establish and enforce the process-wide CTK anchor. Made-with: Cursor
Infer the CUDA Toolkit line from both wildcard-pinned and range-based cuda-toolkit requirements so strict process-wide guard rails keep working for editable wheel installs used by nvrtc and nvJitLink. Made-with: Cursor
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
|
|
Analysis of CI failures for workflow run Cursor GPT-5.4 Extra High Fast Findings
Why
Proper fix
|
Introduce a small toolkit-info utility that reads the CUDA_VERSION macro from cuda.h so follow-up guard-rails changes can infer CTK major.minor from toolkit headers without depending on version.json. Made-with: Cursor
Centralize encoded CUDA version parsing and validation so toolkit and driver version helpers stay aligned and cuda.h parsing gets consistent string conversion and error reporting. Made-with: Cursor
Replace version.json-based CTK root metadata with cuda.h parsing so compatibility checks use a simpler, more universal toolkit source while preserving wheel-based metadata inference. Made-with: Cursor
|
/ok to test |
|
At commit c6c38e3, the CI has a single failure in
That failure does not look like a
Spot-checking sibling logs shows that the underlying
So the most important takeaway from the logs is: the single red test is a combination of two conditions happening in the same job:
That explains why this shows up as only one visible failure even though the broader Issues to look into next:
|
Resolves #1038
Continuation of #1936
WIP — CI testing