Support typed NVVM IR for pre-Blackwell libNVVM targets#101
Open
mohamedsamirx wants to merge 1 commit into
Open
Support typed NVVM IR for pre-Blackwell libNVVM targets#101mohamedsamirx wants to merge 1 commit into
mohamedsamirx wants to merge 1 commit into
Conversation
Signed-off-by: Mohamed Samir <mohamedsamirx@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #98.
This PR makes NVVM IR export target-aware so pre-Blackwell targets use the typed-pointer NVVM IR dialect expected by older libNVVM paths, while Blackwell and newer targets keep the existing opaque-pointer dialect.
Changes include:
CUDA_OXIDE_TARGET/--arch.sm_75.sm_100/compute_100and newer.!nvvmir.version = {2, 0, 3, 1}for typed-pointer mode.!nvvm.annotationsand@llvm.used.i8*.fnegand saturating integer intrinsics into supported LLVM IR operations.libnvvm-sysbinding for CUDA 13+nvvmLLVMVersion.cargo-oxide,cuda-host, andlibnvvm-sysdocs.Motivation
On CUDA 12.4 with an
sm_75NVIDIA T1200 GPU, examples that enter NVVM IR mode through libdevice calls fail in libNVVM:Clean
origin/mainat1f38440reproduces this for:cargo oxide run addressof_sharedarray_repro --emit-nvvm-ir --arch sm_75cargo oxide run manual_launch_libdevice --emit-nvvm-ir --arch sm_75cargo oxide run primitive_stress --emit-nvvm-ir --arch sm_75The root problem is that pre-Blackwell libNVVM expects typed-pointer NVVM IR, while the current exporter emits modern opaque-pointer IR.
Testing
Results on this machine:
cargo oxide doctor: PassThe three issue-relevant examples pass on this branch and fail on clean
origin/mainwithnvvmCompileProgram ... parse expected type.The remaining 12 smoketest failures also reproduce on clean
mainand are unrelated baseline failures on thissm_75laptop, mostly advanced barrier/TMA/tcgen05 paths, one local nvJitLink header/constants issue, and two runtime illegal-memory-access examples.