Commit 2f4fd1d
fix(cuda_std): pack warp shuffle/match return into i64 for LLVM 19 verifier
LLVM 19's verifier rejects the `align N` return-attribute that rustc's
C ABI lowering attaches to calls returning small aggregates like
{ i32, i8 } (align is only valid on pointer returns). Three intrinsic
wrappers in libintrinsics.ll triggered this:
- __nvvm_warp_shuffle
- __nvvm_warp_match_all_32
- __nvvm_warp_match_all_64
Switch their return type from { i32, i8 } to a packed i64 (low 32 bits
= value, bit 32 = predicate). Primitive integer return ⇒ no struct ABI
⇒ no spurious return-attribute. Uses only LLVM 1.0-era IR primitives
(zext/shl/or), so it's safe under both LLVM 7 (CUDA 12.x libnvvm) and
LLVM 19 (CUDA 13.x libnvvm). Removes the now-redundant
WarpShuffleResult struct.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent ab52ac7 commit 2f4fd1d
2 files changed
Lines changed: 53 additions & 44 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
313 | 313 | | |
314 | 314 | | |
315 | 315 | | |
316 | | - | |
317 | | - | |
| 316 | + | |
| 317 | + | |
318 | 318 | | |
319 | | - | |
| 319 | + | |
320 | 320 | | |
321 | 321 | | |
322 | 322 | | |
323 | 323 | | |
324 | 324 | | |
325 | 325 | | |
326 | | - | |
327 | | - | |
| 326 | + | |
| 327 | + | |
328 | 328 | | |
329 | | - | |
| 329 | + | |
330 | 330 | | |
331 | 331 | | |
332 | 332 | | |
| |||
741 | 741 | | |
742 | 742 | | |
743 | 743 | | |
744 | | - | |
745 | | - | |
746 | | - | |
747 | | - | |
748 | | - | |
749 | | - | |
750 | | - | |
751 | | - | |
| 744 | + | |
| 745 | + | |
| 746 | + | |
| 747 | + | |
| 748 | + | |
| 749 | + | |
| 750 | + | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
752 | 754 | | |
753 | 755 | | |
754 | 756 | | |
| |||
761 | 763 | | |
762 | 764 | | |
763 | 765 | | |
764 | | - | |
765 | | - | |
| 766 | + | |
766 | 767 | | |
767 | 768 | | |
768 | 769 | | |
| |||
776 | 777 | | |
777 | 778 | | |
778 | 779 | | |
779 | | - | |
| 780 | + | |
780 | 781 | | |
781 | 782 | | |
782 | 783 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
152 | 152 | | |
153 | 153 | | |
154 | 154 | | |
155 | | - | |
156 | | - | |
157 | | - | |
158 | | - | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
159 | 164 | | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | | - | |
164 | | - | |
165 | | - | |
166 | | - | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
167 | 173 | | |
168 | 174 | | |
169 | 175 | | |
170 | 176 | | |
171 | | - | |
| 177 | + | |
172 | 178 | | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | | - | |
177 | | - | |
178 | | - | |
179 | | - | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
180 | 187 | | |
181 | 188 | | |
182 | 189 | | |
183 | 190 | | |
184 | | - | |
| 191 | + | |
185 | 192 | | |
186 | | - | |
187 | | - | |
188 | | - | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
193 | 201 | | |
194 | 202 | | |
195 | 203 | | |
| |||
0 commit comments