Promote scalar `float32` to `float64` in AOT signature by voltjia · Pull Request #153 · InfiniTensor/ninetoothed

voltjia · 2026-05-09T04:50:39Z

Summary

_build_variant now lifts a scalar fp32 argument to fp64 in the
Triton AOT signature. Without this, a scalar Tensor(0, dtype=ninetoothed.float32)
silently arrives as 0.0f inside the kernel on Triton 3.4.0+ —
triton.tools.compile declares the wrapper parameter as C double
but the cubin still reads 4 bytes as float, so the low half of the
8-byte double (zero for small values like 0.125) gets fed to the
kernel.

Triton ABI sweep

Compiled kernel(scale, out_ptr, n: tl.constexpr) with signature
'fp32, *fp32:16, 16' on every tested version:

Triton	wrapper sig for scalar `fp32`	matches cubin
3.1.0 / 3.2.0 / 3.3.0	`float scale`	✓
3.4.0 / 3.5.0 / 3.6.0 / 3.7.0	`double scale`	✗ — cubin still reads 4 bytes

Re-running the sweep with signature 'fp64, ...' gives double scale
consistently, so the wrapper, args[], and cubin agree on 8 bytes
across 3.1.0 – 3.7.0 (latest). The promotion is therefore harmless on
3.1 – 3.3 (no precision lost since the underlying Python value is
already 64-bit) and fixes the silent corruption on 3.4+.

Testing

pytest output:

============================= test session starts ==============================
platform linux -- Python 3.10.16, pytest-9.0.2, pluggy-1.6.0
rootdir: /home/huangjiacheng/ninetoothed
configfile: pyproject.toml
plugins: anyio-4.12.1, xdist-3.8.0, cov-7.0.0, typeguard-4.4.4
collected 214 items

tests/test_add.py .                                                      [  0%]
tests/test_addmm.py ..                                                   [  1%]
tests/test_aot.py ..........                                             [  6%]
tests/test_aot_auto_tuning.py ....                                       [  7%]
tests/test_attention.py ........                                         [ 11%]
tests/test_auto_tuner.py ....                                            [ 13%]
tests/test_clone.py ....                                                 [ 15%]
tests/test_conv2d.py ....                                                [ 17%]
tests/test_data_ptr.py .                                                 [ 17%]
tests/test_debugging.py .                                                [ 18%]
tests/test_dropout.py .                                                  [ 18%]
tests/test_eval.py ........                                              [ 22%]
tests/test_expand.py .                                                   [ 22%]
tests/test_generation.py ............................................... [ 44%]
.............................                                            [ 58%]
tests/test_getitem.py ..........                                         [ 63%]
tests/test_ipynb.py .                                                    [ 63%]
tests/test_jagged.py ................                                    [ 71%]
tests/test_matmul.py ..                                                  [ 71%]
tests/test_max_pool2d.py ..                                              [ 72%]
tests/test_naming.py .......                                             [ 76%]
tests/test_pad.py ................................................       [ 98%]
tests/test_pow.py .                                                      [ 99%]
tests/test_softmax.py .                                                  [ 99%]
tests/test_unsqueeze.py .                                                [100%]

======================= 214 passed in 3311.29s (0:55:11) =======================

voltjia added 2 commits May 9, 2026 10:36

Promote scalar fp32 to fp64 in AOT signature

8cc72ce

Add a test case for scalar fp32 arguments in test_aot.py

3146730

voltjia merged commit dc51d41 into master May 9, 2026
8 checks passed

voltjia deleted the promote-fp32-to-fp64-in-aot-signature branch May 9, 2026 06:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Promote scalar `float32` to `float64` in AOT signature#153

Promote scalar `float32` to `float64` in AOT signature#153
voltjia merged 2 commits into
masterfrom
promote-fp32-to-fp64-in-aot-signature

voltjia commented May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

voltjia commented May 9, 2026

Summary

Triton ABI sweep

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant