Skip to content

[CUDA][TIR] Standard math intrinsics should not lower to fast-math __*f by default #19546

@LeiWang1999

Description

@LeiWang1999

Currently, some standard TIR math intrinsics on CUDA lower to CUDA fast-math device functions by default.

For example, tir.exp / tirx.exp on float32 lowers to __expf(x) instead of the precise CUDA math function expf(x). This happens even when --use_fast_math is not passed to NVCC.

Why this is a problem

__expf, __logf, __sinf, etc. are CUDA fast-math intrinsics. They trade accuracy for performance and can introduce visible precision loss in numerically sensitive kernels.

Users generally expect standard math intrinsics such as T.exp, T.log, T.sin, and T.cos to preserve normal CUDA math semantics unless fast math is explicitly requested.

Fast-math behavior should ideally be opt-in, for example through a target option, compiler flag, or an explicit fast-math intrinsic.

Standard TIR math intrinsics should lower to precise CUDA math functions by default:

TIR op Expected CUDA
tirx.exp expf
tirx.exp10 exp10f
tirx.log logf
tirx.log2 log2f
tirx.log10 log10f
tirx.sin sinf
tirx.cos cosf
tirx.tan tanf

Fast-math variants such as __expf, __logf, __sinf, and __cosf should only be emitted when fast math is explicitly enabled.

Suggested fix: use CUDAMath instead of CUDAFastMath for standard CUDA math intrinsic lowering:

TVM_REGISTER_OP("tirx.exp")
.set_attr("cuda.FLowerIntrinsic", DispatchPureExtern);

If fast-math lowering is desired, it would be better to gate it behind an explicit fast-math option rather than making it the default behavior for standard math intrinsics.

cc @Hzfengsy @junrushao @quic-sanirudh @shingjan

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triagePRs or issues that need to be investigated by maintainers to find the right assignees to address ittype:rfc-trackingRFC progress tracking. Ref: https://github.com/apache/tvm-rfcs

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions