[AArch64] Add 9.7 CVT data processing intrinsics by MartinWehking · Pull Request #186807 · llvm/llvm-project

MartinWehking · 2026-03-16T14:13:24Z

Add Clang intrinsics
svcvtt_f16_s8, _f32_s16, _f64_s32, _f16_u8, _f32_u16, _f64_u32
svcvtb_f16_s8, _f32_s16, _f64_s32, _f16_u8, _f32_u16, _f64_u32

lowering to AARCH64 Instrs. SCVTF, SCVTFLT, UCVTF, UCVTFLT

and Clang instrinsics:
svcvtzn_s8[_f16_x2], _s32[_f64_x2], _u8[_f16_x2], _u16[_f32_x2], _u32[_f64_x2]

lowering to AARCH64 Instrs. FCVTZSN, FCVTZUN

The Clang intrinsics are guarded by the sve2.3 and sme2.3 feature flags.

ACLE Patch:
ARM-software/acle#428

The patch reuses IsReductionQV for resolving the overload of intrinsics.
This naming is misleading and needs changed

llvmbot · 2026-03-16T14:14:01Z

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-aarch64

Author: Martin Wehking (MartinWehking)

Changes

Add Clang/LLVM intrinsics for svcvt, scvtflt, ucvtf, ucvtflt and fcvtzsn, fcvtzun.
The Clang intrinsics are guarded by the sve2.3 and sme2.3 feature flags.

ACLE Patch:
ARM-software/acle#428

Patch is 39.30 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/186807.diff

8 Files Affected:

(modified) clang/include/clang/Basic/arm_sve.td (+27)
(added) clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_fp_int_cvtn_x2.c (+105)
(added) clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_int_fp_cvt.c (+189)
(modified) llvm/include/llvm/IR/IntrinsicsAArch64.td (+33)
(modified) llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td (+6-6)
(modified) llvm/lib/Target/AArch64/SVEInstrFormats.td (+13-2)
(added) llvm/test/CodeGen/AArch64/sve2p3-intrinsics-fp-converts.ll (+255)
(added) llvm/test/CodeGen/AArch64/sve2p3-intrinsics-fp-converts_x2.ll (+157)

diff --git a/clang/include/clang/Basic/arm_sve.td b/clang/include/clang/Basic/arm_sve.td
index be3cd8a76503b..852cc60c6e0b3 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -997,6 +997,33 @@ def SVCVTLT_Z_F32_F16  : SInst<"svcvtlt_f32[_f16]", "dPh", "f", MergeZeroExp, "a
 def SVCVTLT_Z_F64_F32  : SInst<"svcvtlt_f64[_f32]", "dPh", "d", MergeZeroExp, "aarch64_sve_fcvtlt_f64f32",  [IsOverloadNone, VerifyRuntimeMode]>;
 
 }
+
+let SVETargetGuard = "sve2p3|sme2p3", SMETargetGuard = "sve2p3|sme2p3" in {
+def SVCVT_S8_F16  : SInst<"svcvt_s8[_f16_x2]",  "d2.O", "c", MergeNone, "aarch64_sve_fcvtzsn", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;
+def SVCVT_S16_F32  : SInst<"svcvt_s16[_f32_x2]",  "d2.M", "s", MergeNone, "aarch64_sve_fcvtzsn", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;
+def SVCVT_S32_F64  : SInst<"svcvt_s32[_f64_x2]",  "d2.N", "i", MergeNone, "aarch64_sve_fcvtzsn", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;
+
+def SVCVT_U8_F16  : SInst<"svcvt_u8[_f16_x2]",  "d2.O", "Uc", MergeNone, "aarch64_sve_fcvtzun", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;
+def SVCVT_U16_F32  : SInst<"svcvt_u16[_f32_x2]",  "d2.M", "Us", MergeNone, "aarch64_sve_fcvtzun", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;
+def SVCVT_U32_F64  : SInst<"svcvt_u32[_f64_x2]",  "d2.N", "Ui", MergeNone, "aarch64_sve_fcvtzun", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;
+
+def SVCVTT_F16_S8  : SInst<"svcvtt_f16[_s8]",  "Od", "c", MergeNone, "aarch64_sve_scvtflt_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTT_F32_S16  : SInst<"svcvtt_f32[_s16]",  "Md", "s", MergeNone, "aarch64_sve_scvtflt_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTT_F64_S32  : SInst<"svcvtt_f64[_s32]",  "Nd", "i", MergeNone, "aarch64_sve_scvtflt_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;
+
+def SVCVTT_F16_U8  : SInst<"svcvtt_f16[_u8]",  "Od", "Uc", MergeNone, "aarch64_sve_ucvtflt_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTT_F32_U16  : SInst<"svcvtt_f32[_u16]",  "Md", "Us", MergeNone, "aarch64_sve_ucvtflt_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTT_F64_U32  : SInst<"svcvtt_f64[_u32]",  "Nd", "Ui", MergeNone, "aarch64_sve_ucvtflt_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;
+
+def SVCVTB_F16_S8  : SInst<"svcvtb_f16[_s8]",  "Od", "c", MergeNone, "aarch64_sve_scvtfb_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTB_F32_S16  : SInst<"svcvtb_f32[_s16]",  "Md", "s", MergeNone, "aarch64_sve_scvtfb_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTB_F64_S32  : SInst<"svcvtb_f64[_s32]",  "Nd", "i", MergeNone, "aarch64_sve_scvtfb_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;
+
+def SVCVTB_F16_U8  : SInst<"svcvtb_f16[_u8]",  "Od", "Uc", MergeNone, "aarch64_sve_ucvtfb_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTB_F32_U16  : SInst<"svcvtb_f32[_u16]",  "Md", "Us", MergeNone, "aarch64_sve_ucvtfb_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTB_F64_U32  : SInst<"svcvtb_f64[_u32]",  "Nd", "Ui", MergeNone, "aarch64_sve_ucvtfb_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;
+}
+
 ////////////////////////////////////////////////////////////////////////////////
 // Permutations and selection
 
diff --git a/clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_fp_int_cvtn_x2.c b/clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_fp_int_cvtn_x2.c
new file mode 100644
index 0000000000000..a4a7c58e1ced9
--- /dev/null
+++ b/clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_fp_int_cvtn_x2.c
@@ -0,0 +1,105 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sve -target-feature +sve2p3 -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sve -target-feature +sve2p3 -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sme -target-feature +sme2p3 -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sme -target-feature +sme2p3 -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +sve2p3\
+// RUN:   -S -disable-O0-optnone -Werror -Wall -o /dev/null %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2p3\
+// RUN:   -S -disable-O0-optnone -Werror -Wall -o /dev/null %s
+//
+// REQUIRES: aarch64-registered-target
+
+#include <arm_sve.h>
+
+#if defined __ARM_FEATURE_SME
+#define MODE_ATTR __arm_streaming
+#else
+#define MODE_ATTR
+#endif
+
+// CHECK-LABEL: @test_svcvt_s8_f16_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.fcvtzsn.nxv16i8.nxv8f16(<vscale x 8 x half> [[ZN_COERCE0:%.*]], <vscale x 8 x half> [[ZN_COERCE1:%.*]])
+// CHECK-NEXT:    ret <vscale x 16 x i8> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z20test_svcvt_s8_f16_x213svfloat16x2_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.fcvtzsn.nxv16i8.nxv8f16(<vscale x 8 x half> [[ZN_COERCE0:%.*]], <vscale x 8 x half> [[ZN_COERCE1:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 16 x i8> [[TMP0]]
+//
+svint8_t test_svcvt_s8_f16_x2(svfloat16x2_t zn) MODE_ATTR {
+  return svcvt_s8_f16_x2(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_s16_f32_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.fcvtzsn.nxv8i16.nxv4f32(<vscale x 4 x float> [[ZN_COERCE0:%.*]], <vscale x 4 x float> [[ZN_COERCE1:%.*]])
+// CHECK-NEXT:    ret <vscale x 8 x i16> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z21test_svcvt_s16_f32_x213svfloat32x2_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.fcvtzsn.nxv8i16.nxv4f32(<vscale x 4 x float> [[ZN_COERCE0:%.*]], <vscale x 4 x float> [[ZN_COERCE1:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 8 x i16> [[TMP0]]
+//
+svint16_t test_svcvt_s16_f32_x2(svfloat32x2_t zn) MODE_ATTR {
+  return svcvt_s16_f32_x2(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_s32_f64_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.fcvtzsn.nxv4i32.nxv2f64(<vscale x 2 x double> [[ZN_COERCE0:%.*]], <vscale x 2 x double> [[ZN_COERCE1:%.*]])
+// CHECK-NEXT:    ret <vscale x 4 x i32> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z21test_svcvt_s32_f64_x213svfloat64x2_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.fcvtzsn.nxv4i32.nxv2f64(<vscale x 2 x double> [[ZN_COERCE0:%.*]], <vscale x 2 x double> [[ZN_COERCE1:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 4 x i32> [[TMP0]]
+//
+svint32_t test_svcvt_s32_f64_x2(svfloat64x2_t zn) MODE_ATTR {
+  return svcvt_s32_f64_x2(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_u8_f16_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.fcvtzun.nxv16i8.nxv8f16(<vscale x 8 x half> [[ZN_COERCE0:%.*]], <vscale x 8 x half> [[ZN_COERCE1:%.*]])
+// CHECK-NEXT:    ret <vscale x 16 x i8> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z20test_svcvt_u8_f16_x213svfloat16x2_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.fcvtzun.nxv16i8.nxv8f16(<vscale x 8 x half> [[ZN_COERCE0:%.*]], <vscale x 8 x half> [[ZN_COERCE1:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 16 x i8> [[TMP0]]
+//
+svuint8_t test_svcvt_u8_f16_x2(svfloat16x2_t zn) MODE_ATTR {
+  return svcvt_u8_f16_x2(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_u16_f32_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.fcvtzun.nxv8i16.nxv4f32(<vscale x 4 x float> [[ZN_COERCE0:%.*]], <vscale x 4 x float> [[ZN_COERCE1:%.*]])
+// CHECK-NEXT:    ret <vscale x 8 x i16> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z21test_svcvt_u16_f32_x213svfloat32x2_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.fcvtzun.nxv8i16.nxv4f32(<vscale x 4 x float> [[ZN_COERCE0:%.*]], <vscale x 4 x float> [[ZN_COERCE1:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 8 x i16> [[TMP0]]
+//
+svuint16_t test_svcvt_u16_f32_x2(svfloat32x2_t zn) MODE_ATTR {
+  return svcvt_u16_f32_x2(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_u32_f64_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.fcvtzun.nxv4i32.nxv2f64(<vscale x 2 x double> [[ZN_COERCE0:%.*]], <vscale x 2 x double> [[ZN_COERCE1:%.*]])
+// CHECK-NEXT:    ret <vscale x 4 x i32> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z21test_svcvt_u32_f64_x213svfloat64x2_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.fcvtzun.nxv4i32.nxv2f64(<vscale x 2 x double> [[ZN_COERCE0:%.*]], <vscale x 2 x double> [[ZN_COERCE1:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 4 x i32> [[TMP0]]
+//
+svuint32_t test_svcvt_u32_f64_x2(svfloat64x2_t zn) MODE_ATTR {
+  return svcvt_u32_f64_x2(zn);
+}
diff --git a/clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_int_fp_cvt.c b/clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_int_fp_cvt.c
new file mode 100644
index 0000000000000..6b7252e045e33
--- /dev/null
+++ b/clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_int_fp_cvt.c
@@ -0,0 +1,189 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sve -target-feature +sve2p3 -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sve -target-feature +sve2p3 -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sme -target-feature +sme2p3 -O1 -Werror -Wall -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64 -target-feature +sme -target-feature +sme2p3 -O1 -Werror -Wall -emit-llvm -o - -x c++ %s | FileCheck %s -check-prefix=CPP-CHECK
+
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +sve2p3\
+// RUN:   -S -disable-O0-optnone -Werror -Wall -o /dev/null %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme -target-feature +sme2p3\
+// RUN:   -S -disable-O0-optnone -Werror -Wall -o /dev/null %s
+//
+// REQUIRES: aarch64-registered-target
+
+#include <arm_sve.h>
+
+#if defined __ARM_FEATURE_SME
+#define MODE_ATTR __arm_streaming
+#else
+#define MODE_ATTR
+#endif
+
+// CHECK-LABEL: @test_svcvtb_f16_s8(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.scvtfb.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z18test_svcvtb_f16_s8u10__SVInt8_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.scvtfb.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+svfloat16_t test_svcvtb_f16_s8(svint8_t zn) MODE_ATTR {
+  return svcvtb_f16_s8(zn);
+}
+
+// CHECK-LABEL: @test_svcvtb_f32_s16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.scvtfb.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z19test_svcvtb_f32_s16u11__SVInt16_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.scvtfb.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+svfloat32_t test_svcvtb_f32_s16(svint16_t zn) MODE_ATTR {
+  return svcvtb_f32_s16(zn);
+}
+
+// CHECK-LABEL: @test_svcvtb_f64_s32(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.scvtfb.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z19test_svcvtb_f64_s32u11__SVInt32_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.scvtfb.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+svfloat64_t test_svcvtb_f64_s32(svint32_t zn) MODE_ATTR {
+  return svcvtb_f64_s32(zn);
+}
+
+// CHECK-LABEL: @test_svcvtb_f16_u8(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.ucvtfb.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z18test_svcvtb_f16_u8u11__SVUint8_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.ucvtfb.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+svfloat16_t test_svcvtb_f16_u8(svuint8_t zn) MODE_ATTR {
+  return svcvtb_f16_u8(zn);
+}
+
+// CHECK-LABEL: @test_svcvtb_f32_u16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.ucvtfb.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z19test_svcvtb_f32_u16u12__SVUint16_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.ucvtfb.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+svfloat32_t test_svcvtb_f32_u16(svuint16_t zn) MODE_ATTR {
+  return svcvtb_f32_u16(zn);
+}
+
+// CHECK-LABEL: @test_svcvtb_f64_u32(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.ucvtfb.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z19test_svcvtb_f64_u32u12__SVUint32_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.ucvtfb.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+svfloat64_t test_svcvtb_f64_u32(svuint32_t zn) MODE_ATTR {
+  return svcvtb_f64_u32(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_f16_s8(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.scvtflt.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z17test_svcvt_f16_s8u10__SVInt8_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.scvtflt.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+svfloat16_t test_svcvt_f16_s8(svint8_t zn) MODE_ATTR {
+  return svcvtt_f16_s8(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_f32_s16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.scvtflt.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z18test_svcvt_f32_s16u11__SVInt16_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.scvtflt.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+svfloat32_t test_svcvt_f32_s16(svint16_t zn) MODE_ATTR {
+  return svcvtt_f32_s16(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_f64_s32(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.scvtflt.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z18test_svcvt_f64_s32u11__SVInt32_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.scvtflt.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+svfloat64_t test_svcvt_f64_s32(svint32_t zn) MODE_ATTR {
+  return svcvtt_f64_s32(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_f16_u8(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.ucvtflt.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z17test_svcvt_f16_u8u11__SVUint8_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.ucvtflt.f16i8(<vscale x 16 x i8> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 8 x half> [[TMP0]]
+//
+svfloat16_t test_svcvt_f16_u8(svuint8_t zn) MODE_ATTR {
+  return svcvtt_f16_u8(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_f32_u16(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.ucvtflt.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z18test_svcvt_f32_u16u12__SVUint16_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.ucvtflt.f32i16(<vscale x 8 x i16> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 4 x float> [[TMP0]]
+//
+svfloat32_t test_svcvt_f32_u16(svuint16_t zn) MODE_ATTR {
+  return svcvtt_f32_u16(zn);
+}
+
+// CHECK-LABEL: @test_svcvt_f64_u32(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.ucvtflt.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+// CPP-CHECK-LABEL: @_Z18test_svcvt_f64_u32u12__SVUint32_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.ucvtflt.f64i32(<vscale x 4 x i32> [[ZN:%.*]])
+// CPP-CHECK-NEXT:    ret <vscale x 2 x double> [[TMP0]]
+//
+svfloat64_t test_svcvt_f64_u32(svuint32_t zn) MODE_ATTR {
+  return svcvtt_f64_u32(zn);
+}
diff --git a/llvm/include/llvm/IR/IntrinsicsAArch64.td b/llvm/include/llvm/IR/IntrinsicsAArch64.td
index 75929cbc222ad..d9f7314740953 100644
--- a/llvm/include/llvm/IR/IntrinsicsAArch64.td
+++ b/llvm/include/llvm/IR/IntrinsicsAArch64.td
@@ -1051,6 +1051,7 @@ def llvm_nxv4i1_ty  : LLVMType<nxv4i1>;
 def llvm_nxv8i1_ty  : LLVMType<nxv8i1>;
 def llvm_nxv16i1_ty : LLVMType<nxv16i1>;
 def llvm_nxv16i8_ty : LLVMType<nxv16i8>;
+def llvm_nxv8i16_ty : LLVMType<nxv8i16>;
 def llvm_nxv4i32_ty : LLVMType<nxv4i32>;
 def llvm_nxv2i64_ty : LLVMType<nxv2i64>;
 def llvm_nxv8f16_ty : LLVMType<nxv8f16>;
@@ -2610,6 +2611,29 @@ def int_aarch64_sve_fmlslb_lane   : SVE2_3VectorArgIndexed_Long_Intrinsic;
 def int_aarch64_sve_fmlslt        : SVE2_3VectorArg_Long_Intrinsic;
 def int_aarch64_sve_fmlslt_lane   : SVE2_3VectorArgIndexed_Long_Intrinsic;
 
+//
+// SVE2 - Multi-vector narrowing convert to floating point
+//
+
+class Builtin_SVCVT_UNPRED<LLVMType OUT, LLVMType IN>
+    : DefaultAttrsIntrinsic<[OUT], [IN], [IntrNoMem]>;
+
+def int_aarch64_sve_scvtfb_f16i8: Builtin_SVCVT_UNPRED<llvm_nxv8f16_ty, llvm_nxv16i8_ty>;
+def int_aarch64_sve_scvtfb_f32i16: Builtin_SVCVT_UNPRED<llvm_nxv4f32_ty, llvm_nxv8i16_ty>;
+def int_aarch64_sve_scvtfb_f64i32: Builtin_SVCVT_UNPRED<llvm_nxv2f64_ty, llvm_nxv4i32_ty>;
+
+def int_aarch64_sve_scvtflt_f16i8: Builtin_SVCVT_UNPRED<llvm_nxv8f16_ty, llvm_nxv16i8_ty>;
+def int_aarch64_sve_scvtflt_f32i16: Builtin_SVCVT_UNPRED<llvm_nxv4f32_ty, llvm_nxv8i16_ty>;
+def int_aarch64_sve_scvtflt_f64i32: Builtin_SVCVT_UNPRED<llvm_nxv2f64_ty, llvm_nxv4i32_ty>;
+
+def int_aarch64_sve_ucvtfb_f16i8: Builtin_SVCVT_UNPRED<llvm_nxv8f16_ty, llvm_nxv16i8_ty>;
+def int_aarch64_sve_ucvtfb_f32i16: Builtin_SVCVT_UNPRED<llvm_nxv4f32_ty, llvm_nxv8i16_ty>;
+def int_aarch64_sve_ucvtfb_f64i32: Builtin_SVCVT_UNPRED<llvm_nxv2f64_ty, llvm_nxv4i32_ty>;
+
+def int_aarch64_sve_ucvtflt_f16i8: Builtin_SVCVT_UNPRED<llvm_nxv8f16_ty, llvm_nxv16i8_ty>;
+def int_aarch64_sve_ucvtflt_f32i16: Builtin_SVCVT_UNPRED<llvm_nxv4f32_ty, llvm_nxv8i16_ty>;
+def int_aarch64_sve_ucvtflt_f64i32: Builtin_SVCVT_UNPRED<llvm_nxv2f64_ty, llvm_nxv4i32_ty>;
+
 //
 // SVE2 - Floating-point integer binary logarithm
 //
@@ -3526,6 +3550,10 @@ let TargetPrefix = "aarch64" in {
                 [LLVMSubdivide2VectorType<0>, LLVMSubdivi...
[truncated]

jthackray · 2026-03-16T14:44:46Z

                [LLVMSubdivide2VectorType<0>, LLVMSubdivide2VectorType<0>],
                [IntrNoMem]>;

+  class SVE2_CVT_VG2_Single_Intrinsic


What's the difference between this and SVE2_CVT_VG2_SINGLE_Intrinsic on line 3418? Could that be re-used?

Oh, I already thought that I saw an intrinsic with the same name somewhere else.

Unfortunately not, I was trying to use the subdivide by 2 vector type, but the problem it throws compilation errors when the type changes (fp -> int)

I did rename the intrinsic to "SVE2_CVT_VG2_Narrowing_Intrinsic". Please let me know if that naming is okay and if the typing makes sense

jthackray

Are some negative tests required in clang/test/Sema/? e.g. passing svint16_t to svcvtb_f16_s8(). Also, some tests just with -target-feature +sve without sve2p3 or sme2p3 to check they're not allowed to be called?

amilendra · 2026-03-16T15:04:57Z

You should re-generate the Sema tests and commit them when adding new SVE/SME clang builtins.
Command to generate the tests
python3 llvm-project/clang/utils/aarch64_builtins_test_generator.py --gen-streaming-guard-tests build/tools/clang/include/clang/Basic/arm_sve_builtins.json --out-dir llvm-project/clang/test/Sema/AArch64/

amilendra · 2026-03-16T15:16:48Z

+#else
+#define MODE_ATTR
+#endif
+


You probably should add some overloading tests here as well. See 542d2a5 for something similar.

#ifdef SVE_OVERLOADED_FORMS // A simple used,unused... macro, long enough to represent any SVE builtin. #define SVE_ACLE_FUNC(A1,A2_UNUSED,A3,A4_UNUSED) A1##A3 #else #define SVE_ACLE_FUNC(A1,A2,A3,A4) A1##A2##A3##A4 #endif

Thanks, I noticed that I actually should have been more careful with how I set my overloading.
This looks wrong for example:

svcvtt_f16[_s8]

and

svcvtt_f16[_u8]

Please let me know if this looks okay after the update

Actually, I think it is fine to add an overload like that, because we could call svcvtt_f16 with a signed and unsigned argument in that case. I think I would just have to drop the "isOverloadNone" flag

Edit:
I have changed it back in the latest commit. As far as I understand, the "isOverloadNone" flag does not matter in this case. The overload introduced by the square brackets should be resolved as a valid shortened form.
It is fine to have this implicitly introduced overload like here for example:
svcvtb_f64(svint32_t_val);
svcvtb_f64(svuint32_t_val);

amilendra · 2026-03-16T15:22:02Z

Change the subject of the commit message to [AArch64] Add 9.7 CVT data processing intrinsics? There are other types of data processing (Arithmetic data processing and Absolute data processing) intrinsics to be added by separate patches.

CarolineConcatto

Can you update the commit message and add which prototypes are you implementing.

github-actions · 2026-03-18T10:37:44Z

🐧 Linux x64 Test Results

204434 tests passed
6504 tests skipped

✅ The build succeeded and all tests passed.

github-actions · 2026-03-18T10:37:44Z

🪟 Windows x64 Test Results

136520 tests passed
4633 tests skipped

✅ The build succeeded and all tests passed.

MartinWehking · 2026-03-18T10:39:57Z

Are some negative tests required in clang/test/Sema/? e.g. passing svint16_t to svcvtb_f16_s8(). Also, some tests just with -target-feature +sve without sve2p3 or sme2p3 to check they're not allowed to be called?

I autogenerated the Sema test with the command that @amilendra suggested. Do you think there should be more testing added beyond the autogenerated lines?

MartinWehking · 2026-03-18T12:04:28Z

Can you update the commit message and add which prototypes are you implementing.

I've extended my comment. I hope that's ok

CarolineConcatto · 2026-03-31T10:35:08Z

+
+// CHECK-LABEL: @test_svcvtb_f64_s32(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.scvtfb.f64i32(<vscale x 4 x i32> [[ZN:%.*]])


I am puzzle to why this one has f64i32, like it is scalar inputs and not nxv8i16.nxv4f32 as expected for scalable vectors.

I saw now, this is because the intrinsics are like this in Intrinsics.td

Do you think we can leave the naming like it is?

CarolineConcatto

LGTM

Lukacma · 2026-04-23T12:31:48Z

+def SVCVTT_F16_S8  : SInst<"svcvtt_f16[_s8]",  "Od", "c", MergeNone, "aarch64_sve_scvtflt_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTT_F32_S16  : SInst<"svcvtt_f32[_s16]",  "Md", "s", MergeNone, "aarch64_sve_scvtflt_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTT_F64_S32  : SInst<"svcvtt_f64[_s32]",  "Nd", "i", MergeNone, "aarch64_sve_scvtflt_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;
+
+def SVCVTT_F16_U8  : SInst<"svcvtt_f16[_u8]",  "Od", "Uc", MergeNone, "aarch64_sve_ucvtflt_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTT_F32_U16  : SInst<"svcvtt_f32[_u16]",  "Md", "Us", MergeNone, "aarch64_sve_ucvtflt_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTT_F64_U32  : SInst<"svcvtt_f64[_u32]",  "Nd", "Ui", MergeNone, "aarch64_sve_ucvtflt_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;
+
+def SVCVTB_F16_S8  : SInst<"svcvtb_f16[_s8]",  "Od", "c", MergeNone, "aarch64_sve_scvtfb_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTB_F32_S16  : SInst<"svcvtb_f32[_s16]",  "Md", "s", MergeNone, "aarch64_sve_scvtfb_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTB_F64_S32  : SInst<"svcvtb_f64[_s32]",  "Nd", "i", MergeNone, "aarch64_sve_scvtfb_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;
+
+def SVCVTB_F16_U8  : SInst<"svcvtb_f16[_u8]",  "Od", "Uc", MergeNone, "aarch64_sve_ucvtfb_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTB_F32_U16  : SInst<"svcvtb_f32[_u16]",  "Md", "Us", MergeNone, "aarch64_sve_ucvtfb_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;
+def SVCVTB_F64_U32  : SInst<"svcvtb_f64[_u32]",  "Nd", "Ui", MergeNone, "aarch64_sve_ucvtfb_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;


Suggested change

def SVCVTT_F16_S8 : SInst<"svcvtt_f16[_s8]", "Od", "c", MergeNone, "aarch64_sve_scvtflt_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;

def SVCVTT_F32_S16 : SInst<"svcvtt_f32[_s16]", "Md", "s", MergeNone, "aarch64_sve_scvtflt_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;

def SVCVTT_F64_S32 : SInst<"svcvtt_f64[_s32]", "Nd", "i", MergeNone, "aarch64_sve_scvtflt_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;

def SVCVTT_F16_U8 : SInst<"svcvtt_f16[_u8]", "Od", "Uc", MergeNone, "aarch64_sve_ucvtflt_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;

def SVCVTT_F32_U16 : SInst<"svcvtt_f32[_u16]", "Md", "Us", MergeNone, "aarch64_sve_ucvtflt_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;

def SVCVTT_F64_U32 : SInst<"svcvtt_f64[_u32]", "Nd", "Ui", MergeNone, "aarch64_sve_ucvtflt_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;

def SVCVTB_F16_S8 : SInst<"svcvtb_f16[_s8]", "Od", "c", MergeNone, "aarch64_sve_scvtfb_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;

def SVCVTB_F32_S16 : SInst<"svcvtb_f32[_s16]", "Md", "s", MergeNone, "aarch64_sve_scvtfb_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;

def SVCVTB_F64_S32 : SInst<"svcvtb_f64[_s32]", "Nd", "i", MergeNone, "aarch64_sve_scvtfb_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;

def SVCVTB_F16_U8 : SInst<"svcvtb_f16[_u8]", "Od", "Uc", MergeNone, "aarch64_sve_ucvtfb_f16i8", [IsOverloadNone, VerifyRuntimeMode]>;

def SVCVTB_F32_U16 : SInst<"svcvtb_f32[_u16]", "Md", "Us", MergeNone, "aarch64_sve_ucvtfb_f32i16", [IsOverloadNone, VerifyRuntimeMode]>;

def SVCVTB_F64_U32 : SInst<"svcvtb_f64[_u32]", "Nd", "Ui", MergeNone, "aarch64_sve_ucvtfb_f64i32", [IsOverloadNone, VerifyRuntimeMode]>;

foreach suffix = ["b", "t" ] in {

def SVCVT # !toupper(suffix) # _S: SInst<"svcvt" # suffix # "[_{d}_{1}]", "d^", "hfd", MergeNone, "aarch64_sve_scvtf" # suffix, [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;

def SVCVT # !toupper(suffix) # _U: SInst<"svcvt" # suffix # "[_{d}_{1}]", "de", "hfd", MergeNone, "aarch64_sve_ucvtf" # suffix, [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;

}

Okay so as you can see this can be expressed more concisely (assuming it compiles, but it should). This will require adding new option to Prototype modifiers whose behaviour will be very similiar to 'e' just with signed integer instead. I named it "^" for now, but you should use different letter. I would personally drop toupper as is just a name, but I am fine either way. Also I am not sure why you defined so many LLVM intrinsics instead of overloading? Additionally I think _f* might not need to be part of the short name. The only way it would be needed is if they planned 4-way widening for this. This might be worth discussing in ACLE.

We already have aarch64_sve_scvtf and int_aarch64_sve_ucvtf in the compiler, so keeping them it is following the pattern in the IR naming scheme. For instance there are:
int_aarch64_sve_ucvtf and int_aarch64_sve_ucvtf_f16i32
int_aarch64_sve_scvtf and int_aarch64_sve_scvtf_f16i32.
If we keep the pattern in LLVM, I dont think it is worth discussing in the ACLE if there is any plan to add 4-way widening?

I am not sure I quite understand what you are trying to say here. Could you please elaborate bit more ?

Also I am not sure why you defined so many LLVM intrinsics instead of overloading?

I tried your suggestions locally, but I'm running into errors:

Call parameter type does not match function signature! %0 = load <vscale x 16 x i8>, ptr %zn.addr, align 16

in clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2_int_fp_cvt.c

Is IsOverloadWhileOrMultiVecCvt a suitable type flag?
It seems to expecting a second source operand

The code I provided is just a suggestion. I didn't try compiling it so it might need modification to properly compile. Also feel free to let me know if you think my idea has no chance of working. It is possible I might have missed smth.

I applied your suggestions and added a new flag, so that this is actually possible.
You can see those changes in the last commit.

Note that I squashed the commits from ealier.

Lukacma · 2026-04-23T13:09:09Z

Suggested change

def int_aarch64_sve_scvtfb: DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [llvm_anyint_ty], [IntrNoMem]>;

def int_aarch64_sve_scvtft: DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [llvm_anyint_ty], [IntrNoMem]>;

Unfortunately we dont have precise IITType for this so we will need to add less constraint overload here. That should be fine though.

Lukacma · 2026-04-23T13:31:38Z

 }
+
+let SVETargetGuard = "sve2p3|sme2p3", SMETargetGuard = "sve2p3|sme2p3" in {
+def SVCVTZN_S8_F16  : SInst<"svcvtzn_s8[_f16_x2]",  "d2.O", "c", MergeNone, "aarch64_sve_fcvtzsn_x2", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;


This can be made more concise similarly to my lower comment. Also not sure why _s8 and so on are mandatory. Here I cannot imagine any 4-way narrowing as it wouldn't fill whole vector.

Not sure I follow, could you give me an example please?

Edit: after simplifying it like this:

def SVCVTZN_S : SInst<"svcvtzn_{0}[_{1}_x2]", "y2.d", "hfd", MergeNone, "aarch64_sve_fcvtzsn_x2", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>; def SVCVTZN_U : SInst<"svcvtzn_{0}[_{1}_x2]", "e2.d", "hfd", MergeNone, "aarch64_sve_fcvtzun_x2", [IsOverloadWhileOrMultiVecCvt, VerifyRuntimeMode]>;

I'm receiving errors similar to the other suggestion:

Call parameter type does not match function signature!

Yes, that kind of simplification is what I had in mind. I’m not sure yet whether the error is due to an implementation issue or a problem with the approach itself. If you find that the simplification has a fundamental issue, feel free to ignore the suggestion. I’d be interested to understand the issue, though, so a brief explanation would be appreciated

No problem, so the problem is with the IsOverloadWhileOrMultiVecCvt flag handling here:
It tries to lower to the intrinsic with {DefaultType, Ops[1]->getType()};
For float it sees these types:

ResultType: <vscale x 8 x i16> DefaultType: <vscale x 4 x float> Ops[0]: <vscale x 4 x float>

It would be correct if it was {ResultType, Ops[1]->getType()};.
Not sure if it's a good idea to add a new flag as part of this scope.

I think we can reuse IsReductionQV if we drop the guards from ARM.cpp, which I think are unnecessary:

if (TypeFlags.isReductionQV() && !ResultType->isScalableTy() && ResultType->isVectorTy()) return {ResultType, Ops[1]->getType()};

But honestly I think we need to rework how we specify the overloading. The naming of the flag is pretty bad as they use intrinsics types instead in names, instead of saying what each flag uses for overloading. I also think it would make more sense to more flexible system for specifying the overloads, instead of relying on flags. But that is definitely beyond the scope of this patch.

I would leave it like that now in that case and simplify this once the flags have been updated if that's alright with you

I don't see a reason for waiting? Updating the logic for IsReductionQV is 1 line change and will make the patch significantly simpler.

Sure.
Not sure if I understood you here, but the naming of the flag needs to be changed?
If yes, any suggestions?

No just leave the name as it is. Its fine for now. As mentioned properly renaming/improving overloading functionality would be separate PR.

Add Clang/LLVM intrinsics for svcvt, scvtflt, ucvtf, ucvtflt and fcvtzsn, fcvtzun. The Clang intrinsics are guarded by the sve2.3 and sme2.3 feature flags. ACLE Patch: ARM-software/acle#428 Fix overload and address comments Fix intrinsic name and simplify CHECK lines Reintroduce overloaded short forms for intrinsics Adapt the test cases accordingly. Rename ACLE clang intrinsic A clang intrinsic was renamed in the ACLE patch. Change the name accordingly. Use existing pattern template Apply suggestions Apply suggestions

Introduce a new flag for overload resolution and combine some front end intrinsics

MartinWehking · 2026-05-19T15:39:17Z

  def SVDOT_LANE_X2_SH : SInst<"svdot_lane[_{d}_{2}]", "ddhhi", "s",  MergeNone, "aarch64_sve_sdot_lane_x2", [VerifyRuntimeMode], [ImmCheck<3, ImmCheck0_7>]>;
  def SVDOT_LANE_X2_UH : SInst<"svdot_lane[_{d}_{2}]", "ddhhi", "Us", MergeNone, "aarch64_sve_udot_lane_x2", [VerifyRuntimeMode], [ImmCheck<3, ImmCheck0_7>]>;
-}
+}


ToDo change this

llvmbot added backend:AArch64 clang:frontend Language frontend issues, e.g. anything involving "Sema" llvm:ir labels Mar 16, 2026

jthackray requested review from CarolineConcatto, Lukacma, jthackray and kmclaughlin-arm March 16, 2026 14:22

jthackray reviewed Mar 16, 2026

View reviewed changes

MartinWehking commented Mar 16, 2026

View reviewed changes

Comment thread llvm/include/llvm/IR/IntrinsicsAArch64.td Outdated

amilendra reviewed Mar 16, 2026

View reviewed changes

kmclaughlin-arm reviewed Mar 16, 2026

View reviewed changes

Comment thread llvm/test/CodeGen/AArch64/sve2p3-intrinsics-fp-converts.ll Outdated

Comment thread llvm/test/CodeGen/AArch64/sve2p3-intrinsics-fp-converts.ll Outdated

CarolineConcatto reviewed Mar 17, 2026

View reviewed changes

MartinWehking changed the title ~~[AArch64] Add 9.7 data processing intrinsics~~ [AArch64] Add 9.7 CVT data processing intrinsics Mar 18, 2026

MartinWehking mentioned this pull request Mar 18, 2026

Add alpha support for 9.7 data processing intrinsics ARM-software/acle#428

Open

8 tasks

CarolineConcatto reviewed Mar 31, 2026

View reviewed changes

CarolineConcatto approved these changes Apr 9, 2026

View reviewed changes

Lukacma requested changes Apr 23, 2026

View reviewed changes

llvmorg-github-actions Bot added the clang:codegen IR generation bugs: mangling, exceptions, etc. label May 19, 2026

MartinWehking added 2 commits May 19, 2026 14:02

Apply suggestions

ad13945

Introduce a new flag for overload resolution and combine some front end intrinsics

MartinWehking force-pushed the data-intrinsics branch from 83d09bd to ad13945 Compare May 19, 2026 15:02

MartinWehking commented May 19, 2026

View reviewed changes


	def int_aarch64_sve_scvtfb: DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [llvm_anyint_ty], [IntrNoMem]>;
	def int_aarch64_sve_scvtft: DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [llvm_anyint_ty], [IntrNoMem]>;

Conversation

MartinWehking commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Mar 16, 2026 • edited by llvmorg-github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jthackray left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amilendra commented Mar 16, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MartinWehking Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MartinWehking Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MartinWehking Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amilendra commented Mar 16, 2026

Uh oh!

Uh oh!

Uh oh!

CarolineConcatto left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🐧 Linux x64 Test Results

Uh oh!

github-actions Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🪟 Windows x64 Test Results

Uh oh!

MartinWehking commented Mar 18, 2026

Uh oh!

MartinWehking commented Mar 18, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

CarolineConcatto left a comment

Choose a reason for hiding this comment

Uh oh!

Lukacma Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MartinWehking commented Mar 16, 2026 •

edited

Loading

llvmbot commented Mar 16, 2026 •

edited by llvmorg-github-actions Bot

Loading

jthackray left a comment •

edited

Loading

MartinWehking Mar 18, 2026 •

edited

Loading

MartinWehking Mar 18, 2026 •

edited

Loading

MartinWehking Mar 18, 2026 •

edited

Loading

github-actions Bot commented Mar 18, 2026 •

edited

Loading

github-actions Bot commented Mar 18, 2026 •

edited

Loading

Lukacma Apr 23, 2026 •

edited

Loading

MartinWehking May 18, 2026 •

edited

Loading

MartinWehking May 19, 2026 •

edited

Loading

MartinWehking May 19, 2026 •

edited

Loading