[AArch64] Implement the atomic store with hint intrinsic#198316
[AArch64] Implement the atomic store with hint intrinsic#198316kmclaughlin-arm wants to merge 2 commits into
Conversation
Adds the following ACLE intrinsic as described in [1]:
void __arm_atomic_store_with_hint(type *ptr, type data,
int memory_order, int hint);
A regular atomic store instruction is emitted in Clang for this builtin
with additional metadata (`!aarch64.atomic.hint`), which ensures the
instruction is recognised as atomic by passes in LLVM.
When an atomic store has this metadata, this lowers to the ATOMIC_STORE_HINT
pseudo which is later expanded by AArch64ExpandPseudoInsts into an STSHH
instruction plus an atomic store.
The hint value is represented using MOTargetFlag3 & MOTargetFlag4 flags,
which will need to be extended when new hints are added in future.
[1] ARM-software/acle#432
|
@llvm/pr-subscribers-llvm-support @llvm/pr-subscribers-backend-aarch64 Author: Kerry McLaughlin (kmclaughlin-arm) ChangesAdds the following ACLE intrinsic as described in [1]: An atomic store instruction is emitted in Clang for this builtin with additional metadata The hint value is represented using MOTargetFlag3 & MOTargetFlag4 flags, Patch is 49.26 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/198316.diff 20 Files Affected:
diff --git a/clang/include/clang/Basic/BuiltinsAArch64.td b/clang/include/clang/Basic/BuiltinsAArch64.td
index ba30e344911aa..7b8f19f2893b3 100644
--- a/clang/include/clang/Basic/BuiltinsAArch64.td
+++ b/clang/include/clang/Basic/BuiltinsAArch64.td
@@ -171,6 +171,10 @@ let Attributes = [NoThrow], Features = "ls64" in {
def st64bv0 : AArch64TargetBuiltin<"uint64_t (void *, uint64_t const *)">;
}
+let Attributes = [NoThrow, CustomTypeChecking] in {
+ def atomic_store_with_hint : AArch64Builtin<"void(...)">;
+}
+
// Armv9.3-A Guarded Control Stack
let Attributes = [NoThrow], Features = "gcs" in {
def gcspopm : AArch64TargetBuiltin<"uint64_t (uint64_t)">;
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index c73c116cdc451..241aef979728a 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -9666,6 +9666,12 @@ def err_atomic_op_needs_atomic_int_or_fp : Error<
def err_atomic_op_needs_atomic_int : Error<
"address argument to atomic operation must be a pointer to "
"%select{|atomic }0integer (%1 invalid)">;
+def err_atomic_op_hint_data_size : Error<
+ "address argument to atomic store with hint must be of size 8, 16, 32 or 64 bits">;
+def err_atomic_hint_has_invalid_memory_order : Error<
+ "invalid memory order argument to atomic hint operation (%0 invalid)">;
+def err_atomic_hint_has_invalid_hint_type : Error<
+ "invalid hint type argument to atomic hint operation (%0 invalid)">;
def warn_atomic_op_has_invalid_memory_order : Warning<
"%select{|success |failure }0memory order argument to atomic operation is invalid">,
InGroup<DiagGroup<"atomic-memory-ordering">>;
diff --git a/clang/include/clang/Sema/SemaARM.h b/clang/include/clang/Sema/SemaARM.h
index af8e0e9047171..b0a01c40ffece 100644
--- a/clang/include/clang/Sema/SemaARM.h
+++ b/clang/include/clang/Sema/SemaARM.h
@@ -70,6 +70,7 @@ class SemaARM : public SemaBase {
bool BuiltinARMSpecialReg(unsigned BuiltinID, CallExpr *TheCall, int ArgNum,
unsigned ExpectedFieldNum, bool AllowName);
bool BuiltinARMMemoryTaggingCall(unsigned BuiltinID, CallExpr *TheCall);
+ bool BuiltinARMAtomicStoreHintCall(unsigned BuiltinID, CallExpr *TheCall);
bool MveAliasValid(unsigned BuiltinID, llvm::StringRef AliasName);
bool CdeAliasValid(unsigned BuiltinID, llvm::StringRef AliasName);
diff --git a/clang/lib/CodeGen/TargetBuiltins/ARM.cpp b/clang/lib/CodeGen/TargetBuiltins/ARM.cpp
index 95168e94106cc..1162d0a31e975 100644
--- a/clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+++ b/clang/lib/CodeGen/TargetBuiltins/ARM.cpp
@@ -20,6 +20,7 @@
#include "llvm/IR/IntrinsicsAArch64.h"
#include "llvm/IR/IntrinsicsARM.h"
#include "llvm/IR/IntrinsicsBPF.h"
+#include "llvm/Support/AArch64AtomicHints.h"
#include "llvm/TargetParser/AArch64TargetParser.h"
#include <numeric>
@@ -2129,6 +2130,56 @@ static Value *EmitRangePrefetchBuiltin(CodeGenFunction &CGF, unsigned BuiltinID,
Ops);
}
+static Value *EmitAtomicStoreWithHintBuiltin(CodeGenFunction &CGF,
+ unsigned BuiltinID,
+ const CallExpr *E) {
+ CodeGen::CGBuilderTy &Builder = CGF.Builder;
+ CodeGen::CodeGenModule &CGM = CGF.CGM;
+ Expr::EvalResult Result;
+ if (!E->getArg(2)->EvaluateAsInt(Result, CGM.getContext()))
+ llvm_unreachable(
+ "Expected integer policy argument to atomic store with hint.");
+
+ StoreInst *Store =
+ Builder.CreateStore(CGF.EmitScalarExpr(E->getArg(1)), // Value
+ CGF.EmitPointerWithAlignment(E->getArg(0))); // Ptr;
+
+ AtomicOrdering Ordering;
+ unsigned OrderingArg = Result.Val.getInt().getExtValue();
+ assert(isValidAtomicOrderingCABI(OrderingArg) && "Invalid atomic ordering");
+
+ switch (static_cast<AtomicOrderingCABI>(OrderingArg)) {
+ default:
+ llvm_unreachable("Unsupported atomic ordering found.");
+ case AtomicOrderingCABI::relaxed:
+ Ordering = AtomicOrdering::Monotonic;
+ break;
+ case AtomicOrderingCABI::release:
+ Ordering = AtomicOrdering::Release;
+ break;
+ case AtomicOrderingCABI::seq_cst:
+ Ordering = AtomicOrdering::SequentiallyConsistent;
+ break;
+ }
+ Store->setAtomic(Ordering);
+
+ if (!E->getArg(3)->EvaluateAsInt(Result, CGM.getContext()))
+ llvm_unreachable(
+ "Expected integer hint argument to atomic store with hint.");
+ unsigned HintArg = Result.Val.getInt().getExtValue();
+ assert((getAtomicStoreHintFromMD(HintArg) !=
+ AArch64AtomicStoreHint::HINT_NONE) &&
+ "Invalid hint type");
+
+ MDNode *HintMDVal =
+ MDNode::get(CGM.getLLVMContext(),
+ llvm::ConstantAsMetadata::get(Builder.getInt32(HintArg)));
+ Store->setMetadata(CGM.getModule().getMDKindID("aarch64.atomic.hint"),
+ HintMDVal);
+
+ return Store;
+}
+
/// Return true if BuiltinID is an overloaded Neon intrinsic with an extra
/// argument that specifies the vector type. The additional argument is meant
/// for Sema checking (see `CheckNeonBuiltinFunctionCall`) and this function
@@ -4893,6 +4944,9 @@ Value *CodeGenFunction::EmitAArch64BuiltinExpr(unsigned BuiltinID,
BuiltinID == AArch64::BI__builtin_arm_range_prefetch_x)
return EmitRangePrefetchBuiltin(*this, BuiltinID, E);
+ if (BuiltinID == AArch64::BI__builtin_arm_atomic_store_with_hint)
+ return EmitAtomicStoreWithHintBuiltin(*this, BuiltinID, E);
+
// Memory Tagging Extensions (MTE) Intrinsics
Intrinsic::ID MTEIntrinsicID = Intrinsic::not_intrinsic;
switch (BuiltinID) {
diff --git a/clang/lib/Headers/arm_acle.h b/clang/lib/Headers/arm_acle.h
index 9a6b6a837fa5a..bd99527dc5fa8 100644
--- a/clang/lib/Headers/arm_acle.h
+++ b/clang/lib/Headers/arm_acle.h
@@ -741,6 +741,12 @@ __arm_st64bv0(void *__addr, data512_t __value) {
}
#endif
+/* Atomic store with hints */
+#if defined(__ARM_64BIT_STATE) && __ARM_64BIT_STATE
+#define __arm_atomic_store_with_hint(ptr, data, memory_order, hint) \
+ __builtin_arm_atomic_store_with_hint(ptr, data, memory_order, hint)
+#endif
+
/* 11.1 Special register intrinsics */
#define __arm_rsr(sysreg) __builtin_arm_rsr(sysreg)
#define __arm_rsr64(sysreg) __builtin_arm_rsr64(sysreg)
diff --git a/clang/lib/Sema/SemaARM.cpp b/clang/lib/Sema/SemaARM.cpp
index f57c9c8b87cd5..ef5c46de1e174 100644
--- a/clang/lib/Sema/SemaARM.cpp
+++ b/clang/lib/Sema/SemaARM.cpp
@@ -17,6 +17,7 @@
#include "clang/Sema/Initialization.h"
#include "clang/Sema/ParsedAttr.h"
#include "clang/Sema/Sema.h"
+#include "llvm/Support/AArch64AtomicHints.h"
namespace clang {
@@ -320,6 +321,94 @@ bool SemaARM::BuiltinARMSpecialReg(unsigned BuiltinID, CallExpr *TheCall,
return false;
}
+bool SemaARM::BuiltinARMAtomicStoreHintCall(unsigned BuiltinID,
+ CallExpr *TheCall) {
+ if (SemaRef.checkArgCount(TheCall, 4))
+ return true;
+
+ // Arg 0 should be the pointer type. The pointee type must be a
+ // scalar integral or floating-point type of 8, 16, 32 or 64 bits.
+ ASTContext &Context = getASTContext();
+ Expr *PtrArg = TheCall->getArg(0);
+ auto PtrArgRes = SemaRef.DefaultFunctionArrayLvalueConversion(PtrArg);
+ if (PtrArgRes.isInvalid())
+ return true;
+ auto *PtrTy = PtrArg->getType()->getAs<PointerType>();
+ if (!PtrTy)
+ return Diag(TheCall->getBeginLoc(),
+ diag::err_atomic_builtin_must_be_pointer)
+ << PtrArg->getType() << 0 << PtrArg->getSourceRange();
+ QualType PtrQT = PtrTy->getPointeeType();
+
+ // TODO: Allow MFloat8 types when supported by atomic store
+ if (!PtrQT->isIntegralType(getASTContext()) && !PtrQT->isFloatingType())
+ return Diag(TheCall->getBeginLoc(),
+ diag::err_atomic_op_needs_atomic_int_or_fp)
+ << 0 << PtrQT << PtrArg->getSourceRange();
+
+ unsigned TySize =
+ Context.getTypeSize(Context.getCanonicalType(PtrQT).getUnqualifiedType());
+ if (TySize != 8 && TySize != 16 && TySize != 32 && TySize != 64)
+ return Diag(TheCall->getBeginLoc(), diag::err_atomic_op_hint_data_size)
+ << PtrArg->getSourceRange();
+
+ // Arg 1 is the data to be stored. The type must match the pointee
+ // type found above.
+ auto DataArgRes =
+ SemaRef.DefaultFunctionArrayLvalueConversion(TheCall->getArg(1));
+ if (DataArgRes.isInvalid())
+ return true;
+ QualType DataQT = DataArgRes.get()->getType();
+
+ if (PtrQT != DataQT)
+ return Diag(TheCall->getBeginLoc(),
+ diag::err_typecheck_call_different_arg_types)
+ << PtrQT << DataQT;
+
+ // Arg 2 is the memory order, which must be relaxed, release or seq_cst
+ auto MemOrdArg =
+ SemaRef.DefaultFunctionArrayLvalueConversion(TheCall->getArg(2)).get();
+ std::optional<llvm::APSInt> MemOrdAP =
+ MemOrdArg->getIntegerConstantExpr(Context);
+ if (!MemOrdAP)
+ return Diag(TheCall->getBeginLoc(),
+ diag::err_atomic_hint_has_invalid_memory_order)
+ << MemOrdArg->getType() << MemOrdArg->getSourceRange();
+
+ unsigned Ordering = MemOrdAP->getZExtValue();
+ if (!llvm::isValidAtomicOrderingCABI(Ordering))
+ return Diag(TheCall->getBeginLoc(),
+ diag::err_atomic_hint_has_invalid_memory_order)
+ << *MemOrdAP << MemOrdArg->getSourceRange();
+
+ auto AtomicOrdering = static_cast<llvm::AtomicOrderingCABI>(Ordering);
+ if (AtomicOrdering != llvm::AtomicOrderingCABI::relaxed &&
+ AtomicOrdering != llvm::AtomicOrderingCABI::release &&
+ AtomicOrdering != llvm::AtomicOrderingCABI::seq_cst)
+ return Diag(TheCall->getBeginLoc(),
+ diag::err_atomic_hint_has_invalid_memory_order)
+ << *MemOrdAP << MemOrdArg->getSourceRange();
+
+ // Arg 3 is the hint type. Only values represented by AArch64AtomicStoreHint
+ // are valid.
+ auto HintArg =
+ SemaRef.DefaultFunctionArrayLvalueConversion(TheCall->getArg(3)).get();
+ std::optional<llvm::APSInt> HintAP = HintArg->getIntegerConstantExpr(Context);
+ if (!HintAP)
+ return Diag(TheCall->getBeginLoc(),
+ diag::err_atomic_hint_has_invalid_hint_type)
+ << HintArg->getType() << HintArg->getSourceRange();
+
+ unsigned Hint = HintAP->getZExtValue();
+ if (llvm::getAtomicStoreHintFromMD(Hint) ==
+ llvm::AArch64AtomicStoreHint::HINT_NONE)
+ return Diag(TheCall->getBeginLoc(),
+ diag::err_atomic_hint_has_invalid_hint_type)
+ << *HintAP << HintArg->getSourceRange();
+
+ return false;
+}
+
/// getNeonEltType - Return the QualType corresponding to the elements of
/// the vector type specified by the NeonTypeFlags. This is used to check
/// the pointer arguments for Neon load/store intrinsics.
@@ -1164,6 +1253,9 @@ bool SemaARM::CheckAArch64BuiltinFunctionCall(const TargetInfo &TI,
BuiltinID == AArch64::BI__builtin_arm_wsrp)
return BuiltinARMSpecialReg(BuiltinID, TheCall, 0, 5, true);
+ if (BuiltinID == AArch64::BI__builtin_arm_atomic_store_with_hint)
+ return BuiltinARMAtomicStoreHintCall(BuiltinID, TheCall);
+
// Only check the valid encoding range. Any constant in this range would be
// converted to a register of the form S2_2_C3_C4_5. Let the hardware throw
// an exception for incorrect registers. This matches MSVC behavior.
diff --git a/clang/test/CodeGen/arm_acle.c b/clang/test/CodeGen/arm_acle.c
index cd18fa63bfdbd..a8aa0916a8a4c 100644
--- a/clang/test/CodeGen/arm_acle.c
+++ b/clang/test/CodeGen/arm_acle.c
@@ -1821,3 +1821,81 @@ int test_rndrrs(uint64_t *__addr) {
return __rndrrs(__addr);
}
#endif
+
+#if defined(__ARM_64BIT_STATE)
+
+// AArch64-LABEL: @test_atomic_store_hint_char(
+// AArch64-NEXT: entry:
+// AArch64-NEXT: store atomic i8 [[DATA:%.*]], ptr [[PTR:%.*]] monotonic, align 1, !aarch64.atomic.hint [[META3:![0-9]+]]
+// AArch64-NEXT: ret void
+//
+void test_atomic_store_hint_char(char *ptr, char data) {
+ __arm_atomic_store_with_hint(ptr, data, __ATOMIC_RELAXED, 0);
+}
+
+// AArch64-LABEL: @test_atomic_store_hint_bfloat(
+// AArch64-NEXT: entry:
+// AArch64-NEXT: store atomic bfloat [[DATA:%.*]], ptr [[PTR:%.*]] release, align 2, !aarch64.atomic.hint [[META4:![0-9]+]]
+// AArch64-NEXT: ret void
+//
+void test_atomic_store_hint_bfloat(__bf16 *ptr, __bf16 data) {
+ __arm_atomic_store_with_hint(ptr, data, __ATOMIC_RELEASE, 1);
+}
+
+// AArch64-LABEL: @test_atomic_store_hint_short(
+// AArch64-NEXT: entry:
+// AArch64-NEXT: store atomic i16 [[DATA:%.*]], ptr [[PTR:%.*]] release, align 2, !aarch64.atomic.hint [[META3]]
+// AArch64-NEXT: ret void
+//
+void test_atomic_store_hint_short(short *ptr, short data) {
+ __arm_atomic_store_with_hint(ptr, data, __ATOMIC_RELEASE, 0);
+}
+
+// AArch64-LABEL: @test_atomic_store_hint_u32(
+// AArch64-NEXT: entry:
+// AArch64-NEXT: store atomic i32 [[DATA:%.*]], ptr [[PTR:%.*]] seq_cst, align 4, !aarch64.atomic.hint [[META3]]
+// AArch64-NEXT: ret void
+//
+void test_atomic_store_hint_u32(uint32_t *ptr, uint32_t data) {
+ __arm_atomic_store_with_hint(ptr, data, __ATOMIC_SEQ_CST, 0);
+}
+
+// AArch64-LABEL: @test_atomic_store_hint_float(
+// AArch64-NEXT: entry:
+// AArch64-NEXT: store atomic float [[DATA:%.*]], ptr [[PTR:%.*]] seq_cst, align 4, !aarch64.atomic.hint [[META3]]
+// AArch64-NEXT: ret void
+//
+void test_atomic_store_hint_float(float *ptr, float data) {
+ __arm_atomic_store_with_hint(ptr, data, __ATOMIC_SEQ_CST, 0);
+}
+
+// AArch64-LABEL: @test_atomic_store_hint_s64(
+// AArch64-NEXT: entry:
+// AArch64-NEXT: store atomic i64 [[DATA:%.*]], ptr [[PTR:%.*]] monotonic, align 8, !aarch64.atomic.hint [[META4]]
+// AArch64-NEXT: ret void
+//
+void test_atomic_store_hint_s64(int64_t *ptr, int64_t data) {
+ __arm_atomic_store_with_hint(ptr, data, __ATOMIC_RELAXED, 1);
+}
+
+// AArch64-LABEL: @test_atomic_store_hint_long_long_int(
+// AArch64-NEXT: entry:
+// AArch64-NEXT: store atomic i64 [[DATA:%.*]], ptr [[PTR:%.*]] release, align 8, !aarch64.atomic.hint [[META3]]
+// AArch64-NEXT: ret void
+//
+void test_atomic_store_hint_long_long_int(long long int *ptr, long long int data) {
+ __arm_atomic_store_with_hint(ptr, data, __ATOMIC_RELEASE, 0);
+}
+
+// AArch64-LABEL: @test_atomic_store_hint_double(
+// AArch64-NEXT: entry:
+// AArch64-NEXT: store atomic double [[DATA:%.*]], ptr [[PTR:%.*]] monotonic, align 8, !aarch64.atomic.hint [[META4]]
+// AArch64-NEXT: ret void
+//
+void test_atomic_store_hint_double(double *ptr, double data) {
+ __arm_atomic_store_with_hint(ptr, data, __ATOMIC_RELAXED, 1);
+}
+
+// AArch64: [[META3]] = !{i32 0}
+// AArch64-NEXT: [[META4]] = !{i32 1}
+#endif
diff --git a/clang/test/CodeGen/builtins-arm64.c b/clang/test/CodeGen/builtins-arm64.c
index 3d054c79f1777..ad9ba7feca671 100644
--- a/clang/test/CodeGen/builtins-arm64.c
+++ b/clang/test/CodeGen/builtins-arm64.c
@@ -216,4 +216,17 @@ void trap() {
__builtin_arm_trap(42);
}
+void atomic_store_with_hint(int64_t *a, int64_t b) {
+ __builtin_arm_atomic_store_with_hint(a, b, __ATOMIC_RELAXED, 0); // HINT_STSHH_KEEP
+ // CHECK: store atomic i64 {{.*}}, ptr {{.*}} monotonic, align 8, !aarch64.atomic.hint ![[M1:[0-9]]]
+
+ __builtin_arm_atomic_store_with_hint(a, b, __ATOMIC_SEQ_CST, 0);
+ // CHECK: store atomic i64 {{.*}}, ptr {{.*}} seq_cst, align 8, !aarch64.atomic.hint ![[M1]]
+
+ __builtin_arm_atomic_store_with_hint(a, b, __ATOMIC_RELEASE, 1); // HINT_STSHH_STRM
+ // CHECK: store atomic i64 {{.*}}, ptr {{.*}} release, align 8, !aarch64.atomic.hint ![[M2:[0-9]]]
+}
+
// CHECK: ![[M0]] = !{!"1:2:3:4:5"}
+// CHECK: ![[M1]] = !{i32 0}
+// CHECK: ![[M2]] = !{i32 1}
diff --git a/clang/test/Sema/builtins-arm64.c b/clang/test/Sema/builtins-arm64.c
index 41cffd7ebb1a0..fb4718a1bd1f4 100644
--- a/clang/test/Sema/builtins-arm64.c
+++ b/clang/test/Sema/builtins-arm64.c
@@ -51,3 +51,20 @@ void test_trap(short s, unsigned short us) {
__builtin_arm_trap(s); // expected-error {{argument to '__builtin_arm_trap' must be a constant integer}}
__builtin_arm_trap(us); // expected-error {{argument to '__builtin_arm_trap' must be a constant integer}}
}
+
+void test_atomic_store_hint(char *c_ptr, __int128 *inv_ptr, float *f_ptr,
+ char c_data, __int128 inv_data, float f_data,
+ int inv_int) {
+ __builtin_arm_atomic_store_with_hint(c_ptr, c_data, 0); // expected-error {{too few arguments to function call, expected 4, have 3}}
+ __builtin_arm_atomic_store_with_hint(c_ptr, c_data, 0, 0, 0); // expected-error {{too many arguments to function call, expected 4, have 5}}
+
+ __builtin_arm_atomic_store_with_hint(0, c_data, 0, 0); // expected-error {{address argument to atomic builtin must be a pointer ('int' invalid)}}
+ __builtin_arm_atomic_store_with_hint(c_ptr, f_data, 0, 0); // expected-error {{arguments are of different types ('char' vs 'float')}}
+ __builtin_arm_atomic_store_with_hint(inv_ptr, inv_data, 0, 0); // expected-error {{address argument to atomic store with hint must be of size 8, 16, 32 or 64 bits}}
+
+ __builtin_arm_atomic_store_with_hint(c_ptr, c_data, inv_int, 0); // expected-error {{invalid memory order argument to atomic hint operation ('int' invalid)}}
+ __builtin_arm_atomic_store_with_hint(c_ptr, c_data, 2, 0); // expected-error {{invalid memory order argument to atomic hint operation (2 invalid)}}
+
+ __builtin_arm_atomic_store_with_hint(c_ptr, c_data, 0, inv_int); // expected-error {{invalid hint type argument to atomic hint operation ('int' invalid)}}
+ __builtin_arm_atomic_store_with_hint(c_ptr, c_data, 0, 3); // expected-error {{invalid hint type argument to atomic hint operation (3 invalid)}}
+}
diff --git a/llvm/include/llvm/Support/AArch64AtomicHints.h b/llvm/include/llvm/Support/AArch64AtomicHints.h
new file mode 100644
index 0000000000000..8118f3e2df3ad
--- /dev/null
+++ b/llvm/include/llvm/Support/AArch64AtomicHints.h
@@ -0,0 +1,36 @@
+//===-- AArch64AtomicHints.h - AArch64 Atomic Hint Attributes ---*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_AARCH64ATOMICHINTS_H
+#define LLVM_SUPPORT_AARCH64ATOMICHINTS_H
+
+namespace llvm {
+enum class AArch64AtomicStoreHint {
+ HINT_NONE = 0,
+ HINT_STSHH_KEEP = 1,
+ HINT_STSHH_STRM = 2,
+};
+
+template <typename Int> inline bool isValidAArch64AtomicHintValue(Int I) {
+ return (Int)AArch64AtomicStoreHint::HINT_STSHH_KEEP <= I &&
+ I <= (Int)AArch64AtomicStoreHint::HINT_STSHH_STRM;
+}
+
+template <typename Int>
+inline AArch64AtomicStoreHint getAtomicStoreHintFromMD(Int I) {
+ switch (I) {
+ case 0:
+ return AArch64AtomicStoreHint::HINT_STSHH_KEEP;
+ case 1:
+ return AArch64AtomicStoreHint::HINT_STSHH_STRM;
+ default:
+ return AArch64AtomicStoreHint::HINT_NONE;
+ }
+}
+} // namespace llvm
+#endif // LLVM_SUPPORT_AARCH64ATOMICHINTS_H
diff --git a/llvm/lib/CodeGen/AtomicExpandPass.cpp b/llvm/lib/CodeGen/AtomicExpandPass.cpp
index 7327290f62970..3c12a29eb9556 100644
--- a/llvm/lib/CodeGen/AtomicExpandPass.cpp
+++ b/llvm/lib/CodeGen/AtomicExpandPass.cpp
@@ -254,6 +254,8 @@ static void copyMetadataForAtomic(Instruction &Dest,
Dest.setMetadata(ID, N);
else if (ID == Ctx.getMDKindID("amdgpu.no.fine.grained.memory"))
Dest.setMetadata(ID, N);
+ else if (ID == Ctx.getMDKindID("aarch64.atomic.hint"))
+ Dest.setMetadata(ID, N);
// Losing amdgpu.ignore.denormal.mode, but it doesn't matter for current
// uses.
@@ -707,6 +709,7 @@ StoreInst *AtomicExpandImpl::convertAtomicStoreToIntegerType(StoreInst *SI) {
NewSI->setAlignment(SI->getAlign());
NewSI->setVolatile(SI->isVolatile());
NewSI->setAtomic(SI->getOrdering(), SI->getSyncScopeID());
+ copyMetadataForAtomic(*NewSI, *SI);
LLVM_DEBUG(dbgs() << "Replaced " << *SI << " with " << *NewSI << "\n");
SI->eraseFromParent();
return NewSI;
diff --git a/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp b/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
index 5fa93da1544fc..81fb5619f57b0 100644
--- a/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/AArch64/...
[truncated]
|
efriedma-quic
left a comment
There was a problem hiding this comment.
Is this switching to metadata from the intrinsic just because you couldn't get the aliasing to work right? Or are there other reasons to prefer the metadata?
| return true; | ||
|
|
||
| // Arg 0 should be the pointer type. The pointee type must be a | ||
| // scalar integral or floating-point type of 8, 16, 32 or 64 bits. |
There was a problem hiding this comment.
Do we want to allow vectors?
There was a problem hiding this comment.
Hi @efriedma-quic,
The ACLE specification, that proposed these builtins does not mention vector types:
This intrinsic is
type generic and supports scalar integral and floating-point types of 8, 16, 32, and 64 bits.
| #if defined(__ARM_64BIT_STATE) && __ARM_64BIT_STATE | ||
| #define __arm_atomic_store_with_hint(ptr, data, memory_order, hint) \ | ||
| __builtin_arm_atomic_store_with_hint(ptr, data, memory_order, hint) | ||
| #endif |
There was a problem hiding this comment.
Is there any plan to allow specifying a hint for atomicrmw instructions, in addition to plain stores? Or is it not useful for that?
There was a problem hiding this comment.
Hi @efriedma-quic ,
This PR only covers the ACLE atomic-store-with-hint intrinsic.
There is separate ACLE PR#430 for FEAT_CMH that proposes atomic fetch operations with hints, such as fetch_add/fetch_sub/fetch_and/fetch_xor/fetch_or
I think those could map to atomicrmw, but this will be in a follow-up work, not as part of this store-only patch.
| ; CHECK-NEXT: // kill: def $h0 killed $h0 def $s0 | ||
| ; CHECK-NEXT: fmov w8, s0 | ||
| ; CHECK-NEXT: stshh keep | ||
| ; CHECK-NEXT: stlrh w8, [x0] |
There was a problem hiding this comment.
Do we want to generate a different sequence with rcpc3?
There was a problem hiding this comment.
I think we can use the atomic stores available with rcprc3 here, yes. I will investigate this in a follow-up PR.
|
Thank you for reviewing this, @efriedma-quic
Since the intrinsic was not recognised as an atomic instruction, passes (such as MergedLoadStoreMotion) were able to reorder instructions in such a way that memory ordering was not preserved correctly. The reason for choosing atomic stores with metadata was to prevent this without needing to update each of the affected passes. |
|
@nikic is looking into those alias analysis issues (#196965 etc.). I think we could get it working if we think there are other reasons to prefer the intrinsic. I'm a little concerned that we could end up dropping the metadata in some cases; that wouldn't be a big deal in most cases, probably, but it would be confusing for users. |
Adds the following ACLE intrinsic as described in [1]:
An atomic store instruction is emitted in Clang for this builtin with additional metadata
(
!aarch64.atomic.hint), which ensures the instruction is recognised as atomic by laterpasses in LLVM.
When an atomic store is found with this metadata, it is lowered to the ATOMIC_STORE_HINT
pseudo which is later expanded by AArch64ExpandPseudoInsts into an STSHH instruction
plus an atomic store.
The hint value is represented using MOTargetFlag3 & MOTargetFlag4 flags,
which will need to be extended when new hints are added in future.
[1] ARM-software/acle#432