Skip to content

Commit f81160c

Browse files
authored
aarch64: Fix splat(ireduce(iconst(...))) (#12902)
This commit fixes a lowering rule in the aarch64 Cranelift backend. Specifically a combined `splat(ireduce(_))` combo would pass an immediate to the `splat_const` helper which had higher bits set since the `ireduce` wasn't const-propagated. The fix applied here is to delete the `ireduce`-related rule and rely on mid-end optimizations to trigger to fold the `ireduce(iconst(...))` appropriately. This ensures that the `u64` values passed into the `splat_const` rule is indeed the exact value that's being splatted.
1 parent 2f7dbd6 commit f81160c

File tree

3 files changed

+14
-7
lines changed

3 files changed

+14
-7
lines changed

cranelift/codegen/src/isa/aarch64/lower.isle

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2313,9 +2313,6 @@
23132313
(rule (lower (has_type ty (splat _ (iconst _ (u64_from_imm64 n)))))
23142314
(splat_const n (vector_size ty)))
23152315

2316-
(rule (lower (has_type ty (splat _ (ireduce _ (iconst _ (u64_from_imm64 n))))))
2317-
(splat_const n (vector_size ty)))
2318-
23192316
(rule (lower (has_type ty (splat _ x @ (load _ flags _ _))))
23202317
(if-let mem_op (is_sinkable_inst x))
23212318
(let ((addr Reg (sink_load_into_addr (lane_type ty) mem_op)))

cranelift/filetests/filetests/isa/aarch64/simd.clif

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,14 +33,14 @@ block0:
3333

3434
; VCode:
3535
; block0:
36-
; movz x0, #42679
37-
; dup v0.8h, w0
36+
; movz w1, #42679
37+
; dup v0.8h, w1
3838
; ret
3939
;
4040
; Disassembled:
4141
; block0: ; offset 0x0
42-
; mov x0, #0xa6b7
43-
; dup v0.8h, w0
42+
; mov w1, #0xa6b7
43+
; dup v0.8h, w1
4444
; ret
4545

4646
function %f4(i32, i8x16, i8x16) -> i8x16 {

cranelift/filetests/filetests/runtests/simd-splat.clif

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -212,3 +212,13 @@ block0(v0: f64):
212212
; run: %load_splat_f64x2(0x0.0) == [0x0.0 0x0.0]
213213
; run: %load_splat_f64x2(0x2.0) == [0x2.0 0x2.0]
214214
; run: %load_splat_f64x2(NaN) == [NaN NaN]
215+
216+
function %splat_ireduce() -> i32 {
217+
block0:
218+
v0 = iconst.i64 0x000000FF_00FFFF00
219+
v1 = ireduce.i32 v0
220+
v2 = splat.i32x4 v1
221+
v3 = extractlane v2, 1
222+
return v3
223+
}
224+
; run: %splat_ireduce() == 0x00ffff00

0 commit comments

Comments
 (0)