Skip to content

Commit 5de59c6

Browse files
author
Jeff Law
committed
[RISC-V][PR rtl-optimization/56096] Improve equality comparisons of a logical AND expressions
This BZ shows that we can improve certain comparisons for RISC-V. In particular if we are testing the result of a logical AND for equality and one operand of the AND requires synthesis, we may be able to do better if we right shift away any trailing zeros from the constant and shift the other input as well. This wins when the shifted constant does not require synthesis. That may in turn allow improvement of a select of 0 and 2^n based on the zero/nonzero status of a logical AND. Essentially we can rewrite the sequence to remove a data dependency. Concretely: > > unsigned f1 (unsigned x, unsigned m) > { > x >>= ((m & 0x008080) ? 8 : 0); > return x; > } Compiles into: > li a5,32768 > addi a5,a5,128 > and a1,a1,a5 > snez a1,a1 > slliw a1,a1,3 > srlw a0,a0,a1 > ret But after this patch we generate this instead: > srai a5,a1,7 > andi a5,a5,257 > li a4,8 > czero.eqz a1,a4,a5 > srlw a0,a0,a1 > ret It's just one less instruction, but the li can issue whenever the uarch wants before the srlw as it has no incoming dependency. So we're slight more dense on encoding and slightly more efficient as well. Much like 57650, I'm focused on the low level RISC-V codegen issues, not the broader issues that are raised in the PR. This has been in my tree for a while, so it's been tested on riscv32-elf, riscv64-elf and bootstrapped on the BPI which has support for czero. Waiting on pre-commit CI before moving forward. PR rtl-optimization/56096 gcc/ * config/riscv/riscv.md: Add new patterns to optimize certain cases with a logical AND feeding an equality test against zero. gcc/testsuite/ * gcc.target/riscv/pr56096.c: New test.
1 parent 7828030 commit 5de59c6

2 files changed

Lines changed: 75 additions & 0 deletions

File tree

gcc/config/riscv/riscv.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3201,6 +3201,62 @@
32013201
[(set_attr "type" "shift")
32023202
(set_attr "mode" "DI")])
32033203

3204+
;; Handle logical AND feeding an equality test against zero where an operand
3205+
;; to the AND is a constant requiring synthesis. Because we only care about
3206+
;; zero/nonzero state afte the AND, we may be able to shift both operands
3207+
;; of the AND to the right and eliminate the need for constant synthesis.
3208+
;;
3209+
;; Once mvconst_internal goes away, this likely turns into a simple splitter.
3210+
(define_insn_and_split ""
3211+
[(set (match_operand:X 0 "register_operand" "=r")
3212+
(any_eq:X (and:X (match_operand:X 1 "register_operand" "r")
3213+
(match_operand 2 "shifted_const_arith_operand"))
3214+
(const_int 0)))
3215+
(clobber (match_scratch:X 3 "=&r"))]
3216+
"!SMALL_OPERAND (INTVAL (operands[2]))"
3217+
"#"
3218+
"&& reload_completed"
3219+
[(set (match_dup 3) (ashiftrt:X (match_dup 1) (match_dup 4)))
3220+
(set (match_dup 3) (and:X (match_dup 3) (match_dup 2)))
3221+
(set (match_dup 0) (any_eq:X (match_dup 3) (const_int 0)))]
3222+
{
3223+
HOST_WIDE_INT shift = ctz_hwi (INTVAL (operands[2]));
3224+
operands[4] = gen_int_mode (shift, QImode);
3225+
operands[2] = gen_int_mode (INTVAL (operands[2]) >> shift, word_mode);
3226+
}
3227+
[(set_attr "type" "shift")])
3228+
3229+
;; The pattern above is a bridge to this pattern. Essentially a select
3230+
;; between 0 and 2^n based on the zero/nonzero status of the AND.
3231+
;;
3232+
;; It's no fewer instructions, but the resulting code has fewer data
3233+
;; dependencies and may compress better depending on 2^n.
3234+
(define_insn_and_split ""
3235+
[(set (match_operand:X 0 "register_operand" "=r")
3236+
(ashift:X (any_eq:X
3237+
(and:X (match_operand:X 1 "register_operand" "r")
3238+
(match_operand 2 "shifted_const_arith_operand"))
3239+
(const_int 0))
3240+
(match_operand 3 "const_int_operand")))
3241+
(clobber (match_scratch:X 4 "=&r"))
3242+
(clobber (match_scratch:X 5 "=&r"))]
3243+
"TARGET_ZICOND && TARGET_ZBS"
3244+
"#"
3245+
"&& reload_completed"
3246+
[(set (match_dup 4) (ashiftrt:X (match_dup 1) (match_dup 6)))
3247+
(set (match_dup 4) (and:X (match_dup 4) (match_dup 2)))
3248+
(set (match_dup 5) (match_dup 3))
3249+
(set (match_dup 0) (if_then_else:X (any_eq:X (match_dup 4) (const_int 0))
3250+
(match_dup 5)
3251+
(const_int 0)))]
3252+
{
3253+
HOST_WIDE_INT shift = ctz_hwi (INTVAL (operands[2]));
3254+
operands[3] = gen_int_mode (HOST_WIDE_INT_1U << INTVAL (operands[3]), word_mode);
3255+
operands[6] = gen_int_mode (shift, QImode);
3256+
operands[2] = gen_int_mode (INTVAL (operands[2]) >> shift, word_mode);
3257+
}
3258+
[(set_attr "type" "shift")])
3259+
32043260
;;
32053261
;; ....................
32063262
;;
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
/* { dg-do compile } */
2+
/* { dg-additional-options "-march=rv64gcb_zicond -mabi=lp64d" { target rv64 } } */
3+
/* { dg-additional-options "-march=rv32gcb_zicond -mabi=ilp32" { target rv32 } } */
4+
/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
5+
6+
unsigned f1 (unsigned x, unsigned m)
7+
{
8+
x >>= ((m & 0x008080) ? 8 : 0);
9+
return x;
10+
}
11+
12+
/* { dg-final { scan-assembler-not "addi\t" } } */
13+
/* { dg-final { scan-assembler-not "and\t" } } */
14+
/* { dg-final { scan-assembler-not "snez\t" } } */
15+
/* { dg-final { scan-assembler-not "slli\t" } } */
16+
/* { dg-final { scan-assembler-not "slliw\t" } } */
17+
/* { dg-final { scan-assembler-times "srai\t" 1 } } */
18+
/* { dg-final { scan-assembler-times "andi\t" 1 } } */
19+
/* { dg-final { scan-assembler-times "czero" 1 } } */

0 commit comments

Comments
 (0)