Skip to content

Commit 5385c95

Browse files
author
Jeff Law
committed
[RISC-V][PR target/123904] Improve bit masking of shifted values
If we are masking off bits on the upper and lower part of a register on riscv, depending on the precise mask it may be best implemented as a shift triplet. ie, shift left to clear upper bits, shift right to clear lower bits, shift left again to put the bits into their proper position. If the input value is already left shifted and the shift count corresponds to the low mask bits, then we can get away with just two shifts. We shift left to clear the relevant high bits, then shift right to put them into their proper position. This likey came from spec or coremark given it was reported to me by the RAU team a while back. But the testcase didn't include enough breadcrumbs to know for sure. This has been repeatedly bootstrapped and regression tested on the Pioneer and BPI as well as regularly regression tested on the riscv32-elf and riscv64-elf embedded targets. I'll wait for pre-commit CI to spin before pushing to the trunk. PR target/123904 gcc/ * config/riscv/riscv.md (masking shifted value): New splitter to optimize certain masking operations on shifted values. gcc/testsuite/ * gcc.target/riscv/pr123904.c: New test.
1 parent 36f0b74 commit 5385c95

2 files changed

Lines changed: 38 additions & 0 deletions

File tree

gcc/config/riscv/riscv.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5037,6 +5037,36 @@
50375037
operands[4] = gen_lowpart (QImode, operands[3]);
50385038
})
50395039

5040+
;; This is similar using a shift triplet to implement a logical AND when
5041+
;; the mask is a consecutive_bits_operand.
5042+
;;
5043+
;; The difference is we have a left shift in the input RTL and we verify
5044+
;; that clears the appropriate low bits. So we can get away with just
5045+
;; two shifts.
5046+
(define_split
5047+
[(set (match_operand:X 0 "register_operand")
5048+
(and:X (ashift:X (match_operand:X 1 "register_operand")
5049+
(match_operand 2 "const_int_operand"))
5050+
(match_operand 3 "consecutive_bits_operand")))
5051+
(clobber (match_operand:X 4 "register_operand"))]
5052+
"(ctz_hwi (INTVAL (operands[3]) & GET_MODE_MASK (word_mode))
5053+
== INTVAL (operands[2]))"
5054+
[(set (match_dup 4) (ashift:X (match_dup 1) (match_dup 5)))
5055+
(set (match_dup 0) (lshiftrt:X (match_dup 4) (match_dup 6)))]
5056+
"{
5057+
/* We want to left shift by the number of leading zeros in the mask,
5058+
plus the number of bits shifted left by the pattern. Remember that
5059+
a HOST_WIDE_INT may be 64 bits, so clz on that value can count bits
5060+
we don't care about for rv32. */
5061+
HOST_WIDE_INT lshift
5062+
= clz_hwi (UINTVAL (operands[3])) % BITS_PER_WORD + INTVAL (operands[2]);
5063+
operands[5] = gen_int_mode (lshift, QImode);
5064+
5065+
/* And then we right shift things back into position. */
5066+
HOST_WIDE_INT rshift = lshift - INTVAL (operands[2]);
5067+
operands[6] = gen_int_mode (rshift, QImode);
5068+
}")
5069+
50405070
;; Standard extensions and pattern for optimization
50415071
(include "bitmanip.md")
50425072
(include "crypto.md")
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
/* { dg-do compile } */
2+
/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
3+
4+
unsigned foo19(unsigned a, unsigned b) { b = (b << 2) >> 2; return a + (b << 1); }
5+
6+
/* { dg-final { scan-assembler-times "slli" 1 } } */
7+
/* { dg-final { scan-assembler-times "srli" 1 } } */
8+
/* { dg-final { scan-assembler-times "add" 1 } } */

0 commit comments

Comments
 (0)