Skip to content

Commit c7ef218

Browse files
zhongjuzheIncarnation-p-lee
authored andcommitted
Middle-end: Do not model address cost for SELECT_VL style vectorization
Follow Richard's suggestions, we should not model address cost in the loop vectorizer for select_vl or decrement IV since other style vectorization doesn't do that. To make cost model comparison apple to apple. This patch set COST from 2 to 1 which turns out have better codegen in various codegen for RVV. Ok for trunk ? PR target/111153 gcc/ChangeLog: * tree-vect-loop.cc (vect_estimate_min_profitable_iters): Remove address cost for select_vl/decrement IV. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr111153.c: Moved to... * gcc.dg/vect/costmodel/riscv/rvv/pr11153-2.c: ...here. * gcc.dg/vect/costmodel/riscv/rvv/pr111153-1.c: New test.
1 parent f998335 commit c7ef218

3 files changed

Lines changed: 24 additions & 8 deletions

File tree

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
/* { dg-do compile } */
2+
/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -mtune=generic-ooo -ffast-math" } */
3+
4+
#define DEF_REDUC_PLUS(TYPE) \
5+
TYPE __attribute__ ((noinline, noclone)) \
6+
reduc_plus_##TYPE (TYPE *__restrict a, int n) \
7+
{ \
8+
TYPE r = 0; \
9+
for (int i = 0; i < n; ++i) \
10+
r += a[i]; \
11+
return r; \
12+
}
13+
14+
#define TEST_PLUS(T) T (int) T (float)
15+
16+
TEST_PLUS (DEF_REDUC_PLUS)
17+
18+
/* { dg-final { scan-assembler-not {vsetivli\s+zero,\s*4} } } */

gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr111153.c renamed to gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr11153-2.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
/* { dg-do compile } */
2-
/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -mtune=generic-ooo" } */
2+
/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -ffast-math" } */
33

44
#define DEF_REDUC_PLUS(TYPE) \
55
TYPE __attribute__ ((noinline, noclone)) \
@@ -11,7 +11,7 @@
1111
return r; \
1212
}
1313

14-
#define TEST_PLUS(T) T (int)
14+
#define TEST_PLUS(T) T (int) T (float)
1515

1616
TEST_PLUS (DEF_REDUC_PLUS)
1717

gcc/tree-vect-loop.cc

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4872,12 +4872,10 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
48724872

48734873
unsigned int length_update_cost = 0;
48744874
if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
4875-
/* For decrement IV style, we use a single SELECT_VL since
4876-
beginning to calculate the number of elements need to be
4877-
processed in current iteration, and a SHIFT operation to
4878-
compute the next memory address instead of adding vectorization
4879-
factor. */
4880-
length_update_cost = 2;
4875+
/* For decrement IV style, Each only need a single SELECT_VL
4876+
or MIN since beginning to calculate the number of elements
4877+
need to be processed in current iteration. */
4878+
length_update_cost = 1;
48814879
else
48824880
/* For increment IV stype, Each may need two MINs and one MINUS to
48834881
update lengths in body for next iteration. */

0 commit comments

Comments
 (0)