Commit b782e12
committed
Consolidate asymmetric nounroll schedule into parameterized asymmetric schedule
The no-unroll path needs a different kernel interleaving strategy than
the unrolled path: 2-group interleaving (shared A loads interleaved
with MMA) with B loads and G2S prefetches in a separate third cluster,
rather than 4-group interleaving that folds B loads and G2S directly
into the two MMA clusters. The 4-group pattern was designed for the
unrolled kernel where the larger loop body can absorb the extra live
values; with unroll_factor=1 the tighter loop needs the third cluster
to keep VGPR pressure in check.1 parent 809d2cb commit b782e12
4 files changed
Lines changed: 124 additions & 434 deletions
File tree
- examples/python
- tests/kernel/wave/asm
- wave_lang/kernel/wave/schedules
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
31 | 30 | | |
32 | 31 | | |
33 | 32 | | |
| |||
387 | 386 | | |
388 | 387 | | |
389 | 388 | | |
390 | | - | |
391 | | - | |
392 | | - | |
393 | | - | |
394 | | - | |
395 | | - | |
396 | | - | |
397 | | - | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
398 | 395 | | |
399 | 396 | | |
400 | 397 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1169 | 1169 | | |
1170 | 1170 | | |
1171 | 1171 | | |
1172 | | - | |
1173 | 1172 | | |
1174 | 1173 | | |
1175 | 1174 | | |
| |||
1202 | 1201 | | |
1203 | 1202 | | |
1204 | 1203 | | |
1205 | | - | |
1206 | | - | |
1207 | | - | |
1208 | | - | |
1209 | | - | |
1210 | | - | |
1211 | | - | |
1212 | | - | |
| 1204 | + | |
| 1205 | + | |
| 1206 | + | |
| 1207 | + | |
| 1208 | + | |
| 1209 | + | |
1213 | 1210 | | |
1214 | 1211 | | |
1215 | 1212 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | 21 | | |
23 | 22 | | |
24 | 23 | | |
| |||
35 | 34 | | |
36 | 35 | | |
37 | 36 | | |
38 | | - | |
39 | 37 | | |
40 | 38 | | |
41 | 39 | | |
0 commit comments