Commit cbfb3e2
[rocm-libraries] ROCm/rocm-libraries#6611 (commit 5375c0f)
[CK_TILE] Preserve input strides in EightWaves async-load
descriptor (#6611)
`MakeAsyncLoadADramWindow` in
`GemmPipelineAgBgCrCompAsyncEightWavesPolicy` was rebuilding the 6D view
descriptor with `make_naive_tensor_descriptor_packed`, which synthesizes
strides from lengths and assumes a dense layout. When the input view's
leading-dim stride is larger than its inner length (non-packed memory
layout), the resulting tile window stepped through memory at the wrong
stride.
Compose the unmerge transforms on top of the input view's existing
descriptor instead, so the actual runtime strides are preserved and the
correct `element_space_size` is inherited for bounds checking.
## Test Plan
Added an unit test showing the problem.
## Test Result
The new test fails before fixes and passes after.
## Submission Checklist
- [ ] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.1 parent 9d34174 commit cbfb3e2
4 files changed
Lines changed: 53 additions & 6 deletions
File tree
- include/ck_tile/ops/gemm/pipeline
- test/ck_tile/gemm_block_scale
Lines changed: 9 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
176 | 176 | | |
177 | 177 | | |
178 | 178 | | |
179 | | - | |
180 | | - | |
181 | | - | |
182 | | - | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
183 | 188 | | |
184 | 189 | | |
185 | 190 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
89 | 94 | | |
90 | 95 | | |
91 | 96 | | |
| |||
281 | 286 | | |
282 | 287 | | |
283 | 288 | | |
| 289 | + | |
284 | 290 | | |
285 | 291 | | |
286 | 292 | | |
| |||
Lines changed: 31 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
Lines changed: 7 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1038 | 1038 | | |
1039 | 1039 | | |
1040 | 1040 | | |
1041 | | - | |
| 1041 | + | |
| 1042 | + | |
1042 | 1043 | | |
1043 | 1044 | | |
1044 | 1045 | | |
| 1046 | + | |
| 1047 | + | |
| 1048 | + | |
| 1049 | + | |
1045 | 1050 | | |
1046 | | - | |
| 1051 | + | |
1047 | 1052 | | |
1048 | 1053 | | |
1049 | 1054 | | |
| |||
0 commit comments