Commit 547cf4c
authored
[NPU A3] Fix benchmark issues for fused_linear_jsd and dyt. (#1231)
## Summary
Fix benchmark issues for fused_linear_jsd and dyt.
1.dyt throws errors when using torch.compile on NPU. Add logic in
benchmark to disable torch.compile baseline for NPU devices.
2.fused_linear_jsd encounters out-of-limit grid error exceeding 65536 on
NPU. The issue arises from taking num_row as grid size. Replace it with
min(num_cores, n_rows) to fix the problem.
## Testing Done
dyt:
<img width="1699" height="480" alt="image"
src="https://github.com/user-attachments/assets/a0a44250-fc8d-45d5-9b5a-1c4529a1db2b"
/>
fused_linear_jsd:
<img width="1676" height="499" alt="image"
src="https://github.com/user-attachments/assets/c5a91b9f-5b74-4065-a6b7-74118820b43f"
/>
Atlas 800T-A3 x86
Complete the following tasks before sending your PR, and replace `[ ]`
with
`[x]` to indicate you have done them.
-->
- Hardware Type: <BLANK>
- [x] run `make test` to ensure correctness
- [x] run `make checkstyle` to ensure code style
- [x] run `make test-convergence` to ensure convergence1 parent decb1b7 commit 547cf4c
2 files changed
Lines changed: 36 additions & 4 deletions
File tree
- benchmark/scripts
- src/liger_kernel/ops/backends/_ascend/ops
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
23 | 30 | | |
24 | 31 | | |
25 | 32 | | |
| |||
85 | 92 | | |
86 | 93 | | |
87 | 94 | | |
88 | | - | |
| 95 | + | |
89 | 96 | | |
90 | 97 | | |
91 | 98 | | |
| |||
Lines changed: 28 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
5 | 6 | | |
6 | 7 | | |
7 | 8 | | |
8 | 9 | | |
9 | | - | |
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
15 | 37 | | |
16 | 38 | | |
17 | 39 | | |
| |||
131 | 153 | | |
132 | 154 | | |
133 | 155 | | |
| 156 | + | |
134 | 157 | | |
135 | | - | |
| 158 | + | |
136 | 159 | | |
137 | 160 | | |
138 | 161 | | |
| 162 | + | |
139 | 163 | | |
140 | 164 | | |
141 | 165 | | |
| |||
145 | 169 | | |
146 | 170 | | |
147 | 171 | | |
148 | | - | |
| 172 | + | |
149 | 173 | | |
150 | 174 | | |
151 | 175 | | |
| 176 | + | |
152 | 177 | | |
153 | 178 | | |
154 | 179 | | |
| |||
0 commit comments