Commit 02faae5
committed
[ET-VK] Address coopmat dispatch review feedback
Three correctness fixes flagged on PR #19009.
1. The linear_coopmat / matmul_coopmat dispatch gate previously only checked `M >= 64`. We now tighten the gates in `Linear.cpp` and `Matmul.cpp` to require `M % TILE_M == 0 && N % TILE_N == 0 && K % TILE_K == 0`; misaligned shapes correctly fall back to the tiled shader.
2. The bias path in `linear_coopmat.glsl` previously read the just-written output buffer back, added bias, and wrote it again. We now fold bias into the fp32 accumulator before `coopMatStore`. The binding now becomes `w` instead of `rw`.
3. We now use `packFloat2x16` directly to avoid fp16 -> fp32 -> fp16 round trip.1 parent b26728a commit 02faae5
6 files changed
Lines changed: 281 additions & 140 deletions
File tree
- backends/vulkan
- runtime/graph/ops
- glsl
- impl
- test/custom_ops
Lines changed: 41 additions & 31 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
27 | | - | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
28 | 29 | | |
29 | 30 | | |
30 | 31 | | |
| |||
51 | 52 | | |
52 | 53 | | |
53 | 54 | | |
54 | | - | |
55 | | - | |
56 | | - | |
57 | | - | |
| 55 | + | |
58 | 56 | | |
59 | 57 | | |
60 | 58 | | |
| |||
94 | 92 | | |
95 | 93 | | |
96 | 94 | | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
97 | 101 | | |
98 | 102 | | |
99 | 103 | | |
| |||
146 | 150 | | |
147 | 151 | | |
148 | 152 | | |
149 | | - | |
150 | | - | |
| 153 | + | |
| 154 | + | |
151 | 155 | | |
152 | 156 | | |
153 | 157 | | |
| |||
173 | 177 | | |
174 | 178 | | |
175 | 179 | | |
176 | | - | |
177 | | - | |
| 180 | + | |
| 181 | + | |
178 | 182 | | |
179 | 183 | | |
180 | 184 | | |
| |||
218 | 222 | | |
219 | 223 | | |
220 | 224 | | |
221 | | - | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
222 | 239 | | |
223 | 240 | | |
224 | 241 | | |
225 | 242 | | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
226 | 256 | | |
227 | 257 | | |
228 | 258 | | |
| |||
238 | 268 | | |
239 | 269 | | |
240 | 270 | | |
241 | | - | |
242 | | - | |
243 | | - | |
244 | | - | |
245 | | - | |
246 | | - | |
247 | | - | |
248 | | - | |
249 | | - | |
250 | | - | |
251 | | - | |
252 | | - | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | | - | |
258 | | - | |
259 | | - | |
260 | | - | |
261 | 271 | | |
Lines changed: 4 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
134 | 134 | | |
135 | 135 | | |
136 | 136 | | |
137 | | - | |
138 | | - | |
| 137 | + | |
| 138 | + | |
139 | 139 | | |
140 | 140 | | |
141 | 141 | | |
| |||
157 | 157 | | |
158 | 158 | | |
159 | 159 | | |
160 | | - | |
161 | | - | |
| 160 | + | |
| 161 | + | |
162 | 162 | | |
163 | 163 | | |
164 | 164 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
243 | 243 | | |
244 | 244 | | |
245 | 245 | | |
| 246 | + | |
246 | 247 | | |
247 | 248 | | |
248 | 249 | | |
| |||
251 | 252 | | |
252 | 253 | | |
253 | 254 | | |
254 | | - | |
255 | | - | |
| 255 | + | |
256 | 256 | | |
257 | 257 | | |
258 | 258 | | |
| |||
342 | 342 | | |
343 | 343 | | |
344 | 344 | | |
345 | | - | |
346 | | - | |
347 | | - | |
348 | | - | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
349 | 351 | | |
350 | | - | |
351 | | - | |
352 | | - | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
353 | 357 | | |
354 | 358 | | |
355 | 359 | | |
356 | | - | |
| 360 | + | |
| 361 | + | |
357 | 362 | | |
358 | 363 | | |
359 | | - | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
360 | 368 | | |
361 | 369 | | |
362 | 370 | | |
363 | 371 | | |
364 | 372 | | |
365 | | - | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
366 | 377 | | |
367 | 378 | | |
368 | 379 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| 32 | + | |
32 | 33 | | |
33 | 34 | | |
34 | 35 | | |
| |||
275 | 276 | | |
276 | 277 | | |
277 | 278 | | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
278 | 288 | | |
279 | 289 | | |
280 | 290 | | |
281 | 291 | | |
282 | 292 | | |
283 | | - | |
| 293 | + | |
284 | 294 | | |
285 | | - | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
286 | 299 | | |
287 | 300 | | |
288 | 301 | | |
289 | | - | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
290 | 308 | | |
291 | 309 | | |
292 | 310 | | |
| |||
300 | 318 | | |
301 | 319 | | |
302 | 320 | | |
303 | | - | |
| 321 | + | |
304 | 322 | | |
305 | 323 | | |
306 | 324 | | |
| |||
0 commit comments