Commit 3c3ad79
ssjia
[ETVK][experimental] Route general conv2d through im2col + GEMM
Routes the `SlidingWindow` branch of `add_conv2d_node` (in `Convolution.cpp`) through the new `conv2d_gemm_impl` orchestrator from the previous diff, replacing the legacy direct `conv2d` shader for the general non-pointwise / non-depthwise / non-transposed case. Pointwise still uses `conv2d_pw_impl`, depthwise still uses `conv2d_dw_impl`, transposed still uses the legacy path (im2col doesn't support transposed yet).
On `UNTRAINED_TinyCNNDepthEstimatorRealTime_Vulkan.pte` on Pixel 9 Pro XL (Mali → buffer im2col), total convolution time drops from 84.3 ms to 59.8 ms — a **29% reduction**. The previously-dominant `conv2d_float` (78.5 ms, ~93% of conv time) is replaced by `conv2d_im2col_buffer_float` (10.8 ms) + `conv2d_gemm_buffer_float` (42.5 ms). Pointwise and depthwise dispatches are unchanged.
```
kernel before (us) after (us)
------------------------------------------------------------------------
conv2d_float 78518.3 0.0
conv2d_gemm_buffer_float 0.0 42501.3
conv2d_im2col_buffer_float 0.0 10806.6
conv2d_pw_tiled_float 5525.0 6163.5
conv2d_dw_output_tile_3x3_b1x1_float 101.0 124.4
conv2d_dw_sned_output_tile_5x5_float 171.5 206.1
------------------------------------------------------------------------
TOTAL conv time 84315.8 59801.9
```
This is intentionally a thin, easily-revertible diff sitting on top of the im2col + GEMM prototype, marked experimental so we can land it as a kill-switch-able change while we validate other models.
Differential Revision: [D105120965](https://our.internmc.facebook.com/intern/diff/D105120965/)
[ghstack-poisoned]1 parent 0186ccb commit 3c3ad79
1 file changed
Lines changed: 22 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| 14 | + | |
14 | 15 | | |
15 | 16 | | |
16 | 17 | | |
| |||
473 | 474 | | |
474 | 475 | | |
475 | 476 | | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
476 | 498 | | |
477 | 499 | | |
478 | 500 | | |
| |||
0 commit comments