Commit af5aaa5
ssjia
Update on "[ET-VK][conv1d] Implement height-packed depthwise conv1d operator"
Implement a depthwise conv1d operator using height-packed layout where channels
are the packed dimension (WHCN dim 1). Depthwise conv applies a separate filter
to each channel independently (groups=C), so 4 channels can be processed in
parallel using element-wise vec4 FMA over kernel positions.
Thread mapping: X=C/4, Y=L_out, Z=N. Each thread computes one output texel
(4 channels at one spatial position). Inner loop iterates over kernel positions
K with bounds-checked input access for padding.
Weight [C,1,K] is prepacked as channels-packed so each vec4 load gives 4
channels' weights at one kernel position. Supports both buffer and texture3d
storage, fp32/fp16, optional bias, and arbitrary stride/padding/dilation.
Registered as et_vk.conv1d_dw.default (standalone custom op).
Performance on Adreno 750 (S24):
- [1,128,4096] K=31 buffer f16: 231 GFLOP/s
- [1,128,4096] K=31 buffer f32: 155 GFLOP/s
- [1,512,2048] K=5 buffer f32: 66 GFLOP/s
Differential Revision: [D97344091](https://our.internmc.facebook.com/intern/diff/D97344091/)
[ghstack-poisoned]4 files changed
Lines changed: 86 additions & 14 deletions
File tree
- backends/vulkan/runtime/graph/ops
- glsl
- impl
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| 42 | + | |
| 43 | + | |
42 | 44 | | |
43 | 45 | | |
44 | 46 | | |
| |||
86 | 88 | | |
87 | 89 | | |
88 | 90 | | |
| 91 | + | |
| 92 | + | |
89 | 93 | | |
90 | 94 | | |
91 | 95 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
| 59 | + | |
| 60 | + | |
59 | 61 | | |
60 | 62 | | |
61 | 63 | | |
62 | 64 | | |
| 65 | + | |
| 66 | + | |
63 | 67 | | |
64 | 68 | | |
65 | 69 | | |
| |||
190 | 194 | | |
191 | 195 | | |
192 | 196 | | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
193 | 205 | | |
194 | 206 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
| 20 | + | |
19 | 21 | | |
20 | 22 | | |
21 | 23 | | |
| |||
48 | 50 | | |
49 | 51 | | |
50 | 52 | | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
51 | 58 | | |
52 | 59 | | |
53 | 60 | | |
| |||
74 | 81 | | |
75 | 82 | | |
76 | 83 | | |
77 | | - | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
78 | 87 | | |
79 | 88 | | |
80 | 89 | | |
| |||
103 | 112 | | |
104 | 113 | | |
105 | 114 | | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
106 | 120 | | |
107 | 121 | | |
108 | 122 | | |
| |||
123 | 137 | | |
124 | 138 | | |
125 | 139 | | |
126 | | - | |
| 140 | + | |
| 141 | + | |
127 | 142 | | |
128 | 143 | | |
129 | 144 | | |
| |||
132 | 147 | | |
133 | 148 | | |
134 | 149 | | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
135 | 153 | | |
136 | | - | |
137 | 154 | | |
138 | 155 | | |
139 | 156 | | |
140 | 157 | | |
141 | 158 | | |
142 | 159 | | |
143 | | - | |
| 160 | + | |
144 | 161 | | |
145 | | - | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
146 | 182 | | |
147 | 183 | | |
148 | 184 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
| 20 | + | |
19 | 21 | | |
20 | 22 | | |
21 | 23 | | |
| |||
117 | 119 | | |
118 | 120 | | |
119 | 121 | | |
| 122 | + | |
| 123 | + | |
120 | 124 | | |
121 | 125 | | |
122 | 126 | | |
123 | 127 | | |
124 | 128 | | |
| 129 | + | |
| 130 | + | |
125 | 131 | | |
126 | 132 | | |
127 | 133 | | |
| |||
181 | 187 | | |
182 | 188 | | |
183 | 189 | | |
184 | | - | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
185 | 193 | | |
186 | 194 | | |
187 | 195 | | |
| |||
199 | 207 | | |
200 | 208 | | |
201 | 209 | | |
202 | | - | |
203 | | - | |
| 210 | + | |
| 211 | + | |
204 | 212 | | |
205 | 213 | | |
206 | 214 | | |
207 | 215 | | |
208 | 216 | | |
209 | 217 | | |
210 | | - | |
211 | | - | |
212 | | - | |
| 218 | + | |
213 | 219 | | |
214 | 220 | | |
215 | 221 | | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
216 | 225 | | |
217 | 226 | | |
218 | 227 | | |
| |||
240 | 249 | | |
241 | 250 | | |
242 | 251 | | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
243 | 255 | | |
244 | | - | |
245 | 256 | | |
246 | 257 | | |
247 | 258 | | |
248 | | - | |
| 259 | + | |
249 | 260 | | |
250 | 261 | | |
251 | 262 | | |
252 | 263 | | |
253 | 264 | | |
254 | 265 | | |
255 | 266 | | |
256 | | - | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
257 | 277 | | |
258 | 278 | | |
259 | 279 | | |
| |||
0 commit comments