Skip to content

Commit 5db6f22

Browse files
ssjiamanuelcandales
authored andcommitted
[ET-VK][qconv] Use ivec4 reads in pack_q8_conv2d_weights to fix Adreno 740 bug
On Adreno 740 (Quest 3S), scalar int[] SSBO reads from host-visible staging buffers in the pack_q8_conv2d_weights shader returned incorrect data — specifically t_int8_weight[N] returned the value of t_int8_weight[N-1] for N>0. This caused corrupted conv2d weights for all kernels with kx>0, resulting in 18/45 head-case failures on the SceneX V9 model. Switching the input buffer declaration from scalar int[] to ivec4[] changes the GPU load instruction from scalar loads to vec4 loads, which sidesteps the driver bug. The indexing is updated accordingly: t_int8_weight[word_idx] becomes t_int8_weight[word_idx >> 2][word_idx & 3]. Authored with Claude. Differential Revision: [D96036513](https://our.internmc.facebook.com/intern/diff/D96036513/) ghstack-source-id: 350221316 Pull Request resolved: #18077
1 parent 48690b1 commit 5db6f22

1 file changed

Lines changed: 3 additions & 2 deletions

File tree

backends/vulkan/runtime/graph/ops/glsl/pack_q8_conv2d_weights.glsl

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ ${define_active_storage_type(STORAGE)}
1717
layout(std430) buffer;
1818

1919
${layout_declare_tensor(B, "w", "t_packed_int8_weight", "int", STORAGE, is_scalar_array=False)}
20-
${layout_declare_tensor(B, "r", "t_int8_weight", "int", "buffer")}
20+
${layout_declare_tensor(B, "r", "t_int8_weight", "int", "buffer", is_scalar_array=False)}
2121

2222
layout(push_constant) uniform restrict Block {
2323
ivec4 qmat2_sizes;
@@ -65,7 +65,8 @@ void main() {
6565
if (ic + col < orig_sizes.w) {
6666
const int byte_idx = buf_idx + col;
6767
const int byte_pos = byte_idx & 3;
68-
weight_vals[col] = (t_int8_weight[byte_idx >> 2] >> (byte_pos * 8)) & 0xFF;
68+
const int word_idx = byte_idx >> 2;
69+
weight_vals[col] = (t_int8_weight[word_idx >> 2][word_idx & 3] >> (byte_pos * 8)) & 0xFF;
6970
}
7071
}
7172
packed_block[row] = pack_into_int32(weight_vals);

0 commit comments

Comments
 (0)