Skip to content

Commit b508845

Browse files
committed
LC to overlap with following kernels
1 parent 770a38f commit b508845

1 file changed

Lines changed: 2 additions & 0 deletions

File tree

ggml/src/ggml-cuda/mmvq.cu

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -731,6 +731,8 @@ static __global__ void mul_mat_vec_q_moe(
731731
}
732732
}
733733

734+
ggml_cuda_pdl_lc();
735+
734736
// Warp-level reduction only - no shared memory needed
735737
#pragma unroll
736738
for (int i = 0; i < c_rows_per_block; ++i) {

0 commit comments

Comments
 (0)