Skip to content

Commit c32eefd

Browse files
committed
Fix incorrect leading dimension check for SME SGEMM direct kernel path
For row-major matrices, the tight-packing condition should be k==lda (A is m×k), n==ldb (B is k×n), and n==ldc (C is m×n). The old check used m==lda and k==ldc, which prevented the SME/direct kernel from being invoked except when m==k==n (square matrices). Fixes #5794
1 parent 8e57c86 commit c32eefd

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

interface/gemm.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -562,12 +562,12 @@ if (strcmp(gotoblas_corename(), "armv9sme") == 0
562562
)
563563
// if (support_sme1())
564564
#endif
565-
if (order == CblasRowMajor && m==lda && n ==ldb && k==ldc && beta == 0 && alpha == 1.0 && TransA == CblasNoTrans && TransB == CblasNoTrans&& SGEMM_DIRECT_PERFORMANT(m,n,k)) {
565+
if (order == CblasRowMajor && k==lda && n==ldb && n==ldc && beta == 0 && alpha == 1.0 && TransA == CblasNoTrans && TransB == CblasNoTrans && SGEMM_DIRECT_PERFORMANT(m,n,k)) {
566566
SGEMM_DIRECT(m, n, k, a, lda, b, ldb, c, ldc);
567567
return;
568568
}
569569
else
570-
if (order == CblasRowMajor && m==lda && n==ldb && k==ldc && TransA == CblasNoTrans && TransB == CblasNoTrans&& SGEMM_DIRECT_PERFORMANT(m,n,k)) {
570+
if (order == CblasRowMajor && k==lda && n==ldb && n==ldc && TransA == CblasNoTrans && TransB == CblasNoTrans && SGEMM_DIRECT_PERFORMANT(m,n,k)) {
571571
SGEMM_DIRECT_ALPHA_BETA(m, n, k, alpha, a, lda, b, ldb, beta, c, ldc);
572572
return;
573573
}

0 commit comments

Comments
 (0)