Skip to content

Fix cmake building with cblas_bgemm#5397

Merged
martin-frbg merged 1 commit intoOpenMathLib:developfrom
omegacoleman:fix-cblas-bgemm
Jul 24, 2025
Merged

Fix cmake building with cblas_bgemm#5397
martin-frbg merged 1 commit intoOpenMathLib:developfrom
omegacoleman:fix-cblas-bgemm

Conversation

@omegacoleman
Copy link
Copy Markdown
Contributor

Hello, I'm using CMake build with the following options:

-DONLY_CBLAS=ON
-DUSE_OPENMP=OFF
-DUSE_THREAD=OFF
-DUSE_LOCKING=ON
-DBUILD_BFLOAT16=ON

And I need the shiny new cblas_bgemm. However, these are the problems I encounter:

undefined reference to bgemm_nn, and more

I believe that to create this function, GenerateNamedObjects needs to be called one more time with an additional BGEMM macro; otherwise, it will only produce sbgemm functions.

And after that,

undefined reference to sbstobf16_ and sbf16tos_

These are not built if FBLAS is disabled; yet, cblas interface functions call into them (and they are called by cblas_bgemm), so it's necessary to build them even if NO_FBLAS is defined.

@Mousius
Copy link
Copy Markdown
Contributor

Mousius commented Jul 23, 2025

I assume this works without ONLY_CBLAS=ON, and is only an issue if you disable the BLAS interface.

Interestingly, if you do that, you'll miss out on GEMV forwarding. Depending on your use case, it might be quite a performance boost:

OpenBLAS/cmake/system.cmake

Lines 425 to 430 in f4caa61

if (GEMM_GEMV_FORWARD AND NOT ONLY_CBLAS)
set(CCOMMON_OPT "${CCOMMON_OPT} -DGEMM_GEMV_FORWARD")
endif ()
if (SBGEMM_GEMV_FORWARD AND NOT ONLY_CBLAS)
set(CCOMMON_OPT "${CCOMMON_OPT} -DSBGEMM_GEMV_FORWARD")
endif ()

@martin-frbg
Copy link
Copy Markdown
Collaborator

@Mousius it does look as if the restrictive AND NOT ONLY_CBLAS may have been added (by you, apparently ?) due to the "inexplicably" missing sbstobf16, at least neither the equivalent Makefile.system nor the actual interface/gemm.c has such a restriction.

@martin-frbg martin-frbg added this to the 0.3.31 milestone Jul 23, 2025
@martin-frbg martin-frbg merged commit a4f4662 into OpenMathLib:develop Jul 24, 2025
87 of 88 checks passed
@Mousius
Copy link
Copy Markdown
Contributor

Mousius commented Jul 24, 2025

@martin-frbg yip, I added the limiter. It was actually because it calls gemv behind the scenes and that didn't exist in ONLY_CBLAS - only cblas_gemv.

I was just giving a friendly heads up, using the new bgemm is likely someone experimenting with ML workloads and a lot of the transformer models actually end up using the gemv proxy.

@martin-frbg
Copy link
Copy Markdown
Collaborator

Oh right, I need to make sure that GEMV is hooked up correctly now

@omegacoleman
Copy link
Copy Markdown
Contributor Author

Interestingly, if you do that, you'll miss out on GEMV forwarding. Depending on your use case, it might be quite a performance boost:

I'll try poking around with these settings. Thanks for the info. The matrix sizes are predetermined, so direct calls to cblas_*gemv are also applicable.

using the new bgemm is likely someone experimenting with ML workloads and a lot of the transformer models actually end up using the gemv proxy

Yeah, I'm testing it in an LLM inference engine.

I have successfully wired up the attention layer in full bfloat16, but the performance is ~60x worse than float32 right now, as my CPU lacks AVX512 BF16 extensions.

So this is a proof-of-concept experiment which may be useful years later (when most CPUs being sold have BF16 extensions)

@martin-frbg
Copy link
Copy Markdown
Collaborator

@Mousius all forwarding should be pointing to GEMV i.e. the driver code, not gemv the interface function, so I think the only unresolved references to sgemv et al., if any, should be coming from LAPACK and perhaps the tests. (What's a bit annoying is that currently the CMake build will attempt to build the utest even with ONLY_CBLAS, which means the build
will fail before reaching ctest. And the few bfloat16-related tests we have are all in test, relying on the BLAS interface. Also we seem to have the same problem regarding the lack of sbstobf16 with ONLY_CBLAS in the Makefile build)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants