@@ -13,7 +13,9 @@ This page documents those non-standard APIs.
1313| ?omatcopy | s,d,c,z | out-of-place transposition/copying |
1414| ?geadd | s,d,c,z | ATLAS-like matrix add ` B = α*A+β*B ` |
1515| ?gemmt | s,d,c,z | ` gemm ` but only a triangular part updated |
16-
16+ | cblas_ ?gemm_batch | s,d,c,z,b | ` gemm ` with several groups of input data
17+ |
18+ | cblas_ ?gemm_batch_strided | s,d,c,z,b | ` gemm ` with groups of data stored at fixed offsets in the input arrays
1719
1820## bfloat16 functionality
1921
@@ -26,6 +28,15 @@ BLAS-like and conversion functions for `bfloat16` (available when OpenBLAS was c
2628* ` float cblas_sbdot ` computes the dot product of two bfloat16 arrays
2729* ` void cblas_sbgemv ` performs the matrix-vector operations of GEMV with the input matrix and X vector as bfloat16
2830* ` void cblas_sbgemm ` performs the matrix-matrix operations of GEMM with both input arrays containing bfloat16
31+ * ` void cblas_bgemv ` performs the matrix-vector operations of GEMV with the input matrix, X vector and result as bfloat16
32+ * ` void cblas_bgemm ` performs the matrix-matrix operations of GEMM with both input arrays containing bfloat16 and the output being bfloat16 as well
33+
34+ ## half-precision float or fp16 functionality
35+
36+ BLAS-like and conversion functions for ` hfloat16 ` (available when OpenBLAS was compiled with ` BUILD_HFLOAT16=1 ` ):
37+
38+ * ` void cblas_shgemm ` performs the matrix-matrix operations of GEMM with both input arrays containing hfloat16
39+
2940
3041## Utility functions
3142
@@ -36,4 +47,5 @@ BLAS-like and conversion functions for `bfloat16` (available when OpenBLAS was c
3647* ` char * openblas_get_config() ` returns the options OpenBLAS was built with, something like ` NO_LAPACKE DYNAMIC_ARCH NO_AFFINITY Haswell `
3748* ` int openblas_set_affinity(int thread_index, size_t cpusetsize, cpu_set_t *cpuset) ` sets the CPU affinity mask of the given thread
3849 to the provided cpuset. Only available on Linux, with semantics identical to ` pthread_setaffinity_np ` .
50+ * ` openblas_set_thread_callback_function ` overrides the default multithreading backend with the provided argument
3951
0 commit comments