diff --git a/Changelog.txt b/Changelog.txt index 20c76ff522..ee7701de1d 100644 --- a/Changelog.txt +++ b/Changelog.txt @@ -1,4 +1,54 @@ OpenBLAS ChangeLog +==================================================================== +Version 0.3.33 +23-Apr-2026 + +general: + - fixed an incorrect cast in the SBGEMM test case that could lead to spurious test failures + - fixed an invalid memory access in the converted C version of the CBLAS tests + - made the BIGNUMA setting automatic when the number of cores exceeds 256 + - Imported recent updates from Reference-LAPACK to realign with its upcoming 3.13.0 release: + - Implement ?LARF1F and ?ORM2R (Reference-LAPACK PRs 1019,1020,1196,1257) + - Change loop order in ?GETC2 to improve performance (Reference-LAPACK PR 1023) + - Change WORK array dimension in ?GELQS/?GEQRS (Reference-LAPACK PR 1094) + - Add NaN checks for input matrix A in ?GEEV (Reference-LAPACK PR 1136) + - Fix support for jobu/v in LAPACKE_?GESVDQ_WORK (Reference-LAPACK PRs 1146,1221) + - Fix display of version number in LAPACK testsuite (Reference-LAPACK PR 1149) + - Fix DGGES test seed to avoid bad matrix cases (Reference-LAPACK PR 1187) + - Fix truncation of large WORK array sizes in ZHE (Reference-LAPACK PR 1195) + - Fix overwriting of LDSWORK parameter in ?TRSYL3 (Reference-LAPACK PR 1206) + - Fix overwriting of error states in some EIG tests (Reference-LAPACK PR 1207) + - Remove unused parameter in DORBDB3/ZUNBDB3 (Reference-LAPACK PR 1209) + - Re-enable testing of ?BB and ?GG driver functions (Reference-LAPACK PR 1211) + - Fix workspace size calculation in ?TGSEN (Reference-LAPACK PR 774) + - Fix typos in the EIG DMD tests and initialized the cutoff variable (PR 1212,1228) + - Optimized looping in ?LACPY/?LASCL/?LANTR with fat matrix and UPLO=L (PR 1251) + +arm64: + - worked around a serious miscompilation of the DDOT kernel by GCC15, affecting + most non-SVE targets, and SVE targets in the case of non-unit array stride) + - fixed an accuracy issue in the GEMV kernel for Neoverse V1 and other SVE targets + - fixed broken STRMM and SSYMM in DYNAMIC_ARCH builds when running on non-SME hardware + - added an optimized SHGEMM kernel for Neoverse N2 + - fixed DYNAMIC_ARCH builds under Windows on Arm + - Added autodetection of Cortex A75/A76 in DYNAMIC_ARCH builds + - Added autodetection of Neoverse V3, currently supported through V2 kernels + - Re-added support for the "VORTEX" target in DYNAMIC_ARCH builds with DYNAMIC_LIST + - Fixed CMake-based builds that use the "Ninja" generator + +loongarch64: + - fixed a build failure due to missing support for the new half-precision float type + - fixed a long-standing bug in asserting 64bit capability in the c_check helper script + +x86_64: + - added a workaround for miscompilation of the AVX512 GEMM kernels by LLVM on Windows + - fixed a build failure in the LAED3 code when compiling with MinGW on Windows + - fixed CMake-based compilation with the NVIDIA HPC compiler + - Fixed CMake-based builds that use the "Ninja" generator + +wasm: + - added optimized kernels for STRSM and DTRSM + ==================================================================== Version 0.3.32 23-Mar-2026