OpenBLAS 0.3.1 version
·
7208 commits
to release-0.3.0
since this release
common:
- rewritten thread initialization code with significantly reduced overhead
- added CBLAS interfaces to the IxAMIN BLAS extension functions
- fixed the lapack-test target
- CMAKE builds now create an OpenBLASConfig.cmake file
- ZAXPY now uses a single thread for small input sizes
- the LAPACK code was updated from Reference-LAPACK/lapack#253
POWER:
- corrected CROT and ZROT behaviour with zero INC_X
ARMV7:
- corrected xDOT behaviour with zero INC_X or INC_Y
x86_64:
- retired some older targets of DYNAMIC_ARCH builds to a new option DYNAMIC_OLDER,
this affects PENRYN,DUNNINGTON,OPTERON,OPTERON_SSE3,BOBCAT,ATOM and NANO
(which will still be supported via the slower PRESCOTT kernels when this option is not set) - added an option DYNAMIC_LIST that (used in conjunction with DYNAMIC_ARCH) allows
to specify the list of x86_64 targets to include. Any target not on the list will be supported by
the Sandybridge or Nehalem kernels if available, or by Prescott. - improved SWITCH_RATIO on Haswell for increased GEMM throughput
- added initial support for Intel Skylake X, including an AVX512 SGEMM kernel
- added autodetection of Intel Cannon Lake series as Skylake X
- added a default L2 cache size for hypervisors that return zero here (Chromebook)
- fixed a name clash with recent Windows10 headers that broke the build with (at least)
recent mingw from MSYS2 - fixed a link error in mixed clang/gfortran builds with OpenMP
- updated the OSX deployment target to 10.8
- switched on parallel make for builds on MS Windows by default
x86:
- fixed SSWAP and DSWAP behaviour with zero INC_X and INC_Y
md5sum
f6180b15f9b29252e718afc6f7c55c5f OpenBLAS-0.3.1.zip
b35f77d7ea723684333145b9b10656ec OpenBLAS-0.3.1.tar.gz