You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<li><ahref="#multithreaded-openblas-runs-no-faster-or-is-even-slower-than-singlethreaded-on-my-armv7-board">Multithreaded OpenBLAS runs no faster or is even slower than singlethreaded on my ARMV7 board</a></li>
1700
1710
<li><ahref="#speed-varies-wildly-between-individual-runs-on-a-typical-armv8-smartphone-processor">Speed varies wildly between individual runs on a typical ARMV8 smartphone processor</a></li>
1701
1711
<li><ahref="#i-cannot-get-openblas-to-use-more-than-a-small-subset-of-available-cores-on-a-big-system">I cannot get OpenBLAS to use more than a small subset of available cores on a big system</a></li>
1702
-
<li><ahref="#getting-elf-load-command-addressoffset-not-properly-aligned-when-loading-libopenblasso">Getting "ELF load command address/offset not properly aligned" when loading libopenblas.so</a><ul>
1712
+
<li><ahref="#getting-elf-load-command-addressoffset-not-properly-aligned-when-loading-libopenblasso">Getting "ELF load command address/offset not properly aligned" when loading libopenblas.so</a></li>
1713
+
<li><ahref="#the-tests-work-fine-but-calling-any-complex-function-from-my-code-produces-wrong-or-no-results">The tests work fine, but calling any complex function from my code produces wrong or no results</a></li>
1703
1714
<li><ahref="#using-openblas-with-openmp">Using OpenBLAS with OpenMP</a></li>
1704
1715
</ul>
1705
1716
</li>
1706
1717
</ul>
1707
-
</li>
1708
-
</ul>
1709
1718
</div>
1710
1719
<h2id="general-questions">General questions</h2>
1711
1720
<h3id="what-is-blas-why-is-it-important"><aname="whatblas"></a>What is BLAS? Why is it important?</h3>
<h3id="getting-elf-load-command-addressoffset-not-properly-aligned-when-loading-libopenblasso"><aname="ELFoffset"></a>Getting "ELF load command address/offset not properly aligned" when loading libopenblas.so</h3>
1930
1939
<p>If you get a message "error while loading shared libraries: libopenblas.so.0: ELF load command address/offset not properly aligned" when starting a program that is (dynamically) linked to OpenBLAS, this is very likely due to a bug in the GNU linker (ld) that is part of the
1931
1940
GNU binutils package. This error was specifically observed on older versions of Ubuntu Linux updated with the (at the time) most recent binutils version 2.38, but an internet search turned up sporadic reports involving various other libraries dating back several years. A bugfix was created by the binutils developers and should be available in later versions of binutils.(See issue 3708 for details)</p>
1932
-
<h4id="using-openblas-with-openmp"><aname="OpenMP"></a>Using OpenBLAS with OpenMP</h4>
1941
+
<h3id="the-tests-work-fine-but-calling-any-complex-function-from-my-code-produces-wrong-or-no-results"><aname="CallingConvention"></a>The tests work fine, but calling any complex function from my code produces wrong or no results</h3>
1942
+
<p>This is almost certainly a problem with the calling convention used, in particular with the way the computed result is transported back to the caller. By default, OpenBLAS follows the F2C convention of returning the result on the stack rather than as the first argument to the function. So if your code has a prototype like "void cdotu ( complex *res, int n,...)" change it to "complex cdotu (int n,...)". Better yet,
1943
+
use the CBLAS interface rather than the Fortran one.</p>
1944
+
<h3id="using-openblas-with-openmp"><aname="OpenMP"></a>Using OpenBLAS with OpenMP</h3>
1933
1945
<p>OpenMP provides its own locking mechanisms, so when your code makes BLAS/LAPACK calls from inside OpenMP parallel regions it is imperative
1934
1946
that you use an OpenBLAS that is built with USE_OPENMP=1, as otherwise deadlocks might occur. Furthermore, OpenBLAS will automatically restrict itself to using only a single thread when called from an OpenMP parallel region. When it is certain that calls will only occur
1935
1947
from the main thread of your program (i.e. outside of omp parallel constructs), a standard pthreads build of OpenBLAS can be used as well. In that case it may be useful to tune the linger behaviour of idle threads in both your OpenMP program (e.g. set OMP_WAIT_POLICY=passive) and OpenBLAS (by redefining the THREAD_TIMEOUT variable at build time, or setting the environment variable OPENBLAS_THREAD_TIMEOUT smaller than the default 26) so that the two alternating thread pools do not unnecessarily hog the cpu during the handover.</p>
@@ -1955,7 +1967,7 @@ <h4 id="using-openblas-with-openmp"><a name="OpenMP"></a>Using OpenBLAS with Ope
0 commit comments