Skip to content

Commit 82e8f2e

Browse files
trg-rgbVaibhav805
andcommitted
docs: clarify RISC-V RVV target selection and GCC 14+ requirement for ZVL128B/ZVL256B
Add two notes to the RISC-V section before the per-target entries: 1. RISCV64_GENERIC is intentionally scalar — Makefile.riscv64 appends a scalar march override that takes precedence over any user-supplied -march=rv64gcv. Correct targets for RVV 1.0 are RISCV64_ZVL128B and RISCV64_ZVL256B (see #3808 for design rationale). 2. GCC 14+ required for _rvv.c kernels on current OpenBLAS. GCC 13 builds complete and produce a library but routines using segmented load/store intrinsics (__riscv_vsseg*) fall back to scalar silently. Functional tests pass; only disassembly detects this. Verified on OpenBLAS 0.3.33: GCC 13 (scalar fallback), GCC 14 (~12,691 RVV opcodes), GCC 15 (~14,355 RVV opcodes). Co-authored-by: Vaibhav805 <Vaibhav805@users.noreply.github.com>
1 parent 804a77c commit 82e8f2e

1 file changed

Lines changed: 4 additions & 0 deletions

File tree

README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -226,6 +226,10 @@ Please read `GotoBLAS_01Readme.txt` for older CPU models already supported by th
226226
make HOSTCC=gcc TARGET=x280 NUM_THREADS=8 CC=riscv64-unknown-linux-gnu-clang FC=riscv64-unknown-linux-gnu-gfortran
227227
```
228228

229+
**Note — RVV 1.0 target selection:** `RISCV64_GENERIC` is intentionally scalar — its kernel file maps all BLAS operations to plain-C implementations, and `Makefile.riscv64` appends a scalar `-march` override for this target that takes precedence over any user-supplied `-march=rv64gcv` flag. For RVV 1.0 vectorized builds, use `TARGET=RISCV64_ZVL128B` (VLEN ≥ 128 bits) or `TARGET=RISCV64_ZVL256B` (VLEN ≥ 256 bits). These targets route all three BLAS levels — including DGEMM — to the `_rvv.c` kernel set introduced in 2022; see [issue #3808](https://github.com/OpenMathLib/OpenBLAS/issues/3808) for the design rationale.
230+
231+
**Compiler requirement for ZVL targets:** GCC 14 or later is required on current OpenBLAS releases. GCC 13 does not implement the segmented load/store intrinsics (`__riscv_vsseg*`) used by the `_rvv.c` kernels; the build still completes and produces a library, but the affected routines fall back to scalar code paths. Functional tests will pass on the resulting library — only disassembly-level verification detects the regression. For a correct `RISCV64_ZVL128B` build on OpenBLAS 0.3.33, `riscv64-linux-gnu-objdump -d libopenblas*.a | grep -c 'vle64\|vfmacc\|vsetvli\|vlse64\|vfmul\|vfadd\|vfredosum'` returns approximately 12,000–14,000 (GCC 14: ~12,691; GCC 15: ~14,355).
232+
229233
- **ZVL???B**: Level-3 BLAS and Level-1,2 including vectorised kernels targeting generic RISCV cores with vector support with registers of at least the corresponding width; ZVL128B and ZVL256B are available.
230234
e.g.:
231235
```sh

0 commit comments

Comments
 (0)