Skip to content

kernel/riscv64:Optimized the implementation of axpby on TARGET=RISCV64_ZVL256B.#5288

Merged
martin-frbg merged 2 commits intoOpenMathLib:developfrom
guoyuanplct:develop
May 29, 2025
Merged

kernel/riscv64:Optimized the implementation of axpby on TARGET=RISCV64_ZVL256B.#5288
martin-frbg merged 2 commits intoOpenMathLib:developfrom
guoyuanplct:develop

Conversation

@guoyuanplct
Copy link
Copy Markdown
Contributor

The specific improvements are shown in the figure below.
image
image

@guoyuanplct guoyuanplct changed the title Optimized the implementation of axpby on TARGET=RISCV64_ZVL256B. kernel/riscv64:Optimized the implementation of axpby on TARGET=RISCV64_ZVL256B. May 29, 2025
@martin-frbg martin-frbg added this to the 0.3.30 milestone May 29, 2025
@martin-frbg martin-frbg merged commit 02267d8 into OpenMathLib:develop May 29, 2025
84 of 86 checks passed
@abhishek-iitmadras
Copy link
Copy Markdown
Contributor

abhishek-iitmadras commented Aug 23, 2025

Just out of curiosity and for my learning, i have below question :

What are the key practical scenarios or algorithms where a dedicated AXPBY kernel from openBLAS provides a significant performance advantage given that if we already have a highly optimized AXPY ?

Thanks

@martin-frbg @guoyuanplct

@martin-frbg
Copy link
Copy Markdown
Collaborator

We may not have a highly optimized AXPY on all architectures, and the current default for AXPBY is a naive C loop instead of combining calls to SCAL and AXPY in the interface. (The git log suggests that axpby was added a decade ago for compatibility with MKL - #285 - and nobody looked at it - or its performance - ever since)

@martin-frbg
Copy link
Copy Markdown
Collaborator

(small correction - the Loongson crew did add optimized kernels for their hardware in late 2023, so this is not entirely without precedent. There are no callers in Reference-LAPACK, and the only user in OpenBLAS itself seems to be the generic GEADD, so this may have gone mostly unnoticed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants