Skip to content

Optimized BGEMV for NEOVERSEV1 target#5394

Merged
martin-frbg merged 1 commit intoOpenMathLib:developfrom
Mousius:optimize-bgemv
Jul 23, 2025
Merged

Optimized BGEMV for NEOVERSEV1 target#5394
martin-frbg merged 1 commit intoOpenMathLib:developfrom
Mousius:optimize-bgemv

Conversation

@Mousius
Copy link
Copy Markdown
Contributor

@Mousius Mousius commented Jul 23, 2025

  • Adds bgemv T based off of sbgemv T kernel
  • Adds bgemv N which is slightly alterated to not use Y as an accumulator due to the output being bf16 which results in loss of precision
  • Enables BGEMM_GEMV_FORWARD to proxy BGEMM to BGEMV with new kernels

- Adds bgemv T based off of sbgemv T kernel
- Adds bgemv N which is slightly alterated to not use Y as an
accumulator due to the output being bf16 which results in loss of
precision
- Enables BGEMM_GEMV_FORWARD to proxy BGEMM to BGEMV with new kernels
@Mousius Mousius changed the title Optimized BGEMV for NEOVERSEN2, NEOVERSEV1 and NEOVERSEV2 targets Optimized BGEMV for NEOVERSEV1 target Jul 23, 2025
@martin-frbg martin-frbg added this to the 0.3.31 milestone Jul 23, 2025
@martin-frbg martin-frbg merged commit 392d381 into OpenMathLib:develop Jul 23, 2025
81 of 88 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants