Skip to content

Accumulate results in output register explicitly#5294

Merged
martin-frbg merged 1 commit intoOpenMathLib:developfrom
arnej27959:arnej/fix-arm64-register
Jun 10, 2025
Merged

Accumulate results in output register explicitly#5294
martin-frbg merged 1 commit intoOpenMathLib:developfrom
arnej27959:arnej/fix-arm64-register

Conversation

@arnej27959
Copy link
Copy Markdown
Contributor

This should fix #5293
using the same pattern as the other KERNEL_F_FINALIZE versions.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Jun 9, 2025

CodSpeed Performance Report

Merging #5294 will improve performances by 10.53%

Comparing arnej27959:arnej/fix-arm64-register (5442aff) with develop (02267d8)

Summary

⚡ 1 improvements
✅ 61 untouched benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
test_dgemv[1000-s] 7.7 ms 7 ms +10.53%

@martin-frbg martin-frbg added this to the 0.3.30 milestone Jun 10, 2025
@martin-frbg martin-frbg merged commit bbdc265 into OpenMathLib:develop Jun 10, 2025
86 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

wrong result from cblas_sdot on arm64

2 participants