Skip to content

Optimized RVV_ZVL256B Implementation of zgemv_n#5245

Merged
martin-frbg merged 2 commits intoOpenMathLib:developfrom
guoyuanplct:develop
May 1, 2025
Merged

Optimized RVV_ZVL256B Implementation of zgemv_n#5245
martin-frbg merged 2 commits intoOpenMathLib:developfrom
guoyuanplct:develop

Conversation

@guoyuanplct
Copy link
Copy Markdown
Contributor

The implementation of zgemv_n using RVV_ZVL256B has been optimized. Compared to the previous implementation, it has achieved a 1.5x performance improvement. And the modified code has a 2.4x speedup compared to the version without RVV.

Matrix Size after optimization before optimization without RVV
100x100 0.000052 0.000051 0.000099
110x110 0.000038 0.000067 0.000092
120x120 0.000037 0.000259 0.000095
130x130 0.00006 0.000056 0.000134
140x140 0.000097 0.000104 0.00011
150x150 0.000055 0.000125 0.00016
160x160 0.000065 0.000095 0.000182
170x170 0.000093 0.000133 0.000195
180x180 0.000077 0.000135 0.000219
190x190 0.000085 0.000111 0.000249
200x200 0.000114 0.000183 0.000273

The implementation of zgemv_n using RVV_ZVL256B has been optimized.
Compared to the previous implementation, it has achieved a 1.5x
performance improvement.
@martin-frbg martin-frbg added this to the 0.3.30 milestone Apr 24, 2025
@guoyuanplct
Copy link
Copy Markdown
Contributor Author

The check related to C910v has timed out. I don't know why this happened. Maybe it should be restarted.

@martin-frbg martin-frbg merged commit cba32d0 into OpenMathLib:develop May 1, 2025
84 of 88 checks passed
@martin-frbg
Copy link
Copy Markdown
Collaborator

Unfortunately this appears to have broken C910V (openblas_utest_ext failing in c/zgemv and some c/zgbmv tests), which was masked by the CI job timing out

@guoyuanplct
Copy link
Copy Markdown
Contributor Author

I'm really sorry to hear this, I'm looking into the cause of the error and will do my best to resolve it as soon as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants