Skip to content

POWER10: Reduce sgemm loop unrolling#5592

Merged
martin-frbg merged 1 commit intoOpenMathLib:developfrom
RajalakshmiSR:sgemm-p10-unroll
Jan 5, 2026
Merged

POWER10: Reduce sgemm loop unrolling#5592
martin-frbg merged 1 commit intoOpenMathLib:developfrom
RajalakshmiSR:sgemm-p10-unroll

Conversation

@RajalakshmiSR
Copy link
Copy Markdown

With GCC 14, unnecessary move and lxvp instructions appear when unrolling the inner loop for larger sizes. Reducing the loop unroll factor restores performance to GCC 11.
Co-authored-by: Jose Moreira
Tested by: Amrita H S amritahs@linux.vnet.ibm.com

With GCC 14, unnecessary move and lxvp instructions appear when unrolling the inner loop for larger sizes.
Reducing the loop unroll factor restores performance to GCC 11.
@martin-frbg martin-frbg added this to the 0.3.31 milestone Jan 5, 2026
@martin-frbg martin-frbg merged commit badf4c0 into OpenMathLib:develop Jan 5, 2026
98 of 102 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants