Skip to content

Commit cd276c2

Browse files
committed
only save the required registers for arm/sgemm
According to ARM AAPCS (Procedure Call Standard) 5.1.2.1, only registers s16-s31 must be preserved across subroutine calls; registers s0-s15 do not need to be preserved.
1 parent d7aeae8 commit cd276c2

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

kernel/arm/sgemm_kernel_4x4_vfpv3.S

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -871,7 +871,7 @@ USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
871871
vstr OLD_ALPHA, ALPHA
872872

873873
sub r3, fp, #128
874-
vstm r3, { s8 - s31} // store floating point registers
874+
vstm r3, { s16 - s31 } // store floating point registers
875875

876876
movs r4, #0
877877
str r4, FP_ZERO
@@ -1446,7 +1446,7 @@ USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
14461446
.Lsgemm_kernel_L999:
14471447

14481448
sub r3, fp, #128
1449-
vldm r3, { s8 - s31} // restore floating point registers
1449+
vldm r3, { s16 - s31 } // restore floating point registers
14501450

14511451
movs r0, #0 // set return value
14521452
sub sp, fp, #24

0 commit comments

Comments
 (0)