ARM64EC: Optimize GPR and MM state setting#5410
ARM64EC: Optimize GPR and MM state setting#5410Sonicadvance1 wants to merge 1 commit intoFEX-Emu:mainfrom
Conversation
I think the code improvement speaks for itself here.
|
Does this present a measurable improvement in any particular benchmark? This code path is super cold and only happens on a signal, and at that point this code doesn't remotely dominate. I don't think it makes sense to complicate this with inline asm |
Showed up enough when running games that I noticed it anyway. If llvm-mingw was smart enough it would optimize itself, but it just...doesn't. |
Right, but any numbers to back it up? I very much don't see how this isn't negligible vs signal overhead. |
I'll need to double check. The codegen was bad enough that I didn't even check again. |
So Elden Ring was the game I saw this on, it's spamming thread SIGUSR1 constantly or something which triggers this code path. I must have got the game in to a weird state where it was happening even more frequently because the single digit CPU usage percentages weren't showing back up. But I did measure in the "regular" state and saw a 25%-33% measurable performance uplift on the code, which of course didn't correlate to an FPS increase in this particular case. So still worth it. |
I think the code improvement speaks for itself here.
Before:

After:
