PPU/SPU LLVM: Use native ARM shuffles in recompilers instead of emulating x86 pshufb#18056
PPU/SPU LLVM: Use native ARM shuffles in recompilers instead of emulating x86 pshufb#18056Whatcookie wants to merge 2 commits into
Conversation
|
Crashes in some games with a message along the lines of: LLVM Emergency Exit Invoked: 'Error while trying to spill X8 from class GPR64: Cannot scavenge register without an emergency spill slot!' Seems to be an LLVM bug. Will check with newer LLVM versions, and if it's not fixed, try to open an issue with reproducible code on the LLVM repo. |
|
The only issues i've seen online regarding this were fixed by adding AliasAnalysis |
121d06a to
7e54a0b
Compare
|
Segfaults when building SPU cache on macOS Arm. Edit: Also hangs with the "Thread too sleepy" error in Puppeteer, though this is not related to this PR |
…6 pshufb > - SHUFB from 9 instructions down to 5 > - Though it should be 4 if LLVM would just emit BCAX...
- Some SPU programs inexplicably fail to compile when TBL2/TBX2 are used. - As an insane workaround, first try to compile with TBL2/TBX2, if LLVM crashes while compiling, try to compile the same program without TBL2/TBX2.
Thanks for testing, I pushed a new build, could you test it too? |
The new build fixes the segfaults. Gets in-game after a couple of tries, just like the main branch. |
Finally properly emulates the PS3's most iconic instruction (according to me) efficiently on ARM machines too!
Brings SHUFB from 9 instructions down to 5, though it should be 4 if LLVM would just emit BCAX...
Should result in some nice speedup for arm machines. In another pull request I will tackle the ROTQBY family of instructions.