Commit 2d4ff1c
committed
PPC64: clampDoubleToUint8 — P9 branchless via xsmaxjdp + isel
The previous clampDoubleToUint8 had two mispredictable branches (≤0/NaN
and ≥255), each with a small body that jumped to a shared exit. Hot
on Uint8ClampedArray store paths.
POWER9 added xsmaxjdp/xsminjdp which use Java/JS semantics (ISA v3.0B
§7.6.1.7): any NaN is treated as "less than any number that is not a
NaN". So xsmaxjdp(input, 0) collapses {NaN, -Inf, ≤ 0} all to 0 in a
single instruction — the entire "≤ 0 or NaN → 0" branch dance
disappears.
After the max, fctid (round-to-nearest-even per FPSCR default —
matches ECMA Uint8ClampedArray's round-half-to-even) saturates
out-of-int64 values to INT64_MAX. The remaining upper clamp
(output > 255 → 255) is one cmpdi + isel.
POWER9 path (7 insns, no branches):
zeroDouble fpscratch
xsmaxjdp fpscratch, input, fpscratch ; max(input, 0); NaN→0
fctid fpscratch, fpscratch
mfvsrd output, fpscratch
li max255, 255
cmpdi output, 255
isel output, max255, output, GreaterThan
POWER8 path: unchanged (xsmaxjdp unavailable; fctid maps NaN to
INT64_MAX which would clamp to 255 instead of the spec-required 0,
so we keep the explicit NaN-filtering branches).
Verified end-to-end:
- Real P9 jit-test --jitflags=none: 13715 PASS / 0 FAIL (default)
- Real P9 jit-test --jitflags=none MOZ_PPC64_FORCE_POWER8=1: 13715 / 0
- Real P9 jstests default: PASS
- Real P9 jstests MOZ_PPC64_FORCE_POWER8=1: PASS
- Sim MOZ_PPC64_FORCE_POWER9=1 jit-test: 13651 / 0
- Sim MOZ_PPC64_FORCE_POWER10=1 jit-test: 13651 / 0
- Sim MOZ_PPC64_FORCE_POWER8=1 jit-test: 13651 / 0
PLAN.md item mozilla-firefox#23 (POWER9 fast path; POWER8 fallback unchanged).1 parent 88c5aeb commit 2d4ff1c
1 file changed
Lines changed: 26 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
388 | 388 | | |
389 | 389 | | |
390 | 390 | | |
391 | | - | |
392 | 391 | | |
393 | 392 | | |
394 | | - | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
395 | 419 | | |
396 | 420 | | |
397 | 421 | | |
| |||
401 | 425 | | |
402 | 426 | | |
403 | 427 | | |
404 | | - | |
405 | 428 | | |
406 | 429 | | |
407 | 430 | | |
| |||
411 | 434 | | |
412 | 435 | | |
413 | 436 | | |
414 | | - | |
415 | | - | |
416 | | - | |
417 | 437 | | |
418 | 438 | | |
419 | | - | |
420 | 439 | | |
421 | 440 | | |
422 | 441 | | |
| |||
0 commit comments