Gte rewrite#2007
Conversation
46 tests covering register I/O, instruction correctness, and documented discrepancies between Sony's SDK reference and psx-spx. All tests verified against SCPH-5501 silicon. Key hardware findings: - ORGB saturates (not truncates) negative/large IR values - AVSZ3 uses SZ1+SZ2+SZ3 (not SZ0+SZ1+SZ2) - H register sign-extends on CFC2 read despite being unsigned - MVMVA cv=2 (FC) and mx=3 (garbage) produce deterministic results - RES1 (data reg 23) is fully read/write despite SDK saying prohibited - All single-16bit control registers sign-extend on CTC2 write - lm flag toggles IR clamp between [-0x8000,0x7fff] and [0,0x7fff] - GTE requires COP2 enable (CU2 bit in CP0 SR) and has no interlock Signed-off-by: Nicolas 'Pixel' Noble <nicolas@nobis-crew.org>
Split into sub-test files by instruction group for maintainability. Full instruction coverage: register I/O, NCLIP, AVSZ3/4, SQR, OP, GPF/GPL, RTPS/RTPT, MVMVA (all parameter combos + cv=2/mx=3 bugs), DPCS/DPCT/DCPL/INTPL, NCS/NCT/NCCS/NCCT/NCDS/NCDT/CC/CDP. Also fixes cester macro precedence issue with bitwise & in assertions, and adds ImGui context guard for headless pad polling. Signed-off-by: Nicolas 'Pixel' Noble <nicolas@nobis-crew.org>
Structured bitfield encoder for all GTE instructions with named macros, register indices, and FLAG bit definitions. Conventional fake-opcode values match Sony SDK numbering. All encodings verified against known-good opcodes via static assertions. Signed-off-by: Nicolas 'Pixel' Noble <nicolas@nobis-crew.org>
Replace all hand-coded opcodes and register macros with cop2.h API. Add systematic encoding tests: fake-field ignored verification, sf=0/sf=1 behavioral differences, lm=0/lm=1 differences, full MVMVA 64-combination sweep, mx=3 garbage matrix with all vectors, cv=2 FC bug across all matrices, NCLIP/AVSZ3 ignore unused fields. 117 tests, all verified on SCPH-5501 hardware. Signed-off-by: Nicolas 'Pixel' Noble <nicolas@nobis-crew.org>
Division edge cases: divide by zero, H=0, exact overflow boundary (H=SZ3*2-1 vs H=SZ3*2 vs H=SZ3*2+1). IR/color/screen coordinate saturation boundary tests with FLAG bit verification. Degenerate states: zero matrix, negative Z (behind camera), SQR of -0x8000, GPL with negative base, depth cue with FC < input. Verifies FLAG is cleared at instruction start without explicit gte_clear_flag. 142 tests, all verified on SCPH-5501 hardware. Signed-off-by: Nicolas 'Pixel' Noble <nicolas@nobis-crew.org>
…maly 44-bit MAC overflow detection for all three channels with FLAG bit verification. Division table spot-checks including the documented F015h/780Bh corner case, H/SZ3 boundary (>= not >), and SZ3=1. RTPS sf=0 IR3/FLAG.22 anomaly: IR3 saturates when MAC3 exceeds 16-bit range, but FLAG.22 only fires when MAC3>>12 exceeds it. 155 tests, all verified on SCPH-5501 hardware. Signed-off-by: Nicolas 'Pixel' Noble <nicolas@nobis-crew.org>
Replace the original C implementation with a clean, template-driven architecture. The old code relied on ~130 register access macros, had no separation of concerns, and clashed with codebase naming conventions. The rewrite decomposes the GTE into reusable pipeline stages (lightTransform, colorMatrix, depthCue, colorApply, pushColor) that compose into the 22 instruction implementations. sf and lm are compile-time template parameters, and MVMVA is fully templatized on mx/v/cv with a 256-entry consteval dispatch table. Split across four files: - gte.h: public interface only, no implementation details - gte-internal.h: register accessors, int44, limiters, pipeline stages - gte-transfer.cc: MFC2/MTC2/CFC2/CTC2/LWC2/SWC2 - gte-instructions.cc: instruction templates and dispatch The UNR reciprocal table is now consteval-generated using the table-generator framework instead of a hardcoded byte array. All 155 hardware-verified tests pass on interpreter and dynarec. Signed-off-by: Nicolas 'Pixel' Noble <nicolas@nobis-crew.org>
📝 WalkthroughWalkthroughRefactors the PSX GTE emulator into modular components: adds an internal helper header, separate instruction and transfer implementation files, replaces the monolithic gte.cc, introduces compile-time MVMVA dispatch, and adds extensive COP2 opcode helpers and a large GTE-focused test suite and integration tests. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 19
🧹 Nitpick comments (4)
tests/pcsxrunner/gte.cc (1)
23-35: Consider extracting the shared invocation setup to reduce test drift.Both tests differ only by backend flag; a tiny helper keeps future argument changes in one place.
Optional dedup refactor
+namespace { +int runGteTest(const char* backendFlag) { + MainInvoker invoker("-no-ui", "-run", "-bios", "src/mips/openbios/openbios.bin", "-testmode", backendFlag, + "-luacov", "-loadexe", "src/mips/tests/gte/gte.ps-exe"); + return invoker.invoke(); +} +} // namespace + TEST(GTE, Interpreter) { - MainInvoker invoker("-no-ui", "-run", "-bios", "src/mips/openbios/openbios.bin", "-testmode", "-interpreter", - "-luacov", "-loadexe", "src/mips/tests/gte/gte.ps-exe"); - int ret = invoker.invoke(); - EXPECT_EQ(ret, 0); + EXPECT_EQ(runGteTest("-interpreter"), 0); } TEST(GTE, Dynarec) { - MainInvoker invoker("-no-ui", "-run", "-bios", "src/mips/openbios/openbios.bin", "-testmode", "-dynarec", - "-luacov", "-loadexe", "src/mips/tests/gte/gte.ps-exe"); - int ret = invoker.invoke(); - EXPECT_EQ(ret, 0); + EXPECT_EQ(runGteTest("-dynarec"), 0); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/pcsxrunner/gte.cc` around lines 23 - 35, Both GTE tests duplicate MainInvoker construction; extract a small helper to centralize shared args and vary only the backend flag. Create a helper function (e.g., BuildGteInvoker or MakeGteInvoker) that takes the backend string ("-interpreter" or "-dynarec"), constructs and returns a MainInvoker with the common flags ("-no-ui", "-run", "-bios", "src/mips/openbios/openbios.bin", "-testmode", "-luacov", "-loadexe", "src/mips/tests/gte/gte.ps-exe"), then update the TEST(GTE, Interpreter) and TEST(GTE, Dynarec) cases to call that helper with the respective backend and invoke.invoke(); keep EXPECT_EQ(ret, 0) as-is.vsprojects/gui/gui.vcxproj.filters (1)
112-117: Keep widget files under the widgets filter for consistency.These new entries point to
src/gui/widgets/...but are mapped to top-level filters. Consider placing them underSource Files\widgetsandHeader Files\widgetsto match the rest of the project layout.Proposed filter alignment
- <ClCompile Include="..\..\src\gui\widgets\ram-viewer.cc"> - <Filter>Source Files</Filter> + <ClCompile Include="..\..\src\gui\widgets\ram-viewer.cc"> + <Filter>Source Files\widgets</Filter> </ClCompile> - <ClCompile Include="..\..\src\gui\widgets\zoomable-image.cc"> - <Filter>Source Files</Filter> + <ClCompile Include="..\..\src\gui\widgets\zoomable-image.cc"> + <Filter>Source Files\widgets</Filter> </ClCompile> @@ - <ClInclude Include="..\..\src\gui\widgets\ram-viewer.h"> - <Filter>Header Files</Filter> + <ClInclude Include="..\..\src\gui\widgets\ram-viewer.h"> + <Filter>Header Files\widgets</Filter> </ClInclude> - <ClInclude Include="..\..\src\gui\widgets\zoomable-image.h"> - <Filter>Header Files</Filter> + <ClInclude Include="..\..\src\gui\widgets\zoomable-image.h"> + <Filter>Header Files\widgets</Filter> </ClInclude>Also applies to: 219-224
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@vsprojects/gui/gui.vcxproj.filters` around lines 112 - 117, The new ClCompile entries for "..\..\src\gui\widgets\ram-viewer.cc" and "..\..\src\gui\widgets\zoomable-image.cc" are assigned the top-level "Source Files" filter; update their <Filter> values to the nested "Source Files\\widgets" to match project conventions, and do the same for the corresponding ClInclude entries (lines referenced around 219-224) by setting their <Filter> to "Header Files\\widgets" so widget sources/headers reside under the widgets sub-filter; locate the entries in gui.vcxproj.filters by the file names ram-viewer.cc and zoomable-image.cc (and the analogous header entries) and change the Filter elements accordingly.src/mips/tests/gte/Makefile (1)
34-34: Prefer appending toCPPFLAGSinstead of resetting it.At Line 34,
CPPFLAGS = ...can unintentionally drop inherited flags from parent invocations;+=is safer for composition.Minimal change
-CPPFLAGS = -DNOFLOATINGPOINT +CPPFLAGS += -DNOFLOATINGPOINT🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mips/tests/gte/Makefile` at line 34, The current Makefile line resets CPPFLAGS which can drop inherited flags; change the assignment for the CPPFLAGS variable from `CPPFLAGS = -DNOFLOATINGPOINT` to use appending (`CPPFLAGS += -DNOFLOATINGPOINT`) so any parent or environment-provided flags are preserved when building (update the CPPFLAGS line in the Makefile accordingly).src/mips/common/hardware/cop2.h (1)
182-205: The access helpers currently force magic register numbers.Because
regis stringized directly,cop2_put(COP2_IR0, v)expands to$COP2_IR0, not$8. That makes the newCOP2_*indices unusable with the primary API and explains why the new tests still have to hardcode raw register numbers.Suggested fix
+#define COP2_STRINGIZE_IMPL(x) `#x` +#define COP2_STRINGIZE(x) COP2_STRINGIZE_IMPL(x) + `#define` cop2_put(reg, val) do { \ uint32_t _v = (val); \ - __asm__ volatile("mtc2 %0, $" `#reg` \ + __asm__ volatile("mtc2 %0, $" COP2_STRINGIZE(reg) \ "\n\tnop\n\tnop" \ : : "r"(_v)); \ } while (0) `#define` cop2_get(reg, dest) do { \ - __asm__ volatile("mfc2 %0, $" `#reg` \ + __asm__ volatile("mfc2 %0, $" COP2_STRINGIZE(reg) \ : "=r"(dest)); \ } while (0) ... `#define` cop2_putc(reg, val) do { \ uint32_t _v = (val); \ - __asm__ volatile("ctc2 %0, $" `#reg` \ + __asm__ volatile("ctc2 %0, $" COP2_STRINGIZE(reg) \ "\n\tnop\n\tnop" \ : : "r"(_v)); \ } while (0) `#define` cop2_getc(reg, dest) do { \ - __asm__ volatile("cfc2 %0, $" `#reg` \ + __asm__ volatile("cfc2 %0, $" COP2_STRINGIZE(reg) \ : "=r"(dest)); \ } while (0)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mips/common/hardware/cop2.h` around lines 182 - 205, The macros cop2_put, cop2_get, cop2_putc and cop2_getc stringize their reg argument directly ("$" `#reg`), so symbolic macros like COP2_IR0 don't expand and produce "$COP2_IR0" instead of the numeric register; fix by adding an expand-then-stringify helper (e.g., a STRINGIFY/EXPAND_AND_STRINGIFY pair) and use that helper in the asm templates so reg is expanded before being turned into the "$N" token, ensuring calls like cop2_put(COP2_IR0, v) emit "$8" not "$COP2_IR0".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/core/gte-instructions.cc`:
- Around line 63-70: The PGXP path performs the ir*hOverSz3 multiplication in
32-bit due to the placement of the cast, causing potential overflow; update the
PGXP_pushSXYZ2s call so the multiply is widened before it happens (mirror the
main path): compute gteOFX() + (int64_t)ir1() * hOverSz3 * widescreenFactor and
gteOFY() + (int64_t)ir2() * hOverSz3 (i.e., move the (int64_t) cast to wrap
ir1()/ir2() before multiplying by hOverSz3 and apply widescreenFactor to the
widened product) and pass those results into limG1ia(...) and limG2ia(...).
In `@src/core/gte-internal.h`:
- Around line 378-396: The signed left-shifts on controlVector<cv, N>() and on
rfc()/gfc()/bfc() cause UB when values are negative; replace each `<< 12` with
multiplication by a 64-bit constant (e.g. `* INT64_C(4096)`) so expressions in
limB1/limB2/limB3 (the A1<sf>/A2<sf>/A3<sf> lines) and in mac1()/mac2()/mac3()
(and the depth-cue far-color usages rfc/gfc/bfc) use `* INT64_C(4096)` instead
of `<< 12`, ensuring the integer type is 64-bit to match int44/int64 usage.
In `@src/mips/common/hardware/cop2.h`:
- Around line 189-205: The mfc2/cfc2 read macros (cop2_get and cop2_getc)
currently return immediately and can let the compiler use dest before the CP2
move latency completes; update both macros to pad the assembly with two nops
like cop2_putc (add "\n\tnop\n\tnop" after the "mfc2 %0, $" `#reg` and "cfc2 %0,
$" `#reg` instructions) so the hardware read latency is respected and stale
readbacks are avoided.
In `@src/mips/tests/gte/gte-depthcue.c`:
- Around line 70-94: The test dpct_reads_fifo (and similarly intpl_color_push)
currently only asserts the CODE byte and misses verifying RGB packing and FIFO
ordering; update dpct_reads_fifo to assert the lower 24 bits of rgb0/rgb1/rgb2
match the original RGB inputs (for inputs written via cop2_put(20..22) of
0x00102030,0x00405060,0x00708090 expect 0x102030,0x405060,0x708090 in the lower
24 bits) and that cop2_get returns values with both CODE (top byte) and RGB
bytes correct (use cop2_get to read outputs and cester_assert_uint_eq on both
(val >> 24) & 0xff and val & 0x00ffffff); apply the same additional assertions
to the intpl_color_push test region to catch any pushColor/FIFO ordering or
RGB-packing regressions.
In `@src/mips/tests/gte/gte-edgecase.c`:
- Around line 141-180: The two tests edge_mac0_positive_overflow and
edge_mac0_negative_overflow are using vectors that cancel to zero or merely log
results; update the cop2_put inputs to produce true positive and true negative
MAC0 overflows (use asymmetric 16-bit coordinate triples that make the
cross-product exceed 32-bit signed bounds in each direction), invoke
cop2_cmd(COP2_NCLIP) as before, then read mac0 via cop2_get(24, mac0) and flags
via gte_read_flag() and replace prints with assertions that FLAG.16 (positive
overflow) is set in edge_mac0_positive_overflow and FLAG.15 (negative overflow)
is set in edge_mac0_negative_overflow; keep using the same helpers (cop2_put,
cop2_cmd, cop2_get, gte_read_flag) and update the test comments to reflect the
new input choices.
- Around line 505-546: Replace the ramsyscall_printf-only checks in the tests
edge_depthcue_fc_less_than_input and edge_intpl_fc_less_than_ir with
deterministic assertions: read the computed registers via cop2_get (e.g., MAC1
from index 25, RGB2 from index 22) and the GTE flags via gte_read_flag(),
compute or hardcode the expected MAC/RGB/FLAG values described in the comments,
and use the test framework assertions (e.g., CESTER_ASSERT_INT_EQ /
CESTER_ASSERT_UINT_EQ or existing assert helpers) to validate MAC1, RGB2 and
flag instead of only printing them; keep the same setup calls
(gte_set_far_color, cop2_put, cop2_cmd, gte_clear_flag) and ensure the assertion
messages reference the test names (edge_depthcue_fc_less_than_input,
edge_intpl_fc_less_than_ir) so failures are clear.
In `@src/mips/tests/gte/gte-encoding.c`:
- Around line 219-275: The tests enc_mvmva_full_sweep (and the similar blocks at
lines referenced) are only smoke tests because they end with
cester_assert_int_eq(1, 1); replace those no-op assertions with concrete
validations against expected GTE state: compute the expected MAC/IR/FLAG/rotated
outputs for each MVMVA_T invocation and assert the relevant COP2 registers
(e.g., MAC0/MAC1/MAC2, IR0..IR3, FLAGS) match; update the test body that uses
the MVMVA_T macro (and any helper invoked by it) to capture results after each
MVMVA_T and call cester_assert_* on the specific registers instead of the
always-true assertion so regressions in bitfield dispatch or MAC/FLAG handling
are detected.
- Around line 187-210: The test enc_sqr_lm_difference currently squares positive
inputs so lm has no effect; either rename the test to reflect it's a
positive-result sanity check (e.g., enc_sqr_positive_sanity) or change the
inputs used with cop2_put/COP2_SQR so that COP2_SQR(1, 0) produces a negative
IR1 (so lm matters) and COP2_SQR(1, 1) yields a clamped 0; update assertions to
check that ir1_lm0 is the negative result (or non-zero/has sign bit) and ir1_lm1
== 0, keeping references to enc_sqr_lm_difference, cop2_put, cop2_cmd, COP2_SQR,
cop2_get and the ir1_lm0/ir1_lm1 variables.
In `@src/mips/tests/gte/gte-lighting.c`:
- Around line 4-24: The test currently just prints results; replace or augment
that with concrete assertions using the cester test macros: assert that mac1 ==
0, mac2 == 0, mac3 == 0x1000 (from the comment "Stage 1: ... only IR3 = 0x1000")
and that rgb2 equals the packed saturated color derived from MAC/16 -> 255 for
each channel (i.e. packed 0x00FFFFFF per the comment). Update the block around
cop2_get(25..27) / cop2_get(22) in the ncs_z_normal_white_light test (and
replicate the same style of assertions for the other listed tests: lines 50-73,
76-98, 101-120, 123-141, 144-164, 167-188, 191-212, 215-233, 236-256), removing
or keeping ramsyscall_printf only for additional debug but ensuring the test
fails if the observed MAC/RGB values don’t match the expected comment values.
In `@src/mips/tests/gte/gte-mvmva.c`:
- Around line 117-158: The two tests mvmva_cv2_fc_bug and mvmva_mx3_garbage
currently only print MAC/FLAG (using ramsyscall_printf) so they won't fail the
CI when behavior changes; update each test to assert the expected MAC and FLAG
values (use cop2_get to read MAC registers 25/26/27 and gte_read_flag for FLAG
and replace the print with cester_assert_int_eq or equivalent assertions) or
remove/mark these tests as non-regression/disabled from the automated suite
until expected results are finalized; reference the test names mvmva_cv2_fc_bug
and mvmva_mx3_garbage and the register-read helpers (cop2_get, gte_read_flag)
when making the change.
In `@src/mips/tests/gte/gte-nclip.c`:
- Around line 39-79: In the CESTER_TESTs nclip_large_coords and nclip_overflow
you only log results; replace those logs with real assertions: in
nclip_large_coords assert that mac0 (read via cop2_get(24, mac0)) equals -2047
and that overflow bits FLAG.15 and FLAG.16 (from gte_read_flag()) are clear
((flag & ((1<<15)|(1<<16))) == 0); in nclip_overflow assert that at least one
overflow bit is set (((flag >> 15) & 1) || ((flag >> 16) & 1)) and optionally
assert mac0 is the expected wrapped/overflow value per the test; use the
project's CESTER_ASSERT_* macros to fail the test when these conditions are not
met.
In `@src/mips/tests/gte/gte-op.c`:
- Around line 66-81: The test CESTER_TEST(op_overflow_flag, gte_tests, ...) uses
identical diagonal and IR components so all cross products cancel and never
exercise the overflow path; replace the current cop2_putc/cop2_put inputs with
non-collinear large values (e.g., make at least one of R11/R22/R33 or IR
components different) so the MAC produces a value that exceeds the 44-bit
accumulator, then invoke cop2_cmd(COP2_OP_CP(0,0)) as before, call
gte_read_flag() and assert the expected overflow FLAG bit(s) instead of just
printing them; keep gte_clear_flag() and the same command sequence but change
the input vectors to trigger overflow and add an assertion on flag.
In `@src/mips/tests/gte/gte-precision.c`:
- Around line 257-293: The test CESTER_TEST(prec_rtps_sf0_ir3_flag_anomaly) only
logs MAC3/IR3/FLAG but doesn’t assert the expected anomalous behavior; update
the test to assert the specific expectations: after cop2_cmd(COP2_RTPS(...))
read MAC3 via cop2_get(27, mac3) and IR3 via cop2_get(11, ir3) and flag via
gte_read_flag(), then assert MAC3 == 0x8000, assert IR3 is the saturated value
(0x7FFF) and assert FLAG.22 ((flag >> 22) & 1) == 0; use the existing
helper/assertion framework in the test suite to fail the test if any of these
conditions are not met so the edge case is actually enforced.
- Around line 160-177: The test prec_div_100_over_1000 is missing assertions:
after computing sxy2 and extracting sx (via cop2_get(14, sxy2) and int16_t sx =
...), add CESTER assertions to validate the result and FLAG state — assert sx is
within a tight expected range (e.g., 99..101 or similar to match table rounding)
and assert the GTE flag indicates no overflow/exception (use the existing
gte_clear_flag call beforehand and check the flag read function you use
elsewhere in tests). Place these assertions in the test (after ramsyscall_printf
and before the test end) so regressions in the quotient-table or flag handling
are caught.
- Around line 99-120: The test prec_mac_double_overflow currently uses 0x7fff
which yields products that fit in 44 bits and so never exercises the “double
overflow” path; replace the operands so each product exceeds 44 bits then
cancels (e.g., set R22 and IR3 to 0x400000 so R22*IR3 == 1<<44, do the same for
R33 and IR2 so the subtraction yields 0) by changing the cop2_putc/cop2_put
calls that set R22/R33/IR2/IR3 (symbols cop2_putc(2), cop2_putc(4),
cop2_put(10), cop2_put(11)) to larger values like 0x400000, then add assertions
after cop2_get(25, mac1) and flag = gte_read_flag() to check MAC1 == 0 and that
the GTE overflow flag is set (use gte_read_flag() result to assert the overflow
bit).
In `@src/mips/tests/gte/gte-rtps.c`:
- Around line 20-39: The test rtps_offset_vertex currently only asserts sz3 and
never verifies projected coordinates; after reading sxy2 via cop2_get(14) and
extracting sx and sy, add assertions that sx ≈ 199 and sy ≈ 139 (allow ±1
tolerance to account for rounding) instead of re-asserting sz3 twice—use
cester_assert_int_within or equivalent to check sx and sy against those expected
values and keep the existing sz3 check to ensure scale remained 500.
- Around line 99-117: The test rtps_depth_cue currently only prints MAC0/IR0;
change it (and similarly rtps_sf0) to assert the expected clamped numeric
results and relevant GTE status flags instead of only printing: after calling
gte_clear_flag(), setting registers via cop2_putc/cop2_put and running
cop2_cmd(COP2_RTPS(...)), read back MAC0 and IR0 with cop2_get(24, mac0) and
cop2_get(8, ir0) and replace the ramsyscall_printf with assertions that mac0
equals the expected computed value and ir0 is clamped to the expected range
(e.g. 0..0x1000) and also read and assert the relevant FLAG bits (use the
existing gte_clear_flag() pattern and the GTE flag read helper or cop2_get for
the flag/status register) so the test fails when arithmetic clamping or flag
setting is incorrect; apply the same pattern to rtps_sf0 (lines ~120-138).
In `@src/mips/tests/gte/gte-sqr.c`:
- Around line 50-81: Add concrete assertions to both tests: in
sqr_saturation_shifted after cop2_get(9/10/11, ...) assert that IR1 == 0x4000
and that IR2 and IR3 equal the GTE saturation result used by this codebase (use
the project saturation constant if present, otherwise 0x8000/0x7fff per your GTE
semantics), and assert the saturation/overflow bit is set by checking
gte_read_flag() & <saturation_flag_constant> (or compare against the expected
flag mask). In sqr_negative_input after cop2_get(25/26/27, ...) assert MAC1 ==
100, MAC2 == 2500, and MAC3 == 10000 to validate squared signed inputs; use the
existing test assertion helpers (CESTER_ASSERT or equivalent) rather than
prints.
---
Nitpick comments:
In `@src/mips/common/hardware/cop2.h`:
- Around line 182-205: The macros cop2_put, cop2_get, cop2_putc and cop2_getc
stringize their reg argument directly ("$" `#reg`), so symbolic macros like
COP2_IR0 don't expand and produce "$COP2_IR0" instead of the numeric register;
fix by adding an expand-then-stringify helper (e.g., a
STRINGIFY/EXPAND_AND_STRINGIFY pair) and use that helper in the asm templates so
reg is expanded before being turned into the "$N" token, ensuring calls like
cop2_put(COP2_IR0, v) emit "$8" not "$COP2_IR0".
In `@src/mips/tests/gte/Makefile`:
- Line 34: The current Makefile line resets CPPFLAGS which can drop inherited
flags; change the assignment for the CPPFLAGS variable from `CPPFLAGS =
-DNOFLOATINGPOINT` to use appending (`CPPFLAGS += -DNOFLOATINGPOINT`) so any
parent or environment-provided flags are preserved when building (update the
CPPFLAGS line in the Makefile accordingly).
In `@tests/pcsxrunner/gte.cc`:
- Around line 23-35: Both GTE tests duplicate MainInvoker construction; extract
a small helper to centralize shared args and vary only the backend flag. Create
a helper function (e.g., BuildGteInvoker or MakeGteInvoker) that takes the
backend string ("-interpreter" or "-dynarec"), constructs and returns a
MainInvoker with the common flags ("-no-ui", "-run", "-bios",
"src/mips/openbios/openbios.bin", "-testmode", "-luacov", "-loadexe",
"src/mips/tests/gte/gte.ps-exe"), then update the TEST(GTE, Interpreter) and
TEST(GTE, Dynarec) cases to call that helper with the respective backend and
invoke.invoke(); keep EXPECT_EQ(ret, 0) as-is.
In `@vsprojects/gui/gui.vcxproj.filters`:
- Around line 112-117: The new ClCompile entries for
"..\..\src\gui\widgets\ram-viewer.cc" and
"..\..\src\gui\widgets\zoomable-image.cc" are assigned the top-level "Source
Files" filter; update their <Filter> values to the nested "Source
Files\\widgets" to match project conventions, and do the same for the
corresponding ClInclude entries (lines referenced around 219-224) by setting
their <Filter> to "Header Files\\widgets" so widget sources/headers reside under
the widgets sub-filter; locate the entries in gui.vcxproj.filters by the file
names ram-viewer.cc and zoomable-image.cc (and the analogous header entries) and
change the Filter elements accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1ee50744-d48f-48c7-bad5-94ab4a110609
📒 Files selected for processing (28)
src/core/gte-instructions.ccsrc/core/gte-internal.hsrc/core/gte-transfer.ccsrc/core/gte.ccsrc/core/gte.hsrc/mips/common/hardware/cop2.hsrc/mips/tests/Makefilesrc/mips/tests/gte/Makefilesrc/mips/tests/gte/gte-avsz.csrc/mips/tests/gte/gte-depthcue.csrc/mips/tests/gte/gte-edgecase.csrc/mips/tests/gte/gte-encoding.csrc/mips/tests/gte/gte-gpf-gpl.csrc/mips/tests/gte/gte-lighting.csrc/mips/tests/gte/gte-mvmva.csrc/mips/tests/gte/gte-nclip.csrc/mips/tests/gte/gte-op.csrc/mips/tests/gte/gte-precision.csrc/mips/tests/gte/gte-regio.csrc/mips/tests/gte/gte-rtps.csrc/mips/tests/gte/gte-sqr.csrc/mips/tests/gte/gte.ctests/pcsxrunner/gte.ccvsprojects/core/core.vcxprojvsprojects/core/core.vcxproj.filtersvsprojects/gui/gui.vcxproj.filtersvsprojects/tests/pcsxrunner/pcsxrunner.vcxprojvsprojects/tests/pcsxrunner/pcsxrunner.vcxproj.filters
💤 Files with no reviewable changes (1)
- src/core/gte.cc
| limB1<false>(A1<sf>(int44(controlVector<cv, 0>() << 12) + | ||
| matrixElement<mx, 0, 0>() * vertexX<v>())); | ||
| limB2<false>(A2<sf>(int44(controlVector<cv, 1>() << 12) + | ||
| matrixElement<mx, 1, 0>() * vertexX<v>())); | ||
| limB3<false>(A3<sf>(int44(controlVector<cv, 2>() << 12) + | ||
| matrixElement<mx, 2, 0>() * vertexX<v>())); | ||
| } else { | ||
| mac1() = A1<sf>(int44(controlVector<cv, 0>() << 12) + | ||
| matrixElement<mx, 0, 0>() * vertexX<v>() + | ||
| matrixElement<mx, 0, 1>() * vertexY<v>() + | ||
| matrixElement<mx, 0, 2>() * vertexZ<v>()); | ||
| mac2() = A2<sf>(int44(controlVector<cv, 1>() << 12) + | ||
| matrixElement<mx, 1, 0>() * vertexX<v>() + | ||
| matrixElement<mx, 1, 1>() * vertexY<v>() + | ||
| matrixElement<mx, 1, 2>() * vertexZ<v>()); | ||
| mac3() = A3<sf>(int44(controlVector<cv, 2>() << 12) + | ||
| matrixElement<mx, 2, 0>() * vertexX<v>() + | ||
| matrixElement<mx, 2, 1>() * vertexY<v>() + | ||
| matrixElement<mx, 2, 2>() * vertexZ<v>(), rawMac3); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, verify the file exists and check the line count
wc -l src/core/gte-internal.hRepository: grumpycoders/pcsx-redux
Length of output: 94
🏁 Script executed:
# View the specific lines mentioned in the review
sed -n '370,440p' src/core/gte-internal.h | cat -nRepository: grumpycoders/pcsx-redux
Length of output: 3925
🏁 Script executed:
# Search for the controlVector function definition to understand its return type
rg "controlVector" src/core/gte-internal.h -A 3 -B 3 | head -50Repository: grumpycoders/pcsx-redux
Length of output: 1727
🏁 Script executed:
# Search for int44 function definition
rg "int44" src/core/gte-internal.h -B 2 -A 2 | head -30Repository: grumpycoders/pcsx-redux
Length of output: 1178
🏁 Script executed:
# Check for similar shift operations on control-related variables
rg "<<\s*12" src/core/gte-internal.h | head -20Repository: grumpycoders/pcsx-redux
Length of output: 732
🏁 Script executed:
# Search for rfc, gfc, bfc function definitions
rg "rfc\(\)|gfc\(\)|bfc\(\)" src/core/gte-internal.h -B 2 -A 2 | head -40Repository: grumpycoders/pcsx-redux
Length of output: 848
🏁 Script executed:
# View lines around 433-436 to see the depth cue section
sed -n '425,445p' src/core/gte-internal.h | cat -nRepository: grumpycoders/pcsx-redux
Length of output: 1035
🏁 Script executed:
# Search for the register structures to understand the data types
rg "\.sd" src/core/gte-internal.h | head -20Repository: grumpycoders/pcsx-redux
Length of output: 951
🏁 Script executed:
# Search for all signed left-shift operations on int64_t values in the file
rg "<<\s*12\)" src/core/gte-internal.h | head -30Repository: grumpycoders/pcsx-redux
Length of output: 693
🏁 Script executed:
# Check if there are other register access patterns that might have the same issue
rg "\(\).*<<" src/core/gte-internal.h | grep -E "(rfc|gfc|bfc|controlVector|rbk|gbk|bbk|trX|trY|trZ)" | head -20Repository: grumpycoders/pcsx-redux
Length of output: 693
Replace signed left-shift operations with multiplication to avoid undefined behavior.
Left-shifting negative signed integers is undefined behavior in C++. This affects both the MVMVA control-vector path (lines 378-396) where controlVector<cv, N>() values are shifted, and the depth-cue far-color path (lines 433-436) where rfc(), gfc(), and bfc() values are shifted.
Replace all instances of << 12 with * INT64_C(4096):
Suggested fix
- limB1<false>(A1<sf>(int44(controlVector<cv, 0>() << 12) +
+ limB1<false>(A1<sf>(int44(controlVector<cv, 0>() * INT64_C(4096)) +
matrixElement<mx, 0, 0>() * vertexX<v>()));
- limB2<false>(A2<sf>(int44(controlVector<cv, 1>() << 12) +
+ limB2<false>(A2<sf>(int44(controlVector<cv, 1>() * INT64_C(4096)) +
matrixElement<mx, 1, 0>() * vertexX<v>()));
- limB3<false>(A3<sf>(int44(controlVector<cv, 2>() << 12) +
+ limB3<false>(A3<sf>(int44(controlVector<cv, 2>() * INT64_C(4096)) +
matrixElement<mx, 2, 0>() * vertexX<v>()));
...
- mac1() = A1<sf>(int44(controlVector<cv, 0>() << 12) +
+ mac1() = A1<sf>(int44(controlVector<cv, 0>() * INT64_C(4096)) +
matrixElement<mx, 0, 0>() * vertexX<v>() +
matrixElement<mx, 0, 1>() * vertexY<v>() +
matrixElement<mx, 0, 2>() * vertexZ<v>());
- mac2() = A2<sf>(int44(controlVector<cv, 1>() << 12) +
+ mac2() = A2<sf>(int44(controlVector<cv, 1>() * INT64_C(4096)) +
matrixElement<mx, 1, 0>() * vertexX<v>() +
matrixElement<mx, 1, 1>() * vertexY<v>() +
matrixElement<mx, 1, 2>() * vertexZ<v>());
- mac3() = A3<sf>(int44(controlVector<cv, 2>() << 12) +
+ mac3() = A3<sf>(int44(controlVector<cv, 2>() * INT64_C(4096)) +
matrixElement<mx, 2, 0>() * vertexX<v>() +
matrixElement<mx, 2, 1>() * vertexY<v>() +
matrixElement<mx, 2, 2>() * vertexZ<v>(), rawMac3);
...
- mac1() = A1<sf>(inR + ir0() * limB1<false>(A1<sf>((rfc() << 12) - inR)));
- mac2() = A2<sf>(inG + ir0() * limB2<false>(A2<sf>((gfc() << 12) - inG)));
+ mac1() = A1<sf>(inR + ir0() * limB1<false>(A1<sf>(rfc() * INT64_C(4096) - inR)));
+ mac2() = A2<sf>(inG + ir0() * limB2<false>(A2<sf>(gfc() * INT64_C(4096) - inG)));
int64_t rawMac3;
- mac3() = A3<sf>(inB + ir0() * limB3<false>(A3<sf>((bfc() << 12) - inB)), rawMac3);
+ mac3() = A3<sf>(inB + ir0() * limB3<false>(A3<sf>(bfc() * INT64_C(4096) - inB)), rawMac3);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/gte-internal.h` around lines 378 - 396, The signed left-shifts on
controlVector<cv, N>() and on rfc()/gfc()/bfc() cause UB when values are
negative; replace each `<< 12` with multiplication by a 64-bit constant (e.g. `*
INT64_C(4096)`) so expressions in limB1/limB2/limB3 (the A1<sf>/A2<sf>/A3<sf>
lines) and in mac1()/mac2()/mac3() (and the depth-cue far-color usages
rfc/gfc/bfc) use `* INT64_C(4096)` instead of `<< 12`, ensuring the integer type
is 64-bit to match int44/int64 usage.
| CESTER_TEST(prec_div_100_over_1000, gte_tests, | ||
| gte_set_identity_rotation(); | ||
| gte_set_translation(0, 0, 0); | ||
| cop2_putc(24, 0); | ||
| cop2_putc(25, 0); | ||
| cop2_putc(26, 100); // H=100 | ||
| cop2_putc(27, 0); | ||
| cop2_putc(28, 0); | ||
| cop2_put(0, (0 << 16) | 1000); // VX=1000 | ||
| cop2_put(1, 1000); // VZ=1000 | ||
| gte_clear_flag(); | ||
| cop2_cmd(COP2_RTPS(1, 0)); | ||
| uint32_t sxy2; | ||
| cop2_get(14, sxy2); | ||
| int16_t sx = (int16_t)(sxy2 & 0xffff); | ||
| ramsyscall_printf("div 100/1000: SX=%d\n", sx); | ||
| // SX = 1000 * (100/1000) = 100 (roughly, depends on table rounding) | ||
| ) |
There was a problem hiding this comment.
This division spot-check doesn’t assert anything.
prec_div_100_over_1000 currently passes regardless of the quotient-table implementation. Please assert the projected X result, or at least a tight accepted range plus FLAG state, so this catches regressions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/mips/tests/gte/gte-precision.c` around lines 160 - 177, The test
prec_div_100_over_1000 is missing assertions: after computing sxy2 and
extracting sx (via cop2_get(14, sxy2) and int16_t sx = ...), add CESTER
assertions to validate the result and FLAG state — assert sx is within a tight
expected range (e.g., 99..101 or similar to match table rounding) and assert the
GTE flag indicates no overflow/exception (use the existing gte_clear_flag call
beforehand and check the flag read function you use elsewhere in tests). Place
these assertions in the test (after ramsyscall_printf and before the test end)
so regressions in the quotient-table or flag handling are caught.
The PGXP projection path computed ir1() * hOverSz3 in 32-bit before widening to int64_t, causing overflow on large quotients. The main projection path already widened first - match that. The mfc2/cfc2 read macros (cop2_get, cop2_getc) lacked the two-nop delay that the write macros already had. Without them, the compiler can schedule the destination register use before the R3000 CP2 move latency completes, producing stale readbacks on real hardware. Signed-off-by: Nicolas 'Pixel' Noble <nicolas@nobis-crew.org>
Convert ~35 printf-only tests into proper regression tests by adding concrete assertions against values verified on a SCPH-5501. Tests previously passed unconditionally as long as they didn't crash - now they catch emulator regressions in depth cue, lighting, MVMVA, NCLIP, OP, RTPS, SQR, precision, and edge case instructions. All 155 tests pass on real silicon, interpreter, and dynarec. Signed-off-by: Nicolas 'Pixel' Noble <nicolas@nobis-crew.org>
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (4)
src/mips/tests/gte/gte-op.c (1)
66-82:⚠️ Potential issue | 🟡 MinorThis “overflow” test still only covers the cancellation case.
With equal diagonal and IR components, every cross-product term cancels, so
FLAG == 0is just the expected zero-result path.OPonly multiplies 16-bit matrix/IR operands, so it cannot hit the 44-bit MAC overflow described in the comment; either rename this to the cancellation case or move the overflow regression to an instruction with 32-bit bias terms.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mips/tests/gte/gte-op.c` around lines 66 - 82, The test op_overflow_flag is misleading: it only exercises the cancellation case (equal matrix and IR components) so MAC stays zero rather than demonstrating a 44-bit MAC overflow; update the test by either renaming it to reflect "cancellation" (e.g., op_cancellation_flag) and adjusting the comments, or modify it to actually trigger overflow by using an instruction that multiplies larger/32-bit bias terms instead of COP2_OP_CP (adjust calls around cop2_putc/cop2_put and cop2_cmd accordingly) and then assert the expected overflow FLAG via gte_read_flag; ensure references to COP2_OP_CP, cop2_putc, cop2_put, cop2_cmd, gte_read_flag and the test name op_overflow_flag are updated so intent matches behavior.src/mips/tests/gte/gte-precision.c (1)
97-122:⚠️ Potential issue | 🟡 MinorThis “double overflow” OP test still can’t hit the stated condition.
0x7fff * 0x7fffis only0x3fff0001, andOPhas no 32-bit addend likeTR/BK/FCto push the accumulator toward 44-bit overflow. As written this just proves cancellation to zero. Please rename/remove it or move the regression to an instruction that can actually overflow the MAC pipeline.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mips/tests/gte/gte-precision.c` around lines 97 - 122, The test prec_mac_double_overflow cannot produce the claimed intermediate 44-bit overflow because 0x7fff * 0x7fff = 0x3fff0001 and OP (COP2_OP_CP) has no extra 32-bit addend to push the MAC pipeline into overflow; update the test by either removing/renaming it or changing it to use an instruction that can create the required MAC overflow (e.g., use an operation that includes TR/BK/FC-style 32-bit addends or use larger operands that exceed 44-bit when multiplied), and adjust the setup that currently uses cop2_putc/cop2_put and the cop2_cmd(COP2_OP_CP(0,0)) invocation (prec_mac_double_overflow, cop2_putc, cop2_put, cop2_cmd, COP2_OP_CP) so the intermediate product actually overflows the MAC pipeline before cancellation.src/mips/tests/gte/gte-rtps.c (1)
101-121:⚠️ Potential issue | 🟡 MinorAlso assert
FLAG.12inrtps_depth_cue.This now pins
MAC0andIR0, but it still won't fail ifIR0clamps to0without raising theIR0_SATbit. Since this case is explicitly exercising the clamp path, please readgte_read_flag()here and assert bit 12 too.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mips/tests/gte/gte-rtps.c` around lines 101 - 121, The test rtps_depth_cue currently asserts MAC0 and IR0 but doesn't verify the IR0 saturation flag; call gte_read_flag() (or use the existing cop2_get for the flag register) after cop2_cmd(COP2_RTPS(...)) to read the GTE flags and assert that bit 12 (IR0_SAT / FLAG.12) is set; add a cester_assert that (flags & (1 << 12)) != 0 so the test fails if IR0 was clamped without raising the saturation flag.src/mips/tests/gte/gte-edgecase.c (1)
141-187:⚠️ Potential issue | 🟠 MajorThe MAC0 overflow tests still don't cover the overflow directions they claim.
edge_mac0_positive_overflownow proves the triangle cancels toMAC0 == 0and even assertsflag == 0, so it does not exercise positive overflow at all.edge_mac0_negative_overflowlikewise asserts an in-rangeMAC0value and only checksFLAG.16, not the negative-overflow bit. Please replace both input triangles with asymmetric coordinates that actually exceed signed 32-bit range and assertFLAG.16/FLAG.15respectively.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mips/tests/gte/gte-edgecase.c` around lines 141 - 187, The two tests edge_mac0_positive_overflow and edge_mac0_negative_overflow use triangles that cancel and do not trigger the claimed MAC0 overflows; replace the cop2_put(12/13/14, ...) coordinate triples in each test with asymmetric 16-bit vertex values that produce a true signed 32-bit positive overflow (for edge_mac0_positive_overflow) and a true signed 32-bit negative overflow (for edge_mac0_negative_overflow), keep the sequence gte_clear_flag(); cop2_cmd(COP2_NCLIP); cop2_get(24, mac0); flag = gte_read_flag(); and then update assertions to check the correct flag bits: for the positive-overflow test assert ((flag >> 16) & 1) == 1 (FLAG.16) and for the negative-overflow test assert ((flag >> 15) & 1) == 1 (FLAG.15); remove or adjust any exact mac0 equality asserts if you cannot determine the wrapped value, otherwise assert the expected wrapped MAC0 result consistent with the chosen coordinates.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/core/gte-instructions.cc`:
- Around line 73-75: The multiply/accumulate for MAC0 is currently done in
32-bit and can overflow before F() is called; change the computation so the
product and sum are performed in 64-bit (e.g., use an int64_t accumulator)
before calling F(), then pass that widened value to F() and to limH(), updating
the mac0() assignment site that uses gteDQB() + gteDQA() * hOverSz3 as well as
the similar ZSF * SZn summation sites around the block handling OTZ (the other
occurrence in the same function). Concretely, create/retain rawMac0 as an
int64_t, cast gteDQA(), hOverSz3, ZSF, SZn, etc. to int64_t before
multiplication/addition, compute the full 64-bit sum, then call F(<widened sum>,
rawMac0) and pass rawMac0 to limH().
- Around line 275-276: The NCLIP determinant expression computing mac0()
overflows because the six products are evaluated in 32-bit; update the
expression that sets mac0() (using sx0(), sx1(), sx2(), sy0(), sy1(), sy2()) so
every multiplication is performed in 64-bit by casting operands or the product
to int64_t (i.e., ensure all six terms like sx0()*sy1(), sx1()*sy2(), etc. are
cast to int64_t before summing) and then pass the final int64_t result into F().
In `@src/mips/tests/gte/gte-edgecase.c`:
- Around line 466-485: The test edge_ir0_at_max currently only checks FLAG bit
12 for saturation; add a direct assertion that the read IR0 value equals 0x1000
to enforce the boundary. After cop2_get(8, ir0) (and before or after checking
f12), add a cester_assert_uint_eq comparing ir0 & 0xffff to 0x1000 (or compare
the full ir0 if preferred) so the test fails if RTPS returns the wrong
unsaturated value; keep existing FLAG check (f12) as well.
In `@src/mips/tests/gte/gte-precision.c`:
- Around line 148-159: The test only checks FLAG.17 (f17) but never asserts the
computed quotient result stored in sx; add an assertion that verifies sx equals
the expected saturated value (for this case the comment shows SX should end up
saturated to 0x3FF), e.g. after reading sxy2/flag and computing sx assert sx ==
0x3FF (or an equivalent tight range/assertion masking to 0x3FF) using
cester_assert_*; do the same change for the other division spot-check referenced
(the similar block around the other test that reads sxy2/flag at the later spot)
so both the overflow bit and the actual SX result are validated (refer to
variables sxy2, flag, sx, f17 and the cester_assert_* calls).
In `@src/mips/tests/gte/gte-rtps.c`:
- Around line 149-177: The test rtpt_three_vertices only asserts V0 and sy1; add
assertions for V1's sx (sx1) and both components of V2 (sxy2) to fully verify
the triple-vertex RTPT path: extract sx1 and sy1 as already done and assert sx1
== 180 (int16_t cast), and extract sxy2 components (int16_t)(sxy2 & 0xffff) ==
160 and (int16_t)(sxy2 >> 16) == 140 so that V1's X and both X/Y of V2 are
validated (use the existing variables sxy1, sxy2 and the int16_t casts used for
sx1/sy1).
---
Duplicate comments:
In `@src/mips/tests/gte/gte-edgecase.c`:
- Around line 141-187: The two tests edge_mac0_positive_overflow and
edge_mac0_negative_overflow use triangles that cancel and do not trigger the
claimed MAC0 overflows; replace the cop2_put(12/13/14, ...) coordinate triples
in each test with asymmetric 16-bit vertex values that produce a true signed
32-bit positive overflow (for edge_mac0_positive_overflow) and a true signed
32-bit negative overflow (for edge_mac0_negative_overflow), keep the sequence
gte_clear_flag(); cop2_cmd(COP2_NCLIP); cop2_get(24, mac0); flag =
gte_read_flag(); and then update assertions to check the correct flag bits: for
the positive-overflow test assert ((flag >> 16) & 1) == 1 (FLAG.16) and for the
negative-overflow test assert ((flag >> 15) & 1) == 1 (FLAG.15); remove or
adjust any exact mac0 equality asserts if you cannot determine the wrapped
value, otherwise assert the expected wrapped MAC0 result consistent with the
chosen coordinates.
In `@src/mips/tests/gte/gte-op.c`:
- Around line 66-82: The test op_overflow_flag is misleading: it only exercises
the cancellation case (equal matrix and IR components) so MAC stays zero rather
than demonstrating a 44-bit MAC overflow; update the test by either renaming it
to reflect "cancellation" (e.g., op_cancellation_flag) and adjusting the
comments, or modify it to actually trigger overflow by using an instruction that
multiplies larger/32-bit bias terms instead of COP2_OP_CP (adjust calls around
cop2_putc/cop2_put and cop2_cmd accordingly) and then assert the expected
overflow FLAG via gte_read_flag; ensure references to COP2_OP_CP, cop2_putc,
cop2_put, cop2_cmd, gte_read_flag and the test name op_overflow_flag are updated
so intent matches behavior.
In `@src/mips/tests/gte/gte-precision.c`:
- Around line 97-122: The test prec_mac_double_overflow cannot produce the
claimed intermediate 44-bit overflow because 0x7fff * 0x7fff = 0x3fff0001 and OP
(COP2_OP_CP) has no extra 32-bit addend to push the MAC pipeline into overflow;
update the test by either removing/renaming it or changing it to use an
instruction that can create the required MAC overflow (e.g., use an operation
that includes TR/BK/FC-style 32-bit addends or use larger operands that exceed
44-bit when multiplied), and adjust the setup that currently uses
cop2_putc/cop2_put and the cop2_cmd(COP2_OP_CP(0,0)) invocation
(prec_mac_double_overflow, cop2_putc, cop2_put, cop2_cmd, COP2_OP_CP) so the
intermediate product actually overflows the MAC pipeline before cancellation.
In `@src/mips/tests/gte/gte-rtps.c`:
- Around line 101-121: The test rtps_depth_cue currently asserts MAC0 and IR0
but doesn't verify the IR0 saturation flag; call gte_read_flag() (or use the
existing cop2_get for the flag register) after cop2_cmd(COP2_RTPS(...)) to read
the GTE flags and assert that bit 12 (IR0_SAT / FLAG.12) is set; add a
cester_assert that (flags & (1 << 12)) != 0 so the test fails if IR0 was clamped
without raising the saturation flag.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 02366e64-b547-4798-bcf0-67f12f9cd266
📒 Files selected for processing (11)
src/core/gte-instructions.ccsrc/mips/common/hardware/cop2.hsrc/mips/tests/gte/gte-depthcue.csrc/mips/tests/gte/gte-edgecase.csrc/mips/tests/gte/gte-lighting.csrc/mips/tests/gte/gte-mvmva.csrc/mips/tests/gte/gte-nclip.csrc/mips/tests/gte/gte-op.csrc/mips/tests/gte/gte-precision.csrc/mips/tests/gte/gte-rtps.csrc/mips/tests/gte/gte-sqr.c
| int64_t rawMac0; | ||
| mac0() = F(gteDQB() + gteDQA() * hOverSz3, rawMac0); | ||
| ir0() = limH(rawMac0); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the specific lines mentioned in the review
cat -n src/core/gte-instructions.cc | sed -n '70,80p'Repository: grumpycoders/pcsx-redux
Length of output: 430
🏁 Script executed:
# Check lines 357-368
cat -n src/core/gte-instructions.cc | sed -n '355,370p'Repository: grumpycoders/pcsx-redux
Length of output: 663
🏁 Script executed:
# Let's also check lines around 272-276 for NCLIP
cat -n src/core/gte-instructions.cc | sed -n '270,280p'Repository: grumpycoders/pcsx-redux
Length of output: 569
🏁 Script executed:
# Search for the function definitions to understand their return types
rg -A 2 "int64_t|int32_t|uint32_t" src/core/gte-instructions.cc | head -100Repository: grumpycoders/pcsx-redux
Length of output: 3279
🏁 Script executed:
# Search for PGXP_NLCIP_valid or similar
rg "PGXP_N" src/core/Repository: grumpycoders/pcsx-redux
Length of output: 466
Widen the MAC0 math before passing it to F().
Line 74 and Lines 360-367 still do the multiply/accumulate in 32-bit first. gteDQA() * hOverSz3 and the ZSF * SZn sums can overflow before F(..., rawMac0) sees them, which gives you wrapped MAC0, wrong MAC0 flags, and wrong OTZ on large-depth cases. This contradicts the existing pattern in the same file (lines 65-66) where hOverSz3 multiplications are already widened to int64_t with an explicit comment noting the overflow risk.
Suggested fix
if (last) {
int64_t rawMac0;
- mac0() = F(gteDQB() + gteDQA() * hOverSz3, rawMac0);
+ mac0() = F(gteDQB() + (int64_t)gteDQA() * hOverSz3, rawMac0);
ir0() = limH(rawMac0);
}
}
@@
void PCSX::GTE::AVSZ3(uint32_t code) {
gteFlag() = 0;
int64_t rawMac0;
- mac0() = F(gteZSF3() * sz1() + gteZSF3() * sz2() + gteZSF3() * sz3(), rawMac0);
+ mac0() = F((int64_t)gteZSF3() * sz1() + (int64_t)gteZSF3() * sz2() +
+ (int64_t)gteZSF3() * sz3(),
+ rawMac0);
otz() = limD<true>(rawMac0);
}
@@
void PCSX::GTE::AVSZ4(uint32_t code) {
gteFlag() = 0;
int64_t rawMac0;
- mac0() = F(gteZSF4() * sz0() + gteZSF4() * sz1() + gteZSF4() * sz2() + gteZSF4() * sz3(), rawMac0);
+ mac0() = F((int64_t)gteZSF4() * sz0() + (int64_t)gteZSF4() * sz1() +
+ (int64_t)gteZSF4() * sz2() + (int64_t)gteZSF4() * sz3(),
+ rawMac0);
otz() = limD<true>(rawMac0);
}Also applies to: 357-368
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/gte-instructions.cc` around lines 73 - 75, The multiply/accumulate
for MAC0 is currently done in 32-bit and can overflow before F() is called;
change the computation so the product and sum are performed in 64-bit (e.g., use
an int64_t accumulator) before calling F(), then pass that widened value to F()
and to limH(), updating the mac0() assignment site that uses gteDQB() + gteDQA()
* hOverSz3 as well as the similar ZSF * SZn summation sites around the block
handling OTZ (the other occurrence in the same function). Concretely,
create/retain rawMac0 as an int64_t, cast gteDQA(), hOverSz3, ZSF, SZn, etc. to
int64_t before multiplication/addition, compute the full 64-bit sum, then call
F(<widened sum>, rawMac0) and pass rawMac0 to limH().
| mac0() = F((int64_t)sx0() * sy1() + sx1() * sy2() + sx2() * sy0() - | ||
| sx0() * sy2() - sx1() * sy0() - sx2() * sy1()); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd src/core && head -300 gte-instructions.cc | tail -50Repository: grumpycoders/pcsx-redux
Length of output: 1997
🏁 Script executed:
wc -l src/core/gte-instructions.ccRepository: grumpycoders/pcsx-redux
Length of output: 99
🏁 Script executed:
sed -n '270,280p' src/core/gte-instructions.ccRepository: grumpycoders/pcsx-redux
Length of output: 492
🏁 Script executed:
# Search for the class definition and relevant function signatures
rg -A 5 "sx0\(\)" src/core/gte-instructions.cc | head -30Repository: grumpycoders/pcsx-redux
Length of output: 409
🏁 Script executed:
# Find the F() function definition
rg "^\s*\w+\s+F\(" src/core/gte-instructions.cc -A 3 | head -20Repository: grumpycoders/pcsx-redux
Length of output: 49
🏁 Script executed:
# Get broader context around the problematic line
sed -n '260,290p' src/core/gte-instructions.ccRepository: grumpycoders/pcsx-redux
Length of output: 1412
🏁 Script executed:
# Find declarations of sx0, sy1, sy2, etc.
rg "sx0\(\)|sy0\(\)|sx1\(\)|sy1\(\)|sx2\(\)|sy2\(\)" src/core/gte-instructions.cc --context 2 | head -40Repository: grumpycoders/pcsx-redux
Length of output: 667
🏁 Script executed:
# Search for class definition and member function declarations
rg "class\s+GTE" src/core -A 30 | head -50Repository: grumpycoders/pcsx-redux
Length of output: 2227
🏁 Script executed:
# Look for the header file definition
fd -e h "gte" src/core/Repository: grumpycoders/pcsx-redux
Length of output: 125
🏁 Script executed:
# Check gte.h for function signatures
rg "sx0|sy0|sx1|sy1|sx2|sy2" src/core/gte.h -B 2 -A 1 | head -50Repository: grumpycoders/pcsx-redux
Length of output: 49
🏁 Script executed:
# Search for sx0 and sy0 definitions in gte files
rg "sx[0-2]\(\)|sy[0-2]\(\)" src/core/gte.h -A 2 -B 2Repository: grumpycoders/pcsx-redux
Length of output: 49
🏁 Script executed:
# Try broader search in gte-internal.h
rg "sx0|sy0|sx1|sy1|sx2|sy2" src/core/gte-internal.h | head -30Repository: grumpycoders/pcsx-redux
Length of output: 390
🏁 Script executed:
# Look for the actual register storage and accessor methods
rg "int16_t.*sx|int32_t.*sx" src/core/gte-internal.h | head -20Repository: grumpycoders/pcsx-redux
Length of output: 387
🏁 Script executed:
# Search for these as members/properties
ast-grep --pattern 'int16_t $_' src/core/gte-internal.hRepository: grumpycoders/pcsx-redux
Length of output: 49
🏁 Script executed:
# Let's look at the gte-internal.h header
head -100 src/core/gte-internal.hRepository: grumpycoders/pcsx-redux
Length of output: 4465
🏁 Script executed:
# Find the F() function definition
rg "F\(" src/core/gte-internal.h | head -20Repository: grumpycoders/pcsx-redux
Length of output: 143
🏁 Script executed:
# Look for F function more carefully
rg "^\s*\w+\s+F\(" src/core/gte-internal.h -A 3Repository: grumpycoders/pcsx-redux
Length of output: 49
🏁 Script executed:
# Search in gte-instructions.cc for F definition
rg "^\s*\w+\s+F\(" src/core/gte-instructions.cc -A 3Repository: grumpycoders/pcsx-redux
Length of output: 49
🏁 Script executed:
# Broader search for F definition
rg "^.*F\s*\(" src/core/ -A 2 | head -40Repository: grumpycoders/pcsx-redux
Length of output: 2126
🏁 Script executed:
# Look at what happens around F() usage and mac0 assignment
sed -n '1,100p' src/core/gte-internal.h | tail -60Repository: grumpycoders/pcsx-redux
Length of output: 2520
🏁 Script executed:
# Get F() function definition with more context
rg "inline int64_t F" src/core/gte-internal.h -A 10 | head -50Repository: grumpycoders/pcsx-redux
Length of output: 805
🏁 Script executed:
# Let's see where F() is defined - check line numbers
rg -n "inline int64_t F" src/core/gte-internal.hRepository: grumpycoders/pcsx-redux
Length of output: 151
🏁 Script executed:
# Now get those specific lines with context
sed -n '200,300p' src/core/gte-internal.hRepository: grumpycoders/pcsx-redux
Length of output: 3835
🏁 Script executed:
# Let me verify the max/min range analysis with concrete calculations
python3 << 'EOF'
import sys
# int16_t range: -32768 to 32767
# int32_t range: -2147483648 to 2147483647
max_int16 = 32767
min_int16 = -32768
max_int32 = 2147483647
min_int32 = -2147483648
# Maximum product of two int16_t
max_product = max_int16 * max_int16
min_product = min_int16 * min_int16
print(f"Max int16 product: {max_product}")
print(f"Min int16 product: {min_product}")
print(f"Fits in int32: {max_product <= max_int32 and min_product <= max_int32}")
# Now with the six-term determinant
# Worst case positive: three products added, three subtracted
# Best case for overflow: all six terms are max magnitude but cause cancellation
# Worst case: all three "+" terms are max, all three "-" terms are max
worst_positive = 3 * max_product
worst_negative = 3 * min_product
print(f"\nWorst case positive sum (3 max products): {worst_positive}")
print(f"Exceeds int32 max? {worst_positive > max_int32}")
print(f"Worst case negative sum (3 min products): {worst_negative}")
print(f"Below int32 min? {worst_negative < min_int32}")
EOFRepository: grumpycoders/pcsx-redux
Length of output: 300
🏁 Script executed:
# Let me also verify what the actual computation order is
# Looking at the exact expression again:
# (int64_t)sx0() * sy1() + sx1() * sy2() + sx2() * sy0() - sx0() * sy2() - sx1() * sy0() - sx2() * sy1()
# The key issue is that in C++, int16_t * int16_t promotes to int (32-bit)
# Let me check the exact behavior by looking at how compiler would handle this
python3 << 'EOF'
# C++ integer promotion rules:
# - If both operands are int16_t, they are promoted to int (usually 32-bit signed)
# - int * int = int (32-bit)
# - (int64_t) int16_t * int16_t: cast applies to first operand, so int64_t * int16_t = int64_t
# - But int16_t * int16_t = int (without the cast), and if only first cast, we get:
# (int64_t)(sx0()) * sy1() = int64_t (because cast has higher precedence than *)
# Then: int64_t + int + int - int - int - int
# The int additions/subtractions happen at int level before being promoted to int64_t
# This is the bug: intermediate int32_t operations can overflow
print("C++ arithmetic evaluation:")
print("(int64_t)sx0() * sy1() + sx1() * sy2() + ...")
print()
print("Evaluation order (with operator precedence and type promotion):")
print("1. (int64_t)sx0() -> int64_t cast")
print("2. (int64_t)sx0() * sy1() -> int64_t * int16_t -> int64_t")
print("3. sx1() * sy2() -> int16_t * int16_t -> int (32-bit signed)")
print("4. sx2() * sy0() -> int16_t * int16_t -> int (32-bit signed)")
print("5. (int64_t)sx0() * sy1() + sx1() * sy2() -> int64_t + int -> int64_t")
print("6. ... remaining int terms added/subtracted at int level first, then promoted")
print()
print("The bug: steps 3, 4, etc. compute products as 32-bit int, which can overflow!")
EOFRepository: grumpycoders/pcsx-redux
Length of output: 644
Add int64_t casts to all multiplications in the NCLIP determinant expression.
The six-term area calculation overflows in 32-bit arithmetic before reaching F(). Each uncast multiplication computes as int16_t * int16_t → int32_t. Individual products fit in 32-bit (max ~1.07 billion), but three terms summed exceed INT32_MAX (2.15 billion), causing intermediate overflow with wrapped values passed to F(). This results in incorrect MAC0 values and erroneous overflow flags.
Suggested fix
- mac0() = F((int64_t)sx0() * sy1() + sx1() * sy2() + sx2() * sy0() -
- sx0() * sy2() - sx1() * sy0() - sx2() * sy1());
+ mac0() = F((int64_t)sx0() * sy1() + (int64_t)sx1() * sy2() + (int64_t)sx2() * sy0() -
+ (int64_t)sx0() * sy2() - (int64_t)sx1() * sy0() - (int64_t)sx2() * sy1());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/gte-instructions.cc` around lines 275 - 276, The NCLIP determinant
expression computing mac0() overflows because the six products are evaluated in
32-bit; update the expression that sets mac0() (using sx0(), sx1(), sx2(),
sy0(), sy1(), sy2()) so every multiplication is performed in 64-bit by casting
operands or the product to int64_t (i.e., ensure all six terms like sx0()*sy1(),
sx1()*sy2(), etc. are cast to int64_t before summing) and then pass the final
int64_t result into F().
| CESTER_TEST(edge_ir0_at_max, gte_tests, | ||
| gte_set_identity_rotation(); | ||
| gte_set_translation(0, 0, 0); | ||
| cop2_putc(24, 0); | ||
| cop2_putc(25, 0); | ||
| cop2_putc(26, 200); | ||
| cop2_putc(27, 0); // DQA = 0 | ||
| cop2_putc(28, 0x1000000); // DQB = 0x1000000 -> MAC0=DQB, IR0=DQB>>12=0x1000 | ||
| cop2_put(0, 0); | ||
| cop2_put(1, 1000); | ||
| gte_clear_flag(); | ||
| cop2_cmd(COP2_RTPS(1, 0)); | ||
| uint32_t ir0, flag; | ||
| cop2_get(8, ir0); | ||
| flag = gte_read_flag(); | ||
| ramsyscall_printf("IR0 max: IR0=0x%04x FLAG=0x%08x\n", ir0 & 0xffff, flag); | ||
| // IR0 should be exactly 0x1000 | ||
| uint32_t f12 = (flag >> 12) & 1; | ||
| cester_assert_uint_eq(0, f12); // no saturation | ||
| ) |
There was a problem hiding this comment.
Assert the boundary value here, not just the absence of saturation.
This test is supposed to prove the exact IR0 == 0x1000 boundary, but it currently only checks that FLAG.12 stayed clear. If RTPS returns the wrong unsaturated value, this still passes. Please add a direct assertion on ir0.
Suggested change
cop2_get(8, ir0);
flag = gte_read_flag();
ramsyscall_printf("IR0 max: IR0=0x%04x FLAG=0x%08x\n", ir0 & 0xffff, flag);
// IR0 should be exactly 0x1000
+ cester_assert_uint_eq(0x1000, ir0);
uint32_t f12 = (flag >> 12) & 1;
cester_assert_uint_eq(0, f12); // no saturation📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| CESTER_TEST(edge_ir0_at_max, gte_tests, | |
| gte_set_identity_rotation(); | |
| gte_set_translation(0, 0, 0); | |
| cop2_putc(24, 0); | |
| cop2_putc(25, 0); | |
| cop2_putc(26, 200); | |
| cop2_putc(27, 0); // DQA = 0 | |
| cop2_putc(28, 0x1000000); // DQB = 0x1000000 -> MAC0=DQB, IR0=DQB>>12=0x1000 | |
| cop2_put(0, 0); | |
| cop2_put(1, 1000); | |
| gte_clear_flag(); | |
| cop2_cmd(COP2_RTPS(1, 0)); | |
| uint32_t ir0, flag; | |
| cop2_get(8, ir0); | |
| flag = gte_read_flag(); | |
| ramsyscall_printf("IR0 max: IR0=0x%04x FLAG=0x%08x\n", ir0 & 0xffff, flag); | |
| // IR0 should be exactly 0x1000 | |
| uint32_t f12 = (flag >> 12) & 1; | |
| cester_assert_uint_eq(0, f12); // no saturation | |
| ) | |
| CESTER_TEST(edge_ir0_at_max, gte_tests, | |
| gte_set_identity_rotation(); | |
| gte_set_translation(0, 0, 0); | |
| cop2_putc(24, 0); | |
| cop2_putc(25, 0); | |
| cop2_putc(26, 200); | |
| cop2_putc(27, 0); // DQA = 0 | |
| cop2_putc(28, 0x1000000); // DQB = 0x1000000 -> MAC0=DQB, IR0=DQB>>12=0x1000 | |
| cop2_put(0, 0); | |
| cop2_put(1, 1000); | |
| gte_clear_flag(); | |
| cop2_cmd(COP2_RTPS(1, 0)); | |
| uint32_t ir0, flag; | |
| cop2_get(8, ir0); | |
| flag = gte_read_flag(); | |
| ramsyscall_printf("IR0 max: IR0=0x%04x FLAG=0x%08x\n", ir0 & 0xffff, flag); | |
| // IR0 should be exactly 0x1000 | |
| cester_assert_uint_eq(0x1000, ir0); | |
| uint32_t f12 = (flag >> 12) & 1; | |
| cester_assert_uint_eq(0, f12); // no saturation | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/mips/tests/gte/gte-edgecase.c` around lines 466 - 485, The test
edge_ir0_at_max currently only checks FLAG bit 12 for saturation; add a direct
assertion that the read IR0 value equals 0x1000 to enforce the boundary. After
cop2_get(8, ir0) (and before or after checking f12), add a cester_assert_uint_eq
comparing ir0 & 0xffff to 0x1000 (or compare the full ir0 if preferred) so the
test fails if RTPS returns the wrong unsaturated value; keep existing FLAG check
(f12) as well.
| uint32_t sxy2, flag; | ||
| cop2_get(14, sxy2); | ||
| flag = gte_read_flag(); | ||
| int16_t sx = (int16_t)(sxy2 & 0xffff); | ||
| ramsyscall_printf("div 1/1: SX=%d FLAG=0x%08x\n", sx, flag); | ||
| // H=1, SZ3=1 -> H >= SZ3*2? 1 >= 2? No -> no overflow | ||
| // quotient = H*0x20000/SZ3 = 0x20000. Saturated to 0x1FFFF. | ||
| // SX = IR1 * quotient >> 16 = 0x1000 * 0x1FFFF >> 16 = 0x1FFF | ||
| // Then saturated to 0x3FF | ||
| uint32_t f17 = (flag >> 17) & 1; | ||
| cester_assert_uint_eq(0, f17); // no division overflow | ||
| ) |
There was a problem hiding this comment.
These division spot-checks still don’t assert the quotient.
Both cases read sx, but only FLAG.17 is asserted. A bad UNR-table entry can still leave the suite green as long as the overflow bit matches; please assert the expected sx value, or at least a tight accepted range, in both tests.
Also applies to: 195-203
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/mips/tests/gte/gte-precision.c` around lines 148 - 159, The test only
checks FLAG.17 (f17) but never asserts the computed quotient result stored in
sx; add an assertion that verifies sx equals the expected saturated value (for
this case the comment shows SX should end up saturated to 0x3FF), e.g. after
reading sxy2/flag and computing sx assert sx == 0x3FF (or an equivalent tight
range/assertion masking to 0x3FF) using cester_assert_*; do the same change for
the other division spot-check referenced (the similar block around the other
test that reads sxy2/flag at the later spot) so both the overflow bit and the
actual SX result are validated (refer to variables sxy2, flag, sx, f17 and the
cester_assert_* calls).
| CESTER_TEST(rtpt_three_vertices, gte_tests, | ||
| gte_set_identity_rotation(); | ||
| gte_set_translation(0, 0, 0); | ||
| gte_set_screen(160 << 16, 120 << 16, 200); | ||
| // V0 = (0, 0, 1000) | ||
| cop2_put(0, 0x00000000); | ||
| cop2_put(1, 1000); | ||
| // V1 = (100, 0, 1000) | ||
| cop2_put(2, (0 << 16) | 100); | ||
| cop2_put(3, 1000); | ||
| // V2 = (0, 100, 1000) | ||
| cop2_put(4, (100 << 16) | 0); | ||
| cop2_put(5, 1000); | ||
| gte_clear_flag(); | ||
| cop2_cmd(COP2_RTPT(1, 0)); | ||
| uint32_t sxy0, sxy1, sxy2; | ||
| cop2_get(12, sxy0); | ||
| cop2_get(13, sxy1); | ||
| cop2_get(14, sxy2); | ||
| // V0 at origin -> (160, 120) | ||
| cester_assert_int_eq(160, (int16_t)(sxy0 & 0xffff)); | ||
| cester_assert_int_eq(120, (int16_t)(sxy0 >> 16)); | ||
| // V1 at (100,0,1000) -> SX ~ 180 | ||
| int16_t sx1 = (int16_t)(sxy1 & 0xffff); | ||
| int16_t sy1 = (int16_t)(sxy1 >> 16); | ||
| ramsyscall_printf("RTPT: V1=(%d,%d) V2=(%d,%d)\n", sx1, sy1, | ||
| (int16_t)(sxy2 & 0xffff), (int16_t)(sxy2 >> 16)); | ||
| cester_assert_int_eq(120, sy1); // Y unchanged | ||
| ) |
There was a problem hiding this comment.
rtpt_three_vertices still leaves most of RTPT unverified.
Right now this test only pins V0 and sy1. A regression in V1 sx, or either coordinate of V2, still passes even though the case is meant to validate the triple-vertex path. Please assert sx1 and both components of sxy2 as well.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/mips/tests/gte/gte-rtps.c` around lines 149 - 177, The test
rtpt_three_vertices only asserts V0 and sy1; add assertions for V1's sx (sx1)
and both components of V2 (sxy2) to fully verify the triple-vertex RTPT path:
extract sx1 and sy1 as already done and assert sx1 == 180 (int16_t cast), and
extract sxy2 components (int16_t)(sxy2 & 0xffff) == 160 and (int16_t)(sxy2 >>
16) == 140 so that V1's X and both X/Y of V2 are validated (use the existing
variables sxy1, sxy2 and the int16_t casts used for sx1/sy1).
No description provided.