Skip to content

Commit f414be4

Browse files
authored
test(jelly-micro): add per-fixture recall floors (#1409)
* test(jelly-micro): add per-fixture recall floors Replace the trivially-passing recall >= 0 gate with a RECALL_FLOORS map keyed by fixture name. Fixtures that already reach 100% are locked at 1.0; partially-resolved fixtures (classes, defineProperty, super, super2) are locked at their current baseline so a single lost edge fails CI. Unresolvable fixtures (0% baseline) continue to default to 0. Closes #1387. * test(jelly-micro): validate RECALL_FLOORS keys against discovered fixtures Add a startup check after discoverTests() that throws if any key in RECALL_FLOORS does not match an actual fixture directory. A renamed or mistyped fixture key would otherwise silently lower the recall floor to 0 with no warning, defeating the regression gate. The check is gated on tests.length > 0 so it does not fire in CI environments where the jelly-micro directory is gitignored. * fix(jelly-micro): remove stale more1 entry from RECALL_FLOORS more1 was moved from jelly-micro to the pts-javascript fixture set in #1383 (commit ddfc14c). The RECALL_FLOORS map still referenced it, causing the startup sanity-check to throw in CI where the merged branch no longer includes more1 in jelly-micro. * fix(jelly-micro): correct classes floor from 0.2 to 0.19 (6/31, not 6/30) (#1409)
1 parent 505e95a commit f414be4

1 file changed

Lines changed: 47 additions & 2 deletions

File tree

tests/benchmarks/resolution/jelly-micro.test.ts

Lines changed: 47 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,8 +54,53 @@ function discoverTests(): string[] {
5454
.sort();
5555
}
5656

57+
/**
58+
* Per-fixture minimum recall floors based on the baseline measured on origin/main
59+
* (commit 784951d, June 2026). Fixtures not listed here default to 0 — they
60+
* produce 0% recall and can only improve, never regress below zero.
61+
*
62+
* Format: fixture-name → minimum recall fraction in [0, 1].
63+
* Exact fractions shown in comments; stored as the corresponding percentage
64+
* value so a single lost TP triggers a failure.
65+
*
66+
* Note: more1 was moved to the pts-javascript fixture set in #1383 and is
67+
* no longer part of jelly-micro.
68+
*/
69+
const RECALL_FLOORS: Record<string, number> = {
70+
accessors3: 1.0, // 1/1
71+
arguments: 1.0, // 1/1
72+
classes: 0.19, // 6/31
73+
defineProperty: 0.5, // 3/6
74+
fun: 1.0, // 4/4
75+
generators: 1.0, // 9/9
76+
'receiver-callee-mixup': 1.0, // 1/1
77+
rest: 1.0, // 1/1
78+
spread: 1.0, // 4/4
79+
super: 0.38, // 5/13
80+
super2: 0.4, // 2/5
81+
super3: 1.0, // 3/3
82+
this: 1.0, // 1/1
83+
};
84+
5785
const tests = discoverTests();
5886

87+
// Sanity-check: every RECALL_FLOORS key must match a discovered fixture name.
88+
// A mismatch means either a fixture was renamed or the key was mistyped, and
89+
// the regression floor would silently degrade to 0 without this guard.
90+
// Only enforce when the fixture directory is present (CI skips the whole suite
91+
// when fixtures are absent).
92+
if (tests.length > 0) {
93+
const testSet = new Set(tests);
94+
for (const key of Object.keys(RECALL_FLOORS)) {
95+
if (!testSet.has(key)) {
96+
throw new Error(
97+
`RECALL_FLOORS key "${key}" does not match any discovered fixture in ${FIXTURES_DIR}. ` +
98+
'Update the key or remove the stale entry.',
99+
);
100+
}
101+
}
102+
}
103+
59104
// Per-test results collected for summary
60105
const allResults: Record<
61106
string,
@@ -194,8 +239,8 @@ describe.skipIf(tests.length === 0)('Jelly Micro-Test Benchmark', () => {
194239
for (const e of fn) console.log(` FN: ${e}`);
195240
}
196241

197-
// Soft gate: recall must be ≥ 0% (we don't gate yet — this benchmark is diagnostic)
198-
expect(recall).toBeGreaterThanOrEqual(0);
242+
const floor = RECALL_FLOORS[testName] ?? 0;
243+
expect(recall).toBeGreaterThanOrEqual(floor);
199244
});
200245
});
201246
}

0 commit comments

Comments
 (0)