Skip to content
Merged
49 changes: 47 additions & 2 deletions tests/benchmarks/resolution/jelly-micro.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,53 @@ function discoverTests(): string[] {
.sort();
}

/**
* Per-fixture minimum recall floors based on the baseline measured on origin/main
* (commit 784951d, June 2026). Fixtures not listed here default to 0 — they
* produce 0% recall and can only improve, never regress below zero.
*
* Format: fixture-name → minimum recall fraction in [0, 1].
* Exact fractions shown in comments; stored as the corresponding percentage
* value so a single lost TP triggers a failure.
*
* Note: more1 was moved to the pts-javascript fixture set in #1383 and is
* no longer part of jelly-micro.
*/
const RECALL_FLOORS: Record<string, number> = {
accessors3: 1.0, // 1/1
arguments: 1.0, // 1/1
classes: 0.19, // 6/31
defineProperty: 0.5, // 3/6
fun: 1.0, // 4/4
generators: 1.0, // 9/9
'receiver-callee-mixup': 1.0, // 1/1
rest: 1.0, // 1/1
spread: 1.0, // 4/4
super: 0.38, // 5/13
super2: 0.4, // 2/5
super3: 1.0, // 3/3
this: 1.0, // 1/1
};
Comment on lines +69 to +83

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Stale RECALL_FLOORS keys won't surface fixture renames

RECALL_FLOORS entries that don't match any discovered fixture name are silently ignored — the affected test would fall back to floor = 0 instead of the intended floor. If a fixture directory is renamed (e.g. supersuper-v2) or a key is mistyped here, the regression protection is lost without any warning. Adding a quick sanity check after discoverTests() — verifying every Object.keys(RECALL_FLOORS) entry appears in tests — would catch this at suite startup rather than silently lowering the bar.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added a startup sanity-check after discoverTests() that throws if any key in RECALL_FLOORS does not match a discovered fixture directory. Gated on tests.length > 0 so it does not fire in CI environments where the jelly-micro directory is gitignored (the suite is skipped there anyway).


const tests = discoverTests();

// Sanity-check: every RECALL_FLOORS key must match a discovered fixture name.
// A mismatch means either a fixture was renamed or the key was mistyped, and
// the regression floor would silently degrade to 0 without this guard.
// Only enforce when the fixture directory is present (CI skips the whole suite
// when fixtures are absent).
if (tests.length > 0) {
const testSet = new Set(tests);
for (const key of Object.keys(RECALL_FLOORS)) {
if (!testSet.has(key)) {
throw new Error(
`RECALL_FLOORS key "${key}" does not match any discovered fixture in ${FIXTURES_DIR}. ` +
'Update the key or remove the stale entry.',
);
}
}
}

// Per-test results collected for summary
const allResults: Record<
string,
Expand Down Expand Up @@ -194,8 +239,8 @@ describe.skipIf(tests.length === 0)('Jelly Micro-Test Benchmark', () => {
for (const e of fn) console.log(` FN: ${e}`);
}

// Soft gate: recall must be ≥ 0% (we don't gate yet — this benchmark is diagnostic)
expect(recall).toBeGreaterThanOrEqual(0);
const floor = RECALL_FLOORS[testName] ?? 0;
expect(recall).toBeGreaterThanOrEqual(floor);
});
});
}
Expand Down
Loading