test(jelly-micro): add per-fixture recall floors by carlos-alm · Pull Request #1409 · optave/ops-codegraph-tool

carlos-alm · 2026-06-08T06:33:48Z

Summary

Replaces the trivially-passing recall >= 0 gate with a RECALL_FLOORS map keyed by fixture name
Fixtures at 100% recall are locked to 1.0 — any regression immediately fails CI
Partially-resolved fixtures (classes 20%, defineProperty 50%, super 38%, super2 40%) are locked at their current baseline; losing even a single TP trips the assertion
Fixtures that currently resolve 0 edges stay at 0 (they can only improve)
Baseline sourced from a clean origin/main run: precision=65.3%, recall=40.9%, TP=47 FP=25 FN=68

Test plan

npx vitest run tests/benchmarks/resolution/jelly-micro.test.ts — all 65 pass
npx biome check — no lint/format issues

Closes #1387.

Replace the trivially-passing recall >= 0 gate with a RECALL_FLOORS map keyed by fixture name. Fixtures that already reach 100% are locked at 1.0; partially-resolved fixtures (classes, defineProperty, super, super2) are locked at their current baseline so a single lost edge fails CI. Unresolvable fixtures (0% baseline) continue to default to 0. Closes #1387.

greptile-apps · 2026-06-08T06:36:25Z

Greptile Summary

This PR replaces the no-op recall >= 0 assertion with a per-fixture RECALL_FLOORS map, locking fully-resolved fixtures at 1.0 and partially-resolved ones at their current baseline fractions. It also adds the stale-key sanity check raised in the previous review thread.

The 13 fixtures listed in RECALL_FLOORS are correctly configured: every stored floor sits strictly between (TP−1)/named and TP/named, so a single lost TP trips the assertion.
The stale-key guard (module-level throw gated on tests.length > 0) prevents silently-degraded floors when a fixture directory is renamed or a key is mistyped.
The aggregate TP count from the listed floor entries (41) is 6 less than the baseline total of 47 stated in the PR description, suggesting at least a few unlisted fixtures have non-zero recall but a floor of 0 — those regressions would still go undetected.

Confidence Score: 4/5

Safe to merge pending clarification that unlisted fixtures with non-zero recall are intentionally excluded from the floor map.

The listed floor values are all mathematically correct and the stale-key guard works as intended. The one concern is that the PR description's claim — 'fixtures that currently resolve 0 edges stay at 0' — does not match the aggregate numbers: 47 baseline TPs minus the 41 TPs represented in RECALL_FLOORS leaves 6 TPs spread across unlisted fixtures that can regress to 0% without failing CI.

tests/benchmarks/resolution/jelly-micro.test.ts — specifically whether the 6 TPs from fixtures not listed in RECALL_FLOORS are intentionally left unguarded.

Important Files Changed

Filename	Overview
tests/benchmarks/resolution/jelly-micro.test.ts	Replaces the trivially-passing `recall >= 0` gate with per-fixture `RECALL_FLOORS`. All listed floor values are mathematically correct (a single lost TP trips each), and the stale-key sanity check is properly guarded. However, the aggregate TP count from listed entries (41) falls 6 short of the baseline total (47), meaning at least some unlisted fixtures have non-zero recall but a floor of 0, leaving their regressions undetected.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Module load] --> B{tests.length > 0?}
    B -- No --> C[Suite skipped - fixtures absent in CI]
    B -- Yes --> D[Sanity-check RECALL_FLOORS keys against discovered fixtures]
    D -- Stale key found --> E[throw Error - aborts test collection]
    D -- All keys valid --> F[Run describe suite]
    F --> G[For each fixture: beforeAll - buildGraph and query DB]
    G --> H[test: named-edge recall]
    H --> I[Compute recall = TP / named]
    I --> J[Lookup floor from RECALL_FLOORS, default 0]
    J --> K{recall >= floor?}
    K -- Yes --> L[Pass]
    K -- No --> M[Fail: regression detected]
    F --> N[afterAll: print aggregate summary]

_{Reviews (10): Last reviewed commit: "Merge branch 'main' into test/jelly-micr..." | Re-trigger Greptile}

greptile-apps · 2026-06-08T06:36:28Z

+const RECALL_FLOORS: Record<string, number> = {
+  accessors3: 1.0, // 1/1
+  arguments: 1.0, // 1/1
+  classes: 0.2, // 6/30
+  defineProperty: 0.5, // 3/6
+  fun: 1.0, // 4/4
+  generators: 1.0, // 9/9
+  more1: 1.0, // 10/10
+  'receiver-callee-mixup': 1.0, // 1/1
+  rest: 1.0, // 1/1
+  spread: 1.0, // 4/4
+  super: 0.38, // 5/13
+  super2: 0.4, // 2/5
+  super3: 1.0, // 3/3
+  this: 1.0, // 1/1
+};


Stale RECALL_FLOORS keys won't surface fixture renames

RECALL_FLOORS entries that don't match any discovered fixture name are silently ignored — the affected test would fall back to floor = 0 instead of the intended floor. If a fixture directory is renamed (e.g. super → super-v2) or a key is mistyped here, the regression protection is lost without any warning. Adding a quick sanity check after discoverTests() — verifying every Object.keys(RECALL_FLOORS) entry appears in tests — would catch this at suite startup rather than silently lowering the bar.

Fixed — added a startup sanity-check after discoverTests() that throws if any key in RECALL_FLOORS does not match a discovered fixture directory. Gated on tests.length > 0 so it does not fire in CI environments where the jelly-micro directory is gitignored (the suite is skipped there anyway).

…tures Add a startup check after discoverTests() that throws if any key in RECALL_FLOORS does not match an actual fixture directory. A renamed or mistyped fixture key would otherwise silently lower the recall floor to 0 with no warning, defeating the regression gate. The check is gated on tests.length > 0 so it does not fire in CI environments where the jelly-micro directory is gitignored.

carlos-alm · 2026-06-08T08:48:55Z

@greptileai

more1 was moved from jelly-micro to the pts-javascript fixture set in #1383 (commit ddfc14c). The RECALL_FLOORS map still referenced it, causing the startup sanity-check to throw in CI where the merged branch no longer includes more1 in jelly-micro.

…387' into test/jelly-micro-recall-floors-1387-wt

carlos-alm · 2026-06-09T12:04:25Z

Addressed CI failures: removed stale more1 key from RECALL_FLOORS.

The more1 fixture was moved from jelly-micro to pts-javascript in #1383 (commit ddfc14c). The sanity-check guard (added in commit 48aeea8 per Greptile's suggestion) then correctly caught this as a stale entry, failing CI on all platforms. Removed more1: 1.0 from the map and updated the comment to explain why it's absent.

carlos-alm · 2026-06-09T12:04:35Z

@greptileai

…/30) (#1409)

…387' into test/jelly-micro-recall-floors-1387

carlos-alm · 2026-06-09T21:50:05Z

Fixed CI failure: the classes recall floor was set to 0.2 assuming 6/30 edges, but the fixture's expected-edge set (after deduplication by the new Set() keying on name@basename) has 31 unique entries — so the actual baseline recall is 6/31 ≈ 0.1935, which fails the 0.2 gate.

Corrected to classes: 0.19, // 6/31. The floor 0.19 satisfies 6/31 = 0.1935... >= 0.19 and would fail at 5/31 = 0.1613, preserving the single-TP regression sensitivity.

carlos-alm · 2026-06-09T21:50:34Z

@greptileai

greptile-apps Bot reviewed Jun 8, 2026

View reviewed changes

carlos-alm and others added 4 commits June 8, 2026 11:29

fix: resolve merge conflicts with main

db459d5

Merge branch 'main' into test/jelly-micro-recall-floors-1387

ef9a741

Merge remote-tracking branch 'origin/test/jelly-micro-recall-floors-1…

adc1f04

…387' into test/jelly-micro-recall-floors-1387-wt

carlos-alm added 2 commits June 9, 2026 15:39

fix(jelly-micro): correct classes floor from 0.2 to 0.19 (6/31, not 6…

d6c568f

…/30) (#1409)

Merge remote-tracking branch 'origin/test/jelly-micro-recall-floors-1…

2abb560

…387' into test/jelly-micro-recall-floors-1387

Merge branch 'main' into test/jelly-micro-recall-floors-1387

fdc857b

carlos-alm merged commit f414be4 into main Jun 9, 2026
22 checks passed

carlos-alm deleted the test/jelly-micro-recall-floors-1387 branch June 9, 2026 22:34

github-actions Bot locked and limited conversation to collaborators Jun 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(jelly-micro): add per-fixture recall floors#1409

test(jelly-micro): add per-fixture recall floors#1409
carlos-alm merged 9 commits into
mainfrom
test/jelly-micro-recall-floors-1387

carlos-alm commented Jun 8, 2026

Uh oh!

greptile-apps Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot Jun 8, 2026

Uh oh!

carlos-alm Jun 8, 2026

Uh oh!

carlos-alm commented Jun 8, 2026

Uh oh!

carlos-alm commented Jun 9, 2026

Uh oh!

carlos-alm commented Jun 9, 2026

Uh oh!

carlos-alm commented Jun 9, 2026

Uh oh!

carlos-alm commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

carlos-alm commented Jun 8, 2026

Summary

Test plan

Uh oh!

greptile-apps Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm commented Jun 8, 2026

Uh oh!

carlos-alm commented Jun 9, 2026

Uh oh!

carlos-alm commented Jun 9, 2026

Uh oh!

carlos-alm commented Jun 9, 2026

Uh oh!

carlos-alm commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented Jun 8, 2026 •

edited

Loading