Increase TS E2E test data size to fix flaky js-ts-class#2051
Merged
Conversation
722b3d8 to
744a9d6
Compare
The js-ts-class E2E test was flaky because n=100 is too small for the O(n²)→O(n) optimization to overcome Map/Set per-operation overhead. At n=100, the LLM correctly generates a Map-based O(n) solution but it benchmarks as slower (-10.6%) due to constant factor dominance. Bump to n=10,000 so the algorithmic improvement produces measurable speedup, making the 30% E2E threshold reliably achievable.
The change detection for JS E2E tests was missing the test fixture directory, so PRs that only modify JS test data (like this one) were skipped. Java already had its equivalent path included.
The profiler's save() was called every 100 hit() calls. With O(n²)
algorithms this produced hundreds of thousands of writeFileSync calls,
each truncating the file to 0 bytes before writing. If the subprocess
timed out (SIGKILL), the file was left at 0 bytes → JSONDecodeError.
Fixes:
- Move require('fs')/require('path') to module scope (not inside save())
- Reduce save-every-N from 100 → 10,000 hits (100x fewer syscalls)
- Pre-create output file with {} before running Jest (safety net)
- Handle empty files gracefully in parse_results
- Fix misleading "file not found" warning → "file empty or no timing data"
c4c59dc to
8ca0f8d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
findDuplicatesbenchmark test data from n=100 to n=10,000Problem
The
js-ts-classE2E test is flaky because at n=100, the O(n²)→O(n) optimization the LLM correctly generates (Map-based deduplication) benchmarks as slower (-10.6%) due to Map/Set per-operation overhead dominating at small input sizes.Fix
At n=10,000, the algorithmic improvement reliably produces measurable speedup, making the 30% E2E threshold consistently achievable.
Test plan
npx jestpasses (83/83)js-ts-classE2E passes with the new data size