Measure retained heap in lint memory test to remove flakiness#5097
Open
habdelra wants to merge 2 commits into
Open
Measure retained heap in lint memory test to remove flakiness#5097habdelra wants to merge 2 commits into
habdelra wants to merge 2 commits into
Conversation
The "memory usage during lint operations" test asserted a hard ceiling on the process heapUsed delta across ten concurrent /_lint requests without forcing a collection first, so the reading was dominated by transient garbage that GC had not yet reclaimed rather than by retained memory. The delta therefore tracked GC timing, not leaks, and intermittently tipped over the threshold. The test now warms up once (so one-time eslint/prettier initialization is not miscounted), forces a full GC before both the baseline and final reading, and asserts on the resulting retained-growth delta. A `forceGc` helper enables GC at runtime via the V8 flag API since the runner does not pass --expose-gc. Retained growth measures ~3MB locally; the bound is set to 20MB as a leak guard with ample headroom. A diagnostic line logs the retained-growth number and whether GC ran, so any future failure can be told apart from measurement noise. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR reduces flakiness in the realm-server lint endpoint memory test by shifting the measurement from “heapUsed at arbitrary GC timing” to a post-GC retained-heap delta, using a warm-up request plus an explicit GC trigger.
Changes:
- Add a
forceGc()helper to attempt to enable and run V8 garbage collection at runtime (without starting Node with--expose-gc). - Update the lint memory test to warm caches, force GC before baseline/final readings, tighten the retained-growth threshold, and emit a diagnostic line.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| packages/realm-server/tests/realm-endpoints/lint-test.ts | Reworks the memory-usage test to measure post-GC retained growth and adds diagnostics. |
| packages/realm-server/tests/helpers/prettier-test-utils.ts | Introduces forceGc() helper used by the memory test to trigger GC without --expose-gc. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- forceGc resolves the GC function once and caches it, so a missing globalThis.gc no longer creates a throwaway VM context on every call (which would itself allocate and perturb the heap reading). The runtime-enable path now swallows a ReferenceError and returns false instead of letting it escape, honoring the documented contract. - The memory test now keys its threshold on whether a collection actually ran: the tight 20MB retained-memory bound when GC was forced, falling back to the looser historical 45MB bound otherwise, so the test never becomes more flaky than before in an environment without runtime GC. - The warm-up request's status is asserted so a regression there is localized rather than silently shifting the baseline. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
lukemelia
approved these changes
Jun 3, 2026
Contributor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
The
realm-endpoints/lint-test.ts > memory usage during lint operationstest is intermittently flaky. It asserts a hard ceiling on the processheapUseddelta across ten concurrent/_lintrequests, but never forces a garbage collection before reading. The "after" reading therefore includes transient garbage that GC simply hasn't reclaimed yet — so the delta measures GC timing, not retained memory, and occasionally tips over the threshold on a busy CI runner.Fix
Make the measurement reflect retained memory — the actual leak signal:
forceGchelper enables GC at runtime via the V8 flag API, because the test runner doesn't start Node with--expose-gc.[lint-memory-test] initial=… final=… retainedGrowth=… gcForced=…) so any future failure shows the real retained number and whether GC actually ran — distinguishing a real leak from measurement noise.Verification
Ran the module locally against the realm-server test stack:
🤖 Generated with Claude Code