fix: cap in-memory journal to prevent heap OOM under sustained load#114
Merged
fix: cap in-memory journal to prevent heap OOM under sustained load#114
Conversation
…d load
The Journal class appended one entry per request (body + headers + fixture
reference) and never evicted, so long-running servers grew heap linearly.
Production showcase-aimock on Railway OOMs every ~18min (heap 0 -> 4GB)
because every LLM request flowing through proxy-only mode still appends to
Journal, even when fixture caching and on-disk recording are disabled.
Add a FIFO size cap:
- new JournalOptions.maxEntries constructor arg (0 = unbounded, preserves
prior behavior for existing callers, including 100+ test call sites)
- MockServerOptions.journalMaxEntries plumbs through createServer
- CLI --journal-max flag, defaulting to 1000 (sensible for long-running
mock proxies; tiny memory footprint; large enough for most test-harness
inspection use cases)
Eviction is single-shift-per-add (amortized O(1) under the never-overshoot-
by-more-than-one invariant).
Programmatic callers that want full unbounded history can pass 0 explicitly.
commit: |
4 tasks
jpr5
added a commit
that referenced
this pull request
Apr 17, 2026
…erver defaults (#115) ## Summary Follow-up polish to v1.14.1's journal OOM fix (#114). Three independent improvements surfaced by CR: 1. **Read-path non-mutation bug fix.** `Journal.getFixtureMatchCount(fixture, testId)` was a read method that silently inserted an empty Map + triggered FIFO eviction for unknown testIds. Reads could evict live testIds. Now split into a read-only public `getFixtureMatchCountsForTest` (returns transient empty Map on miss) and private `getOrCreateFixtureMatchCountsForTest` (insert+evict write path used only by `incrementFixtureMatchCount`). 2. **CLI validation hardening.** `--journal-max -5` was silently treated as unbounded; now rejected with a clear error. Same for the new `--fixture-counts-max` flag. 3. **`createServer()` default flip.** `journalMaxEntries` (1000) and `fixtureCountsMaxTestIds` (500) now default to finite caps for programmatic callers — long-running embedders no longer inherit the original leak. Tests using `new Journal()` directly remain unbounded by default (back-compat). Opt in to unbounded via `journalMaxEntries: 0`. ## Test plan - [x] 3 new tests: read-non-mutation, CLI negative rejection, `fixtureCountsMaxTestIds` cap FIFO eviction - [x] Full suite: 2461 tests pass - [x] Lint + build + prettier clean - [x] CR R2 on combined diff: 0 blocking bugs ## Breaking change note The `createServer()` default flip is a behavioral change for programmatic embedders that relied on unbounded journal retention. Opt back in with `createServer({ journalMaxEntries: 0 })` if needed. Documented in `types.ts` JSDoc.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Journal.entrieswas unbounded — every request across 26 handlers (~179 call sites: chat completions, messages, responses, gemini, bedrock, embeddings, images, speech, transcription, video, ollama, cohere, search, rerank, moderation, a2a, mcp, agui, ws-*, etc.) pushed a full{ method, path, headers, body, response }record and never evicted. At sustained prod traffic this grows heap ~3.8MB/sec → 4GB → OOM in ~18 minutes. Observed exactly that onshowcase-aimockRailway service: deterministic 0→4GB heap growth thenFATAL ERROR: Reached heap limit Allocation failed. Crash cascades to ~7 downstream showcase services that route through aimock viaOPENAI_BASE_URL.Fix
FIFO size cap on
Journal.entriesvia newJournalOptions.maxEntries. Default:new Journal()/createServer()): unbounded (backwards-compat; 100+ test/library callers depend on this).serve/ the GHCR image). Override via--journal-max <N>;0or omitted = unbounded.Eviction is a single
shift()per over-cap add. At cap=1000 × ~5KB/entry ≈ 5MB steady-state — well under heap limits.Test plan
src/__tests__/journal.test.ts(cap behavior, FIFO ordering, uncapped default,getLast/findByFixturepost-eviction, 100k-add cap invariant)Follow-ups (separate PR)
--journal-maxvalues (currently silently treated as unbounded)createServer()default: flip to finite cap so long-running embedders don't inherit the leakfixtureMatchCountsByTestIdMap cap (narrower but also unbounded)Array.shiftis O(n); true O(1) would need a ring buffer