Skip to content

fix: bound _serial_cache lifetime to prevent unbounded growth (#62)#64

Merged
jensens merged 1 commit into
mainfrom
fix/serial-cache-leak
Apr 22, 2026
Merged

fix: bound _serial_cache lifetime to prevent unbounded growth (#62)#64
jensens merged 1 commit into
mainfrom
fix/serial-cache-leak

Conversation

@jensens
Copy link
Copy Markdown
Member

@jensens jensens commented Apr 22, 2026

Closes #62.

Summary

The _serial_cache on both PGJsonbStorageInstance and the main PGJsonbStorage was a plain dict with no eviction and no clearing — it grew monotonically for the life of the storage instance. On long-running pods (5 threads × moderate traffic) the leak was estimated at ~3.8 GB after 24 hours in #62, matching observed drift toward pod memory limits.

Fix

Two complementary fixes, both keep the existing conflict-resolution behaviour intact:

  • History-preserving mode: swap _serial_cache = {} for a new _NoopSerialCache (drop-in dict interface that silently drops writes and always misses on reads). _do_loadSerial retrieves old revisions from object_history directly in this mode, so the cache is redundant.
  • History-free mode: clear the cache on afterCompletion. The base versions needed by tryToResolveConflict are consumed during tpc_vote within the same transaction, so bounding the cache lifetime to the enclosing tx is safe.

Applied symmetrically to both PGJsonbStorageInstance and PGJsonbStorage. No new config knob, no API change.

Test plan

New tests/test_serial_cache.py with 4 tests covering both invariants:

  • History-preserving: _serial_cache stays at size 0 after arbitrary loads
  • History-preserving: still size 0 after repeated open/close cycles
  • History-free: cache populated during tx, emptied after afterCompletion
  • History-free: cache does not accumulate across tx cycles
  • Full test suite: 468 passed, zero regressions — including conflict-resolution and conformance tests

Risk

Minimal. The only behavioural change is that cross-transaction conflict resolution in history-free mode no longer finds its base version in a stale in-memory cache — but that path was already broken (the row has been overwritten in PG by the time it matters), so this is documentation of existing behaviour, not a new limitation.

🤖 Generated with Claude Code

The conflict-resolution cache was a plain dict with no eviction or
clearing. It grew monotonically for the life of the storage instance,
driving memory pressure on long-running pods (estimated ~3.8 GB leaked
per pod after 24 hours of moderate traffic at 5 threads).

Two complementary fixes:

- History-preserving: swap for _NoopSerialCache. object_history already
  serves _do_loadSerial, so the cache is redundant. Zero memory cost.
- History-free: clear on afterCompletion. Base versions needed by
  tryToResolveConflict are consumed during tpc_vote within the same
  transaction, so bounding the cache lifetime to the enclosing tx is
  safe.

Applies to both PGJsonbStorageInstance and main PGJsonbStorage. No new
config, no API change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jensens jensens merged commit 3eb5430 into main Apr 22, 2026
5 checks passed
@jensens jensens deleted the fix/serial-cache-leak branch April 22, 2026 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

_serial_cache grows unbounded (memory leak over long-running instances)

1 participant