Skip to content

Latest commit

 

History

History
113 lines (88 loc) · 4.5 KB

File metadata and controls

113 lines (88 loc) · 4.5 KB

v0.4.60 — Memory footprint round

Released: 2026-05-16 Scope: memory-footprint hardening. No user-visible API change; the on-disk MemDB format and the MCP JSON wire format are byte-identical to v0.4.50. Existing .memdb directories work unchanged.

What changed for users

Same accuracy. Same query latency. Smaller binary. Lower RAM. And — the under-reported headline — RAM that no longer swings between runs, so container memory limits are actually sizeable from one number.

Axis v0.4.50 v0.4.60 Δ
Binary size (release, ARM64) 144 MB 85 MB −41%
Cold reindex peak RSS 538 MB 457 MB −15.2%
Cold reindex variance (spread of 3 runs) 145 MB 4 MB −97%
Concurrent rerank+embed peak RSS (32×200) 1514 MB 1289 MB −14.9%
Concurrent rerank+embed throughput 11.04 qps 11.00 qps flat
1k find_symbol p50 latency 0.24 ms 0.24 ms flat
1k find_symbol acc@1 / acc@5 / acc@10 96.6% / 99.8% / 99.8% identical bit-identical

Benched on Apple M3 Max, 14 cores, mempalace (127 files / 2,918 nodes / 7,559 edges), median of 3 cold-reindex runs and a single 32-concurrency × 200-query rerank+embed run.

Why variance is the win to care about

Pre-v0.4.60:

cold reindex peak RSS, v0.4.50, three runs on the same host:
  run 1:  538 MB
  run 2:  583 MB
  run 3:  648 MB
  spread: 145 MB (29% of median)

You'd need a 30% safety margin on every container memory limit or risk OOM on a bad run. With v0.4.60:

cold reindex peak RSS, v0.4.60, three runs on the same host:
  run 1:  455 MB
  run 2:  457 MB
  run 3:  459 MB
  spread:   4 MB (1% of median)

Pick a number, pin your limit, stop guessing.

New env vars

Var Default Purpose
MEMTRACE_UNIFIED_CACHE_MB 256 Single hot-cache budget shared across the embed-vector and backend page-cache layers. moka W-TinyLFU eviction. Set 0 to disable; raise to 512 / 1024 on RAM-rich hosts. Replaces several per-subsystem caches that compounded silently.
MEMTRACE_ORT_LOW_RSS 0 (off) Set to 1 to disable the ORT CPU memory arena on every Session::builder site. Saves ~3% peak RSS at a ~19% throughput cost on our shipping model sizes — default OFF. Useful only if you've swapped in much smaller custom models.

Full table: environment-variables.md.

What changed internally (no action needed)

  • Allocator swap. mimalloc 3 is now the default on every target except musl and Windows MSVC (which keep the system allocator, unchanged from v0.4.50). --no-default-features falls back to jemalloc if you need the rebuild path.
  • Shared tree-sitter parser instances. One pooled parser per language instead of one per worker × language. Reduces resident parser-table bytes at steady state.
  • String interning + inline-small-string on the dup-heavy node identity fields. Custom serde keeps the JSON wire format byte-equal to v0.4.50.
  • Bitmap-backed adjacency primitive in the backend. Live adjacency stores migrate in a follow-up release.

Cross-platform

Target Allocator Status
macOS aarch64 (M-series) mimalloc 3 benched + shipped
macOS x86_64 (Intel) mimalloc 3 cfg-identical to aarch64
Linux glibc x86_64 / aarch64 mimalloc 3 shipped
Linux musl system malloc unchanged from v0.4.50
Windows MSVC system malloc unchanged from v0.4.50
Windows GNU mimalloc 3 shipped

Not fixed in this release

There's a separate bug class around daemon supervision — orphaned worker processes after an orchestrator's heartbeat goes stale, no process memory ceiling, no idle-shutdown timer, no spawn-time lock. A field report surfaced multiple memtrace processes totaling many GB of RSS after the orchestrator's state file went stale for hours. v0.4.60 does not address this.

A dedicated fix is tracked for an upcoming release. Mitigation until it ships: if Activity Monitor / Task Manager shows more than one memtrace process for the same data dir, kill the older ones manually — see troubleshooting.md.

Upgrading

# npm install
npm install -g memtrace@0.4.60

# cargo install
cargo install memtrace-mcp --version 0.4.60

No data migration required. Existing .memdb directories deserialise as-is. Drop in MEMTRACE_UNIFIED_CACHE_MB=512 if you have RAM to spare and want a higher hot-cache hit rate.