Released: 2026-05-16
Scope: memory-footprint hardening. No user-visible API change; the
on-disk MemDB format and the MCP JSON wire format are byte-identical
to v0.4.50. Existing .memdb directories work unchanged.
Same accuracy. Same query latency. Smaller binary. Lower RAM. And — the under-reported headline — RAM that no longer swings between runs, so container memory limits are actually sizeable from one number.
| Axis | v0.4.50 | v0.4.60 | Δ |
|---|---|---|---|
| Binary size (release, ARM64) | 144 MB | 85 MB | −41% |
| Cold reindex peak RSS | 538 MB | 457 MB | −15.2% |
| Cold reindex variance (spread of 3 runs) | 145 MB | 4 MB | −97% |
| Concurrent rerank+embed peak RSS (32×200) | 1514 MB | 1289 MB | −14.9% |
| Concurrent rerank+embed throughput | 11.04 qps | 11.00 qps | flat |
| 1k find_symbol p50 latency | 0.24 ms | 0.24 ms | flat |
| 1k find_symbol acc@1 / acc@5 / acc@10 | 96.6% / 99.8% / 99.8% | identical | bit-identical |
Benched on Apple M3 Max, 14 cores, mempalace (127 files / 2,918 nodes / 7,559 edges), median of 3 cold-reindex runs and a single 32-concurrency × 200-query rerank+embed run.
Pre-v0.4.60:
cold reindex peak RSS, v0.4.50, three runs on the same host:
run 1: 538 MB
run 2: 583 MB
run 3: 648 MB
spread: 145 MB (29% of median)
You'd need a 30% safety margin on every container memory limit or risk OOM on a bad run. With v0.4.60:
cold reindex peak RSS, v0.4.60, three runs on the same host:
run 1: 455 MB
run 2: 457 MB
run 3: 459 MB
spread: 4 MB (1% of median)
Pick a number, pin your limit, stop guessing.
| Var | Default | Purpose |
|---|---|---|
MEMTRACE_UNIFIED_CACHE_MB |
256 |
Single hot-cache budget shared across the embed-vector and backend page-cache layers. moka W-TinyLFU eviction. Set 0 to disable; raise to 512 / 1024 on RAM-rich hosts. Replaces several per-subsystem caches that compounded silently. |
MEMTRACE_ORT_LOW_RSS |
0 (off) |
Set to 1 to disable the ORT CPU memory arena on every Session::builder site. Saves ~3% peak RSS at a ~19% throughput cost on our shipping model sizes — default OFF. Useful only if you've swapped in much smaller custom models. |
Full table: environment-variables.md.
- Allocator swap. mimalloc 3 is now the default on every target
except musl and Windows MSVC (which keep the system allocator,
unchanged from v0.4.50).
--no-default-featuresfalls back to jemalloc if you need the rebuild path. - Shared tree-sitter parser instances. One pooled parser per language instead of one per worker × language. Reduces resident parser-table bytes at steady state.
- String interning + inline-small-string on the dup-heavy node identity fields. Custom serde keeps the JSON wire format byte-equal to v0.4.50.
- Bitmap-backed adjacency primitive in the backend. Live adjacency stores migrate in a follow-up release.
| Target | Allocator | Status |
|---|---|---|
| macOS aarch64 (M-series) | mimalloc 3 | benched + shipped |
| macOS x86_64 (Intel) | mimalloc 3 | cfg-identical to aarch64 |
| Linux glibc x86_64 / aarch64 | mimalloc 3 | shipped |
| Linux musl | system malloc | unchanged from v0.4.50 |
| Windows MSVC | system malloc | unchanged from v0.4.50 |
| Windows GNU | mimalloc 3 | shipped |
There's a separate bug class around daemon supervision — orphaned
worker processes after an orchestrator's heartbeat goes stale, no
process memory ceiling, no idle-shutdown timer, no spawn-time lock.
A field report surfaced multiple memtrace processes totaling many
GB of RSS after the orchestrator's state file went stale for hours.
v0.4.60 does not address this.
A dedicated fix is tracked for an upcoming release. Mitigation
until it ships: if Activity Monitor / Task Manager shows more
than one memtrace process for the same data dir, kill the older
ones manually — see
troubleshooting.md.
# npm install
npm install -g memtrace@0.4.60
# cargo install
cargo install memtrace-mcp --version 0.4.60No data migration required. Existing .memdb directories deserialise
as-is. Drop in MEMTRACE_UNIFIED_CACHE_MB=512 if you have RAM to spare
and want a higher hot-cache hit rate.