v9: Radically improve memory footprint by mourner · Pull Request #258 · mapbox/supercluster

mourner · 2026-05-19T20:07:40Z

Across four independent changes in this PR, building a Supercluster index over 1M points (maxZoom=17) now uses 94% less transient allocation, 57% lower peak heap, 65% less retained memory, and runs 23% faster — with no public API changes and zero precision drift at GL JS tile params. An equivalent of similar improvements in geojson-vt mapbox/geojson-vt#191

1M uniform random points sample

metric	baseline	now	Δ
build time	7668 ms	5901 ms	−23%
transient alloc	4681 MB	303 MB	−94%
peak heap	1027 MB	444 MB	−57%
held memory*	730 MB	253 MB	−65%

*held = after dropping the caller's input reference.

Real-world dataset: 1.6M Overture places

metric	before	after	Δ
build time	5782 ms	5671 ms	−2%
transient alloc	3265 MB	347 MB	−89%
peak heap	990 MB	470 MB	−53%
held memory*	642 MB	82 MB	−87%

The build-time delta is modest on this dataset (real geographic distribution leaves no plateau zooms — every level produces fewer clusters than the previous), but the memory story is dramatic: a working set that previously needed nearly 1 GB peak and held 642 MB after build now peaks at 470 MB and retains just 82 MB once the caller releases their input.

What changed

Phase 1 — Int32 slab. The per-point cluster slab is now an Int32Array using a centered encoding (coord − 0.5) × 2³⁰, putting every stored value and every sqDist subtraction inside V8's 31-bit SMI fast path. Working precision improved 64× vs the prior Math.fround path. KDBush also moved to Int32Array storage. The per-zoom slab is sized tight to the actual output count, so slack stays bounded by N across the whole hierarchy.

Phase 2 — withinInto typed-array result (KDBush 4.1.0). The ~14M tree.within calls during a 1M build each allocated a growable JS Array for results, a growable JS Array for the DFS stack, and an iterator object at the consumer. KDBush 4.1.0 adds withinInto(qx, qy, r, out) → count with a caller-owned typed array and moves its DFS stack to a module-level Uint32Array(96). Supercluster passes a per-zoom Uint32Array(numItems), sized tight and cache-friendly. This phase alone cut transient alloc from 3.2 GB to 0.5 GB.

Phase 3 — Drop retained GeoJSON wrappers. The instance no longer pins the caller's points array. We retain only what output paths actually read: props (references to caller-owned properties), coords (Float64 mercator, 16 B/point — needed for drift-free tile coords at extent: 8192, z18), and a lazily-allocated ids array (only if any input has id !== undefined). Released 132 MB of retained memory on the 1M bench.

Phase 4 — Reuse KDBush across plateau zooms. When _cluster returns written === numItems, no cluster branch was ever taken and the output is bit-identical to the input. We set trees[z] = trees[z+1] to share the parent KDBush by reference, leave prev/prevNum unchanged, and recycle the freshly-allocated out slab into the next iteration's allocation. Skips both a KDBush rebuild and an Int32 slab allocation per plateau zoom — biggest impact on real geographic datasets with sparse high-zoom data (−11% ms on the populated-places fixture).

Compatibility

Public API unchanged. getClusters / getTile / getChildren / getLeaves return the same JSON shape.
Single-point feature coords come from the original Float64 mercator (no precision drift at any extent/zoom callers actually use).
Cluster id encoding now caps at ~67M input points (vs ~2³⁵ before). Documented, no realistic caller affected.

mourner added 8 commits May 18, 2026 23:49

upgrade dev deps

7ba611c

more reliable benchmark with memory stats

cf20790

use Int32 storage with higher precision

2f6bf78

fit coords in SMI fast path, less slack

97b12b7

use alloc-free neighbor search (nice speedup)

d5a2a09

do not retain input shape, only props/ids/coords

731f55e

small code cleanup

bcdad35

reuse parent zoom trees when no clusters form

1b279df

mourner added the ai AI coding agents co-authored the code label May 19, 2026

mourner changed the title ~~Optimizations~~ v8: Radically improve memory footprint May 19, 2026

mourner changed the title ~~v8: Radically improve memory footprint~~ v9: Radically improve memory footprint May 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v9: Radically improve memory footprint#258

v9: Radically improve memory footprint#258
mourner wants to merge 8 commits into
mainfrom
optimizations

mourner commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mourner commented May 19, 2026

1M uniform random points sample

Real-world dataset: 1.6M Overture places

What changed

Compatibility

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant