Skip to content

Opt-in allocator arena release after Model.solve() to bound RSS across sequential solves #720

@MaykThewessen

Description

@MaykThewessen

Summary

Sequential PyPSA-Eur-style chunked solves (weekly LPs, identical structure, N iterations in one process) exhibit monotonic RSS growth until OOM, even after del model + gc.collect() between iterations. Root cause is platform allocator arena retention (glibc on Linux, libSystem on macOS), not Python-level leaks. Proposing a small opt-in helper that releases arena pages back to the kernel after the existing solve path completes.

This is orthogonal to:

Neither addresses the allocator-arena pattern: glibc / libSystem keep freed pages cached, so RSS grows monotonically across the chunk loop even when the Python-side working set is bounded.

Proposal

linopy/_memory.py (~30 LOC):

def release_allocator_pages() -> None:
    """gc.collect() then ask platform allocator to return arenas to kernel.

    Linux:   libc.malloc_trim(0)
    macOS:   libSystem.malloc_zone_pressure_relief(NULL, 0)
    Other:   gc only, silent no-op.
    """

Wired into Model.solve(release_memory: bool = False). Default unchanged. Also exposed at top level (linopy.release_allocator_pages()) so callers in chunked-solve loops can trim mid-pipeline without going through Model.solve.

Benchmark (macOS arm64, linopy master 37af4ba)

5 sequential solves of an identical 5k var × 2k con dense-block LP via HiGHS, single Python process:

metric baseline release_memory=True Δ
RSS post-cleanup, iter 4 (GB) 4.41 2.02 −54 %
ru_maxrss (GB) 5.30 4.32 −19 %
wall-clock total (s) 30.97 32.09 +3.6 % (noise)

Baseline RSS climbs 3.71 → 4.41 GB monotonically across the 5 iterations. With trim, post-cleanup RSS drops back to 1.5–2.0 GB after each solve. Linux numbers via malloc_trim typically more dramatic — happy to add a Linux benchmark before opening the PR.

Tests

  • Platform-mocked dispatch (Linux / Darwin / Windows / FreeBSD).
  • Kwarg default backward-compat smoke test.
  • Repeated-call stability (no crashes if libc / libSystem lookup misses).
  • Swallows OSError / AttributeError, debug-logs on miss.

Open questions

  1. Naming: release_memory=True on Model.solve vs. top-level linopy.release_allocator_pages() vs. both. Leaning toward both — kwarg for the common single-solve case, public helper for chunked loops where users want to trim between solves without re-entering Model.solve.
  2. Interaction with Persistent solver: in-place model updates via ModelDiff #699: once persistent solver lands, native solver model + factor stay resident. Python-side arena cache still grows; trim hook stays useful. Agree?
  3. Should the helper also clear m._xCounter / m._cCounter style state on Model.solve(release_memory=True)? Leaning no — out of scope for an allocator helper, separate concern.

Will open PR once direction is confirmed.

Refs: #219, #630, #699

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions