Skip to content

Commit b2cea69

Browse files
authored
fix(gc): #745 — JSON polyglot RSS regression (v0.5.900) (#750)
* fix(gc): #745 — JSON polyglot RSS regression (v0.5.903) `gc_bump_malloc_trigger` was ratcheting the bytes-trigger up on every gc-suppressed parse. For a single 108 MB parse (json_pipeline_full, v0.5.279's original optimization target) that's correct. For a 50-iter loop of `JSON.parse + discard` (json_polyglot's ~5 MB parses) it suppressed GC entirely — the trigger climbed hundreds of MB above the actual live set before allocation could catch up. Two guards in gc.rs: 1. GC_TRIGGER_BUMPED — bump the trigger at most once per GC cycle. Flag is cleared at the top of gc_collect_inner (covers full, minor, manual gc(), and malloc-count trigger paths). 2. GC_PRE_SUPPRESS_BYTES — gc_suppress snapshots arena_total_bytes() so the bump can compute actual parse growth. Skip the bump if growth is below 32 MB. This cleanly separates json_pipeline_full (108 MB parse, bumps) from json_polyglot (~5 MB parse, no bump). Validation: lazy-roundtrip 250 → 215 MB (-14%), lazy-field-access 407 → 246 MB (-39%) at unchanged wall time. GC fires 2× per polyglot run vs 0× pre-fix. perry-runtime tests 250/0/0. * fix(gc): #745 follow-up — preserve test_memory_json_churn (revised guards) CI run on PR #750 surfaced a regression in test_memory_json_churn.ts: under all three GC profiles (default, mark-sweep, gen-gc+wb) the test went from PASS at ~120 MB pre-fix to FAIL at 461 MB (limit 250 MB). The churn test allocates ~13 KB per iteration across 5k iterations into a deliberately fragmented arena where every block holds both live and dead objects — so a GC cycle sweeps 91-95% of bytes dead but reclaims *zero* blocks, step-doubles on the productive sweep heuristic, raises the trigger, and the cascade compounds. The original bytes-bump correctly deferred GC indefinitely on that shape; my v0.5.900 fix made GC fire and exposed the fragmentation cascade. Revised the guards to gate on the suppressed window's actual arena growth and split tiny-parse vs medium-or-larger parse handling: (1) `GC_PRE_SUPPRESS_BYTES` — gc_suppress snapshots arena_total so gc_bump_malloc_trigger can compute parse_growth. (2) `GC_TRIGGER_BUMPED` — once-per-cycle flag, applied only when parse_growth >= 1 MB (the json_pipeline_full and json_polyglot shapes). Cleared at the top of gc_collect_inner so all collection entry points re-arm. Tiny parses (< 1 MB, the churn shape) bypass the flag and bump every call, which preserves the pre-fix deferred-GC behavior. (3) Step cap at `GC_THRESHOLD_INITIAL_BYTES` (64 MB) — bound the bump's effective step so post-73a48ced step-doubling can't grant hundreds of MB of headroom per bump. Validation after revision: benchmarks/json_polyglot (lazy-tape default): roundtrip peak RSS: 250 → 223 MB (-11%, was -14% under v1) field-access peak RSS: 407 → 305 MB (-25%, was -39% under v1) scripts/run_memory_stability_tests.sh: 18/18 PASS (was 15/18 PASS, 3/18 FAIL under v1) test_memory_json_churn lands at 209 MB (limit 250) Pipeline-full-style probe: ~135 ms, no mid-iterate GC, 217 MB — the json_pipeline_full optimization is preserved because the 108 MB parse blows past the 1 MB tiny-parse threshold and the once-per-cycle flag still allows the first bump. perry-runtime tests: 250/0/0. Versioning: rebased onto current main (v0.5.902 → v0.5.903) since #743 took v0.5.900 and #749 took v0.5.901/v0.5.902 while this PR was in CI. Updated changelog entry text to match the revised approach.
1 parent 50f5548 commit b2cea69

5 files changed

Lines changed: 138 additions & 71 deletions

File tree

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
Detailed changelog for Perry. See CLAUDE.md for concise summaries.
44

5+
## v0.5.903 — fix(gc): #745 — JSON polyglot benchmark RSS regression. **Symptom.** `benchmarks/json_polyglot/` regressed 2-4× on peak RSS vs v0.5.279: `validate-and-roundtrip` 85 → 254 MB, `parse-and-iterate` 100 → 411 MB; the no-lazy-tape variants moved ~2.6× too. Wall time was flat or slightly faster — purely a memory-footprint regression. **Root cause.** Commit `56818086` ("perf(gc): bump bytes trigger after gc-suppressed parse") added a per-parse bytes-trigger bump in `gc_bump_malloc_trigger` to defer GC during the post-parse iterate/rebuild pass of `json_pipeline_full` (single 108 MB parse + 70 MB iterate). The bump uses `bytes_now + GC_STEP_BYTES` with "only raise — never lower" semantics. The optimization was correct for the single-shot pattern but pathological for a *loop* of `JSON.parse + discard`: each iteration's bump ratcheted the trigger by another `step`, and with productive-sweep step-doubling (post-`73a48ced` ECS optimization) the step itself grew toward 1 GB across cycles. `PERRY_GC_DIAG=1` on the `json_polyglot` 50-iter roundtrip showed **zero** GC cycles — the trigger had been pushed hundreds of MB above the actual live set (~5 MB) before allocation could ever catch up. **Fix.** Three guards on the bump in `crates/perry-runtime/src/gc.rs`, gated on the suppressed window's actual arena growth: (1) `GC_PRE_SUPPRESS_BYTES` thread-local — `gc_suppress` snapshots `arena_total_bytes()` at suppress-start; `gc_bump_malloc_trigger` reads it to compute `parse_growth = bytes_now - pre_suppress`. (2) `GC_TRIGGER_BUMPED` thread-local flag — for medium-or-larger parses (`parse_growth >= 1 MB`, the `json_pipeline_full` and `json_polyglot` shapes), the bump may raise the trigger only *once* per GC cycle; the flag is set on a successful raise and cleared at the top of `gc_collect_inner` so all collection entry points (full GC, minor GC via `gc_collect_minor`, manual `gc()`, malloc-count trigger path) re-arm it. Tiny parses (`< 1 MB` of arena growth) bypass the flag and bump every call — this preserves the `test_memory_json_churn` shape (5 k iters × small+large parses into a fragmented arena, where every block holds both live and dead objects so a GC sweep would find 91 %+ bytes dead but reclaim zero blocks and cascade RSS up; the original bytes-bump correctly deferred GC indefinitely on that shape). (3) Step cap at `GC_THRESHOLD_INITIAL_BYTES` (64 MB) — clamps the bump's effective step so post-`73a48ced` step-doubling can't make a single bump grant hundreds of MB of headroom. The cap is a no-op for `json_pipeline_full` since `GC_STEP_BYTES` is at INITIAL on the first call (no prior GC). **Validation.** Repro confirmed at v0.5.898 baseline: 250 MB lazy roundtrip / 407 MB lazy field-access. After fix: **223 MB** lazy roundtrip (-11%) / **305 MB** lazy field-access (-25%). Wall time unchanged (82 ms roundtrip, 403 ms field-access; both faster than the v0.5.279 baseline). `./scripts/run_memory_stability_tests.sh` was failing 3/18 before the fix (`test_memory_json_churn` at 461 MB across all three GC profiles vs 250 MB limit) and now passes all 18; the churn test lands at 209 MB. Pipeline-full-style probe (200 k records, ~26 MB parse + iterate+rebuild) still bumps and runs in ~135 ms with no mid-iterate GC. Full perry-runtime test suite green (250/0/0). **Out of scope.** The `PERRY_JSON_TAPE=0` (no-lazy, opt-in debug knob) path moved 265 → 279 MB — a slight regression from firing GC inside the loop, which the unfixed "never GC, grow linearly" behavior had been hiding. The lazy-tape default (the path the issue title named) is the headline metric and moves the right direction. The remaining ~2.5× gap vs the v0.5.279 baseline on the lazy path is from the `GC_TRIGGER_ABSOLUTE_CEILING` raise (64 → 128 MB in commit `73a48ced`) which is a deliberate ECS optimization and is left untouched per direction. **Version-bump note.** Renumbered v0.5.900 → v0.5.903 because #743 took v0.5.900 (#733 inliner fix) and #749 took v0.5.901/v0.5.902 (#741/#742/#740 stdlib + codegen) ahead of this PR in the merge queue.
6+
57
## v0.5.902 — fix(codegen): #740 follow-up — object-literal class-field aliases. Builds on v0.5.901's class-expr→ClassRef change. Two new shapes now correctly resolve a class-ref read out of an object-literal field back to the underlying class, instead of falling through to the empty-object placeholder. (1) `const O = { Inner: class extends Base {…} }; new O.Inner(args)` — the `NewDynamic { callee: PropertyGet { LocalGet(O), "Inner" }, args }` case in `crates/perry-codegen/src/expr.rs`. (2) `const O = { Inner: class … }; const C = O.Inner; new C(args)` — same shape but with an intermediate `let` binding (the `Stmt::Let { init: Some(PropertyGet { LocalGet(other), prop }) }` arm in `crates/perry-codegen/src/stmt.rs`). Both share a new per-function side table `FnCtx.local_class_field_aliases: HashMap<LocalId, HashMap<String, String>>` populated when `Stmt::Let` sees `init = New { class_name (an __AnonShape), args }` and walks the class's declared field order in parallel with the args — any `Expr::ClassRef(name)` arg becomes a `(local_id, field_name) → class_name` entry. The map is also propagated through `let O2 = O` (LocalGet of a known-shape local) so a re-binding doesn't lose the class info. **Validation.** Minimal repro `const O = { Inner: class extends Base { _tag = "X" } }; new O.Inner({issue:"x"}).issue` now prints `"x"`, matches Node. test740_b (function returns class via O.Inner, called outside) prints `inst._tag: X` correctly. Same 26/28 `test_gap_*` pass, same 20/23 `test_*class*` pass — no regressions. **Still open.** The full Effect DoD (#321) needs runtime parent-constructor dispatch: `class ParseError extends TaggedError("ParseError")` reaches `[3] pe._tag: undefined` because perry's `lower_new` only walks the static `extends_name` parent chain — `RegisterClassParentDynamic` registers the parent at runtime but no code path consults that registry to run the parent's field initializers / constructor. Fixing that means emitting each class's constructor as a separately addressable function and adding a runtime constructor-registry lookup, which is the same shape of work that would unblock `class X extends Factory()` generally. **Version-bump note.** Renumbered v0.5.897 → v0.5.902 to follow this PR's first commit (v0.5.901) after the main-side bumps that intervened.
68

79
## v0.5.901 — fix(stdlib/path,tty,classexpr): close #741 + #742, partial #740. **Version-bump note.** Renumbered from v0.5.896 → v0.5.901 because #735 (v0.5.895–v0.5.898 #665 follow-ups) and #744 (v0.5.899 toml/deno_core bump) landed on main, and #743 (v0.5.900 #733 self-recursive inliner) is queued ahead of this PR. **#741 path module gaps.** Four sibling bugs in `node:path` that were the entire diff in `test-files/test_parity_path.ts` against Node 22. (1) `path.dirname("/")` returned `""` because Rust's `Path::new("/").parent()` is `None`; Node spec says the root's dirname is the root itself. Added an explicit POSIX-root short-circuit in `crates/perry-runtime/src/path.rs::js_path_dirname` — any all-`/` input returns `"/"`, empty input returns `"."`, parent-not-found falls back to `"."` (Node's "no separator" semantics). (2) `path.join("/foo/", "/bar/", "baz")` returned `/bar/baz` because `js_path_join` was implemented via Rust's `PathBuf::join` which resets on absolute segments — that's `path.resolve` semantics, not `path.join` semantics. Per Node docs, `path.join` concatenates all arguments with `/` then normalizes; an absolute middle segment is just another segment. Rewrote `js_path_join` to do raw `"{a}/{b}"` concat (with empty-arg short-circuits) followed by `normalize_str`. (3) `path.matchesGlob(path, pattern)` (Node 22.5+) wasn't implemented and threw the hard manifest-gate error. Added a runtime helper `js_path_matches_glob` that converts the glob to a regex (`*` → `[^/]*`, `**` → `.*`, `?` → `[^/]`, `[...]` classes pass through, special regex chars get escaped) and tests via the existing `regex` crate. (4) `path.toNamespacedPath(path)` is Windows-only on Node — POSIX returns the input unchanged. Perry was unimplemented, threw `TypeError: value is not a function`, and that throw terminated the entire script (so the parity test's lines 38+ silently never ran). Added `js_path_to_namespaced_path` as a POSIX no-op (returns input verbatim). **Plumbing.** Two new HIR variants `PathToNamespacedPath(Box<Expr>)` and `PathMatchesGlob(Box<Expr>, Box<Expr>)` plus a third `PathResolveJoin(Box<Expr>, Box<Expr>)` introduced as a side fix: changing `js_path_join` to the correct concat semantics broke the existing `path.resolve(a, b, c)` lowering, which chained `PathJoin` for the multi-arg path — `PathJoin` was inadvertently doing double duty as the resolve-style "reset on absolute" join. Split the two: the HIR lowering for `path.resolve(a, b, c)` now chains `PathResolveJoin` (which does have reset-on-absolute, via a new `js_path_resolve_join` runtime helper) and the chain wraps a final `PathResolve`. `path.join(a, b, c)` still chains plain `PathJoin` (now correct concat semantics). New entries added to walker.rs, stable_hash.rs (tags 449/450/451), analysis.rs, monomorph.rs, js_transform.rs, type_analysis.rs, collectors.rs, codegen-js/emit.rs, codegen-wasm/emit.rs, runtime_decls.rs, and expr.rs. Also added `method("path", "toNamespacedPath", …)` and `method("path", "matchesGlob", …)` to `perry_api_manifest::entries` so the unimplemented gate stops firing. **Validation.** Issue's 5-line repro matches Node byte-for-byte. Full `test-files/test_parity_path.ts` diff drops from 15 lines to 2 (`posix.join` / `win32.join` — separate sub-namespace work, not in scope for #741). **#742 tty exports + isTTY shape.** Three bugs that were the entire diff in `test-files/test_parity_tty.ts`. (1) `typeof tty.ReadStream`, `typeof tty.WriteStream`, and `typeof tty.isatty` all returned `"undefined"` because reading them as PropertyGet on the `NativeModuleRef("tty")` namespace went through `js_native_module_property_by_name` which only consults the constants dispatcher (none of them are constants). The call-site form `tty.isatty(0)` worked via a dedicated HIR lowering, but the property-read form (`typeof`, `const f = tty.isatty`) returned `undefined`. Fix: at the tail of `js_native_module_property_by_name` (`crates/perry-runtime/src/object.rs`), check `is_native_module_callable_export(module, prop)` — a whitelist currently of `("tty","isatty")`, `("tty","ReadStream")`, `("tty","WriteStream")` — and if matched, synthesize a `BOUND_METHOD_FUNC_PTR` closure capturing the namespace object + method name (same shape `js_native_module_bind_method` creates for ordinary method access). Closures NaN-box with the magic header that `js_value_typeof` recognizes as `"function"`. Whitelist deliberately narrow so `typeof tty.bogusName` still returns `"undefined"`. (2) `process.std{in,out,err}.isTTY` returned `false` (typeof `"boolean"`) when stdout was piped, but Node's docs spec it as `true` when TTY, **`undefined` otherwise** — `isTTY` is a presence-test, not a boolean. Many libraries do `if ("isTTY" in process.stdout)` or `if (process.stdout.isTTY !== undefined)` and got the wrong answer under Perry. Changed `js_process_stdin_isatty` / `js_process_stdout_isatty` / `js_process_stderr_isatty` to return `TAG_UNDEFINED` (not `TAG_FALSE`) when the corresponding fd isn't a TTY. `tty.isatty(fd)` itself still returns a real boolean — that's the documented Node contract. **Validation.** Issue's 5-line repro matches Node byte-for-byte. Full `test-files/test_parity_tty.ts` parity test now passes byte-for-byte (was 6 lines off). **#740 Effect ParseResult.ts crash (partial fix).** The actual root issue tracks `class X extends Factory()<...>` (a class extending a factory-call result) which is the Effect blocker, and requires runtime constructor dispatch the compiler doesn't yet have. What landed here is the *narrower* compiler fix that moves the standalone repro from "throws `TypeError: value is not a function`" to "runs to completion (instance has undefined fields)": class expressions used as values were lowering to `Expr::New { class_name, args: vec![] }` — i.e. creating a zero-arg instance — rather than to `Expr::ClassRef(class_name)`. So `const C = class { ... }` bound `C` to a stillborn instance, and `O.Inner` inside `{ Inner: class { ... } }` likewise. Changed `crates/perry-hir/src/lower.rs`'s `ast::Expr::Class` arm to emit `Expr::ClassRef(synthetic_name)` instead. With the alias-chain propagation already in `Stmt::Let` (codegen/stmt.rs), this makes `const C = class {...}; new C(args)` and `function f() { const C = class {...}; return C; } new f()(args)` work end-to-end (constructor runs with the supplied args). The remaining `O.Inner`-via-PropertyGet case still falls through to the empty-object placeholder in `NewDynamic` because the codegen has no runtime constructor dispatch — the value at the property is the class ref, but `new <local-with-tagged-class-id>` doesn't yet route through a registry. That's the same architectural gap that blocks `class X extends Factory()`, and it's the next layer #740 needs. Effect's standalone repro now prints `[1] before` / `[2] after class def` / `[3] pe._tag: undefined` (no crash, partial fields) rather than the original throw. **Validation.** All 26 / 28 `test_gap_*` tests still pass — same 2 pre-existing failures as main (`test_gap_console_methods.ts`, `test_gap_regexp_advanced.ts` — known categorical gaps in CLAUDE.md). All 20 / 23 `test_*class*.ts` tests still pass — same 3 pre-existing differences as main. No regressions. `cargo test -p perry-runtime --lib` 250 passed, 0 failed. **Out of scope.** (a) Runtime constructor dispatch — `js_new_dynamic(callee_value, args_vec)` that inspects the callee's NaN tag and routes to the right constructor when callee is an INT32-tagged class-id; needed for `new O.Inner(args)` (where `O.Inner` is a class ref read from an object field) and `class X extends Factory()` (where the parent class is a runtime value). Tracked as #740 follow-up. (b) `path.posix.X(...)` / `path.win32.X(...)` sub-namespace method calls — these still print `undefined` because the `is_sub_namespace` flag in expr_call.rs:519 short-circuits the NativeMethodCall lowering and the fall-through path doesn't dispatch back to the corresponding `path.X(...)` for POSIX. Not in scope for #741's stated acceptance (the 5-line repro), but is the last diff in `test_parity_path.ts`. (c) The `BaseEffectError`-via-object-literal pattern that Effect uses for `TaggedError` — needs (a) to dispatch the constructor when reading the class ref back from the object field.

CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
88

99
Perry is a native TypeScript compiler written in Rust that compiles TypeScript source code directly to native executables. It uses SWC for TypeScript parsing and LLVM for code generation.
1010

11-
**Current Version:** 0.5.902
11+
**Current Version:** 0.5.903
1212

1313

1414
## TypeScript Parity Status

0 commit comments

Comments
 (0)