Skip to content

Commit f6412a8

Browse files
committed
refactor(hir): extract Arrow + Fn + Object from lower_expr (v0.5.338)
Tier 2.3 follow-up. Two more sub-modules under lower/: - expr_function.rs (335 LOC) — ast::Expr::Arrow (178) + ast::Expr::Fn (138) plus a shared compute_closure_captures helper that the original arms duplicated verbatim. Co-locating them means the capture analysis is a real shared function instead of copy-paste. - expr_object.rs (508 LOC) — ast::Expr::Object including its inline is_closed_shape predicate. The largest single arm extracted so far. Closed-shape literals lower to `new __AnonShape_N()` for the codegen direct-GEP fast path; open-shape ones fall through to generic Object/ObjectSpread. lower_expr delta: 6508 → 5716 LOC (-792 in this commit; -971 cumulative across v0.5.337 + v0.5.338 = ~14.5%). Remaining Tier 2.3 arms (largest first): Call (3986), Member (405), New (393), Assign (312). Each has its own helper-fn cross-references that need careful coordination — separate PRs. Verified: - cargo build --release clean - cargo test --workspace 434/0 = baseline - gap tests 25/28; doc-tests 80/82 = baseline - Smoke compile exercising arrow-with-capture, fn expr, closed-shape object, spread object, computed key, and array-of-objects all match Node byte-for-byte.
1 parent b22d481 commit f6412a8

6 files changed

Lines changed: 889 additions & 831 deletions

File tree

CLAUDE.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
88

99
Perry is a native TypeScript compiler written in Rust that compiles TypeScript source code directly to native executables. It uses SWC for TypeScript parsing and LLVM for code generation.
1010

11-
**Current Version:** 0.5.337
11+
**Current Version:** 0.5.338
1212

1313
## TypeScript Parity Status
1414

@@ -149,6 +149,7 @@ First-resolved directory cached in `compile_package_dirs`; subsequent imports re
149149

150150
Keep entries to 1-2 lines max. Full details in CHANGELOG.md.
151151

152+
- **v0.5.338** — Tier 2.3 follow-up: extracts three more `lower_expr` arms from `crates/perry-hir/src/lower.rs` into focused sub-modules. (1) **`expr_function.rs`** (335 LOC) — both `ast::Expr::Arrow` (178 LOC) and `ast::Expr::Fn` (138 LOC) plus a shared `compute_closure_captures` helper that the original arms duplicated verbatim. The Arrow + Fn lowering shares almost all of its logic (parameter destructuring, body lowering with JS function-hoisting, closure capture analysis); the only differences are arrows capture `this` from the enclosing scope while function expressions don't, and arrows allow a single-expression body shorthand. Co-locating them lets the capture analysis become a real shared function instead of being copy-pasted. (2) **`expr_object.rs`** (508 LOC) — the `ast::Expr::Object` arm including its inline `is_closed_shape` predicate. This is the largest single arm extracted so far. The lowered shape depends on whether the literal is a "closed shape" (no spreads, all fixed string keys) — such literals lower to `new __AnonShape_N()` so downstream property access hits the codegen direct-GEP fast path; open-shape literals (spreads, computed keys, getters/setters) fall through to a generic `Object` / `ObjectSpread` HIR node. **Files**: 2 new sub-modules under `lower/`, plus the v0.5.337 `expr_misc.rs`. **lower_expr delta**: 6508 → 5716 LOC (-792 in this commit; 6687 → 5716 cumulative across v0.5.337+v0.5.338 = -971, ~14.5% total reduction). **Unblocked refactors enabled**: the shared `compute_closure_captures` helper is now a clean target for the Tier 4 follow-up that fuses outer `collect_local_refs_stmt` + `collect_assigned_locals_stmt` into one walk (currently runs both separately on the body). **What remains in Tier 2.3**: the biggest arms — `Call` (3986 LOC, by far the largest), `Member` (405), `New` (393), `Assign` (312). Each has its own helper-fn cross-references that need careful coordination; doing them in a single PR would balloon the diff to >5k LOC. **Verified**: cargo build --release clean; cargo test --workspace 434/0 = baseline; gap tests 25/28 = baseline; doc-tests --skip-xcompile 80/82 = baseline; smoke compile exercising arrow-with-capture, function expression, closed-shape object, spread object, computed key, and array-of-objects (`[1,2,3].map(n => ({ id: n, sq: n*n }))`) all match Node byte-for-byte. **Cumulative across this session** (v0.5.329-v0.5.338, ten commits): all plan items have shipped work; Tier 2.3 has now had two rounds of extractions and the pattern is well-established for the remaining bigger arms.
152153
- **v0.5.337** — Tier 2.3 of the compiler-improvement plan (pilot scope): begins splitting the 6,687-line `lower::lower_expr` function in `crates/perry-hir/src/lower.rs` by extracting 8 self-contained AST variants — `Cond`, `Await`, `SuperProp`, `Update`, `Tpl`, `Seq`, `MetaProp`, `Yield` — into a new `lower/expr_misc.rs` sub-module. Each becomes a free `pub(super) fn lower_<variant>(ctx: &mut LoweringContext, node: &ast::<Type>) -> Result<Expr>` taking the SWC AST node and returning the same `Result<Expr>` the original arm produced. Recursion goes through `super::lower_expr`, matching the pattern from Tier 2.1 (`compile.rs` split) and Tier 2.2 (`ui_styling` extracted from `lower_call.rs`). The match arms in `lower_expr` collapse to one-line delegations like `ast::Expr::Cond(cond) => expr_misc::lower_cond(ctx, cond)`. **Pilot rationale**: the extracted 8 are the smallest, well-bounded variants — each between 4 and 64 LOC, none introducing nested helper fns of its own (the original `Update` arm's nested-`match` shape ports cleanly), all using only public methods on `LoweringContext`. The bigger arms (`Call` 3986 LOC, `Object` 479, `Member` 405, `New` 393, `Assign` 312, `Arrow` 178) are followups: each carries cross-references and helper fns that need careful coordination, and a single PR splitting all 32 arms would balloon the diff to >10k LOC. The pilot proves the extraction pattern works without the recursion-vs-borrow-checker wrestling that giant-arm extraction sometimes produces. **Files**: new `crates/perry-hir/src/lower/expr_misc.rs` (222 LOC = 8 helpers + module doc + imports). lower.rs delta: 13599 → 13415 LOC overall (-184); the lower_expr function specifically went 6687 → 6508 LOC (-179, ~2.7%). Net workspace LOC roughly unchanged (extracted code still exists, just in a focused module). The win is cognitive load: each extracted helper is now individually testable, future variant work (e.g. the `Update` arm's PrivateName/Computed branches) doesn't have to scroll past the 6000-line `lower_expr` body. **What's NOT done in the pilot**: the 5 biggest arms remain inline. Each is independently extractable using the same pattern; doing them later as focused PRs avoids one massive diff. **Verified**: `cargo build --release` clean; `cargo test --workspace` 434/0 = baseline; gap tests 25/28 = baseline; doc-tests --skip-xcompile 80/82 = baseline; smoke compile of a TypeScript program exercising all 5 testable extracted variants (`cond`, `update`, `tpl`, `seq`, `yield` — Await/SuperProp/MetaProp don't have easy single-line repros) matches Node byte-for-byte. **Cumulative across this session** (v0.5.329-v0.5.337, nine commits): all 13 plan items shipped including the highest-risk lower_expr split (pilot scope). Tier 2.3 broader rollout is the only remaining followup; everything else from the plan is complete.
153154
- **v0.5.336** — Tier 4 follow-up: completes the remaining three perf items the plan called out (4.3, 4.4, 4.6), now matching the four already shipped in v0.5.335. **4.6 Arc&lt;I18nTable&gt;**: pre-fix `crates/perry/src/commands/compile.rs` cloned the per-module `i18n_snapshot` tuple inside the `par_iter()` codegen loop — every clone duplicated the (potentially large) `Vec<String>` of every translated string × every locale. New `pub i18n_table: Option<std::sync::Arc<(Vec<String>, usize, usize, Vec<String>, usize)>>` (was the bare tuple) on `CompileOptions`; `i18n_snapshot` is wrapped once at the top of the loop, the per-module clone is now a cheap Arc reference bump. The destructure at `crates/perry-codegen/src/codegen.rs::compile_module` was updated to `arc.as_ref()` deref. The cache-key derivation in `compute_object_cache_key` likewise now derefs through the Arc. Inner `I18nLowerCtx.translations` (codegen-side per-module copy) is still a Vec — wrapping it in Arc too would eliminate the second per-module clone but is a wider refactor tracked as a follow-up. Per-module saving: roughly 1 × `Vec<String>` clone per module per build (was 2). On a project with 30 modules and 1000 translated strings, this saves ~30 redundant Vec allocations + their String contents per compile. **4.4 parallel `.ll` write**: `compile.rs` post-codegen used to `for result in compile_results { fs::write(...) }` — sequential I/O that bottlenecked when codegen finished producing bytes faster than a single thread could drain. Refactored to: (a) sequential partition into `to_write: Vec<(PathBuf, Vec<u8>)>` + error reporting (errors print in source order, preserved from pre-fix), (b) parallel write via `to_write.par_iter().map(|(p, b)| fs::write(p, b)).collect()` — the OS handles concurrent writes to distinct paths fine, (c) bail on first I/O error after the par_iter finishes (preserves the "fail fast on disk-full / permission" semantics), (d) sequential print + `obj_paths` collection (so output is grouped not interleaved). Wall-time saving scales with module count and disk-writev parallelism (~2-4x faster on a SSD with 50+ modules, less on slow storage). **4.3 fuse mutable-captures passes**: `crates/perry-hir/src/lower.rs::widen_mutable_captures_stmts` had three back-to-back `for stmt in stmts.iter()` loops, each populating a separate HashSet (`scope_mutable`, `scope_captured`, `scope_assigned_at_level`). Fused into a single iteration that calls all three `collect_*` helpers per statement. The collectors read disjoint Expr/Stmt fields with no ordering dependency, so the union is identical. Saves 2 full Stmt slice traversals per scope; this pass runs over `module.init` + every function body + every class method/getter/setter/static_method/ctor body, so the savings compound on a large project. The mutating pass at the bottom (`widen_mutable_captures_stmt`) still runs separately because it depends on the union of all three sets. **Tier 4 complete**: all six items shipped (4.1 + 4.2 + 4.5 in v0.5.335; 4.3 + 4.4 + 4.6 here). **Verified**: cargo build --release clean; cargo test --workspace 434/0 = baseline; gap tests 25/28 = baseline; doc-tests --skip-xcompile 80/82 = baseline; multi-module #212 closure-capture smoke compile matches Node byte-for-byte (exercises widen_mutable_captures and the parallel codegen + write path). **Cumulative across this session** (v0.5.329-v0.5.336, eight commits): all 11 highest-leverage items in the compiler-improvement plan now shipped except Tier 2.3 (lower_expr split — biggest risk, deliberately left for a focused PR).
154155
- **v0.5.335** — Tier 4 of the compiler-improvement plan (three perf wins): **4.1** fuses two `module.functions.iter()` passes in `perry_transform::inline::inline_functions` (Math.imul polyfill detection + inlinable-function candidate collection) into one iteration, and fuses two `module.classes.iter()` passes (inlinable-method collection + class-name lookup) into one iteration. Saves 2 full module scans per compile; per-compile savings scale with module size. Pre-fix the four scans were back-to-back over the same collections with no ordering dependency between them. **4.2** fuses five `ctx.native_modules.par_iter_mut().for_each(...)` calls in `crates/perry/src/commands/compile.rs` into two. The pre-fix sequence was: (1) `transform_js_imports`, (2) `fix_local_native_instances`, (3) `fix_cross_module_native_instances`, (4) `fix_local_native_instances` (re-run), (5) `monomorphize_module`. Pass A now fuses 1+2 (independent within each module); pass B fuses 3+4+5 (the cross-module step needs the export maps built between the two passes, but its result + the local-fix re-run + monomorphization are all intra-module operations once the maps exist). Saves three rayon scheduler round-trips per compile of a multi-module project. Behavior preserved exactly: the local-fix re-run still runs unconditionally (matching pre-fix semantics for the `has_native_exports = false` branch). The `_jsruntime` and `has_native_exports` gates inside the fused closures keep modules that don't need those passes paying only the cheap branch. **4.5** bounds the in-memory `ParseCache` (used by `perry dev` to skip reparsing unchanged files between rebuilds) at 500 entries with FIFO eviction. Pre-fix the cache was unbounded — a `perry dev` session that walked `node_modules` or any large dir would hold every parsed AST forever (potentially 100+ MB of SWC AST nodes). New `pub const DEFAULT_PARSE_CACHE_CAPACITY: usize = 500` + `ParseCache::with_capacity(n)` constructor for atypical projects (pass `usize::MAX` to opt out of eviction). Implementation: `VecDeque<PathBuf>` tracks insertion order; on miss for a brand-new path, if `entries.len() >= max_entries` the front of the order queue is popped and removed from `entries` before insertion. Same-path re-inserts (the common case during edit-rebuild cycles) bypass eviction since the entry count is unchanged. FIFO over true LRU avoids a new `lru` crate dep and the per-hit re-ordering it would need; the perry-dev access pattern (a file's miss → re-insert puts it at the back, files not touched stay at front) makes them functionally equivalent. Two new unit tests pin the eviction invariant: `eviction_caps_entries_at_max_capacity` (insert 6 with cap=3, verify 3 oldest are evicted), `re_inserting_same_path_does_not_count_against_cap` (touch path A multiple times, then B + C with cap=2, verify A is evicted — not B/C). **What's NOT done in Tier 4 this session**: 4.3 (combine three mutable-captures passes in lower.rs — needs careful HIR analysis), 4.4 (parallelize per-module .ll write — small win, depends on file-system parallelism not rayon), 4.6 (`Arc<I18nTable>` instead of cloning per worker — small win in already-fast i18n path). Each is independently extractable as a future PR. **Verified**: `cargo build --release` clean; `cargo test --workspace` 434/0 = baseline+2 (the two new parse_cache tests); gap tests 25/28 = baseline; doc-tests --skip-xcompile 80/82 = baseline; multi-module smoke compile (`test_issue_212_class_method_capture.ts` — 10 sub-tests exercising class-method captures, generics, mixed types) matches `node --experimental-strip-types` byte-for-byte. **Cumulative across this session** (v0.5.329-v0.5.335, seven commits): Tier 1.1 + 1.2 + 1.3 + 2.1-partial + 2.2-partial + 3.1 + 4.1 + 4.2 + 4.5 — eight of the eleven highest-leverage items in the compiler-improvement plan now shipped.

Cargo.lock

Lines changed: 28 additions & 28 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,7 @@ opt-level = "s" # Optimize for size in stdlib
111111
opt-level = 3
112112

113113
[workspace.package]
114-
version = "0.5.337"
114+
version = "0.5.338"
115115
edition = "2021"
116116
license = "MIT"
117117
repository = "https://github.com/PerryTS/perry"

0 commit comments

Comments
 (0)