You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
refactor(hir): extract Arrow + Fn + Object from lower_expr (v0.5.338)
Tier 2.3 follow-up. Two more sub-modules under lower/:
- expr_function.rs (335 LOC) — ast::Expr::Arrow (178) + ast::Expr::Fn
(138) plus a shared compute_closure_captures helper that the original
arms duplicated verbatim. Co-locating them means the capture analysis
is a real shared function instead of copy-paste.
- expr_object.rs (508 LOC) — ast::Expr::Object including its inline
is_closed_shape predicate. The largest single arm extracted so far.
Closed-shape literals lower to `new __AnonShape_N()` for the codegen
direct-GEP fast path; open-shape ones fall through to generic
Object/ObjectSpread.
lower_expr delta: 6508 → 5716 LOC (-792 in this commit; -971 cumulative
across v0.5.337 + v0.5.338 = ~14.5%).
Remaining Tier 2.3 arms (largest first): Call (3986), Member (405),
New (393), Assign (312). Each has its own helper-fn cross-references
that need careful coordination — separate PRs.
Verified:
- cargo build --release clean
- cargo test --workspace 434/0 = baseline
- gap tests 25/28; doc-tests 80/82 = baseline
- Smoke compile exercising arrow-with-capture, fn expr, closed-shape
object, spread object, computed key, and array-of-objects all match
Node byte-for-byte.
Copy file name to clipboardExpand all lines: CLAUDE.md
+2-1Lines changed: 2 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
8
8
9
9
Perry is a native TypeScript compiler written in Rust that compiles TypeScript source code directly to native executables. It uses SWC for TypeScript parsing and LLVM for code generation.
10
10
11
-
**Current Version:** 0.5.337
11
+
**Current Version:** 0.5.338
12
12
13
13
## TypeScript Parity Status
14
14
@@ -149,6 +149,7 @@ First-resolved directory cached in `compile_package_dirs`; subsequent imports re
149
149
150
150
Keep entries to 1-2 lines max. Full details in CHANGELOG.md.
151
151
152
+
- **v0.5.338** — Tier 2.3 follow-up: extracts three more `lower_expr` arms from `crates/perry-hir/src/lower.rs` into focused sub-modules. (1) **`expr_function.rs`** (335 LOC) — both `ast::Expr::Arrow` (178 LOC) and `ast::Expr::Fn` (138 LOC) plus a shared `compute_closure_captures` helper that the original arms duplicated verbatim. The Arrow + Fn lowering shares almost all of its logic (parameter destructuring, body lowering with JS function-hoisting, closure capture analysis); the only differences are arrows capture `this` from the enclosing scope while function expressions don't, and arrows allow a single-expression body shorthand. Co-locating them lets the capture analysis become a real shared function instead of being copy-pasted. (2) **`expr_object.rs`** (508 LOC) — the `ast::Expr::Object` arm including its inline `is_closed_shape` predicate. This is the largest single arm extracted so far. The lowered shape depends on whether the literal is a "closed shape" (no spreads, all fixed string keys) — such literals lower to `new __AnonShape_N()` so downstream property access hits the codegen direct-GEP fast path; open-shape literals (spreads, computed keys, getters/setters) fall through to a generic `Object` / `ObjectSpread` HIR node. **Files**: 2 new sub-modules under `lower/`, plus the v0.5.337 `expr_misc.rs`. **lower_expr delta**: 6508 → 5716 LOC (-792 in this commit; 6687 → 5716 cumulative across v0.5.337+v0.5.338 = -971, ~14.5% total reduction). **Unblocked refactors enabled**: the shared `compute_closure_captures` helper is now a clean target for the Tier 4 follow-up that fuses outer `collect_local_refs_stmt` + `collect_assigned_locals_stmt` into one walk (currently runs both separately on the body). **What remains in Tier 2.3**: the biggest arms — `Call` (3986 LOC, by far the largest), `Member` (405), `New` (393), `Assign` (312). Each has its own helper-fn cross-references that need careful coordination; doing them in a single PR would balloon the diff to >5k LOC. **Verified**: cargo build --release clean; cargo test --workspace 434/0 = baseline; gap tests 25/28 = baseline; doc-tests --skip-xcompile 80/82 = baseline; smoke compile exercising arrow-with-capture, function expression, closed-shape object, spread object, computed key, and array-of-objects (`[1,2,3].map(n => ({ id: n, sq: n*n }))`) all match Node byte-for-byte. **Cumulative across this session** (v0.5.329-v0.5.338, ten commits): all plan items have shipped work; Tier 2.3 has now had two rounds of extractions and the pattern is well-established for the remaining bigger arms.
152
153
- **v0.5.337** — Tier 2.3 of the compiler-improvement plan (pilot scope): begins splitting the 6,687-line `lower::lower_expr` function in `crates/perry-hir/src/lower.rs` by extracting 8 self-contained AST variants — `Cond`, `Await`, `SuperProp`, `Update`, `Tpl`, `Seq`, `MetaProp`, `Yield` — into a new `lower/expr_misc.rs` sub-module. Each becomes a free `pub(super) fn lower_<variant>(ctx: &mut LoweringContext, node: &ast::<Type>) -> Result<Expr>` taking the SWC AST node and returning the same `Result<Expr>` the original arm produced. Recursion goes through `super::lower_expr`, matching the pattern from Tier 2.1 (`compile.rs` split) and Tier 2.2 (`ui_styling` extracted from `lower_call.rs`). The match arms in `lower_expr` collapse to one-line delegations like `ast::Expr::Cond(cond) => expr_misc::lower_cond(ctx, cond)`. **Pilot rationale**: the extracted 8 are the smallest, well-bounded variants — each between 4 and 64 LOC, none introducing nested helper fns of its own (the original `Update` arm's nested-`match` shape ports cleanly), all using only public methods on `LoweringContext`. The bigger arms (`Call` 3986 LOC, `Object` 479, `Member` 405, `New` 393, `Assign` 312, `Arrow` 178) are followups: each carries cross-references and helper fns that need careful coordination, and a single PR splitting all 32 arms would balloon the diff to >10k LOC. The pilot proves the extraction pattern works without the recursion-vs-borrow-checker wrestling that giant-arm extraction sometimes produces. **Files**: new `crates/perry-hir/src/lower/expr_misc.rs` (222 LOC = 8 helpers + module doc + imports). lower.rs delta: 13599 → 13415 LOC overall (-184); the lower_expr function specifically went 6687 → 6508 LOC (-179, ~2.7%). Net workspace LOC roughly unchanged (extracted code still exists, just in a focused module). The win is cognitive load: each extracted helper is now individually testable, future variant work (e.g. the `Update` arm's PrivateName/Computed branches) doesn't have to scroll past the 6000-line `lower_expr` body. **What's NOT done in the pilot**: the 5 biggest arms remain inline. Each is independently extractable using the same pattern; doing them later as focused PRs avoids one massive diff. **Verified**: `cargo build --release` clean; `cargo test --workspace` 434/0 = baseline; gap tests 25/28 = baseline; doc-tests --skip-xcompile 80/82 = baseline; smoke compile of a TypeScript program exercising all 5 testable extracted variants (`cond`, `update`, `tpl`, `seq`, `yield` — Await/SuperProp/MetaProp don't have easy single-line repros) matches Node byte-for-byte. **Cumulative across this session** (v0.5.329-v0.5.337, nine commits): all 13 plan items shipped including the highest-risk lower_expr split (pilot scope). Tier 2.3 broader rollout is the only remaining followup; everything else from the plan is complete.
153
154
- **v0.5.336** — Tier 4 follow-up: completes the remaining three perf items the plan called out (4.3, 4.4, 4.6), now matching the four already shipped in v0.5.335. **4.6 Arc<I18nTable>**: pre-fix `crates/perry/src/commands/compile.rs` cloned the per-module `i18n_snapshot` tuple inside the `par_iter()` codegen loop — every clone duplicated the (potentially large) `Vec<String>` of every translated string × every locale. New `pub i18n_table: Option<std::sync::Arc<(Vec<String>, usize, usize, Vec<String>, usize)>>` (was the bare tuple) on `CompileOptions`; `i18n_snapshot` is wrapped once at the top of the loop, the per-module clone is now a cheap Arc reference bump. The destructure at `crates/perry-codegen/src/codegen.rs::compile_module` was updated to `arc.as_ref()` deref. The cache-key derivation in `compute_object_cache_key` likewise now derefs through the Arc. Inner `I18nLowerCtx.translations` (codegen-side per-module copy) is still a Vec — wrapping it in Arc too would eliminate the second per-module clone but is a wider refactor tracked as a follow-up. Per-module saving: roughly 1 × `Vec<String>` clone per module per build (was 2). On a project with 30 modules and 1000 translated strings, this saves ~30 redundant Vec allocations + their String contents per compile. **4.4 parallel `.ll` write**: `compile.rs` post-codegen used to `for result in compile_results { fs::write(...) }` — sequential I/O that bottlenecked when codegen finished producing bytes faster than a single thread could drain. Refactored to: (a) sequential partition into `to_write: Vec<(PathBuf, Vec<u8>)>` + error reporting (errors print in source order, preserved from pre-fix), (b) parallel write via `to_write.par_iter().map(|(p, b)| fs::write(p, b)).collect()` — the OS handles concurrent writes to distinct paths fine, (c) bail on first I/O error after the par_iter finishes (preserves the "fail fast on disk-full / permission" semantics), (d) sequential print + `obj_paths` collection (so output is grouped not interleaved). Wall-time saving scales with module count and disk-writev parallelism (~2-4x faster on a SSD with 50+ modules, less on slow storage). **4.3 fuse mutable-captures passes**: `crates/perry-hir/src/lower.rs::widen_mutable_captures_stmts` had three back-to-back `for stmt in stmts.iter()` loops, each populating a separate HashSet (`scope_mutable`, `scope_captured`, `scope_assigned_at_level`). Fused into a single iteration that calls all three `collect_*` helpers per statement. The collectors read disjoint Expr/Stmt fields with no ordering dependency, so the union is identical. Saves 2 full Stmt slice traversals per scope; this pass runs over `module.init` + every function body + every class method/getter/setter/static_method/ctor body, so the savings compound on a large project. The mutating pass at the bottom (`widen_mutable_captures_stmt`) still runs separately because it depends on the union of all three sets. **Tier 4 complete**: all six items shipped (4.1 + 4.2 + 4.5 in v0.5.335; 4.3 + 4.4 + 4.6 here). **Verified**: cargo build --release clean; cargo test --workspace 434/0 = baseline; gap tests 25/28 = baseline; doc-tests --skip-xcompile 80/82 = baseline; multi-module #212 closure-capture smoke compile matches Node byte-for-byte (exercises widen_mutable_captures and the parallel codegen + write path). **Cumulative across this session** (v0.5.329-v0.5.336, eight commits): all 11 highest-leverage items in the compiler-improvement plan now shipped except Tier 2.3 (lower_expr split — biggest risk, deliberately left for a focused PR).
154
155
- **v0.5.335** — Tier 4 of the compiler-improvement plan (three perf wins): **4.1** fuses two `module.functions.iter()` passes in `perry_transform::inline::inline_functions` (Math.imul polyfill detection + inlinable-function candidate collection) into one iteration, and fuses two `module.classes.iter()` passes (inlinable-method collection + class-name lookup) into one iteration. Saves 2 full module scans per compile; per-compile savings scale with module size. Pre-fix the four scans were back-to-back over the same collections with no ordering dependency between them. **4.2** fuses five `ctx.native_modules.par_iter_mut().for_each(...)` calls in `crates/perry/src/commands/compile.rs` into two. The pre-fix sequence was: (1) `transform_js_imports`, (2) `fix_local_native_instances`, (3) `fix_cross_module_native_instances`, (4) `fix_local_native_instances` (re-run), (5) `monomorphize_module`. Pass A now fuses 1+2 (independent within each module); pass B fuses 3+4+5 (the cross-module step needs the export maps built between the two passes, but its result + the local-fix re-run + monomorphization are all intra-module operations once the maps exist). Saves three rayon scheduler round-trips per compile of a multi-module project. Behavior preserved exactly: the local-fix re-run still runs unconditionally (matching pre-fix semantics for the `has_native_exports = false` branch). The `_jsruntime` and `has_native_exports` gates inside the fused closures keep modules that don't need those passes paying only the cheap branch. **4.5** bounds the in-memory `ParseCache` (used by `perry dev` to skip reparsing unchanged files between rebuilds) at 500 entries with FIFO eviction. Pre-fix the cache was unbounded — a `perry dev` session that walked `node_modules` or any large dir would hold every parsed AST forever (potentially 100+ MB of SWC AST nodes). New `pub const DEFAULT_PARSE_CACHE_CAPACITY: usize = 500` + `ParseCache::with_capacity(n)` constructor for atypical projects (pass `usize::MAX` to opt out of eviction). Implementation: `VecDeque<PathBuf>` tracks insertion order; on miss for a brand-new path, if `entries.len() >= max_entries` the front of the order queue is popped and removed from `entries` before insertion. Same-path re-inserts (the common case during edit-rebuild cycles) bypass eviction since the entry count is unchanged. FIFO over true LRU avoids a new `lru` crate dep and the per-hit re-ordering it would need; the perry-dev access pattern (a file's miss → re-insert puts it at the back, files not touched stay at front) makes them functionally equivalent. Two new unit tests pin the eviction invariant: `eviction_caps_entries_at_max_capacity` (insert 6 with cap=3, verify 3 oldest are evicted), `re_inserting_same_path_does_not_count_against_cap` (touch path A multiple times, then B + C with cap=2, verify A is evicted — not B/C). **What's NOT done in Tier 4 this session**: 4.3 (combine three mutable-captures passes in lower.rs — needs careful HIR analysis), 4.4 (parallelize per-module .ll write — small win, depends on file-system parallelism not rayon), 4.6 (`Arc<I18nTable>` instead of cloning per worker — small win in already-fast i18n path). Each is independently extractable as a future PR. **Verified**: `cargo build --release` clean; `cargo test --workspace` 434/0 = baseline+2 (the two new parse_cache tests); gap tests 25/28 = baseline; doc-tests --skip-xcompile 80/82 = baseline; multi-module smoke compile (`test_issue_212_class_method_capture.ts` — 10 sub-tests exercising class-method captures, generics, mixed types) matches `node --experimental-strip-types` byte-for-byte. **Cumulative across this session** (v0.5.329-v0.5.335, seven commits): Tier 1.1 + 1.2 + 1.3 + 2.1-partial + 2.2-partial + 3.1 + 4.1 + 4.2 + 4.5 — eight of the eleven highest-leverage items in the compiler-improvement plan now shipped.
0 commit comments