Skip to content

Commit 7ef0f7a

Browse files
committed
feat: multi-threading support and parallel compiler pipeline (v0.4.0)
perry/thread module with three primitives: - parallelMap(array, fn): data-parallel array processing across CPU cores - parallelFilter(array, fn): data-parallel array filtering across CPU cores - spawn(fn): background OS thread returning Promise Threading infrastructure: - SerializedValue deep-copy for safe cross-thread value transfer - Thread-local arenas with Drop cleanup for worker threads - Promise integration via PENDING_THREAD_RESULTS queue - Compile-time mutable capture rejection for thread safety Parallel compiler pipeline (rayon): - Module codegen runs across all CPU cores - Transform passes (JS imports, native instances, monomorphization) parallelized - nm symbol scanning parallelized Runtime improvements: - Array.sort() upgraded from O(n²) insertion sort to O(n log n) TimSort hybrid Documentation: - New "Multi-Threading" section (4 pages) in docs - Updated introduction, hello-world, and limitations pages
1 parent 00e466d commit 7ef0f7a

21 files changed

Lines changed: 2441 additions & 345 deletions

File tree

CLAUDE.md

Lines changed: 30 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
88

99
Perry is a native TypeScript compiler written in Rust that compiles TypeScript source code directly to native executables. It uses SWC for TypeScript parsing and Cranelift for code generation.
1010

11-
**Current Version:** 0.3.2
11+
**Current Version:** 0.4.0
1212

1313
## Workflow Requirements
1414

@@ -36,13 +36,13 @@ TypeScript (.ts) → Parse (SWC) → AST → Lower → HIR → Transform → Cod
3636

3737
| Crate | Purpose |
3838
|-------|---------|
39-
| **perry** | CLI driver |
39+
| **perry** | CLI driver (parallel module codegen via rayon) |
4040
| **perry-parser** | SWC wrapper for TypeScript parsing |
4141
| **perry-types** | Type system definitions |
4242
| **perry-hir** | HIR data structures (`ir.rs`) and AST→HIR lowering (`lower.rs`) |
4343
| **perry-transform** | IR passes (closure conversion, async lowering, inlining) |
4444
| **perry-codegen** | Cranelift-based native code generation |
45-
| **perry-runtime** | Runtime: value.rs, object.rs, array.rs, string.rs, gc.rs, arena.rs, etc. |
45+
| **perry-runtime** | Runtime: value.rs, object.rs, array.rs, string.rs, gc.rs, arena.rs, thread.rs |
4646
| **perry-stdlib** | Node.js API support (mysql2, redis, fetch, fastify, ws, etc.) |
4747
| **perry-ui** / **perry-ui-macos** / **perry-ui-ios** | Native UI (AppKit/UIKit) |
4848
| **perry-jsruntime** | JavaScript interop via QuickJS |
@@ -66,6 +66,18 @@ Key functions: `js_nanbox_string/pointer/bigint`, `js_nanbox_get_pointer`, `js_g
6666

6767
Mark-sweep GC in `crates/perry-runtime/src/gc.rs` with conservative stack scanning. Arena objects (arrays, objects) discovered by linear block walking (zero per-alloc tracking). Malloc objects (strings, closures, promises, bigints, errors) tracked in thread-local Vec. Triggers on new arena block allocation (~8MB) or explicit `gc()` call. 8-byte GcHeader per allocation.
6868

69+
## Threading (`perry/thread`)
70+
71+
User code is single-threaded by default. `perry/thread` module provides three primitives with compile-time safety (no mutable captures allowed):
72+
73+
- **`parallelMap(array, fn)`** — data-parallel array processing across all CPU cores
74+
- **`parallelFilter(array, fn)`** — data-parallel array filtering across all CPU cores
75+
- **`spawn(fn)`** — background OS thread, returns Promise
76+
77+
Values cross threads via `SerializedValue` deep-copy (zero-cost for numbers, O(n) for strings/arrays/objects). Each thread has independent arena + GC. Arena `Drop` frees blocks when worker threads exit. Results from `spawn` flow back via `PENDING_THREAD_RESULTS` queue, drained during `js_promise_run_microtasks()`.
78+
79+
**Compiler pipeline** also parallelized via rayon: module codegen, transform passes, and nm symbol scanning.
80+
6981
## Native UI (`perry/ui`)
7082

7183
Declarative TypeScript compiles to AppKit/UIKit calls. 47 `perry_ui_*` FFI functions. Handle-based widget system (1-based i64 handles, NaN-boxed with POINTER_TAG). 5 reactive binding types dispatched from `state_set()`. `--target ios-simulator`/`--target ios` for cross-compilation.
@@ -84,39 +96,25 @@ Projects can list npm packages to compile natively instead of routing to V8. Con
8496
{ "perry": { "compilePackages": ["@noble/curves", "@noble/hashes"] } }
8597
```
8698

87-
**Implementation** (`crates/perry/src/commands/compile.rs`):
88-
- `CompilationContext.compile_packages: HashSet<String>` — packages to compile natively
89-
- `CompilationContext.compile_package_dirs: HashMap<String, PathBuf>` — dedup cache (first-found dir per package)
90-
- `resolve_package_source_entry()` — prefers `src/index.ts` over `lib/index.js`
91-
- `is_in_compile_package()` — checks if a file path is inside a listed package
92-
- `resolve_import()` — redirects compile packages to first-found dir for dedup, marks as `NativeCompiled`
93-
- `collect_modules()``.js` files inside compile packages bypass JS runtime routing
94-
95-
**Dedup logic**: When `@noble/hashes` appears in both `@noble/curves/node_modules/` and `@solana/web3.js/node_modules/`, the first-resolved directory is cached in `compile_package_dirs`. Subsequent imports redirect to the same copy, preventing duplicate linker symbols.
99+
**Dedup logic**: When `@noble/hashes` appears in multiple `node_modules/`, the first-resolved directory is cached in `compile_package_dirs`. Subsequent imports redirect to the same copy, preventing duplicate linker symbols.
96100

97101
## Known Limitations
98102

99103
- **No runtime type checking**: Types erased at compile time. `typeof` via NaN-boxing tags. `instanceof` via class ID chain.
100-
- **Single-threaded**: User code on one thread. Async I/O on tokio worker pool. Use `spawn_for_promise_deferred()` for safe cross-thread data transfer.
104+
- **No shared mutable state across threads**: Thread primitives enforce immutable captures at compile time. No `SharedArrayBuffer` or `Atomics`.
101105

102106
## Common Pitfalls & Patterns
103107

104108
### NaN-Boxing Mistakes
105109
- **Double NaN-boxing**: If value is already F64, don't NaN-box again. Check `builder.func.dfg.value_type(val)`.
106110
- **Wrong tag**: Strings=STRING_TAG, objects=POINTER_TAG, BigInt=BIGINT_TAG.
107111
- **`as f64` vs `from_bits`**: `u64 as f64` is numeric conversion (WRONG). Use `f64::from_bits(u64)` to preserve bits.
108-
- **Handle extraction**: Handle-based objects are small integers NaN-boxed with POINTER_TAG. Use `js_nanbox_get_pointer`, not bitcast.
109112

110113
### Cranelift Type Mismatches
111114
- Loop counter optimization produces I32 — always convert before passing to F64/I64 functions
112115
- Check `builder.func.dfg.value_type(val)` before conversion; handle F64↔I64, I32→F64, I32→I64
113-
- `is_pointer && !is_union` for variable type determination
114116
- Constructor parameters always F64 (NaN-boxed) at signature level
115117

116-
### Function Inlining (inline.rs)
117-
- `try_inline_call` returns body with `Stmt::Return` — in `Stmt::Expr` context, convert `Return(Some(e))``Expr(e)`
118-
- `substitute_locals` must handle ALL expression types
119-
120118
### Async / Threading
121119
- Thread-local arenas: JSValues from tokio workers invalid on main thread
122120
- Use `spawn_for_promise_deferred()` — return raw Rust data, convert to JSValue on main thread
@@ -126,71 +124,48 @@ Projects can list npm packages to compile natively instead of routing to V8. Con
126124
- ExternFuncRef values are NaN-boxed — use `js_nanbox_get_pointer` to extract
127125
- Module init order: topological sort by import dependencies
128126
- Optional params need `imported_func_param_counts` propagation through re-exports
129-
- Wrapper functions: truncate `call_args` to match declared signature
130127

131128
### Closure Captures
132129
- `collect_local_refs_expr()` must handle all expression types — catch-all silently skips refs
133130
- Captured string/pointer values must be NaN-boxed before storing, not raw bitcast
134131
- Loop counter i32 values: `fcvt_from_sint` to f64 before capture storage
135132

136-
### Loop Optimization
137-
- Pattern 3 accumulator (8 f64 accumulators) — skip for string variables
138-
- LICM: `hoisted_element_loads` + `hoisted_i32_products` cache invariant loads before inner loops
139-
- `try_compile_index_as_i32` keeps i32 ops contained to array indexing only
140-
141133
### Handle-Based Dispatch
142134
- TWO systems: `HANDLE_METHOD_DISPATCH` (methods) and `HANDLE_PROPERTY_DISPATCH` (properties)
143135
- Both must be registered. Small pointer detection: value < 0x100000 = handle.
144136

145-
### UI Codegen
146-
- NativeMethodCall has TWO arg paths: has-object and no-object — don't mix them up
147-
- `js_object_get_field_by_name_f64` (NOT `js_object_get_field_by_name`) for runtime field extraction
148-
149137
### objc2 v0.6 API
150138
- `define_class!` with `#[unsafe(super(NSObject))]`, `msg_send!` returns `Retained` directly
151139
- All AppKit constructors require `MainThreadMarker`
152-
- `CGPoint`/`CGSize`/`CGRect` in `objc2_core_foundation`
153140

154141
## Recent Changes
155142

156-
### v0.3.2
157-
- watchOS native app support (`--target watchos`/`--target watchos-simulator`): data-driven SwiftUI renderer, `perry-ui-watchos` crate with full `perry_ui_*` FFI surface, fixed PerryWatchApp.swift runtime, `perry setup watchos`, `perry run watchos`, `[watchos]` config in perry.toml
158-
159-
### v0.3.0
160-
- Compile-time i18n system (`perry/i18n` module): zero-ceremony localization for UI strings, `[i18n]` config in perry.toml, embedded 2D string table, native locale detection (all 6 platforms via OS APIs), `{param}` interpolation, CLDR plural rules (30+ locales), format wrappers (`Currency`, `Percent`, `ShortDate`, `LongDate`, `FormatNumber`, `FormatTime`, `Raw`), `perry i18n extract` CLI, iOS `.lproj` + Android `values-xx/` generation, compile-time validation
161-
162-
### v0.2.203
163-
- `perry setup` summary: show what's stored in global config vs project perry.toml for all platforms
143+
### v0.4.0
144+
- `perry/thread` module: `parallelMap`, `parallelFilter`, and `spawn` — real OS threads with compile-time safety. `SerializedValue` deep-copy, thread-local arenas with `Drop`, promise integration via `PENDING_THREAD_RESULTS`.
145+
- Parallel compiler pipeline via rayon: module codegen, transform passes, nm symbol scanning all across CPU cores.
146+
- Array.sort() upgraded from O(n²) insertion sort to O(n log n) TimSort-style hybrid.
147+
- Comprehensive threading docs in `docs/src/threading/` (4 pages).
164148

165-
### v0.2.202
166-
- Fix `perry setup ios` not saving bundle_id to perry.toml
149+
### v0.3.3
150+
- `perry publish`: `.env` loading, `[publish] exclude` in perry.toml
167151

168-
### v0.2.201
169-
- `perry setup` improvements: auto-detect signing identity, show config paths, bundle_id priority lookup
170-
171-
### v0.2.200
172-
- Fix `perry setup` not saving to project perry.toml when file didn't exist
173-
- Audio capture API (`perry/system`): 6 platforms, A-weighted dB(A) metering
174-
- Camera API (`perry/ui`, iOS only): CameraView, freeze/unfreeze, pixel sampling
175-
176-
### v0.2.199
177-
- Fix `import * as X` namespace function calls, ScrollView in ZStack, SIGBUS during module init
152+
### v0.3.2
153+
- watchOS native app support (`--target watchos`/`--target watchos-simulator`)
178154

179-
### v0.2.198
180-
- WidgetKit: full iOS + Android + watchOS + Wear OS support, 4 new compile targets
155+
### v0.3.0
156+
- Compile-time i18n system (`perry/i18n` module): zero-ceremony localization, `[i18n]` config, embedded string table, native locale detection (6 platforms), CLDR plural rules, format wrappers
181157

182-
### Older (v0.2.37-v0.2.197)
158+
### Older (v0.2.37-v0.2.203)
183159
See CHANGELOG.md for detailed history. Key milestones:
160+
- v0.2.198: WidgetKit (iOS + Android + watchOS + Wear OS)
184161
- v0.2.191: Geisterhand UI testing framework
185162
- v0.2.183-189: WebAssembly target (`--target wasm`)
186163
- v0.2.180: `perry run` command with remote build fallback
187-
- v0.2.174: `perry/widget` module + `--target ios-widget`
188164
- v0.2.172: Codebase refactor (codegen.rs split into 12 modules, lower.rs into 8)
189165
- v0.2.156-162: Cross-platform UI parity (all 6 platforms at 100%)
190166
- v0.2.150-151: Native plugin system
191167
- v0.2.147: Mark-sweep garbage collection
192168
- v0.2.116: Native UI module (perry/ui)
193169
- v0.2.115: Integer function specialization (fibonacci 2x faster than Node)
194170
- v0.2.79: Fastify-compatible HTTP runtime
195-
- v0.2.49: First production worker (MySQL, LLM APIs, scoring)
196171
- v0.2.37: NaN-boxing foundation

Cargo.lock

Lines changed: 25 additions & 23 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)