You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Wasm: dispose per-dispatch MessageEvent/Event in worker-response handlers (4.12.1-local.10)
WasmAccelerator.EnsurePersistentHandlers installs persistent per-worker
OnMessage/OnError handlers. Each worker response delivers a MessageEvent
(and each error an Event) JSObject that the handler OWNS - SpawnDev.BlazorJS
does not auto-dispose an ActionEvent handler's argument (ActionCallback<T1>.Invoke
calls the delegate and never disposes the arg; confirmed by the library author).
The handlers never disposed msg/err, so every (dispatch x worker) response left a
MessageEvent reclaimable only by the finalizer (disposal-breakdown over a TurboQuant
lane: MessageEvent created=9971, proper=0, finalizer=9969). Between GCs this transient
pile-up spikes the main-thread V8 heap during a heavy dispatch storm - the likely
trigger of the ML late-lane heavy-test timeouts.
Fix: `using` the MessageEvent/Event arg in both handlers so each disposes
deterministically on every path (including the stray-message early return).
WasmDispatchResponse is a plain DTO, so it stays valid after msg is disposed.
Also: corrected the WasmMemoryBuffer header comment (data buffers are staged through
the shared linear memory on the main thread, NOT zero-copy shared to workers - the
inaccurate line had seeded a worker-pinned-SAB hypothesis).
Guard: WasmTests.Wasm_DispatchResponse_DoesNotLeakMessageEvent (alive-MessageEvent
count via BlazorJS IDisposableTracker, kept off the tracker's Console/verbose paths).
NOTE: a separate, slower persistent retained-object climb (non-MessageEvent,
main-thread V8) remains under investigation via a CDP bytes-by-type + retainers heap
snapshot - tracked, distinct from this transient fix.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,6 +13,7 @@ Wrapper-only (forks stay **2.0.16**). Adds a new selection-gate capability flag:
13
13
- **Wasm SIMD128 emitter foundation (Phase 1 of the SIMD port).** Additive groundwork only - no production kernel emits v128 yet, so the scalar path is byte-identical. Adds the v128 value type and the 0xFD-prefixed SIMD opcode set to `WasmOpCodes` (spec-verified; sub-opcodes are u32-LEB128 after the prefix, so multi-byte ones like `f32x4.add`=228 encode correctly), v128 emit helpers in `WasmModuleBuilder` (`EmitSimd`/`EmitSimdMem`/`EmitSimdLane`/`EmitV128Const`/`EmitI8x16Shuffle`), and the runtime SIMD capability surface: `WasmBackend.RuntimeSupportsWasmSimd` (via `System.Runtime.Intrinsics.Wasm.PackedSimd.IsSupported` - if the running Blazor WASM build has SIMD enabled, the browser/workers accept v128), `ForceScalar`/`ForceSimd` test overrides, `EffectiveWasmSimd`, `WasmCapabilityContext.WasmSimd`, and `WasmAccelerator.SupportsSimd`. **Non-SIMD devices stay first-class forever** (the scalar path is a supported mode, not a deprecated fallback - real hardware/browsers without wasm SIMD are common; see the dual-build technique in `BlazorWASMSIMDDetectExample`). Verified by the offline `DemoConsole -- wasm-simd-probe`: a hand-built v128 module is `wasm-validate`-clean and `wasm2wat`-decodes to the intended instructions.
14
14
- **Wasm: bound the persistent-worker module cache (late-lane memory-pressure fix).** The process-persistent worker pool keeps every distinct kernel's compiled `WebAssembly.Module` in a per-worker cache (`_modulesById`) for the tab's life. Across a long test lane each per-test accelerator's kernels get fresh ids, so the cache accumulated unbounded (measured 2 -> 1057 across a ~570-test lane) until late, heavy tests hit process-memory pressure and timed out (the committed shared linear memory was flat/small - the module cache was the driver). Fix: when cumulative kernels compiled since the last flush cross `WasmBackend.ModuleCacheFlushThreshold` (default 256; 0 disables), the host instructs the workers to drop their module/instance caches at the next fresh accelerator's FIRST dispatch (safe - that accelerator re-sends its own kernels; the cleared modules are disposed accelerators' dead weight). Bounds peak modules to ~the threshold. Short workloads never reach it -> never flush -> kernels stay fully warm. Diagnostics `WasmAccelerator.TotalKernelsCompiled` / `SharedWasmMemoryPages`; guard `WasmTests.Wasm_ModuleCacheFlush_DoesNotBreakCorrectness` (flushes every accelerator, asserts CPU-oracle).
15
15
- **Wasm: fixed a host-write SNAPSHOT SharedArrayBuffer leak (the real ML-lane heavy-test memory leak).** `WasmMemoryBuffer.PrepareHostWrite` allocates a full-buffer-size SharedArrayBuffer when a host write lands while a dispatch is in flight on that buffer (the lazy copy-out race defense). `CompleteDispatchIntent` removed the snapshot from its tracking dict but **never `Dispose()`d the SharedArrayBuffer** (despite its own doc claiming "that tier's SAB is freed"), and the all-intents-complete path dropped the dict without disposing either - so every materialized snapshot leaked a full-buffer-size JS SharedArrayBuffer. Under a long heavy-workload lane (ML's CopyFromCPU+dispatch pattern) this accumulated to ~1.5 GiB of JS heap, slowing late tests into timeouts (root-caused via a resident-memory trace: heap 154->1644 MiB; worker pool flat, linear memory flat, module cache flat by magnitude). Fix: dispose the snapshot SAB on release + on buffer dispose (`DisposeAllSnapshots`). New diagnostic `WasmMemoryBuffer.LiveSnapshotBytes`; guard `WasmTests.Wasm_HostWriteSnapshot_DoesNotLeakSAB` (deterministically materializes snapshots, asserts the resident bytes return to baseline). Also adds resident-count diagnostics `WasmMemoryBuffer.LiveBufferCount`/`LiveBufferBytes` + `WasmAccelerator.LiveAcceleratorCount`.
16
+
- **Wasm: dispatch-response handlers now dispose the per-dispatch `MessageEvent`/`Event` JSObject.** `WasmAccelerator.EnsurePersistentHandlers` installs persistent per-worker `OnMessage`/`OnError` handlers; each worker response delivers a `MessageEvent` (and each error an `Event`) JSObject that the handler **owns** - SpawnDev.BlazorJS does not auto-dispose an `ActionEvent` handler's argument (`ActionCallback<T1>.Invoke` calls the delegate and never disposes the arg; confirmed by the library author). The handlers never disposed `msg`/`err`, so every (dispatch x worker) response created a `MessageEvent` that was reclaimed only by the finalizer (disposal-breakdown over a TurboQuant lane: `MessageEvent created=9971, proper=0, finalizer=9969`). Between GCs this transient pile-up spikes the main-thread V8 heap during a heavy dispatch storm - the likely trigger of the late-lane heavy-test timeouts. Fix: `using` the `MessageEvent`/`Event` arg in both handlers so each disposes deterministically on every path (including the stray-message early return). Guard `WasmTests.Wasm_DispatchResponse_DoesNotLeakMessageEvent` (alive-`MessageEvent` count via BlazorJS `IDisposableTracker`, kept off the tracker's verbose/Console paths). NOTE: a separate, slower persistent retained-object climb (non-`MessageEvent`) remains under investigation via a CDP bytes-by-type + retainers heap snapshot - tracked, distinct from this transient fix.
16
17
17
18
## 4.12.0 (2026-06-13) - Sync/async contract: async-only where it waits/observes, sync for fire-and-forget
Copy file name to clipboardExpand all lines: SpawnDev.ILGPU/SpawnDev.ILGPU.csproj
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -4,9 +4,9 @@
4
4
<TargetFramework>net10.0</TargetFramework>
5
5
<ImplicitUsings>enable</ImplicitUsings>
6
6
<Nullable>enable</Nullable>
7
-
<Version>4.12.1-local.9</Version>
7
+
<Version>4.12.1-local.10</Version>
8
8
<!-- Brief current-version highlights only. Full per-version history with code samples lives in CHANGELOG.md (linked from the README). -->
9
-
<PackageReleaseNotes>4.12.1: WebGPU cooperative GEMV grid-stride fix; ±inf/NaN scalar kernel params on WebGL+Wasm; AcceleratorRequirements.RequiresScatterStores flag; Wasm process-persistent shared Web Worker pool AND shared linear memory keyed per MaxLinearMemoryPages (default-WorkerCount accelerators share one pool + one WebAssembly.Memory per distinct max per tab, fixing worker-churn starvation and the WebAssembly.Memory-reservation accumulation across long test lanes — at both the default 1 GiB and custom maxes like 2 GiB); Wasm SIMD128 emitter foundation (additive groundwork, scalar path unchanged). Forks stay 2.0.16. Full per-version history with details: CHANGELOG.md at https://github.com/LostBeard/SpawnDev.ILGPU/blob/master/CHANGELOG.md</PackageReleaseNotes>
9
+
<PackageReleaseNotes>4.12.1: Wasmprocess-persistent shared worker pool + shared linear memory (per MaxLinearMemoryPages) — fixes worker-churn starvation and WebAssembly.Memory accumulation across long test lanes; Wasm dispatch-response handlers now dispose the per-dispatch MessageEvent/Event JSObject (removes per-dispatch JS-object churn); WebGPU GEMV grid-stride fix; ±inf/NaN scalar kernel params on WebGL+Wasm; AcceleratorRequirements.RequiresScatterStores; Wasm SIMD128 emitter foundation (additive, scalar path unchanged). Forks stay 2.0.16. Full per-version history: CHANGELOG.md at https://github.com/LostBeard/SpawnDev.ILGPU/blob/master/CHANGELOG.md</PackageReleaseNotes>
0 commit comments