Skip to content

Commit d9ab766

Browse files
fix(webgpu): fixes per-frame memory growth (#140)
* fix(webgpu): release per-frame transients at their consumption point Continuous WebGPU rendering leaked native heap render-proportionally: each frame mints single-use wrappers (swapchain texture view, command encoder, render/compute pass, command buffer) whose native Arc handle is dropped only by the V8 GC finalizer, which a tight requestAnimationFrame loop starves. global.gc() does not flush the finalizer tasks; backgrounding the loop drains them and the heap collapses. Release each transient deterministically at the WebGPU-spec operation that consumes it, with the GC finalizer kept as a backstop: render/compute pass -> pass.end() command encoder -> encoder.finish() command buffer -> queue.submit() swapchain view -> the owning GPUCanvasContext.presentSurface() The native handle lives in ArcHandle (unique_ptr + stateless custom deleter, new ArcHandle.h, with a MutArcHandle variant for non-const C ABI releases). reset() releases the Arc once and nulls the pointer; ~unique_ptr (run by the GC finalizer via ObjectWrapperImpl's virtual destructor) is a no-op when already null, so exactly-once release is a type invariant. Hand-written destructors and manual null-guards are gone; C ABI call sites pass the raw pointer via .get(). The Rust crate is unchanged. The swapchain view is the only transient tied to a context, so each GPUCanvasContext tracks its own views (swapchainContext_ stamped in getCurrentTexture, registered in GPUTexture.createView) and releases them in presentSurface(): per-context, so multiple canvases in one isolate never drain each other's in-flight views. JS destroy() calls are optional-chained, safe ahead of the native rebuild. Verified: the C++ compiles and links via the Android NDK toolchain (Gradle assembleRelease). * refactor(webgpu): own every native handle via ArcHandle (no hand-written dtors) Convert the remaining 21 persistent WebGPU wrappers to hold their Rust Arc handle in ArcHandle/MutArcHandle, the same primitive as the five transients. Each hand-written destructor that called a raw canvas_native_webgpu_*_release is gone; the unique_ptr member releases the handle once on destruction, so the GC path is unchanged in behavior but the lifecycle now lives in the type. Accessors and C ABI call sites use handle.get(); GPUImpl's null-guarded release and GPUCanvasContextImpl's release (which keeps its raf_.reset() ordering) fold into the same model. Behavior-preserving cleanup: these objects are not minted per frame and do not soak; this removes duplicated release bookkeeping and makes all 26 WebGPU wrappers consistent. Verified compiling via the Android NDK toolchain. * fix(webgpu): free per-present read-back command encoder and buffer presentSurface copies the swapchain texture into the read-back texture every frame (toDataURL support) using a command encoder + command buffer, but never dropped them, so they accumulated in wgpu-core's registry render-proportionally — a native-heap leak. Drop the command buffer after submit and the encoder after the copy, mirroring gpu_queue.rs::queue_submit. Confirmed on a Moto G 2025: symbolized heapprofd named present_surface as the top net-retained allocator; the drops remove it (~0.31 -> ~0.18 MB/s under continuous render). * fix(webgpu): release the swapchain GPUTexture wrapper at present getCurrentTexture() returns a fresh per-frame GPUTexture wrapper holding an Arc clone of the surface texture; its only non-deterministic free path was the GC finalizer, which a tight render loop starves. Track the per-context swapchain textures and drop their native handle at presentSurface() (the texture's point of death) via a JS-exposed __releaseHandle on GPUTextureImpl that decrements the Arc. __releaseHandle is distinct from destroy(): destroy() frees the GPU texture and must never run on the swapchain texture; __releaseHandle only drops our wrapper's handle. Completes deterministic per-frame release for the swapchain path. * fix(webgpu): drop the swapchain texture and its clear_view at frame end getCurrentTexture() registers a wgpu Texture (carrying an auto-created surface clear_view) in wgpu-core's hub each frame. surface present()/discard() only release the acquired-texture ref; the hub registry holds a second ref that must be dropped explicitly. CanvasGPUTexture::drop only did this for the None (app-created) branch — the Some (surface) branch was a no-op with its discard commented out, so the per-frame swapchain texture and its clear image view leaked in the hub every frame. Symbolized heapprofd named surface_get_current_texture (present.rs:220, ash create_image_view) as the top remaining grower. Drop the surface texture in the Some branch (and discard first if it was acquired but never presented). The discard's error is logged rather than fatally aborted — the original code used handle_error_fatal, which crashes the process from within Drop and is almost certainly why this cleanup was commented out, trading a crash for a leak. Verified on a Moto G 2025: continuous-render foreground soak goes from ~0.18 MB/s to +0.002 MB/s over 5 minutes (flat; matches the blank-app baseline). This was the last of three native leaks (per-frame wrapper Arcs; read-back encoder/buffer; this). * style(webgpu): tighten inline comments to repo conventions Trim the explanatory comment blocks added with the leak fixes down to terse one-liners matching the surrounding code. No behavior change.
1 parent cd52d3d commit d9ab766

62 files changed

Lines changed: 333 additions & 173 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

crates/canvas-c/src/webgpu/gpu_canvas_context.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1495,9 +1495,13 @@ pub unsafe extern "C" fn canvas_native_webgpu_context_present_surface(
14951495
if error.is_none() {
14961496
global.queue_submit(data.device.queue.queue.id, &[id]).ok();
14971497
}
1498+
1499+
global.command_buffer_drop(id);
14981500
}
14991501
Err(_) => {}
15001502
}
1503+
1504+
global.command_encoder_drop(encoder);
15011505
}
15021506
}
15031507
};

crates/canvas-c/src/webgpu/gpu_texture.rs

Lines changed: 8 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -45,28 +45,21 @@ impl Drop for CanvasGPUTexture {
4545
}
4646
match self.surface_id {
4747
Some(surface_id) => {
48+
let global = self.instance.global();
4849
let has_surface_presented = self
4950
.has_surface_presented
5051
.load(std::sync::atomic::Ordering::SeqCst);
5152

53+
// acquired but never presented: hand the image back to the swapchain
5254
if !has_surface_presented {
53-
let global = self.instance.global();
54-
55-
/*
56-
match global.surface_texture_discard(surface_id) {
57-
Ok(_) => {
58-
self.surface_id = None;
59-
}
60-
Err(cause) => {
61-
handle_error_fatal(
62-
global,
63-
cause,
64-
"canvas_native_webgpu_texture_release",
65-
)
66-
},
55+
if let Err(cause) = global.surface_texture_discard(surface_id) {
56+
log::warn!("surface_texture_discard failed: {cause:?}");
6757
}
68-
*/
6958
}
59+
60+
// drop the hub ref so the surface texture and its clear_view are freed
61+
global.texture_drop(self.texture);
62+
self.surface_id = None;
7063
}
7164
None => {
7265
let context = self.instance.global();

packages/canvas/WebGPU/Constants.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,3 +39,5 @@ export const native_ = Symbol('[[native]]');
3939
export const mapState_ = Symbol('[[mapState]]');
4040
export const contextPtr_ = Symbol('[[contextPtr]]');
4141
export const adapter_ = Symbol('[[adapter]]');
42+
// owning GPUCanvasContext stamped on the getCurrentTexture() texture (for release at present)
43+
export const swapchainContext_ = Symbol('[[swapchainContext]]');

packages/canvas/WebGPU/GPUCanvasContext.ts

Lines changed: 38 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
import { Helpers } from '../helpers';
2-
import { adapter_, contextPtr_, GPUTextureUsage, native_ } from './Constants';
2+
import { adapter_, contextPtr_, GPUTextureUsage, native_, swapchainContext_ } from './Constants';
33
import type { GPUDevice } from './GPUDevice';
44
import { GPUTexture } from './GPUTexture';
5+
import type { GPUTextureView } from './GPUTextureView';
56
import type { GPUAdapter } from './GPUAdapter';
67
import type { GPUCanvasAlphaMode, GPUCanvasPresentMode, GPUExtent3D, GPUTextureFormat } from './Types';
78
import type { CanvasRenderingContext } from '../common';
@@ -18,6 +19,15 @@ export class GPUCanvasContext implements CanvasRenderingContext {
1819
[native_] = null;
1920
[contextPtr_] = null;
2021

22+
// per-frame swapchain views and textures, released at the next presentSurface()
23+
private _swapchainViews: GPUTextureView[] = [];
24+
private _swapchainTextures: GPUTexture[] = [];
25+
26+
/** @internal */
27+
_registerSwapchainView(view: GPUTextureView) {
28+
this._swapchainViews.push(view);
29+
}
30+
2131
constructor(context: any, contextOptions: any = {}) {
2232
let nativeContext = '0';
2333
if (__ANDROID__) {
@@ -176,12 +186,39 @@ export class GPUCanvasContext implements CanvasRenderingContext {
176186
const result = GPUTexture.fromNative(texture);
177187
if (!result) {
178188
console.error('GPUCanvasContext.getCurrentTexture: native texture wrapper contained no texture');
189+
} else {
190+
// mark as swapchain-owned and track for release at present
191+
(result as any)[swapchainContext_] = this;
192+
this._swapchainTextures.push(result);
179193
}
180194
return result;
181195
}
182196

183197
presentSurface(_texture?: GPUTexture) {
184198
this.native.presentSurface();
199+
// release this frame's swapchain views and textures (their point of death)
200+
const views = this._swapchainViews;
201+
if (views.length > 0) {
202+
this._swapchainViews = [];
203+
for (let i = 0; i < views.length; i++) {
204+
const view = views[i] as any;
205+
view?.[native_]?.destroy?.();
206+
if (view) {
207+
view[native_] = null;
208+
}
209+
}
210+
}
211+
const texs = this._swapchainTextures;
212+
if (texs.length > 0) {
213+
this._swapchainTextures = [];
214+
for (let i = 0; i < texs.length; i++) {
215+
const tex = texs[i] as any;
216+
tex?.[native_]?.__releaseHandle?.();
217+
if (tex) {
218+
tex[native_] = null;
219+
}
220+
}
221+
}
185222
}
186223

187224
getCapabilities(adapter: GPUAdapter): {

packages/canvas/WebGPU/GPUCommandEncoder.ts

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -190,7 +190,11 @@ export class GPUCommandEncoder {
190190

191191
finish(descriptor?: { label?: string }) {
192192
const ret = this[native_].finish(descriptor);
193-
return GPUCommandBuffer.fromNative(ret);
193+
const buffer = GPUCommandBuffer.fromNative(ret);
194+
// finish() consumes the encoder; release it now instead of waiting for GC
195+
this[native_]?.destroy?.();
196+
this[native_] = null;
197+
return buffer;
194198
}
195199

196200
insertDebugMarker(markerLabel: string) {

packages/canvas/WebGPU/GPUComputePassEncoder.ts

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,10 @@ export class GPUComputePassEncoder {
1919
}
2020

2121
end() {
22-
this[native_].end();
22+
// end() consumes the pass; release it now instead of waiting for GC
23+
const n = this[native_];
24+
n?.end();
25+
n?.destroy?.();
2326
this[native_] = null;
2427
}
2528

packages/canvas/WebGPU/GPUQueue.ts

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,14 @@ export class GPUQueue {
129129
return command[native_];
130130
}),
131131
);
132+
// submit() consumes the command buffers; release them now instead of via GC
133+
for (let i = 0; i < commands.length; i++) {
134+
const command = commands[i] as any;
135+
command?.[native_]?.destroy?.();
136+
if (command) {
137+
command[native_] = null;
138+
}
139+
}
132140
}
133141
}
134142

packages/canvas/WebGPU/GPURenderPassEncoder.ts

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,11 @@ export class GPURenderPassEncoder {
4545
}
4646

4747
end() {
48-
this[native_].end();
48+
// end() consumes the pass; release it now instead of waiting for GC
49+
const n = this[native_];
50+
n?.end();
51+
n?.destroy?.();
52+
this[native_] = null;
4953
}
5054

5155
endOcclusionQuery() {

packages/canvas/WebGPU/GPUTexture.ts

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
import { native_ } from './Constants';
1+
import { native_, swapchainContext_ } from './Constants';
22
import { GPUTextureView } from './GPUTextureView';
33
import type { GPUTextureViewDescriptor } from './Interfaces';
44

@@ -49,6 +49,12 @@ export class GPUTexture {
4949

5050
createView(desc?: GPUTextureViewDescriptor) {
5151
const view = this[native_].createView(desc);
52-
return GPUTextureView.fromNative(view);
52+
const ret = GPUTextureView.fromNative(view);
53+
// swapchain-texture views die at present; register them with the owning context
54+
const ctx = (this as any)[swapchainContext_];
55+
if (ret && ctx) {
56+
ctx._registerSwapchainView(ret);
57+
}
58+
return ret;
5359
}
5460
}
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
//
2+
// ArcHandle.h — RAII owner for the Rust Arc handles exposed across the canvas C ABI.
3+
//
4+
5+
#ifndef CANVAS_WEBGPU_ARCHANDLE_H
6+
#define CANVAS_WEBGPU_ARCHANDLE_H
7+
8+
#include <memory>
9+
10+
// unique_ptr owner for a Rust Arc handle exposed across the C ABI. reset() runs the
11+
// matching ..._release once and nulls the pointer; the GC finalizer's later drop is
12+
// then a no-op. ArcHandle for const release fns, MutArcHandle for non-const.
13+
template <typename T, void (*Release)(const T *)>
14+
struct ArcDeleter {
15+
void operator()(const T *ptr) const noexcept {
16+
if (ptr != nullptr) {
17+
Release(ptr);
18+
}
19+
}
20+
};
21+
22+
template <typename T, void (*Release)(const T *)>
23+
using ArcHandle = std::unique_ptr<const T, ArcDeleter<T, Release>>;
24+
25+
template <typename T, void (*Release)(T *)>
26+
struct MutArcDeleter {
27+
void operator()(T *ptr) const noexcept {
28+
if (ptr != nullptr) {
29+
Release(ptr);
30+
}
31+
}
32+
};
33+
34+
template <typename T, void (*Release)(T *)>
35+
using MutArcHandle = std::unique_ptr<T, MutArcDeleter<T, Release>>;
36+
37+
#endif // CANVAS_WEBGPU_ARCHANDLE_H

0 commit comments

Comments
 (0)