@@ -24,10 +24,14 @@ AsyncLocalStorage, microtask draining, and environment lifecycle management.
2424 ` UvFsAwaitable ` (fs operations), ` UvFsStatAwaitable ` (stat-family),
2525 ` UvWorkAwaitable ` (thread pool work), and ` UvGetAddrInfoAwaitable `
2626 (DNS resolution). Each embeds the libuv request struct directly in the
27- coroutine frame, avoiding separate heap allocations.
27+ coroutine frame, avoiding separate heap allocations. Each also exposes a
28+ ` cancelable_req() ` method returning the underlying ` uv_req_t* ` for
29+ cancellation support during environment teardown.
2830
2931* ` uv_promise.h ` -- Helpers for bridging coroutines to JavaScript Promises:
30- ` MakePromise() ` , ` ResolvePromise() ` , ` RejectPromiseWithUVError() ` .
32+ ` MakePromise() ` , ` ResolvePromise() ` , ` RejectPromiseWithUVError() ` . The
33+ resolve and reject helpers guard against calling V8 APIs when the
34+ environment is shutting down (` can_call_into_js() ` check).
3135
3236## Usage
3337
@@ -114,14 +118,19 @@ self-destructs when the coroutine completes.
114118`UvTrackedTask<T, Name>` follows the same lazy/fire-and-forget pattern but
115119adds three phases around `Start()`:
116120
117- 1. **Creation**: The coroutine frame is heap-allocated by the compiler.
118- The coroutine is suspended at `initial_suspend` (lazy).
121+ 1. **Creation**: The coroutine frame is allocated from the thread-local
122+ free-list (see "Frame allocator" below). The coroutine is suspended at
123+ `initial_suspend` (lazy).
119124
1201252. **`InitTracking(env)`**: Assigns an `async_id`, captures the current
121- `async_context_frame` (for AsyncLocalStorage propagation), creates a
122- resource object for `executionAsyncResource()`, emits the async\_hooks
123- `init` event and a trace event, registers in the Environment's coroutine
124- task list, and reports external memory to V8.
126+ `async_context_frame` (for AsyncLocalStorage propagation), emits a trace
127+ event, and registers in the Environment's coroutine task list for
128+ cancellation during teardown. If async\_hooks listeners are active
129+ (`kInit > 0` or `kUsesExecutionAsyncResource > 0`), a resource object
130+ is created for `executionAsyncResource()` and the `init` hook is emitted.
131+ The type name V8 string is cached per template instantiation via
132+ `v8::Eternal<v8::String>`, so only the first coroutine of a given type
133+ pays the `String::NewFromUtf8` cost.
125134
1261353. **`Start()`**: Marks the task as detached (fire-and-forget) and resumes
127136 the coroutine. Each resume-to-suspend segment is wrapped in an
@@ -133,8 +142,10 @@ adds three phases around `Start()`:
133142
1341434. **Completion**: At `final_suspend`, the last `InternalCallbackScope` is
135144 closed (draining task queues), the async\_hooks `destroy` event is emitted,
136- the task is unregistered from the Environment, external memory accounting
137- is released, and the coroutine frame is freed.
145+ the task is unregistered from the Environment, and the coroutine frame is
146+ returned to the thread-local free-list. If a detached coroutine has a
147+ captured C++ exception that was never observed, `std::terminate()` is
148+ called rather than silently discarding it.
138149
139150## How the awaitable dispatch works
140151
@@ -144,7 +155,7 @@ directly in the coroutine frame. When the coroutine hits `co_await`:
1441551. `await_transform()` on the promise wraps it in a `TrackedAwaitable`.
1451562. `TrackedAwaitable::await_suspend()`:
146157 * Closes the current `InternalCallbackScope` (drains microtasks/nextTick).
147- * Records the `uv_req_t*` for cancellation support.
158+ * Records the `uv_req_t*` for cancellation support (via `cancelable_req()`) .
148159 * Increments `request_waiting_` (event loop liveness).
149160 * Calls the inner `await_suspend()`, which dispatches the libuv call with
150161 `req_.data = this` pointing back to the awaitable.
@@ -157,6 +168,11 @@ directly in the coroutine frame. When the coroutine hits `co_await`:
157168 * Opens a new `InternalCallbackScope` for the next segment.
158169 * Returns the result (e.g., `req_.result` for fs operations).
159170
171+ The liveness counter and cancellation tracking are conditional on the inner
172+ awaitable having a `cancelable_req()` method (checked at compile time via a
173+ `requires` expression). When co\_awaiting another `UvTask` or `UvTrackedTask`
174+ (coroutine composition), these steps are skipped.
175+
160176## Environment teardown
161177
162178During `Environment::CleanupHandles()`, the coroutine task list is iterated and
@@ -166,36 +182,51 @@ in-flight libuv request (if any), which causes the libuv callback to fire with
166182The `request_waiting_` counter ensures the teardown loop waits for all
167183coroutine I/O to finish before destroying the Environment.
168184
185+ ## Frame allocator
186+
187+ Coroutine frames are allocated from a thread-local free-list rather than going
188+ through `malloc`/`free` on every creation and destruction. This is implemented
189+ via `promise_type::operator new` and `operator delete` in `TrackedPromiseBase`,
190+ which route through `CoroFrameAlloc()` and `CoroFrameFree()`.
191+
192+ The free-list uses size-class buckets with 256-byte granularity, covering
193+ frames up to 4096 bytes (which covers typical coroutine frames). Frames larger
194+ than 4096 bytes fall through to the global `operator new`. Since all coroutines
195+ run on the event loop thread, the free-list requires no locking.
196+
197+ After the first coroutine of a given size class completes, subsequent
198+ coroutines of the same size class are allocated from the free-list with zero
199+ `malloc` overhead.
200+
169201## Allocation comparison with ReqWrap
170202
171203For a single async operation (e.g., `fsPromises.access`):
172204
173- | | ReqWrap pattern | Coroutine pattern |
174- | -------------------- | --------------- | --------------------------------- |
175- | C++ heap allocations | 3 | 1 (coroutine frame ) |
176- | V8 heap objects | 7 | 3 (resource + resolver + promise) |
177- | Total allocations | 10 | 4 |
205+ | | ReqWrap pattern | Coroutine (no hooks) | Coroutine (hooks active) |
206+ | -------------------- | --------------- | -------------------- | ----------- ------------- |
207+ | C++ heap allocations | 3 | 0 (free-list hit ) | 0 (free-list hit) |
208+ | V8 heap objects | 7 | 2 (resolver+promise) | 3 (+ resource object) |
209+ | Total allocations | 10 | 2 | 3 |
178210
179211For a multi-step operation (open + stat + read + close):
180212
181- | | 4x ReqWrap | Single coroutine |
182- | ----------------------------- | ---------- | ----------------------------- |
183- | C++ heap allocations | 12 | 1 |
184- | V8 heap objects | 28 | 3 |
185- | Total allocations | 40 | 4 |
186- | InternalCallbackScope entries | 4 | 5 (one per segment + initial) |
213+ | | 4x ReqWrap | Single coroutine (no hooks) | Single coroutine (hooks active) |
214+ | ----------------------------- | ---------- | --------------------------- | ----------------------------- -- |
215+ | C++ heap allocations | 12 | 0 (free-list hit) | 0 (free-list hit) |
216+ | V8 heap objects | 28 | 2 | 3 |
217+ | Total allocations | 40 | 2 | 3 |
218+ | InternalCallbackScope entries | 4 | 5 (one per segment) | 5 |
187219
188220The coroutine frame embeds the `uv_fs_t` (\~440 bytes) directly. The compiler
189221may overlay non-simultaneously-live awaitables in the frame, so a multi-step
190222coroutine does not necessarily pay N times the `uv_fs_t` cost.
191223
192224## Known limitations
193225
194- * **Heap snapshot visibility**: The coroutine frame is a plain `malloc`
195- allocated by the C++ coroutine machinery. It is not visible to V8 heap
196- snapshots or `MemoryRetainer`. `AdjustAmountOfExternalAllocatedMemory` is
197- used to give V8 a rough signal of the external memory pressure, but the
198- exact frame contents are not inspectable.
226+ * **Heap snapshot visibility**: The coroutine frame is not visible to V8 heap
227+ snapshots or `MemoryRetainer`. The thread-local free-list allocator reduces
228+ malloc pressure but does not provide V8 with per-frame memory accounting.
229+ The exact frame contents are not inspectable from heap snapshot tooling.
199230
200231* **Snapshot serialization**: `UvTrackedTask` holds `v8::Global` handles that
201232 cannot be serialized into a startup snapshot. There is currently no safety
@@ -207,3 +238,15 @@ coroutine does not necessarily pay N times the `uv_fs_t` cost.
207238 The coroutine pattern uses free-form string names. The `init` trace event
208239 uses the provided name; the `destroy` trace event currently uses a generic
209240 `"coroutine"` category name rather than the per-instance name.
241+
242+ * **Free-list growth**: The thread-local free-list does not have a cap on the
243+ number of cached frames per size class. Under a workload that creates a
244+ large burst of concurrent coroutines and then goes idle, the free-list will
245+ retain all of those frames until the thread exits. A maximum per-bucket
246+ count could be added if this becomes a concern.
247+
248+ * **Static Eternal handles**: The cached type name `v8::Eternal<v8::String>`
249+ is a static variable per template instantiation. It is never freed and is
250+ shared across all Isolates on the same thread. This is safe for the
251+ single-Isolate case (the common case for Node.js), but would need
252+ per-Isolate caching if multiple Isolates use the same coroutine types.
0 commit comments