11# Design Doc: Precise Futex Wakeups
22
3- - ** Status** : Draft
3+ - ** Status** : Completed
44- ** Bug** : https://github.com/emscripten-core/emscripten/issues/26633
55
66## Context
7- Currently , ` emscripten_futex_wait ` (in
8- ` system/lib/pthread/emscripten_futex_wait.c ` ) relies on a periodic wakeup loop
9- for pthreads and the main runtime thread. This is done for two primary reasons:
7+ Historically , ` emscripten_futex_wait ` (in
8+ ` system/lib/pthread/emscripten_futex_wait.c ` ) relied on a periodic wakeup loop
9+ for pthreads and the main runtime thread. This was done for two primary reasons:
1010
11- 1 . ** Thread Cancellation** : To check if the calling thread has been cancelled while it is blocked.
11+ 1 . ** Thread Cancellation** : To check if the calling thread had been cancelled while it was blocked.
12122 . ** Main Runtime Thread Events** : To allow the main runtime thread (even when not the main browser thread) to process its mailbox/event queue.
1313
14- The current implementation uses a 1ms wakeup interval for the main runtime
15- thread and a 100ms interval for cancellable pthreads. This leads to unnecessary
14+ The old implementation used a 1ms wakeup interval for the main runtime
15+ thread and a 100ms interval for cancellable pthreads. This led to unnecessary
1616CPU wakeups and increased latency for events.
1717
1818## Goals
@@ -23,22 +23,21 @@ CPU wakeups and increased latency for events.
2323
2424## Non-Goals
2525- ** Main Browser Thread** : Changes to the busy-wait loop in ` futex_wait_main_browser_thread ` are out of scope.
26- - ** Direct Atomics Usage** : Threads that call ` atomic.wait ` directly (bypassing ` emscripten_futex_wait ` ) will remain un-interruptible.
26+ - ** Direct Atomics Usage** : Threads that call ` atomic.wait ` directly (bypassing ` emscripten_futex_wait ` ) remain un-interruptible.
2727- ** Wasm Workers** : Wasm Workers do not have a ` pthread ` structure, so they are not covered by this design.
2828
29- ## Proposed Design
29+ ## Design
3030
3131The core idea is to allow "side-channel" wakeups (cancellation, mailbox events)
3232to interrupt the ` atomic.wait ` call by having the waker call ` atomic.wake ` on the
3333same address the waiter is currently blocked on.
3434
35- As part of this design we will need to explicitly state that
36- ` emscripten_futex_wait ` now supports spurious wakeups. i.e. it may return ` 0 `
37- (success) even if the underlying futex was not explicitly woken by the
38- application.
35+ As part of this design, ` emscripten_futex_wait ` now explicitly supports spurious
36+ wakeups. i.e. it may return ` 0 ` (success) even if the underlying futex was not
37+ explicitly woken by the application.
3938
4039### 1. ` struct pthread ` Extensions
41- We will add a single atomic ` wait_addr ` field to ` struct pthread ` (in
40+ A single atomic ` wait_addr ` field was added to ` struct pthread ` (in
4241` system/lib/libc/musl/src/internal/pthread_impl.h ` ).
4342
4443``` c
@@ -57,82 +56,80 @@ _Atomic uintptr_t wait_addr;
5756```
5857
5958### 2. Waiter Logic (`emscripten_futex_wait`)
60- The waiter will follow this logic:
59+ The waiter follows this logic:
6160
62- 1. **Notification Loop **:
61+ 1. **Publish Wait Address **:
6362 ```c
6463 uintptr_t expected_null = 0;
65- while (!atomic_compare_exchange_strong(&self->wait_addr, &expected_null, (uintptr_t)addr)) {
64+ if (!atomic_compare_exchange_strong(&self->wait_addr, &expected_null, (uintptr_t)addr)) {
6665 // If the CAS failed, it means NOTIFY_BIT was set by another thread.
67- assert(expected_null == NOTIFY_BIT);
68- // Let the notifier know that we received the wakeup notification by
69- // resetting wait_addr.
70- self->wait_addr = 0;
71- handle_wakeup(); // Process mailbox or handle cancellation
72- // Reset expected_null because CAS updates it to the observed value on failure.
73- expected_null = 0;
66+ assert(expected_null & NOTIFY_BIT);
67+ // We don't wait at all; instead behave as if we spuriously woke up.
68+ ret = ATOMICS_WAIT_OK;
69+ goto done;
7470 }
7571 ```
76722. **Wait**: Call `ret = __builtin_wasm_memory_atomic_wait32(addr, val, timeout)`.
77- 3. **Unpublish & Check **:
73+ 3. **Unpublish**:
7874 ```c
79- // Clear wait_addr and check if a notification arrived while we were sleeping.
80- if ((atomic_exchange(&self->wait_addr, 0) & NOTIFY_BIT) != 0) {
81- handle_wakeup();
82- }
75+ done:
76+ self->wait_addr = 0;
8377 ```
84- 4. **Return**: Return the result of the wait.
78+ 4. **Handle side effects**: If the wake was due to cancellation or mailbox
79+ events, these are handled after `emscripten_futex_wait` returns (or
80+ internally via `pthread_testcancel` if cancellable).
8581
8682Note: We do **not** loop internally if `ret == ATOMICS_WAIT_OK`. Even if we
8783suspect the wake was caused by a side-channel event, we must return to the user
8884to avoid "swallowing" a simultaneous real application wake.
8985
90- ### 3. Waker Logic
91- When a thread needs to wake another thread for a side-channel event:
86+ ### 3. Waker Logic (`_emscripten_thread_notify`)
87+ When a thread needs to wake another thread for a side-channel event (e.g.
88+ enqueuing work or cancellation), it calls `_emscripten_thread_notify`:
9289
93- 1. **Enqueue Work**: Add the task to the target's mailbox or set the cancellation flag.
94- 2. **Signal**:
95- ```c
96- uintptr_t addr = atomic_fetch_or(&target->wait_addr, NOTIFY_BIT);
97- if (addr == 0 || (addr & NOTIFY_BIT) != 0) {
98- // Either the thread wasn't waiting (it will see NOTIFY_BIT later),
99- // or someone else is already in the process of notifying it.
100- return;
101- }
102- // We set the bit and are responsible for waking the target .
103- // The target is currently waiting on `addr`.
104- while (target->wait_addr == ( addr | NOTIFY_BIT)) {
105- emscripten_futex_wake((void*)addr, INT_MAX );
106- sched_yield();
107- }
108- ```
90+ ```c
91+ void _emscripten_thread_notify(pthread_t target) {
92+ uintptr_t addr = atomic_fetch_or(&target->wait_addr, NOTIFY_BIT);
93+ if ( addr == 0 || (addr & NOTIFY_BIT) != 0) {
94+ // Either the thread wasn't waiting (it will see NOTIFY_BIT later),
95+ // or someone else is already in the process of notifying it.
96+ return;
97+ }
98+ // We set the bit and are responsible for waking the target.
99+ // The target is currently waiting on `addr` .
100+ while ( target->wait_addr == (addr | NOTIFY_BIT)) {
101+ emscripten_futex_wake((void*) addr, INT_MAX);
102+ sched_yield( );
103+ }
104+ }
105+ ```
109106
110107### 4 . Handling the Race Condition
111108The protocol handles the " Lost Wakeup" race by having the waker loop until the
112109waiter clears its `wait_addr`. If the waker sets the `NOTIFY_BIT` just before
113110the waiter enters `atomic.wait`, the `atomic_wake` will be delivered once the
114111waiter is asleep. If the waiter wakes up for any reason (timeout, real wake, or
115- side-channel wake), its `atomic_exchange` will satisfy the waker's loop
116- condition.
112+ side-channel wake), its reset of ` wait_addr ` to ` 0 ` will satisfy the waker's
113+ loop condition.
117114
118115## Benefits
119116
120117- ** Lower Power Consumption** : Threads can sleep indefinitely (or for the full duration of a user-requested timeout) without periodic wakeups.
121- - **Lower Latency**: Mailbox events and cancellation requests are processed immediately rather than waiting for the next 1ms or 100ms tick.
122- - **Simpler Loop**: The complex logic for calculating remaining timeout slices in `emscripten_futex_wait` is removed.
118+ - ** Lower Latency** : Mailbox events and cancellation requests are processed immediately rather than waiting for the next tick.
119+ - ** Simpler Loop** : The complex logic for calculating remaining timeout slices in ` emscripten_futex_wait ` was removed.
123120
124121## Alternatives Considered
125122- ** Signal-based wakeups** : Not currently feasible in Wasm as signals are not
126123 implemented in a way that can interrupt ` atomic.wait ` .
127124- ** A single global "wake-up" address per thread** : This would require the
128125 waiter to wait on * two* addresses simultaneously (the user's futex and its
129- own wakeup address), which `atomic.wait` does not support. The proposed
126+ own wakeup address), which ` atomic.wait ` does not support. The implemented
130127 design works around this by having the waker use the * user's* futex address.
131128
132129## Security/Safety Considerations
133- - **The `wait_addr` must be managed carefully** to ensure wakers don't
130+ - ** The ` wait_addr ` is managed carefully** to ensure wakers don't
134131 call ` atomic.wake ` on stale addresses. Clearing the address upon wake
135132 mitigates this.
136- - **The waker loop should have a reasonable fallback ** (like a yield) to prevent a
137- busy-wait deadlock if the waiter is somehow prevented from waking up (though
138- `atomic.wait` is generally guaranteed to wake if `atomic.wake` is called).
133+ - ** The waker loop has a yield ** to prevent a busy-wait deadlock if the waiter
134+ is somehow prevented from waking up (though ` atomic.wait ` is generally
135+ guaranteed to wake if ` atomic.wake ` is called).
0 commit comments