Skip to content

Commit 3281561

Browse files
authored
Implement precise pthread wakeups (#26659)
This change implements the design in docs/design/01-precise-futex-wakeups.md. As part of this change emscripten_futex_wait API now explicitly allows spurious wakeups, much like the corresponding linux syscall. There were a couple of tests that were using `emscripten_futex_wait` without a loop, so I added loops to them in an abundance of caution. I also rolled the posixtestsuite submodule so that in now includes emscripten-core/posixtestsuite#12, which was needed because that test becoming more flaky with this change. Fixes: #26633
1 parent 4a659ff commit 3281561

31 files changed

+289
-186
lines changed

ChangeLog.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,9 @@ See docs/process.md for more on how version tagging works.
2020

2121
5.0.7 (in development)
2222
----------------------
23+
- The emscripten_futux_wait API is now documented to explicitly allow spurious
24+
wakeups. This was part of an internal change to improve inter-thread
25+
communication. (#26659)
2326

2427
5.0.6 - 04/14/26
2528
----------------
Lines changed: 54 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
11
# Design Doc: Precise Futex Wakeups
22

3-
- **Status**: Draft
3+
- **Status**: Completed
44
- **Bug**: https://github.com/emscripten-core/emscripten/issues/26633
55

66
## Context
7-
Currently, `emscripten_futex_wait` (in
8-
`system/lib/pthread/emscripten_futex_wait.c`) relies on a periodic wakeup loop
9-
for pthreads and the main runtime thread. This is done for two primary reasons:
7+
Historically, `emscripten_futex_wait` (in
8+
`system/lib/pthread/emscripten_futex_wait.c`) relied on a periodic wakeup loop
9+
for pthreads and the main runtime thread. This was done for two primary reasons:
1010

11-
1. **Thread Cancellation**: To check if the calling thread has been cancelled while it is blocked.
11+
1. **Thread Cancellation**: To check if the calling thread had been cancelled while it was blocked.
1212
2. **Main Runtime Thread Events**: To allow the main runtime thread (even when not the main browser thread) to process its mailbox/event queue.
1313

14-
The current implementation uses a 1ms wakeup interval for the main runtime
15-
thread and a 100ms interval for cancellable pthreads. This leads to unnecessary
14+
The old implementation used a 1ms wakeup interval for the main runtime
15+
thread and a 100ms interval for cancellable pthreads. This led to unnecessary
1616
CPU wakeups and increased latency for events.
1717

1818
## Goals
@@ -23,22 +23,21 @@ CPU wakeups and increased latency for events.
2323

2424
## Non-Goals
2525
- **Main Browser Thread**: Changes to the busy-wait loop in `futex_wait_main_browser_thread` are out of scope.
26-
- **Direct Atomics Usage**: Threads that call `atomic.wait` directly (bypassing `emscripten_futex_wait`) will remain un-interruptible.
26+
- **Direct Atomics Usage**: Threads that call `atomic.wait` directly (bypassing `emscripten_futex_wait`) remain un-interruptible.
2727
- **Wasm Workers**: Wasm Workers do not have a `pthread` structure, so they are not covered by this design.
2828

29-
## Proposed Design
29+
## Design
3030

3131
The core idea is to allow "side-channel" wakeups (cancellation, mailbox events)
3232
to interrupt the `atomic.wait` call by having the waker call `atomic.wake` on the
3333
same address the waiter is currently blocked on.
3434

35-
As part of this design we will need to explicitly state that
36-
`emscripten_futex_wait` now supports spurious wakeups. i.e. it may return `0`
37-
(success) even if the underlying futex was not explicitly woken by the
38-
application.
35+
As part of this design, `emscripten_futex_wait` now explicitly supports spurious
36+
wakeups. i.e. it may return `0` (success) even if the underlying futex was not
37+
explicitly woken by the application.
3938

4039
### 1. `struct pthread` Extensions
41-
We will add a single atomic `wait_addr` field to `struct pthread` (in
40+
A single atomic `wait_addr` field was added to `struct pthread` (in
4241
`system/lib/libc/musl/src/internal/pthread_impl.h`).
4342

4443
```c
@@ -57,82 +56,80 @@ _Atomic uintptr_t wait_addr;
5756
```
5857
5958
### 2. Waiter Logic (`emscripten_futex_wait`)
60-
The waiter will follow this logic:
59+
The waiter follows this logic:
6160
62-
1. **Notification Loop**:
61+
1. **Publish Wait Address**:
6362
```c
6463
uintptr_t expected_null = 0;
65-
while (!atomic_compare_exchange_strong(&self->wait_addr, &expected_null, (uintptr_t)addr)) {
64+
if (!atomic_compare_exchange_strong(&self->wait_addr, &expected_null, (uintptr_t)addr)) {
6665
// If the CAS failed, it means NOTIFY_BIT was set by another thread.
67-
assert(expected_null == NOTIFY_BIT);
68-
// Let the notifier know that we received the wakeup notification by
69-
// resetting wait_addr.
70-
self->wait_addr = 0;
71-
handle_wakeup(); // Process mailbox or handle cancellation
72-
// Reset expected_null because CAS updates it to the observed value on failure.
73-
expected_null = 0;
66+
assert(expected_null & NOTIFY_BIT);
67+
// We don't wait at all; instead behave as if we spuriously woke up.
68+
ret = ATOMICS_WAIT_OK;
69+
goto done;
7470
}
7571
```
7672
2. **Wait**: Call `ret = __builtin_wasm_memory_atomic_wait32(addr, val, timeout)`.
77-
3. **Unpublish & Check**:
73+
3. **Unpublish**:
7874
```c
79-
// Clear wait_addr and check if a notification arrived while we were sleeping.
80-
if ((atomic_exchange(&self->wait_addr, 0) & NOTIFY_BIT) != 0) {
81-
handle_wakeup();
82-
}
75+
done:
76+
self->wait_addr = 0;
8377
```
84-
4. **Return**: Return the result of the wait.
78+
4. **Handle side effects**: If the wake was due to cancellation or mailbox
79+
events, these are handled after `emscripten_futex_wait` returns (or
80+
internally via `pthread_testcancel` if cancellable).
8581
8682
Note: We do **not** loop internally if `ret == ATOMICS_WAIT_OK`. Even if we
8783
suspect the wake was caused by a side-channel event, we must return to the user
8884
to avoid "swallowing" a simultaneous real application wake.
8985
90-
### 3. Waker Logic
91-
When a thread needs to wake another thread for a side-channel event:
86+
### 3. Waker Logic (`_emscripten_thread_notify`)
87+
When a thread needs to wake another thread for a side-channel event (e.g.
88+
enqueuing work or cancellation), it calls `_emscripten_thread_notify`:
9289
93-
1. **Enqueue Work**: Add the task to the target's mailbox or set the cancellation flag.
94-
2. **Signal**:
95-
```c
96-
uintptr_t addr = atomic_fetch_or(&target->wait_addr, NOTIFY_BIT);
97-
if (addr == 0 || (addr & NOTIFY_BIT) != 0) {
98-
// Either the thread wasn't waiting (it will see NOTIFY_BIT later),
99-
// or someone else is already in the process of notifying it.
100-
return;
101-
}
102-
// We set the bit and are responsible for waking the target.
103-
// The target is currently waiting on `addr`.
104-
while (target->wait_addr == (addr | NOTIFY_BIT)) {
105-
emscripten_futex_wake((void*)addr, INT_MAX);
106-
sched_yield();
107-
}
108-
```
90+
```c
91+
void _emscripten_thread_notify(pthread_t target) {
92+
uintptr_t addr = atomic_fetch_or(&target->wait_addr, NOTIFY_BIT);
93+
if (addr == 0 || (addr & NOTIFY_BIT) != 0) {
94+
// Either the thread wasn't waiting (it will see NOTIFY_BIT later),
95+
// or someone else is already in the process of notifying it.
96+
return;
97+
}
98+
// We set the bit and are responsible for waking the target.
99+
// The target is currently waiting on `addr`.
100+
while (target->wait_addr == (addr | NOTIFY_BIT)) {
101+
emscripten_futex_wake((void*)addr, INT_MAX);
102+
sched_yield();
103+
}
104+
}
105+
```
109106

110107
### 4. Handling the Race Condition
111108
The protocol handles the "Lost Wakeup" race by having the waker loop until the
112109
waiter clears its `wait_addr`. If the waker sets the `NOTIFY_BIT` just before
113110
the waiter enters `atomic.wait`, the `atomic_wake` will be delivered once the
114111
waiter is asleep. If the waiter wakes up for any reason (timeout, real wake, or
115-
side-channel wake), its `atomic_exchange` will satisfy the waker's loop
116-
condition.
112+
side-channel wake), its reset of `wait_addr` to `0` will satisfy the waker's
113+
loop condition.
117114

118115
## Benefits
119116

120117
- **Lower Power Consumption**: Threads can sleep indefinitely (or for the full duration of a user-requested timeout) without periodic wakeups.
121-
- **Lower Latency**: Mailbox events and cancellation requests are processed immediately rather than waiting for the next 1ms or 100ms tick.
122-
- **Simpler Loop**: The complex logic for calculating remaining timeout slices in `emscripten_futex_wait` is removed.
118+
- **Lower Latency**: Mailbox events and cancellation requests are processed immediately rather than waiting for the next tick.
119+
- **Simpler Loop**: The complex logic for calculating remaining timeout slices in `emscripten_futex_wait` was removed.
123120

124121
## Alternatives Considered
125122
- **Signal-based wakeups**: Not currently feasible in Wasm as signals are not
126123
implemented in a way that can interrupt `atomic.wait`.
127124
- **A single global "wake-up" address per thread**: This would require the
128125
waiter to wait on *two* addresses simultaneously (the user's futex and its
129-
own wakeup address), which `atomic.wait` does not support. The proposed
126+
own wakeup address), which `atomic.wait` does not support. The implemented
130127
design works around this by having the waker use the *user's* futex address.
131128

132129
## Security/Safety Considerations
133-
- **The `wait_addr` must be managed carefully** to ensure wakers don't
130+
- **The `wait_addr` is managed carefully** to ensure wakers don't
134131
call `atomic.wake` on stale addresses. Clearing the address upon wake
135132
mitigates this.
136-
- **The waker loop should have a reasonable fallback** (like a yield) to prevent a
137-
busy-wait deadlock if the waiter is somehow prevented from waking up (though
138-
`atomic.wait` is generally guaranteed to wake if `atomic.wake` is called).
133+
- **The waker loop has a yield** to prevent a busy-wait deadlock if the waiter
134+
is somehow prevented from waking up (though `atomic.wait` is generally
135+
guaranteed to wake if `atomic.wake` is called).

src/struct_info_generated.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1036,7 +1036,7 @@
10361036
"p_proto": 8
10371037
},
10381038
"pthread": {
1039-
"__size__": 124,
1039+
"__size__": 128,
10401040
"profilerBlock": 104,
10411041
"stack": 48,
10421042
"stack_size": 52,

src/struct_info_generated_wasm64.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1036,7 +1036,7 @@
10361036
"p_proto": 16
10371037
},
10381038
"pthread": {
1039-
"__size__": 216,
1039+
"__size__": 224,
10401040
"profilerBlock": 184,
10411041
"stack": 80,
10421042
"stack_size": 88,

system/include/emscripten/threading_primitives.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -188,6 +188,10 @@ void emscripten_condvar_signal(emscripten_condvar_t * _Nonnull condvar, uint32_t
188188

189189
// If the given memory address contains value val, puts the calling thread to
190190
// sleep waiting for that address to be notified.
191+
// Note: Like the Linux futex syscall, this API *does* allow spurious wakeups.
192+
// This differs from the WebAssembly `atomic.wait` instruction itself which
193+
// does *not* allow supurious wakeups and it means that most callers will want
194+
// to wrap this some kind of loop.
191195
// Returns -EINVAL if addr is null.
192196
int emscripten_futex_wait(volatile void/*uint32_t*/ * _Nonnull addr, uint32_t val, double maxWaitMilliseconds);
193197

system/lib/libc/musl/src/internal/pthread_impl.h

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,16 @@ struct pthread {
111111
// postMessage path. Once this becomes true, it remains true so we never
112112
// fall back to postMessage unnecessarily.
113113
_Atomic int waiting_async;
114+
// The address the thread is currently waiting on in emscripten_futex_wait.
115+
//
116+
// This field encodes the state using the following bitmask:
117+
// - NULL: Not waiting, no pending notification.
118+
// - NOTIFY_BIT (0x1): Not waiting, but a notification was sent.
119+
// - addr: Waiting on `addr`, no pending notification.
120+
// - addr | NOTIFY_BIT: Waiting on `addr`, notification sent.
121+
//
122+
// Since futex addresses must be 4-byte aligned, the low bit is safe to use.
123+
_Atomic uintptr_t wait_addr;
114124
#endif
115125
#ifdef EMSCRIPTEN_DYNAMIC_LINKING
116126
// When dynamic linking is enabled, threads use this to facilitate the
@@ -120,6 +130,10 @@ struct pthread {
120130
#endif
121131
};
122132

133+
#ifdef __EMSCRIPTEN__
134+
#define NOTIFY_BIT (1 << 0)
135+
#endif
136+
123137
enum {
124138
DT_EXITED,
125139
DT_EXITING,

system/lib/libc/musl/src/signal/setitimer.c

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
#include <emscripten/emscripten.h>
99
#include <emscripten/threading.h>
1010
#include <assert.h>
11+
#include <math.h>
1112
#include <signal.h>
1213
#include <stdint.h>
1314
#include <stdio.h>
@@ -79,6 +80,18 @@ void _emscripten_check_timers(double now)
7980
}
8081
}
8182
}
83+
84+
double _emscripten_next_timer()
85+
{
86+
assert(emscripten_is_main_runtime_thread());
87+
double next_timer = INFINITY;
88+
for (int which = 0; which < 3; which++) {
89+
if (current_timeout_ms[which]) {
90+
next_timer = fmin(current_timeout_ms[which], next_timer);
91+
}
92+
}
93+
return next_timer - emscripten_get_now();
94+
}
8295
#endif
8396

8497
int setitimer(int which, const struct itimerval *restrict new, struct itimerval *restrict old)

system/lib/libc/musl/src/thread/pthread_cancel.c

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,5 +108,13 @@ int pthread_cancel(pthread_t t)
108108
pthread_exit(PTHREAD_CANCELED);
109109
return 0;
110110
}
111+
#ifdef __EMSCRIPTEN__
112+
// Wake the target thread in case it is in emscripten_futex_wait. Normally,
113+
// this is only required when the target is the main runtime thread and there
114+
// is an event added to its system queue.
115+
// However, all threads need to be interrupted like this in the case they are
116+
// cancelled.
117+
_emscripten_thread_notify(t);
118+
#endif
111119
return pthread_kill(t, SIGCANCEL);
112120
}

system/lib/pthread/em_task_queue.c

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
#include "em_task_queue.h"
1515
#include "proxying_notification_state.h"
1616
#include "thread_mailbox.h"
17+
#include "threading_internal.h"
1718

1819
#define EM_TASK_QUEUE_INITIAL_CAPACITY 128
1920

@@ -166,6 +167,7 @@ static bool em_task_queue_grow(em_task_queue* queue) {
166167
}
167168

168169
void em_task_queue_execute(em_task_queue* queue) {
170+
DBG("em_task_queue_execute");
169171
queue->processing = 1;
170172
pthread_mutex_lock(&queue->mutex);
171173
while (!em_task_queue_is_empty(queue)) {
@@ -178,6 +180,7 @@ void em_task_queue_execute(em_task_queue* queue) {
178180
}
179181
pthread_mutex_unlock(&queue->mutex);
180182
queue->processing = 0;
183+
DBG("done em_task_queue_execute");
181184
}
182185

183186
void em_task_queue_cancel(em_task_queue* queue) {
@@ -219,6 +222,7 @@ static void receive_notification(void* arg) {
219222
notification_state expected = NOTIFICATION_RECEIVED;
220223
atomic_compare_exchange_strong(
221224
&tasks->notification, &expected, NOTIFICATION_NONE);
225+
DBG("receive_notification done");
222226
}
223227

224228
static void cancel_notification(void* arg) {
@@ -246,6 +250,7 @@ bool em_task_queue_send(em_task_queue* queue, task t) {
246250
notification_state previous =
247251
atomic_exchange(&queue->notification, NOTIFICATION_PENDING);
248252
if (previous == NOTIFICATION_PENDING) {
253+
DBG("em_task_queue_send NOTIFICATION_PENDING already set");
249254
emscripten_thread_mailbox_unref(queue->thread);
250255
return true;
251256
}

0 commit comments

Comments
 (0)