Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
140 changes: 140 additions & 0 deletions docs/design/01-precise-futex-wakeups.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
# Design Doc: Precise Futex Wakeups in Emscripten

- **Status**: Draft
- **Bug**: https://github.com/emscripten-core/emscripten/issues/26633

## Context
Currently, `emscripten_futex_wait` (in
`system/lib/pthread/emscripten_futex_wait.c`) relies on a periodic wakeup loop
for pthreads and the main runtime thread. This is done for two primary reasons:

1. **Thread Cancellation**: To check if the calling thread has been cancelled while it is blocked.
2. **Main Runtime Thread Events**: To allow the main runtime thread (even when not the main browser thread) to process its mailbox/event queue.

The current implementation uses a 1ms wakeup interval for the main runtime
thread and a 100ms interval for cancellable pthreads. This leads to unnecessary
Comment on lines +14 to +15
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be out of date once the new design lands. Should this be phrased as "the previous implementation" or edited out entirely to future-proof the doc?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure how to address this problem. The perhaps by updating these documents if/when the bugs get closed. i.e. perhaps when Draft is changed to Done or Complete?

CPU wakeups and increased latency for events.

## Goals
- Remove the periodic wakeup loop from `emscripten_futex_wait`.
- Implement precise, event-driven wakeups for cancellation and mailbox events.
- Maintain the existing `emscripten_futex_wait` API signature.
- Focus implementation on threads that support `atomic.wait` (pthreads and workers).

## Non-Goals
- **Main Browser Thread**: Changes to the busy-wait loop in `futex_wait_main_browser_thread` are out of scope.
- **Direct Atomics Usage**: Threads that call `atomic.wait` directly (bypassing `emscripten_futex_wait`) will remain un-interruptible.
- **Wasm Workers**: Wasm Worker do not have a `pthread` structure, they are not covered by this design.

## Proposed Design

The core idea is to allow "side-channel" wakeups (cancellation, mailbox events)
to interrupt the `atomic.wait` call by having the waker call `atomic.wake` on the
same address the waiter is currently blocked on.

As part of this design we will need to explicitly state that
`emscripten_futex_wait` now supports spurious wakeups. i.e. it may return `0`
(success) even if the underlying futex was not explicitly woken by the
application.

### 1. `struct pthread` Extensions
We will add the following fields to `struct pthread` (in
`system/lib/libc/musl/src/internal/pthread_impl.h`). All operations on these
fields must use `memory_order_seq_cst` to ensure the handshake is robust.

```c
// The address the thread is currently waiting on in emscripten_futex_wait.
// NULL if the thread is not currently in a futex wait.
_Atomic(void*) waiting_on_address;

// A counter that is incremented every time the thread wakes up from a futex wait.
// Used by wakers to ensure the target thread has actually acknowledged the wake.
_Atomic(uint32_t) wait_counter;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be named wake_counter.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its the waiter that increments it though. i.e. it counts how much times a given threads has called futex_wait.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it increments upon waking. The current name suggests to me that it should be incremented when going to sleep, but in fact it is incremented when waking.


// A bitmask of reasons why the thread was woken for a side-channel event.
_Atomic(uint32_t) wait_reasons;

#define WAIT_REASON_CANCEL (1 << 0)
#define WAIT_REASON_MAILBOX (1 << 1)
```

### 2. Waiter Logic (`emscripten_futex_wait`)
The waiter will follow this logic (using `SEQ_CST` for all atomic accesses):

1. **Pre-check**: Check `wait_reasons`. If non-zero, handle the reasons (e.g., process mailbox or handle cancellation).
2. **Publish**: Set `waiting_on_address = addr`.
3. **Counter Snapshot**: Read `current_counter = wait_counter`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't appear to be used for anything. (But it would be easier to be sure if looking at real code.)

4. **Double-check**: This is critical to avoid the race where a reason was added just before `waiting_on_address` was set. If `wait_reasons` is now non-zero, clear `waiting_on_address` and go to step 1.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be slightly simplified by using the low bits of waiting_on_address as the wait_reason, so both could atomically checked and updated in a CAS loop.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think we can just remove wait_reasons completely.

That was something that Gemini added rather than me, and I don't think its actually safe since we could have multiple wakers trying to wake a sleeping thread for different reasons.

This is also why the counter is needed since it removes the multiple-wakers problem.

5. **Wait**: Call `ret = __builtin_wasm_memory_atomic_wait32(addr, val, timeout)`.
6. **Unpublish**:
- Set `waiting_on_address = NULL`.
- Atomically increment `wait_counter`.
7. **Post-check**: Check `wait_reasons`. If non-zero, handle the reasons.
8. **Return**: Return the result of the wait to the caller.
- If `ret == ATOMICS_WAIT_OK`, return `0`.
- If `ret == ATOMICS_WAIT_TIMED_OUT`, return `-ETIMEDOUT`.
- If `ret == ATOMICS_WAIT_NOT_EQUAL`, return `-EWOULDBLOCK`.
Comment on lines +64 to +76
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a meta-comment, I'm not sure it's worth checking in a design doc if it is going to contain precise pseudocode listing like this. It might as well be real code.


Note: We do **not** loop internally if `ret == ATOMICS_WAIT_OK`. Even if we suspect the wake was caused by a side-channel event, we must return to the user to avoid "swallowing" a simultaneous real application wake that might not have changed the memory value.

### 3. Waker Logic
When a thread needs to wake another thread for a side-channel event (e.g., in `pthread_cancel` or `em_task_queue_enqueue`):

1. Atomically OR the appropriate bit into the target thread's `wait_reasons` (`SEQ_CST`).
2. Read `target_addr = target->waiting_on_address` (`SEQ_CST`).
3. If `target_addr` is not NULL:
- Read `start_c = target->wait_counter` (`SEQ_CST`).
- Enter a loop:
- Call `emscripten_futex_wake(target_addr, 1)`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wake just one waiter instead of all waiters at once? If all were woken at once, we could get rid of the wait_counter entirely, right? And maybe get rid of waiting_on_address as well?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I don't see any reason not to wake all waiters.

Regarding getting rid waiting_on_address .. I don't see how that is possible? If you need to wake a given thread you need to know what address its currently waiting on don't you?

Regarding getting rid of wait_counter, did you grok the reason why its actually needed? i.e. to fix the possible race between setting waiting_on_address and the actual atomic.wait instruction. Maybe you have some other idea about how to solve that race?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you grok

Probably not :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I got it. We need the waiting_on_address so we know what address to notify when we want to send a waiting thread a message. It's possible we could send the notification before the waiting thread actually goes to sleep, so to protect against that we notify it repeatedly until we observe it changing its wake count, at which point we know it will proceed to handle the notification.

But instead of a wake count, perhaps we could just have the waiting thread clear its wait_reasons when it reads them after waking up. Then the notifying thread can just observe the wait_reasons get cleared to know the target thread has woken up.

- Exit loop if `target->wait_counter != start_c` OR `target->waiting_on_address != target_addr`.
- **Yield**: Call `sched_yield()` (or a small sleep) to allow the target thread to proceed if it is currently being scheduled.

### 4. Handling the Race Condition
The "Lost Wakeup" race is handled by the combination of:
- The waiter double-checking `wait_reasons` after publishing its `waiting_on_address`.
- The waker looping `atomic.wake` until the waiter increments its `wait_counter`.

Even if the waker's first `atomic.wake` occurs after the waiter's double-check
but *before* the waiter actually enters the `atomic.wait` instruction, the waker
will continue to loop and call `atomic.wake` again. The subsequent call(s) will
successfully wake the waiter once it is actually sleeping.

Multiple wakers can safely call this logic simultaneously; they will all exit
the loop as soon as the waiter acknowledges the wake by incrementing the
counter.

### 5. Overlapping and Spurious Wakeups
The design must handle cases where "real" wakeups (triggered by the application) and "side-channel" wakeups (cancellation/mailbox) occur simultaneously.

1. **Spurious Wakeups for Other Threads**: If multiple threads are waiting on the same address (e.g., a shared mutex), a side-channel `atomic_wake(addr, 1)` targeted at Thread A might be delivered by the kernel to Thread B.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no kernel in this case. Is this describing how native futexes work?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess "kernel" should be replaced by "host" here. Would that makes sense here. This sentence does make sense to me with that replacment.

Its referring to the fact that normal wakups can overlap with the side channel wakeups.

- **Thread B's response**: It will wake up, increment its `wait_counter`, see that its `wait_reasons` are empty, and return `0` to its caller.
- **Thread C (the waker)**: It will see that Thread A's `wait_counter` has *not* changed and `waiting_on_address` is still `addr`. It will therefore continue its loop and call `atomic_wake` again until Thread A is finally woken.
- **Result**: Thread B experiences a "spurious" wakeup. This is acceptable and expected behavior for futex-based synchronization.
2. **Handling Side-Channel Success**: If Thread A is woken by the side-channel, it handles the event and returns `0`. The user's code will typically see that its own synchronization condition is not yet met and immediately call `emscripten_futex_wait` again. This effectively "resumes" the wait from the user's perspective while having allowed the side-channel event to be processed.
3. **No Lost "Real" Wakeups**: By returning to the caller whenever `atomic.wait` returns `OK`, we ensure that we never miss or swallow a real application-level `atomic.wake`.

### 6. Counter Wrap-around
The `wait_counter` is a `uint32_t` and will wrap around to zero after $2^{32}$ wakeups. This is safe because:
1. **Impossibility of Racing**: For the waker to "miss" a wake-up due to wrap-around, the waiter would have to wake up and re-enter a sleep state exactly $2^{32}$ times in the very brief window between the waker's `atomic_wake` and its subsequent check of `wait_counter`. Even at extreme wakeup frequencies (e.g., 1 million per second), this would take over an hour.
2. **Address Change Check**: The waker loop also checks `target->waiting_on_address != target_addr`. If the waiter wakes up and either stops waiting or starts waiting on a *different* address, the waker will exit the loop regardless of the counter value.

### 6. Benefits
- **Lower Power Consumption**: Threads can sleep indefinitely (or for the full duration of a user-requested timeout) without periodic wakeups.
- **Lower Latency**: Mailbox events and cancellation requests are processed immediately rather than waiting for the next 1ms or 100ms tick.
- **Simpler Loop**: The complex logic for calculating remaining timeout slices in `emscripten_futex_wait` is removed.

## Alternatives Considered
- **Signal-based wakeups**: Not currently feasible in Wasm as signals are not
implemented in a way that can interrupt `atomic.wait`.
- **A single global "wake-up" address per thread**: This would require the
waiter to wait on *two* addresses simultaneously (the user's futex and its
own wakeup address), which `atomic.wait` does not support. The proposed
design works around this by having the waker use the *user's* futex address.

## Security/Safety Considerations
- The `waiting_on_address` must be managed carefully to ensure wakers don't
call `atomic.wake` on stale addresses. The `wait_counter` and clearing the
address upon wake mitigate this.
- The waker loop should have a reasonable fallback (like a yield) to prevent a
busy-wait deadlock if the waiter is somehow prevented from waking up (though
`atomic.wait` is generally guaranteed to wake if `atomic.wake` is called).
10 changes: 10 additions & 0 deletions docs/design/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Emscripten Design Documents

This directory contains design documents for emscripten features and major
changes/refactors.

We are experimenting with keeping these document here under source control with
the hope that this will increase understandability of the codebase. This has
some advantages over doing all planning in Google Docs or GitHub issues. For
example, it allows us to track the history of designs and it allows them to be
searchable using standard tools like `git grep`.
Loading