Skip to content

Commit 2d45407

Browse files
committed
Rebase CABI onto explicit stack-switching interface (no behavior change)
1 parent 2acb403 commit 2d45407

File tree

4 files changed

+882
-643
lines changed

4 files changed

+882
-643
lines changed

design/mvp/CanonicalABI.md

Lines changed: 178 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ specified here.
1616
* [Component Instance State](#component-instance-state)
1717
* [Table State](#table-state)
1818
* [Resource State](#resource-state)
19+
* [Stack Switching](#stack-switching)
1920
* [Thread State](#thread-state)
2021
* [Waitable State](#waitable-state)
2122
* [Task State](#task-state)
@@ -587,6 +588,151 @@ class ResourceType(Type):
587588
```
588589

589590

591+
#### Stack Switching
592+
593+
Component Model concurrency support is defined in terms of the Core WebAssembly
594+
[stack-switching] proposal's `cont.new`, `resume` and `suspend` instructions.
595+
However, the Component Model only needs a limited subset of the full
596+
[stack-switching] proposal:
597+
598+
First, there are only two global [control tags] used with `suspend`:
599+
```wat
600+
(tag $block (param $switch-to (ref null $Thread)) (result $cancelled bool))
601+
(tag $current-thread (result (ref $Thread)))
602+
```
603+
The `$block` tag is used to suspend a [thread] until some future event. The
604+
parameters and results will be described in the next section, where they are
605+
used to define `Thread`. The `$current-thread` tag is used to retrieve the
606+
current `Thread`, which is semantically stored in the `resume` handler's local
607+
state (although an optimizing implementation would instead maintain the current
608+
thread in the VM's execution context so that it could be retrieved with a single
609+
load and/or kept in register state).
610+
611+
Second, there is only a single continuation type used with `resume`:
612+
```wat
613+
(type $ct (cont (func (param $cancelled bool) (result (ref null $Thread)))))
614+
```
615+
Thus, continuations are only produced for the `$block` event; the continuation
616+
produced for `$current-thread` is immediately resumed and so never "escapes".
617+
618+
Third, *every* `resume` performed by the Canonical ABI always handles *both*
619+
`$block` and `$current-thread` and thus every Canonical ABI `suspend` is always
620+
handled by the innermost Canonical ABI `resume` without a dynamic handler/tag
621+
search.
622+
623+
Given these restrictions, specialized versions of `cont.new`, `resume` and
624+
`suspend` that are "monomorphized" to the above types and tags can be easily
625+
implemented in terms of Python's standard preemptive threading primitives, using
626+
[`threading.Thread`] to provide a native stack, [`threading.Lock`] to only allow
627+
a single `threading.Thread` to execute at a time, and [`threading.local`] to
628+
maintain the dynamic handler scope using thread-local storage. This could have
629+
been implemented more directly and efficiently using [fibers], but the Python
630+
standard library doesn't have fibers. However, a realistic implementation is
631+
expected to use (a pool of) fibers.
632+
633+
Starting with `cont.new`, the monomorphized version takes a function type
634+
matching `$ct` above:
635+
```python
636+
class Continuation:
637+
lock: threading.Lock
638+
handler: Handler
639+
cancelled: Cancelled
640+
641+
class Handler:
642+
tls = threading.local()
643+
lock: threading.Lock
644+
current: Thread
645+
cont: Optional[Continuation]
646+
switch_to: Optional[Thread]
647+
648+
def cont_new(f: Callable[[Cancelled], Optional[Thread]]) -> Continuation:
649+
cont = Continuation()
650+
cont.lock = threading.Lock()
651+
cont.lock.acquire()
652+
def wrapper():
653+
cont.lock.acquire()
654+
Handler.tls.value = cont.handler
655+
f(cont.cancelled)
656+
handler = Handler.tls.value
657+
handler.cont = None
658+
handler.switch_to = switch_to
659+
handler.lock.release()
660+
threading.Thread(target = wrapper).start()
661+
return cont
662+
```
663+
Here, `Continuation` is used to pass parameters from `resume` to the
664+
continuation's thread. These parameters are set on `Continuation` right before
665+
`resume` calls `Continuation.lock.release()` to transfer control flow to the
666+
continuation. The `Handler` object is created by `resume` with the expectation
667+
that `Handler.lock.release()` will be called to transfer control flow and
668+
results back to `resume` handler. The `Handler` is stored in the thread-local
669+
storage of the internal `threading.Thread` to implement the dynamic scoping
670+
required by stack-switching. Because a single `threading.Thread` can be
671+
suspended and resumed many times (each time with a new `Continuation` /
672+
`Handler`), the `Handler` must be re-loaded from TLS after `f` returns since it
673+
may have changed.
674+
675+
Next, `resume` is monomorphized to take a continuation of type `$ct`, the
676+
`cancelled` argument passed to `$ct` and, lastly, the `current` `Thread` which
677+
is to be immediately returned by the `(on $current-thread)` handler. The
678+
`(on $block)` and "returned without suspending" cases are merged into a single
679+
return value, where the latter "returned without suspended" case produces
680+
`None` for the returned `Optional[Continuation]`.
681+
```python
682+
def resume(cont: Continuation, cancelled: Cancelled, current: Thread) -> \
683+
tuple[Optional[Continuation], Optional[Thread]]:
684+
handler = Handler()
685+
handler.lock = threading.Lock()
686+
handler.lock.acquire()
687+
handler.current = current
688+
cont.handler = handler
689+
cont.cancelled = cancelled
690+
cont.lock.release()
691+
handler.lock.acquire()
692+
return (handler.cont, handler.switch_to)
693+
```
694+
695+
Lastly, `suspend` is monomorphized into 2 functions for the `$block` and
696+
`$current-thread` tags shown above, so that their signatures and implementations
697+
can be specialized. Since `$current-thread` has a trivial handler that simply
698+
returns the `current` `Thread` passed to `resume`, it can simply return
699+
`Handler.current` directly without any stack switching.
700+
```python
701+
def block(switch_to: Optional[Thread]) -> Cancelled:
702+
cont = Continuation()
703+
cont.lock = threading.Lock()
704+
cont.lock.acquire()
705+
handler = Handler.tls.value
706+
handler.cont = cont
707+
handler.switch_to = switch_to
708+
handler.lock.release()
709+
cont.lock.acquire()
710+
Handler.tls.value = cont.handler
711+
return cont.cancelled
712+
713+
def current_thread() -> Thread:
714+
return Handler.tls.value.current
715+
```
716+
717+
In the future, when Core WebAssembly gets [stack-switching], the Component Model
718+
`$block` and `$current-thread` tags would not be exposed to Core WebAssembly.
719+
Thus, an optimizing implementation would continue to be able to implement
720+
`block()` as a direct control flow transfer to the innermost `resume()` and
721+
`current_thread()` via implicit context, both without an O(n) handler-stack tag
722+
search. In particular, this avoids the pathological O(N<sup>2</sup>) behavior
723+
which would otherwise arise if Component Model cooperative threads were used in
724+
conjunction with deeply-nested Core WebAssembly handlers.
725+
726+
Additionally, once Core WebAssembly has stack switching, any unhandled events
727+
that originate in Core WebAssembly would turn into traps if they reach a
728+
component boundary (just like unhandled exceptions do now; see
729+
`call_and_trap_on_throw` below). Thus, all cross-component/cross-language stack
730+
switching would continue to be mediated by the Component Model's types and
731+
Canonical ABI, with Core WebAssembly stack-switching used to implement
732+
intra-component concurrency according to the language's own internal ABI, which
733+
can be different inside each component.
734+
735+
590736
#### Thread State
591737

592738
As described in the [concurrency explainer], threads are created both
@@ -595,9 +741,8 @@ As described in the [concurrency explainer], threads are created both
595741
`canon_thread_new_indirect` below). Threads are represented here by the
596742
`Thread` class and the [current thread] is represented by explicitly threading
597743
a reference to a `Thread` through all Core WebAssembly calls so that the
598-
`thread` parameter always points to "the current thread". The `Thread` class
599-
provides a set of primitive control-flow operations that are used by the rest
600-
of the Canonical ABI definitions.
744+
`thread` parameter always points to "the current thread".
745+
601746

602747
While `Thread`s are semantically created for each component export call by the
603748
Python `canon_lift` code, an optimizing runtime should be able to allocate
@@ -608,19 +753,21 @@ their own per-component-instance `threads` table so that thread table indices
608753
and elements can be more-readily reused between calls without interference from
609754
the other kinds of handles.
610755

611-
`Thread` is implemented using the Python standard library's [`threading`]
612-
module. While a Python [`threading.Thread`] is a preemptively-scheduled [kernel
613-
thread], it is coerced to behave like a cooperatively-scheduled [fiber] by
614-
careful use of [`threading.Lock`]. If Python had built-in fibers (or algebraic
615-
effects), those could have been used instead since all that's needed is the
616-
ability to switch stacks. In any case, the use of `threading.Thread` is
617-
encapsulated by the `Thread` class so that the rest of the Canonical ABI can
618-
simply use `suspend`, `resume`, etc.
619-
620-
When a `Thread` is suspended and then resumed, it receives a `Cancelled`
621-
value indicating whether the caller has cooperatively requested that the thread
622-
cancel itself which is communicated to Core WebAssembly with the following
623-
integer values:
756+
The `Thread` class is implemented in terms of the `cont_new`, `resume` and
757+
`suspend` control-flow primitives, which are defined in the previous section.
758+
`Thread` defines a set of higher-level concurrency operations that are used by
759+
all the other Canonical ABI definitions. In particular, `Thread` layers on the
760+
concepts of:
761+
* [waiting on external I/O]
762+
* [async call stack]
763+
* [thread index]
764+
* [thread-local storage]
765+
* [cancellation]
766+
767+
When a `Thread` suspends and then resumes, it receives a `Cancelled` value
768+
indicating whether the caller has cooperatively requested that the thread cancel
769+
itself which is communicated to Core WebAssembly with the following integer
770+
values:
624771
```python
625772
class Cancelled(IntEnum):
626773
FALSE = 0
@@ -3478,10 +3625,11 @@ optimization to avoid allocating stacks for async languages that have avoided
34783625
the need for stackful coroutines by design (e.g., `async`/`await` in JS,
34793626
Python, C# and Rust).
34803627

3481-
Uncaught Core WebAssembly [exceptions] result in a trap at component
3482-
boundaries. Thus, if a component wishes to signal an error, it must use some
3483-
sort of explicit type such as `result` (whose `error` case particular language
3484-
bindings may choose to map to and from exceptions):
3628+
Uncaught Core WebAssembly [exceptions] or, in a future with [stack-switching],
3629+
unhandled events, result in a trap at component boundaries. Thus, if a component
3630+
wishes to signal an error, it must use some sort of explicit type such as
3631+
`result` (whose `error` case particular language bindings may choose to map to
3632+
and from exceptions):
34853633
```python
34863634
def call_and_trap_on_throw(callee, thread, args):
34873635
try:
@@ -4980,16 +5128,22 @@ def canon_thread_available_parallelism():
49805128
[Shared-Everything Dynamic Linking]: examples/SharedEverythingDynamicLinking.md
49815129
[Concurrency Explainer]: Concurrency.md
49825130
[Suspended]: Concurrency#thread-built-ins
5131+
[Thread Index]: Concurrency#thread-built-ins
5132+
[Async Call Stack]: Concurrency.md#subtasks-and-supertasks
49835133
[Structured Concurrency]: Concurrency.md#subtasks-and-supertasks
49845134
[Recursive Reentrance]: Concurrency.md#subtasks-and-supertasks
49855135
[Backpressure]: Concurrency.md#backpressure
5136+
[Thread]: Concurrency.md#threads-and-tasks
5137+
[Threads]: Concurrency.md#threads-and-tasks
49865138
[Current Thread]: Concurrency.md#current-thread-and-task
49875139
[Current Task]: Concurrency.md#current-thread-and-task
49885140
[Block]: Concurrency.md#blocking
5141+
[Waiting on External I/O]: Concurrency.md#blocking
49895142
[Subtasks]: Concurrency.md#subtasks-and-supertasks
49905143
[Readable and Writable Ends]: Concurrency.md#streams-and-futures
49915144
[Readable or Writable End]: Concurrency.md#streams-and-futures
49925145
[Thread-Local Storage]: Concurrency.md#thread-local-storage
5146+
[Cancellation]: Concurrency.md#cancellation
49935147
[Subtask State Machine]: Concurrency.md#cancellation
49945148
[Stream Readiness]: Concurrency.md#stream-readiness
49955149

@@ -5012,6 +5166,7 @@ def canon_thread_available_parallelism():
50125166
[WASI]: https://github.com/webassembly/wasi
50135167
[Deterministic Profile]: https://github.com/WebAssembly/profiles/blob/main/proposals/profiles/Overview.md
50145168
[stack-switching]: https://github.com/WebAssembly/stack-switching
5169+
[Control Tags]: https://github.com/WebAssembly/stack-switching/blob/main/proposals/stack-switching/Explainer.md#declaring-control-tags
50155170
[`memaddr`]: https://webassembly.github.io/spec/core/exec/runtime.html#syntax-memaddr
50165171
[`memaddrs` table]: https://webassembly.github.io/spec/core/exec/runtime.html#syntax-moduleinst
50175172
[`memidx`]: https://webassembly.github.io/spec/core/syntax/modules.html#syntax-memidx
@@ -5027,8 +5182,7 @@ def canon_thread_available_parallelism():
50275182
[Code Units]: https://www.unicode.org/glossary/#code_unit
50285183
[Surrogate]: https://unicode.org/faq/utf_bom.html#utf16-2
50295184
[Name Mangling]: https://en.wikipedia.org/wiki/Name_mangling
5030-
[Kernel Thread]: https://en.wikipedia.org/wiki/Thread_(computing)#kernel_thread
5031-
[Fiber]: https://en.wikipedia.org/wiki/Fiber_(computer_science)
5185+
[Fibers]: https://en.wikipedia.org/wiki/Fiber_(computer_science)
50325186
[Asyncify]: https://emscripten.org/docs/porting/asyncify.html
50335187

50345188
[`import_name`]: https://clang.llvm.org/docs/AttributeReference.html#import-name
@@ -5039,7 +5193,8 @@ def canon_thread_available_parallelism():
50395193

50405194
[`threading`]: https://docs.python.org/3/library/threading.html
50415195
[`threading.Thread`]: https://docs.python.org/3/library/threading.html#thread-objects
5042-
[`threading.Lock`]: https://docs.python.org/3/library/threading.html#lock-objects
5196+
[`threading.Lock`]: https://docs.python.org/3/library/threading.html#lock-objects
5197+
[`threading.local`]: https://docs.python.org/3/library/threading.html#thread-local-data
50435198

50445199
[OIO]: https://en.wikipedia.org/wiki/Overlapped_I/O
50455200
[io_uring]: https://en.wikipedia.org/wiki/Io_uring

design/mvp/Concurrency.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -348,6 +348,7 @@ feature is necessary in any case (due to iloops and traps).
348348

349349
### Current Thread and Task
350350

351+
TODO
351352
At any point in time while executing Core WebAssembly code or a [canonical
352353
built-in] called by Core WebAssembly code, there is a well-defined **current
353354
thread** whose containing task is the **current task**. The "current thread" is

0 commit comments

Comments
 (0)