@@ -16,6 +16,7 @@ specified here.
1616 * [ Component Instance State] ( #component-instance-state )
1717 * [ Table State] ( #table-state )
1818 * [ Resource State] ( #resource-state )
19+ * [ Stack Switching] ( #stack-switching )
1920 * [ Thread State] ( #thread-state )
2021 * [ Waitable State] ( #waitable-state )
2122 * [ Task State] ( #task-state )
@@ -587,69 +588,200 @@ class ResourceType(Type):
587588```
588589
589590
590- #### Thread State
591+ #### Stack Switching
591592
592- As described in the [ concurrency explainer] , threads are created both
593- * implicitly* , when calling a component export (in ` canon_lift ` below), and
594- * explicitly* , when core wasm code calls the ` thread.new-indirect ` built-in (in
595- ` canon_thread_new_indirect ` below). Threads are represented here by the
596- ` Thread ` class and the [ current thread] is represented by explicitly threading
597- a reference to a ` Thread ` through all Core WebAssembly calls so that the
598- ` thread ` parameter always points to "the current thread". The ` Thread ` class
599- provides a set of primitive control-flow operations that are used by the rest
600- of the Canonical ABI definitions.
601-
602- While ` Thread ` s are semantically created for each component export call by the
603- Python ` canon_lift ` code, an optimizing runtime should be able to allocate
604- ` Thread ` s lazily, only when needed for actual thread switching operations,
605- thereby avoiding cross-component call overhead for simple, short-running
606- cross-component calls. To assist in this optimization, ` Thread ` s are put into
607- their own per-component-instance ` threads ` table so that thread table indices
608- and elements can be more-readily reused between calls without interference from
609- the other kinds of handles.
610-
611- ` Thread ` is implemented using the Python standard library's [ ` threading ` ]
612- module. While a Python [ ` threading.Thread ` ] is a preemptively-scheduled [ kernel
613- thread] , it is coerced to behave like a cooperatively-scheduled [ fiber] by
614- careful use of [ ` threading.Lock ` ] . If Python had built-in fibers (or algebraic
615- effects), those could have been used instead since all that's needed is the
616- ability to switch stacks. In any case, the use of ` threading.Thread ` is
617- encapsulated by the ` Thread ` class so that the rest of the Canonical ABI can
618- simply use ` suspend ` , ` resume ` , etc.
619-
620- When a ` Thread ` is suspended and then resumed, it receives a ` Cancelled `
621- value indicating whether the caller has cooperatively requested that the thread
622- cancel itself which is communicated to Core WebAssembly with the following
623- integer values:
593+ Component Model concurrency support is defined in terms of the Core WebAssembly
594+ [ stack-switching] proposal's ` cont.new ` , ` resume ` and ` suspend ` instructions.
595+ However, the Component Model only needs a limited subset of the full
596+ [ stack-switching] proposal:
597+
598+ First, there are only two global [ control tags] used with ` suspend ` :
599+ ``` wat
600+ (tag $block (param $switch-to (ref null $Thread)) (result $cancelled bool))
601+ (tag $current-thread (result (ref $Thread)))
602+ ```
603+ The ` $block ` tag is used to suspend a [ thread] until some future event. The
604+ parameters and results will be described in the next section, where they are
605+ used to define ` Thread ` . The ` $current-thread ` tag is used to retrieve the
606+ current ` Thread ` , which is semantically stored in the ` resume ` handler's local
607+ state (although an optimizing implementation would instead maintain the current
608+ thread in the VM's execution context so that it could be retrieved with a single
609+ load and/or kept in register state).
610+
611+ Second, there is only a single continuation type used with ` resume ` :
612+ ``` wat
613+ (type $ct (cont (func (param $cancelled bool) (result (ref null $Thread)))))
614+ ```
615+ Thus, continuations are only produced for the ` $block ` event; the continuation
616+ produced for ` $current-thread ` is immediately resumed and so never "escapes".
617+
618+ Third, * every* ` resume ` performed by the Canonical ABI always handles * both*
619+ ` $block ` and ` $current-thread ` and thus every Canonical ABI ` suspend ` is always
620+ handled by the innermost Canonical ABI ` resume ` without a dynamic handler/tag
621+ search.
622+
623+ Given these restrictions, specialized versions of ` cont.new ` , ` resume ` and
624+ ` suspend ` that are "monomorphized" to the above types and tags can be easily
625+ implemented in terms of Python's standard preemptive threading primitives, using
626+ [ ` threading.Thread ` ] to provide a native stack, [ ` threading.Lock ` ] to only allow
627+ a single ` threading.Thread ` to execute at a time, and [ ` threading.local ` ] to
628+ maintain the dynamic handler scope using thread-local storage. This could have
629+ been implemented more directly and efficiently using [ fibers] , but the Python
630+ standard library doesn't have fibers. However, a realistic implementation is
631+ expected to use (a pool of) fibers.
632+
633+ Starting with ` cont.new ` , the monomorphized version takes a function type
634+ matching ` $ct ` above:
624635``` python
625636class Cancelled (IntEnum ):
626637 FALSE = 0
627638 TRUE = 1
628- ```
639+
640+ class Continuation :
641+ lock: threading.Lock
642+ handler: Handler
643+ cancelled: Cancelled
644+
645+ class Handler :
646+ tls = threading.local()
647+ lock: threading.Lock
648+ current: Thread
649+ cont: Optional[Continuation]
650+ switch_to: Optional[Thread]
651+
652+ def cont_new (f : Callable[[Cancelled], Optional[Thread]]) -> Continuation:
653+ cont = Continuation()
654+ cont.lock = threading.Lock()
655+ cont.lock.acquire()
656+ def wrapper ():
657+ cont.lock.acquire()
658+ Handler.tls.value = cont.handler
659+ f(cont.cancelled)
660+ handler = Handler.tls.value
661+ handler.cont = None
662+ handler.switch_to = switch_to
663+ handler.lock.release()
664+ threading.Thread(target = wrapper).start()
665+ return cont
666+ ```
667+ Here, ` Continuation ` is used to pass parameters from ` resume ` to the
668+ continuation's thread. These parameters are set on ` Continuation ` right before
669+ ` resume ` calls ` Continuation.lock.release() ` to transfer control flow to the
670+ continuation. The ` Handler ` object is created by ` resume ` with the expectation
671+ that ` Handler.lock.release() ` will be called to transfer control flow and
672+ results back to ` resume ` handler. The ` Handler ` is stored in the thread-local
673+ storage of the internal ` threading.Thread ` to implement the dynamic scoping
674+ required by stack-switching. Because a single ` threading.Thread ` can be
675+ suspended and resumed many times (each time with a new ` Continuation ` /
676+ ` Handler ` ), the ` Handler ` must be re-loaded from TLS after ` f ` returns since it
677+ may have changed.
678+
679+ Next, ` resume ` is monomorphized to take a continuation of type ` $ct ` , the
680+ ` cancelled ` argument passed to ` $ct ` and, lastly, the ` current ` ` Thread ` which
681+ is to be immediately returned by the ` (on $current-thread) ` handler. The
682+ ` (on $block) ` and "returned without suspending" cases are merged into a single
683+ return value, where the latter "returned without suspended" case produces
684+ ` None ` for the returned ` Optional[Continuation] ` .
685+ ``` python
686+ def resume (cont : Continuation, cancelled : Cancelled, current : Thread) -> \
687+ tuple[Optional[Continuation], Optional[Thread]]:
688+ handler = Handler()
689+ handler.lock = threading.Lock()
690+ handler.lock.acquire()
691+ handler.current = current
692+ cont.handler = handler
693+ cont.cancelled = cancelled
694+ cont.lock.release()
695+ handler.lock.acquire()
696+ return (handler.cont, handler.switch_to)
697+ ```
698+
699+ Lastly, ` suspend ` is monomorphized into 2 functions for the ` $block ` and
700+ ` $current-thread ` tags shown above, so that their signatures and implementations
701+ can be specialized. Since ` $current-thread ` has a trivial handler that simply
702+ returns the ` current ` ` Thread ` passed to ` resume ` , it can simply return
703+ ` Handler.current ` directly without any stack switching.
704+ ``` python
705+ def block (switch_to : Optional[Thread]) -> Cancelled:
706+ cont = Continuation()
707+ cont.lock = threading.Lock()
708+ cont.lock.acquire()
709+ handler = Handler.tls.value
710+ handler.cont = cont
711+ handler.switch_to = switch_to
712+ handler.lock.release()
713+ cont.lock.acquire()
714+ Handler.tls.value = cont.handler
715+ return cont.cancelled
716+
717+ def current_thread () -> Thread:
718+ return Handler.tls.value.current
719+ ```
720+
721+ In the future, when Core WebAssembly gets [ stack-switching] , the Component Model
722+ ` $block ` and ` $current-thread ` tags would not be exposed to Core WebAssembly.
723+ Thus, an optimizing implementation would continue to be able to implement
724+ ` block() ` as a direct control flow transfer to the innermost ` resume() ` and
725+ ` current_thread() ` via implicit context, both without an O(n) handler-stack tag
726+ search. In particular, this avoids the pathological O(N<sup >2</sup >) behavior
727+ which would otherwise arise if Component Model cooperative threads were used in
728+ conjunction with deeply-nested Core WebAssembly handlers.
729+
730+ Additionally, once Core WebAssembly has stack switching, any unhandled events
731+ that originate in Core WebAssembly would turn into traps if they reach a
732+ component boundary (just like unhandled exceptions do now; see
733+ ` call_and_trap_on_throw ` below). Thus, all cross-component/cross-language stack
734+ switching would continue to be mediated by the Component Model's types and
735+ Canonical ABI, with Core WebAssembly stack-switching used to implement
736+ intra-component concurrency according to the language's own internal ABI, which
737+ can be different inside each component.
738+
739+
740+ #### Thread State
741+
742+ As described in the [ concurrency explainer] , threads are created both
743+ * implicitly* , when calling a component export (in ` canon_lift ` below), and
744+ * explicitly* , when core wasm code calls the ` thread.new-indirect ` built-in (in
745+ ` canon_thread_new_indirect ` below). While threads are * logically* created for
746+ each component export call, an optimizing runtime should be able to allocate
747+ threads lazily when needed for actual thread switching operations, thereby
748+ avoiding cross-component call overhead for simple, short-running cross-component
749+ calls. To assist in this optimization, threads are put into their own
750+ ` ComponentInstance.threads ` table to reduce interference from the other kinds of
751+ handles.
752+
753+ Threads are represented in the Canonical ABI by the ` Thread ` class defined in
754+ this section. The ` Thread ` class is implemented in terms of the ` cont_new ` ,
755+ ` resume ` , ` block ` and ` current_thread ` stack-switching primitives defined in the
756+ previous section. ` Thread ` defines a set of higher-level concurrency operations
757+ that are used by all the other Canonical ABI definitions. In particular, a
758+ "thread" adds the higher-level concepts of:
759+ * [ waiting on external I/O]
760+ * [ async call stack]
761+ * [ thread index]
762+ * [ thread-local storage]
763+ * [ cancellation]
629764
630765Introducing the ` Thread ` class in chunks, a ` Thread ` has the following fields
631766and can be in one of the following 3 states based on these fields:
632- * ` running ` : actively executing with a "parent" thread that is waiting
633- to run once the ` running ` thread suspends or returns
634- * ` suspended ` : waiting to be ` resume ` d by another thread
635- * ` waiting ` : waiting to be ` resume ` d by ` Store.tick ` once ` ready `
767+ * ` running ` : actively executing on the stack
768+ * ` suspended ` : waiting to be resumed by some other thread ` running ` in
769+ the same component instance (via its ` index ` )
770+ * ` pending ` : waiting to to be resumed by the host (in ` Store.tick ` once ` ready `
636771
637772``` python
638773class Thread :
639- task: Task
640- fiber: threading.Thread
641- fiber_lock: threading.Lock
642- parent_lock: Optional[threading.Lock]
774+ cont: Optional[Continuation]
643775 ready_func: Optional[Callable[[], bool ]]
644- cancellable: bool
645- cancelled: Cancelled
776+ task: Task
646777 index: Optional[int ]
647778 context: list[int ]
779+ cancellable: bool
648780
649781 CONTEXT_LENGTH = 2
650782
651783 def running (self ):
652- return self .parent_lock is not None
784+ return self .cont is None
653785
654786 def suspended (self ):
655787 return not self .running() and self .ready_func is None
@@ -3494,10 +3626,11 @@ optimization to avoid allocating stacks for async languages that have avoided
34943626the need for stackful coroutines by design (e.g., ` async ` /` await ` in JS,
34953627Python, C# and Rust).
34963628
3497- Uncaught Core WebAssembly [ exceptions] result in a trap at component
3498- boundaries. Thus, if a component wishes to signal an error, it must use some
3499- sort of explicit type such as ` result ` (whose ` error ` case particular language
3500- bindings may choose to map to and from exceptions):
3629+ Uncaught Core WebAssembly [ exceptions] or, in a future with [ stack-switching] ,
3630+ unhandled events, result in a trap at component boundaries. Thus, if a component
3631+ wishes to signal an error, it must use some sort of explicit type such as
3632+ ` result ` (whose ` error ` case particular language bindings may choose to map to
3633+ and from exceptions):
35013634``` python
35023635def call_and_trap_on_throw (callee , thread , args ):
35033636 try :
@@ -4981,16 +5114,22 @@ def canon_thread_available_parallelism():
49815114[ Shared-Everything Dynamic Linking ] : examples/SharedEverythingDynamicLinking.md
49825115[ Concurrency Explainer ] : Concurrency.md
49835116[ Suspended ] : Concurrency#thread-built-ins
5117+ [ Thread Index ] : Concurrency#thread-built-ins
5118+ [ Async Call Stack ] : Concurrency.md#subtasks-and-supertasks
49845119[ Structured Concurrency ] : Concurrency.md#subtasks-and-supertasks
49855120[ Recursive Reentrance ] : Concurrency.md#subtasks-and-supertasks
49865121[ Backpressure ] : Concurrency.md#backpressure
5122+ [ Thread ] : Concurrency.md#threads-and-tasks
5123+ [ Threads ] : Concurrency.md#threads-and-tasks
49875124[ Current Thread ] : Concurrency.md#current-thread-and-task
49885125[ Current Task ] : Concurrency.md#current-thread-and-task
49895126[ Block ] : Concurrency.md#blocking
5127+ [ Waiting on External I/O ] : Concurrency.md#blocking
49905128[ Subtasks ] : Concurrency.md#subtasks-and-supertasks
49915129[ Readable and Writable Ends ] : Concurrency.md#streams-and-futures
49925130[ Readable or Writable End ] : Concurrency.md#streams-and-futures
49935131[ Thread-Local Storage ] : Concurrency.md#thread-local-storage
5132+ [ Cancellation ] : Concurrency.md#cancellation
49945133[ Subtask State Machine ] : Concurrency.md#cancellation
49955134[ Stream Readiness ] : Concurrency.md#stream-readiness
49965135
@@ -5013,6 +5152,7 @@ def canon_thread_available_parallelism():
50135152[ WASI ] : https://github.com/webassembly/wasi
50145153[ Deterministic Profile ] : https://github.com/WebAssembly/profiles/blob/main/proposals/profiles/Overview.md
50155154[ stack-switching ] : https://github.com/WebAssembly/stack-switching
5155+ [ Control Tags ] : https://github.com/WebAssembly/stack-switching/blob/main/proposals/stack-switching/Explainer.md#declaring-control-tags
50165156[ `memaddr` ] : https://webassembly.github.io/spec/core/exec/runtime.html#syntax-memaddr
50175157[ `memaddrs` table ] : https://webassembly.github.io/spec/core/exec/runtime.html#syntax-moduleinst
50185158[ `memidx` ] : https://webassembly.github.io/spec/core/syntax/modules.html#syntax-memidx
@@ -5028,8 +5168,7 @@ def canon_thread_available_parallelism():
50285168[ Code Units ] : https://www.unicode.org/glossary/#code_unit
50295169[ Surrogate ] : https://unicode.org/faq/utf_bom.html#utf16-2
50305170[ Name Mangling ] : https://en.wikipedia.org/wiki/Name_mangling
5031- [ Kernel Thread ] : https://en.wikipedia.org/wiki/Thread_(computing)#kernel_thread
5032- [ Fiber ] : https://en.wikipedia.org/wiki/Fiber_(computer_science)
5171+ [ Fibers ] : https://en.wikipedia.org/wiki/Fiber_(computer_science)
50335172[ Asyncify ] : https://emscripten.org/docs/porting/asyncify.html
50345173
50355174[ `import_name` ] : https://clang.llvm.org/docs/AttributeReference.html#import-name
@@ -5040,7 +5179,8 @@ def canon_thread_available_parallelism():
50405179
50415180[ `threading` ] : https://docs.python.org/3/library/threading.html
50425181[ `threading.Thread` ] : https://docs.python.org/3/library/threading.html#thread-objects
5043- [ `threading.Lock` ] : https://docs.python.org/3/library/threading.html#lock-objects
5182+ [ `threading.Lock` ] : https://docs.python.org/3/library/threading.html#lock-objects
5183+ [ `threading.local` ] : https://docs.python.org/3/library/threading.html#thread-local-data
50445184
50455185[ OIO ] : https://en.wikipedia.org/wiki/Overlapped_I/O
50465186[ io_uring ] : https://en.wikipedia.org/wiki/Io_uring
0 commit comments