Replace GetHost with a function pointer, add HasData#10770
Merged
Merged
Conversation
Member
Author
|
cc @lann on this as well as it relates to Spin factors and such |
alexcrichton
added a commit
to alexcrichton/spin
that referenced
this pull request
May 12, 2025
This commit is a refactoring to replace the concrete `InitContext` structure with a generic type parameter instead. The motivation for this is to prepare for handling bytecodealliance/wasmtime#10770. That's a big change to how `add_to_linker` functions work, notably that the argument to the generated functions is a `fn`, not a `F: Fn`. WASI factors currently rely on the closure-like nature this argument which means it's not compatible with that change. The refactoring to use a `trait InitContext` here enables plumbing a type parameter through a function to be able to get a function pointer without relying on closures.
alexcrichton
added a commit
to alexcrichton/spin
that referenced
this pull request
May 12, 2025
This commit is a refactoring to replace the concrete `InitContext` structure with a generic type parameter instead. The motivation for this is to prepare for handling bytecodealliance/wasmtime#10770. That's a big change to how `add_to_linker` functions work, notably that the argument to the generated functions is a `fn`, not a `F: Fn`. WASI factors currently rely on the closure-like nature this argument which means it's not compatible with that change. The refactoring to use a `trait InitContext` here enables plumbing a type parameter through a function to be able to get a function pointer without relying on closures. Signed-off-by: Alex Crichton <alex@alexcrichton.com>
pchickey
approved these changes
May 12, 2025
Contributor
pchickey
left a comment
There was a problem hiding this comment.
Thanks. I greatly prefer this to GetHost. Every time I had to use GetHost I had to look at other use sites and think hard about how to get it right, whereas this is immediately obvious. Also, changing from a closure to a function pointer should hopefully make it more clear that the purpose is projection, rather than causing a side effect, as it has been misused at least once in the past.
This commit is a refactoring to the fundamentals of the `bindgen!` macro and the functions that it generates. Prior to this change the fundamental entrypoint generated by `bindgen!` was a function `add_to_linker_get_host` which takes a value of type `G: GetHost`. This `GetHost` implementation is effectively an alias for a closure whose return value is able to close over the parameter given lfietime-wise. The `GetHost` abstraction was added to Wasmtime originally to enable using any type that implements `Host` traits, not just `&mut U` as was originally supported. The definition of `GetHost` was _just_ right to enable a type such as `MyThing<&mut T>` to implement `Host` and a closure could be provided that could return it. At the time that `GetHost` was added it was known to be problematic from an understandability point of view, namely: * It has a non-obvious definition. * It's pretty advanced Rust voodoo to understand what it's actually doing * Using `GetHost` required lots of `for<'a> ...` in places which is unfamiliar syntax for many. * `GetHost` values couldn't be type-erased (e.g. put in a trait object) as we couldn't figure out the lifetime syntax to do so. Despite these issues it was the only known solution at hand so we landed it and kept the previous `add_to_linker` style (`&mut T -> &mut U`) as a convenience. While this has worked reasonable well (most folks just try to not look at `GetHost`) it has reached a breaking point in the WASIp3 work. In the WASIp3 work it's effectively now going to be required that the `G: GetHost` value is packaged up and actually stored inside of accessors provided to host functions. This means that `GetHost` values now need to not only be taken in `add_to_linker` but additionally provided to the rest of the system through an "accessor". This was made possible in bytecodealliance#10746 by moving the `GetHost` type into Wasmtime itself (as opposed to generated code where it lived prior). While this worked with WASIp3 and it was possible to plumb `G: GetHost` safely around, this ended up surfacing more issues. Namely all "concurrent" host functions started getting significantly more complicated `where` clauses and type signatures. At the end of the day I felt that we had reached the end of the road to `GetHost` and wanted to search for alternatives, hence this change. The fundamental purpose of `GetHost` was to be able to express, in a generic fashion: * Give me a closure that takes `&mut T` and returns `D`. * The `D` type can close over the lifetime in `&mut T`. * The `D` type must implement `bindgen!`-generated traits. A realization I had was that we could model this with a generic associated type in Rust. Rust support for generic associated types is relatively new and not something I've used much before, but it ended up being a perfect model for this. The definition of the new `HasData` trait is deceptively simple: trait HasData { type Data<'a>; } What this enables us to do though is to generate `add_to_linker` functions that look like this: fn add_to_linker<T, D>( linker: &mut Linker<T>, getter: fn(&mut T) -> D::Data<'_>, ) -> Result<()> where D: HasData, for<'a> D::Data<'a>: Host; This definition here models `G: GetHost` as a literal function pointer, and the ability to close over the `&mut T` lifetime with type (not just `&mut U`) is expressed through the type constructor `type Data<'a>`). Ideally we could take a generic generic associated type (I'm not even sure what to call that), but that's not something Rust has today. Overall this felt like a much simpler way of modeling `GetHost` and its requirements. This plumbed well throughout the WASIp3 work and the signatures for concurrent functions felt much more appropriate in terms of complexity after this change. Taking this change to the limit means that `GetHost` in its entirety could be purged since all usages of it could be replaced with `fn(&mut T) -> D::Data<'a>`, a hopefully much more understandable type. This change is not all rainbows however, there are some gotchas that remain: * One is that all `add_to_linker` generated functions have a `D: HasData` type parameter. This type parameter cannot be inferred and must always be explicitly specified, and it's not easy to know what to supply here without reading documentation. Actually supplying the type parameter is quite easy once you know what to do (and what to fill in), but it may involve defining a small struct with a custom `HasData` implementation which can be non-obvious. * Another is that the `G: GetHost` value was previously a full Rust closure, but now it's transitioning to a function pointer. This is done in preparation for WASIp3 work where the function needs to be passed around, and doing that behind a generic parameter is more effort than it's worth. This means that embedders relying on the true closure-like nature here will have to update to using a function pointer instead. * The function pointer is stored in locations that require `'static`, and while `fn(T)` might be expected to be `'static` regardless of `T` is is, in fact, not. This means that practically `add_to_linker` requires `T: 'static`. Relative to just before this change this is a possible regression in functionality, but there orthogonal reasons beyond just this that we want to start requiring `T: 'static` anyway. That means that this isn't actually a regression relative to bytecodealliance#10760, a related change. The first point is partially ameliorated with WASIp3 work insofar that the `D` type parameter will start serving as a location to specify where concurrent implementations are found. These concurrent methods don't take `&mut self` but instead are implemented for `T: HasData` types. In that sense it's more justified to have this weird type parameter, but in the meantime without this support it'll feel a bit odd to have this little type parameter hanging off the side. This change has been integrated into the WASIp3 prototyping repository with success. This has additionally been integrated into the Spin embedding which has one of the more complicated reliances on `*_get_host` functions known. Given that it's expected that while this is not necessarily a trivial change to rebase over it should at least be possible. Finally the `HasData` trait here has been included with what I'm hoping is a sufficient amount of documentation to at least give folks a spring board to understand it. If folks have confusion about this `D` type parameter my hope is they'll make their way to `HasData` which showcases various patterns for "librarifying" host implementations of WIT interfaces. These patterns are all used throughout Wasmtime and WASI currently in crates and tests and such.
alexcrichton
added a commit
to alexcrichton/spin
that referenced
this pull request
May 19, 2025
This commit is a refactoring to replace the concrete `InitContext` structure with a generic type parameter instead. The motivation for this is to prepare for handling bytecodealliance/wasmtime#10770. That's a big change to how `add_to_linker` functions work, notably that the argument to the generated functions is a `fn`, not a `F: Fn`. WASI factors currently rely on the closure-like nature this argument which means it's not compatible with that change. The refactoring to use a `trait InitContext` here enables plumbing a type parameter through a function to be able to get a function pointer without relying on closures. Signed-off-by: Alex Crichton <alex@alexcrichton.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This commit is a refactoring to the fundamentals of the
bindgen!macroand the functions that it generates. Prior to this change the
fundamental entrypoint generated by
bindgen!was a functionadd_to_linker_get_hostwhich takes a value of typeG: GetHost. ThisGetHostimplementation is effectively an alias for a closure whosereturn value is able to close over the parameter given lfietime-wise.
The
GetHostabstraction was added to Wasmtime originally to enableusing any type that implements
Hosttraits, not just&mut Uas wasoriginally supported. The definition of
GetHostwas just right toenable a type such as
MyThing<&mut T>to implementHostand aclosure could be provided that could return it. At the time that
GetHostwas added it was known to be problematic from anunderstandability point of view, namely:
doing
GetHostrequired lots offor<'a> ...in places which isunfamiliar syntax for many.
GetHostvalues couldn't be type-erased (e.g. put in a trait object)as we couldn't figure out the lifetime syntax to do so.
Despite these issues it was the only known solution at hand so we landed
it and kept the previous
add_to_linkerstyle (&mut T -> &mut U) as aconvenience. While this has worked reasonable well (most folks just try
to not look at
GetHost) it has reached a breaking point in the WASIp3work.
In the WASIp3 work it's effectively now going to be required that the
G: GetHostvalue is packaged up and actually stored inside ofaccessors provided to host functions. This means that
GetHostvaluesnow need to not only be taken in
add_to_linkerbut additionallyprovided to the rest of the system through an "accessor". This was made
possible in #10746 by moving the
GetHosttype into Wasmtime itself (asopposed to generated code where it lived prior).
While this worked with WASIp3 and it was possible to plumb
G: GetHostsafely around, this ended up surfacing more issues. Namely all
"concurrent" host functions started getting significantly more
complicated
whereclauses and type signatures. At the end of the day Ifelt that we had reached the end of the road to
GetHostand wanted tosearch for alternatives, hence this change.
The fundamental purpose of
GetHostwas to be able to express, in ageneric fashion:
&mut Tand returnsD.Dtype can close over the lifetime in&mut T.Dtype must implementbindgen!-generated traits.A realization I had was that we could model this with a generic
associated type in Rust. Rust support for generic associated types is
relatively new and not something I've used much before, but it ended up
being a perfect model for this. The definition of the new
HasDatatrait is deceptively simple:
What this enables us to do though is to generate
add_to_linkerfunctions that look like this:
This definition here models
G: GetHostas a literal function pointer,and the ability to close over the
&mut Tlifetime with type (not just&mut U) is expressed through the type constructortype Data<'a>).Ideally we could take a generic generic associated type (I'm not even
sure what to call that), but that's not something Rust has today.
Overall this felt like a much simpler way of modeling
GetHostand itsrequirements. This plumbed well throughout the WASIp3 work and the
signatures for concurrent functions felt much more appropriate in terms
of complexity after this change. Taking this change to the limit means
that
GetHostin its entirety could be purged since all usages of itcould be replaced with
fn(&mut T) -> D::Data<'a>, a hopefully muchmore understandable type.
This change is not all rainbows however, there are some gotchas that
remain:
One is that all
add_to_linkergenerated functions have aD: HasDatatype parameter. This type parameter cannot be inferred andmust always be explicitly specified, and it's not easy to know what to
supply here without reading documentation. Actually supplying the type
parameter is quite easy once you know what to do (and what to fill
in), but it may involve defining a small struct with a custom
HasDataimplementation which can be non-obvious.Another is that the
G: GetHostvalue was previously a full Rustclosure, but now it's transitioning to a function pointer. This is
done in preparation for WASIp3 work where the function needs to be
passed around, and doing that behind a generic parameter is more
effort than it's worth. This means that embedders relying on the true
closure-like nature here will have to update to using a function
pointer instead.
The function pointer is stored in locations that require
'static,and while
fn(T)might be expected to be'staticregardless ofTis is, in fact, not. This means that practically
add_to_linkerrequires
T: 'static. Relative to just before this change this is apossible regression in functionality, but there orthogonal reasons
beyond just this that we want to start requiring
T: 'staticanyway.That means that this isn't actually a regression relative to Add
T: 'statictoStore<T>#10760, arelated change.
The first point is partially ameliorated with WASIp3 work insofar that
the
Dtype parameter will start serving as a location to specify whereconcurrent implementations are found. These concurrent methods don't
take
&mut selfbut instead are implemented forT: HasDatatypes. Inthat sense it's more justified to have this weird type parameter, but in
the meantime without this support it'll feel a bit odd to have this
little type parameter hanging off the side.
This change has been integrated into the WASIp3 prototyping repository
with success. This has additionally been integrated into the Spin
embedding which has one of the more complicated reliances on
*_get_hostfunctions known. Given that it's expected that while thisis not necessarily a trivial change to rebase over it should at least be
possible.
Finally the
HasDatatrait here has been included with what I'm hopingis a sufficient amount of documentation to at least give folks a spring
board to understand it. If folks have confusion about this
Dtypeparameter my hope is they'll make their way to
HasDatawhich showcasesvarious patterns for "librarifying" host implementations of WIT
interfaces. These patterns are all used throughout Wasmtime and WASI
currently in crates and tests and such.