Skip to content
This repository was archived by the owner on Sep 8, 2025. It is now read-only.

Merge with upstream#169

Merged
alexcrichton merged 44 commits into
bytecodealliance:mainfrom
alexcrichton:merge-
May 19, 2025
Merged

Merge with upstream#169
alexcrichton merged 44 commits into
bytecodealliance:mainfrom
alexcrichton:merge-

Conversation

@alexcrichton
Copy link
Copy Markdown
Member

Lots of conflicts, mostly of my own making. Undoing some temporary workarounds from #159 too

alexcrichton and others added 30 commits May 5, 2025 20:56
This commit fixes some more fallout found on oss-fuzz about the x64
generating rounding builtins when it shouldn't be. This situation is
caused by simd float rounding instructions which the x64 backend lowers
to libcall-per-element and now needs to move that logic to the frontend
instead.
This refactors implementation of call ABI handling across architectures
with the goal of bringing s390x in line with other platforms.

The main idea is to
- handle main call instruction selection and generation in ISLE
  (like s390x but unlike other platforms today)
- handle argument setup mostly outside of ISLE
  (like other platforms but unlike s390x today)
- handle return value processing as part of the call instructio
  (like all platforms today)

All platforms now emit the main call instruction directly from ISLE,
which e.g. handles selection of the correct ISA instruction depending
on the call destination.  This ISLE code calls out to helper routines
to handle argument and return value processing.  These helpers are
mostly common code and provided by the Callee and/or Lower layers,
with some platform-specific additions via ISLE Context routines.

The old CallSite abstraction is no longer needed; most of the
differences between call and return_call handling disappear.
(There is still a common-code CallInfo vs. a platform-specifc
ReturnCallInfo.  At this point, it should be relatively straight-
forward to make CallInfo platform-specific as well if desired,
but this is not done here.)

Some ISLE infrastructure for iterators / loops, which was only
ever used by the s390x argument processing code, has been removed.

s390x now closely matches all other platforms, with only a few
special cases (slightly different tail-call ABI requires some
differences in stack offset computations; we still need to
handle vector lane swaps for cross-ABI calls), which should
simplify future maintenance.
This commit is similar to #10699, another instance of a libcall popping
up late in the x64 backend. Fuzzing found this issue and to help verify
this is the last one I've run the whole `*.wast` test suite with the
x86_64 baseline (no target features) and saw the panic before this PR
and no more panics after.
* x64: convert `phadd` instructions

* review: use `rustfmt::skip` instead of re-configuring `rustfmt`
This fixes a typo in a file introduced in #10735.
* Update to wit-bindgen 0.42

* Update vets
The public function `generated_files` in `cranelift-assembler-x64` makes
the generated `rlib` non-deterministic because it contains the full
paths of generated files. But this function is only used in `main.rs` of
the same crate, so this change inlines it there to keep the library artifact
deterministic while maintaining the same behavior.
* Move the `GetHost` trait used in `bindgen!` into Wasmtime

Turns out we don't actually need to generate this `GetHost` trait, we
can instead have it live in one location with extra documentation. There
are already extra bounds on the `Host` associated type at all call-sites
so there's no need to additionally have trait bounds in the trait
definition, meaning the trait definition is always the same and it can
move within Wasmtime.

This shouldn't have any impact on any embedders today, it's just moving
things around.

* Review comments
…10747)

* Cranelift: update to regalloc2 0.12.2; support many return values.

Prior versions of regalloc2 could not support more than 255 operands on
an instruction, and together with the integrated return-value loads on
call instructions introduced in #10502, this caused issues with calls
with many returns. This PR upgrades to a version of RA2 that supports up
to `2^16 - 1` operands per instruction (well in excess of the maximum of
1000 return/result values per Wasm's implementation limits, for
example).

Fixes #10741.

* Update vets

---------

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
This does not change functionality; instead, it reorganizes how REX
prefixes were encoded by the assembler to simplify code generation.
`RexFlags` now becomes the more-correct `RexPrefix` and constructors
like `RexPrefix::two_op` build the necessary byte for emission with
`RexPrefix::encode`.
This change simplifies the generated code for encoding REX suffixes
(ModR/M byte, SIB byte, displacement) but does not alter the underlying
logic. As a part of this refactoring, the displacement struct is now
appropriately named `Disp` and several public functions no longer need
to be so.
This replaces all uses of `neg` and `not` with instructions from the new
assembler.
* x64: convert more logical operations

This uses the new assembler for packed operations like `pand`, `pandn`,
`por`, and `pxor`.

* review: remove unnecessary `is_xmm_mem`
* winch(aarch64): Run wast test in CI

Closes bytecodealliance/wasmtime#9566

This commit takes inspiration from
https://github.com/bytecodealliance/wasmtime/pull/10738/files

During the 2025-05-07 Cranelift meeting we discussed potential avenues
to start running wast tests for winch-aarch64. One potential idea was
to declare an allowlist of tests that should be exectued and
potentially ignore everyhing else. Although that idea works, I decided
to diverge a bit from it, in favor of introducing a very strict
classification of the state of tests for aarch64, namely, in this
commit tests are classified as:

* Run and expected to pass
* Run and exepected to fail in a recoverable way
* Don't run since it's known that they produce an unrecoverable error
e.g., segafault.

This approach is probably more verbose than the one discussed in the
meeting, however, I think it's easier to have a global view of that
status for aarch64 and potentially other backends in the future.

* Formatting

* Extract common tests that fail in all targets

* Move the `Compiler::should_fail` call at runtime

`Compiler::should_fail` makes use of `#[cfg(target_arg="...")]` directives to
derive the compiler-arch-Wasm proposal support matrix, however,
`Compiler::should_fail` is also called from the `wasmtime_test` macro
which introduces conflicts with the `target_arch` resolution (within macros,
these directives resolve to the host compiler, instead of the target).

This commit emits `Compiler::should_fail` at runtime instead of at
compile time, which fixes the issue mentioned above.

* Use `.unwrap` when test is expected to pass
* asm: implement write-only operands

This change adds support to the new assembler for write-only operands.
This implementation appeared first in [#10754] but is split out here to
unblock implementation of instructions that require it: multiplication,
conversions, moves, etc. This starts roughly the same as what was
implemented for write-only XMMs in [#10754] but includes support for
write-only GPRs as well and generates the temporary registers which are
needed.

[#10754]: bytecodealliance/wasmtime#10754

Co-authored-by: Johnnie Birch <johnnie.l.birch.jr@intel.com>

* fix: use `to_reg()` to extract the Cranelift type from `Writable`

---------

Co-authored-by: Johnnie Birch <johnnie.l.birch.jr@intel.com>
I ended up needing this in Spin, and it's already needed in Wasmtime's
generated bindings, so let's commit to exposing this.
* Add `T: 'static` to `Store<T>

Since the beginning the `T` type parameter on `Store<T>` has had no
bounds on it. This was intended for maximal flexibility in terms of what
embedders place within a `Store<T>` and I've personally advocated that
we need to keep it this way. In the development of the WASIp3 work,
however, I've at least personally reached the conclusion that this is no
longer tenable and proceeding will require adding a `'static` bound to
data within a store.

Wasmtime today [already] carries unsafe `transmute`s to work around this
lack of `'static` bound, and while the number of `unsafe` parts is
relatively small right now we're still fundamentally lying to the
compiler about lifetime bounds internally. With the WASIp3 async work
this degree of "lying" has become even worse. Joel has written up some
examples [on Zulip] about how the Rust compiler is requiring `'static`
bounds in surprising ways. These patterns are cropping up quite
frequently in the WASIp3 work and it's becoming particularly onerous
maintaining all of the `unsafe` and ensuring that everything is in sync.

In the WASIp3 repository I've additionally [prototyped a change] which
would additionally practically require `T: 'static` in more locations.
This change is one I plan on landing in Wasmtime in the near future and
while its main motivations are for enabling WASIp3 work it is also a
much nicer system than what we have today, in my opinion.

Overall the cost of not having `T: 'static` on `Store<T>` is effectively
becoming quite costly, in particular with respect to WASIp3 work. This
is coupled with all known embedders already using `T: 'static` data
within a `Store<T>` so the expectation of the impact of this change is
not large. The main downside of this change as a result is that when and
where to place `'static` bounds is sort of a game of whack-a-mole with
the compiler. For example I changed `Store<T>` to require `'static`
here, but the rest of the change is basically "hit compile until rustc
says it's ok". There's not necessarily a huge amount of rhyme-or-reason
to where `'static` bounds crop up, which can be surprising or difficult
to work with for users.

In the end I feel that this change is necessary and one we can't shy
away from. If problems crop up we'll need to figure out how to thread
that needle at that time, but I'm coming around to thinking that
`T: 'static` is just a fundamental constraint we'll have to take on at
this time. Maybe a future version of Rust that fixes some of Joel's
examples (if they can be fixed, we're not sure of that) we could
consider relaxing this but that's left for future work.

[already]: https://github.com/bytecodealliance/wasmtime/blob/35053d6d8d1a5d4692cf636cba0c920b4a79a44b/crates/wasmtime/src/runtime/store.rs#L602-L611
[on Zulip]: https://rust-lang.zulipchat.com/#narrow/channel/122651-general/topic/.22type.20may.20not.20live.20long.20enough.22.20for.20generic.20closure/near/473862072
[prototyped a change]: bytecodealliance#158

* Remove a no-longer-necessary `unsafe` block

* Update test expectations

* Fix gc-disabled builds
* Replace `GetHost` with a function pointer, add `HasData`

This commit is a refactoring to the fundamentals of the `bindgen!` macro
and the functions that it generates. Prior to this change the
fundamental entrypoint generated by `bindgen!` was a function
`add_to_linker_get_host` which takes a value of type `G: GetHost`. This
`GetHost` implementation is effectively an alias for a closure whose
return value is able to close over the parameter given lfietime-wise.

The `GetHost` abstraction was added to Wasmtime originally to enable
using any type that implements `Host` traits, not just `&mut U` as was
originally supported. The definition of `GetHost` was _just_ right to
enable a type such as `MyThing<&mut T>` to implement `Host` and a
closure could be provided that could return it. At the time that
`GetHost` was added it was known to be problematic from an
understandability point of view, namely:

* It has a non-obvious definition.
* It's pretty advanced Rust voodoo to understand what it's actually
  doing
* Using `GetHost` required lots of `for<'a> ...` in places which is
  unfamiliar syntax for many.
* `GetHost` values couldn't be type-erased (e.g. put in a trait object)
  as we couldn't figure out the lifetime syntax to do so.

Despite these issues it was the only known solution at hand so we landed
it and kept the previous `add_to_linker` style (`&mut T -> &mut U`) as a
convenience. While this has worked reasonable well (most folks just try
to not look at `GetHost`) it has reached a breaking point in the WASIp3
work.

In the WASIp3 work it's effectively now going to be required that the
`G: GetHost` value is packaged up and actually stored inside of
accessors provided to host functions. This means that `GetHost` values
now need to not only be taken in `add_to_linker` but additionally
provided to the rest of the system through an "accessor". This was made
possible in #10746 by moving the `GetHost` type into Wasmtime itself (as
opposed to generated code where it lived prior).

While this worked with WASIp3 and it was possible to plumb `G: GetHost`
safely around, this ended up surfacing more issues. Namely all
"concurrent" host functions started getting significantly more
complicated `where` clauses and type signatures. At the end of the day I
felt that we had reached the end of the road to `GetHost` and wanted to
search for alternatives, hence this change.

The fundamental purpose of `GetHost` was to be able to express, in a
generic fashion:

* Give me a closure that takes `&mut T` and returns `D`.
* The `D` type can close over the lifetime in `&mut T`.
* The `D` type must implement `bindgen!`-generated traits.

A realization I had was that we could model this with a generic
associated type in Rust. Rust support for generic associated types is
relatively new and not something I've used much before, but it ended up
being a perfect model for this. The definition of the new `HasData`
trait is deceptively simple:

    trait HasData {
        type Data<'a>;
    }

What this enables us to do though is to generate `add_to_linker`
functions that look like this:

    fn add_to_linker<T, D>(
        linker: &mut Linker<T>,
        getter: fn(&mut T) -> D::Data<'_>,
    ) -> Result<()>
      where
        D: HasData,
        for<'a> D::Data<'a>: Host;

This definition here models `G: GetHost` as a literal function pointer,
and the ability to close over the `&mut T` lifetime with type (not just
`&mut U`) is expressed through the type constructor `type Data<'a>`).
Ideally we could take a generic generic associated type (I'm not even
sure what to call that), but that's not something Rust has today.

Overall this felt like a much simpler way of modeling `GetHost` and its
requirements. This plumbed well throughout the WASIp3 work and the
signatures for concurrent functions felt much more appropriate in terms
of complexity after this change. Taking this change to the limit means
that `GetHost` in its entirety could be purged since all usages of it
could be replaced with `fn(&mut T) -> D::Data<'a>`, a hopefully much
more understandable type.

This change is not all rainbows however, there are some gotchas that
remain:

* One is that all `add_to_linker` generated functions have a `D:
  HasData` type parameter. This type parameter cannot be inferred and
  must always be explicitly specified, and it's not easy to know what to
  supply here without reading documentation. Actually supplying the type
  parameter is quite easy once you know what to do (and what to fill
  in), but it may involve defining a small struct with a custom
  `HasData` implementation which can be non-obvious.

* Another is that the `G: GetHost` value was previously a full Rust
  closure, but now it's transitioning to a function pointer. This is
  done in preparation for WASIp3 work where the function needs to be
  passed around, and doing that behind a generic parameter is more
  effort than it's worth. This means that embedders relying on the true
  closure-like nature here will have to update to using a function
  pointer instead.

* The function pointer is stored in locations that require `'static`,
  and while `fn(T)` might be expected to be `'static` regardless of `T`
  is is, in fact, not. This means that practically `add_to_linker`
  requires `T: 'static`. Relative to just before this change this is a
  possible regression in functionality, but there orthogonal reasons
  beyond just this that we want to start requiring `T: 'static` anyway.
  That means that this isn't actually a regression relative to #10760, a
  related change.

The first point is partially ameliorated with WASIp3 work insofar that
the `D` type parameter will start serving as a location to specify where
concurrent implementations are found. These concurrent methods don't
take `&mut self` but instead are implemented for `T: HasData` types. In
that sense it's more justified to have this weird type parameter, but in
the meantime without this support it'll feel a bit odd to have this
little type parameter hanging off the side.

This change has been integrated into the WASIp3 prototyping repository
with success. This has additionally been integrated into the Spin
embedding which has one of the more complicated reliances on
`*_get_host` functions known. Given that it's expected that while this
is not necessarily a trivial change to rebase over it should at least be
possible.

Finally the `HasData` trait here has been included with what I'm hoping
is a sufficient amount of documentation to at least give folks a spring
board to understand it. If folks have confusion about this `D` type
parameter my hope is they'll make their way to `HasData` which showcases
various patterns for "librarifying" host implementations of WIT
interfaces. These patterns are all used throughout Wasmtime and WASI
currently in crates and tests and such.

* Update expanded test expectations
This switches s390x to use the common-code VCodeConstant
literal pool instead of constants inlined into the instruction
stream.
The underlying issue was incidentally fixed in
c22b3cb, so this commit just adds a test to
make sure that the bug doesn't regress again in the future.

Fixes #10772
The `--memory-size` argument caused failures when running the `mpk.rs`
example. This was due to some `clap`-related conversions that no longer
worked as when this was written:

```
thread 'main' panicked at examples/mpk.rs:69:18:
Mismatch between definition and access of `memory_size`. Could not
downcast to usize, need to downcast to u64
```

This change explicitly uses `u64` for the CLI argument, fixing the
failure.
Use `$crate` in the `entity_impl` macro for references to itself, so that
they resolve even if `entity_impl` isn't imported into the global scope.
* c-api: new wasmtime_trap_new_code function

* review comments

* remove accidental include
* winch: Implement check_stack for aarch64

* winch: update aarch64 snapshot tests for check_stack change

* winch: address review comments on aarch64 check_stack

* winch: update aarch64 snapshot tests: check_stack uses unsigned compare

* winch: update aarch64 snapshot tests: stack grows down

* winch: aarch64 check_stack correct direction and alignment

* winch: aarch64 enable stack_stack wast checks
This commit is a follow-up to
bytecodealliance/wasmtime#10730 and closes
bytecodealliance/wasmtime#10751.

In bytecodealliance/wasmtime#10751, I forgot
to update the branch emission heuristic checks.

Typically, branch-emitting code, such as br, will handle the results
in the maybe_pop_results callback. In this case, we can safely assert
that the stack pointer should be greater than or equal to the target
base stack pointer offset before calling the callback. However, for
certain cases, like `br_table`, result handling must occur before
invoking `CodeContext::br`, which can lead to assertion failures.

To maintain the invariant check while accommodating both scenarios,
this commit moves the assertion to occur after the callback is
invoked.
* Migrate run-tests.sh to Python

Try to fix MinGW breakage in CI

prtest:full

* Use python shell

* Try again

* Exit Python with the same code as Cargo
* Classify Winch with Tier 1 support

This commit classifies the Winch compiler in our documentation as a Tier
1 feature. This reflects the degree of work and investment put into it
as well as the thorough quality vetting that has already happened in
production. For Wasmtime this means that bugs in Winch are candidates
for security releases as well.

This additionally refines our tiers-of-support documentation to reflect
how not all compilers are required to support all features. Tables are
added of per-architecture support for each compiler as well as refined
wording of the various requirements of what is required for each tier as
well.

* Re-organize some docs
alexcrichton and others added 12 commits May 18, 2025 22:59
Looks to fix an issue coming up on CI and MinGW

prtest:full
x86 CPUs have a set of conversions of the form: `cvt{from}2{to}`, where
`from` and `to` can be various XMM-held types (e.g., `ss`, `ps, `si`,
etc.). These also have their truncating versions, `cvtt*`. This change
defines all of the instructions Cranelift needs (there are a few more)
and wires them up in ISLE and in various emitted sequences. For these
sequences, this chooses to factor out the choice of instruction into a
closure since we no longer can pass around an opcode.
* Update MSRV to 1.85.0

This makes us eligible to update to the 2024 Rust edition but I'm
planning on deferring that to a separate follow-up change.

prtest:full

* Fix unused mut

* Try harder to link `cosf` and thus `libm`
* Update Wasmtime to the 2024 Rust Edition

Now that our MSRV supports the 2024 edition it's possible to make this
switch. This commit moves Wasmtime to the 2024 Edition to keep
up-to-date with Rust idioms and access many of the edition features
exclusive to the 2024 edition.

prtest:full

* Reformat with the 2024 edition
@alexcrichton alexcrichton enabled auto-merge May 19, 2025 18:19
@alexcrichton alexcrichton added this pull request to the merge queue May 19, 2025
Merged via the queue into bytecodealliance:main with commit c3b6c9c May 19, 2025
155 checks passed
@alexcrichton alexcrichton deleted the merge- branch May 19, 2025 19:25
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.