Skip to content
This repository was archived by the owner on Sep 8, 2025. It is now read-only.

Merge with main again#112

Merged
alexcrichton merged 14 commits into
bytecodealliance:mainfrom
alexcrichton:another-merge
Apr 7, 2025
Merged

Merge with main again#112
alexcrichton merged 14 commits into
bytecodealliance:mainfrom
alexcrichton:another-merge

Conversation

@alexcrichton
Copy link
Copy Markdown
Member

Trying to reduce conflicts from version bumps...

cfallin and others added 14 commits April 5, 2025 05:01
… (#10502)

* Cranelift: remove return-value instructions after calls at callsites.

This PR addresses the issues described in #10488 in a more head-on
way: it removes the use of separate "return-value instructions" that
load return values from the stack, instead folding these loads into
the semantics of the call VCode instruction.

This is a prerequisite for exception-handling: we need calls to be
workable as terminators, meaning that we cannot require any
other (VCode) instructions after the call to define the return values.

In principle, this PR starts simply enough: the return-locations list
on the `CallInfo` that each backend uses to provide regalloc metadata
is updated to support a notion of "register or stack address" as the
source of each return value, and this list is now used for both kinds
of returns, not just returns in registers. Shared code is defined in
`machinst::abi` used by all backends to perform the requisite loads.

In order to make this work with more defined values than fit in
registers, however, this PR also had to add support for
"any"-constrained registers to Cranelift, and handling allocations
that may be spillslots. This has always been supported by RA2, but
this is the first time that Cranelift uses them directly (previously
they were used only internally in RA2 as lowerings from other kinds of
constraints like safepoints). This requires encoding a spillslot index
in our `Reg` type.

There is a little bit of complexity around handling the loads/defs as
well: if we have a return value on-stack, and we need to put it in a
spillslot, we cannot do a memory-to-memory move directly, so we need a
temporary register. Earlier versions of this PR allocated another temp
as a vreg on the call, but this doesn't work with all calling
conventions (too many clobbers). For simplicity I picked a particular
register that is (i) clobbered by calls and (ii) not used for return
values for each architecture (x86-64's tailcall needed to lose one
return-in-register slot to make this work).

This removes retval insts from the shared ABI infra completely. s390x
is different, still, because it handles callsite lowering from ISLE;
we will need to address that separately for exception support there.

* Fix is_included_in_clobbers on aarch64: new defs must skip optimization.

* Review feedback: add assert.

* Review feedback: handle retval temp reg via ABI trait method.

* Update is_clobbered_in_inst to affect only clobbers, not all defs.
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
…k` module to the top level of the crate (#10517)

Split off from bytecodealliance/wasmtime#10503
Currently the `wast` test suite does a combinatorial matrix of `compiler
x pooling x allocator` to produce the list of tests to run. We also have
a `wast_tests` fuzzer, however, which configures even more knobs and run
tests under these configurations. In the interest of saving time on CI
and avoiding too much combinatorial explosion this commit changes the
`wast` test suite to avoiding combinatorics and instead running in only
"interesting" combinations.

The goal here is to weed out common issues on CI quickly, such as
compiler-specific behaviors, but leave the full spectrum of
combinatorial tests to fuzzing. For example we probably aren't getting
much extra coverage running Winch with the pooling allocator over
running Cranelift with the pooling allocator.

This reduces the number of tests from ~4500 to ~2500 which should
hopefully lead to some minor improvements in CI time.
This commit implements a workaround for rust-lang/rust#139487 which is
necessary to have these implementations continue to work on more recent
compilers due to the underlying implementation changing.
These tests assume that a very large allocation will fail due to
exceeding address space limits.  This is not the case on s390x,
where the full 64-bit address space is available for allocation.
(Of course, actually using that much memory would still fail at
some point, but the mere allocation may not, depending on the
kernel's memory overcommit handling configuration.)

Note that running the tests under qemu will cause the expected
failure, which is why this problem is not noticed by the CI.

Simply disable these test on s390x.
This fixes the Pulley ABI code to only have a tail-call-specific
adjustment on the tail ABI, not all ABIs. This matches what other
backends such as riscv64 do as well. This is not exposed through wasm I
believe because no native signature can exercise this code (I think) and
all wasm code uses the `tail` ABI anyway.

This was discovered by running Cranelift filetests using ASAN which is
something I hope to be able to add to our CI in the near future.
* Support adding additional capacity to `FreeList`s

Split out from bytecodealliance/wasmtime#10503

* Add a test for `FreeList::add_capacity`

* Also test when old capacity is not a multiple of our alignment

* Add comments about alignment-rounding and free list capacity

* Don't run little example as a doc test
…nt` (#10526)

And let alias analysis dedupe and clean up multiple uses, instead of trying to
do that manually in our Wasm-to-CLIF frontend by loading the pointer once in the
entry block and trying to keep track of whether it is actually necessary to
pre-declare the pointer and all that.

Split off from bytecodealliance/wasmtime#10503
* Cranelift: update to regalloc2 0.11.3.

This pulls in a fix for a fuzzbug found after #10502 started generating
more challenging constraints for regalloc. The fix in
bytecodealliance/regalloc2#214 updates bundle-splitting logic to
properly handle bundles with multiple live-ranges all covering one
instruction.

* Update test expectations after regalloc perturbation.
…531)

This is a follow-on to bytecodealliance/wasmtime#10502
implementing the same logic for s390x.

Like other platforms, we now no longer emit any instructions handling
return values after the call instruction; instead, everything is done
within a pseudo Call instruction.  Unlike other platforms, this also
has to handle vector lane swapping when calling between ABIs that mix
BE and LE lane orders.  The pseudo Call instruction needs enough type
information to make this choice, therefore I had to add a type field
to the RetLocation::Reg case (the ::Stack case already had one).

All other changes are contained within the s390x back-end.  To simplify
the implementation, the patch also implements a number of clean-ups:
- Introduce a MemArg::SpillOffset variant
- Remove the (unnecessary on this platform) island code size check
- Merge the CallInd instructions into the base Call instructions,
  using a new CallInstDest type to carry the call destination
…ent` (#10541)

* Stop returning a `WasmResult` from the bounds-checking code; it is actually
  infallible and we were just plumbing around results for no reason.

* Split out a few small helper methods from `FuncEnvironment::make_heap`. This
  allows for reusing them in
  bytecodealliance/wasmtime#10503
* Apply argument extensions to component intrinsics too

This commit fixes the definition of host signatures for component
builtins to work in the same way that core builtins do which is to use
platform-specific sign-extension-rules by default. This fixes a
regression found in the wasip3-prototyping repository where new
component intrinsics didn't have this set and subsequently were failing
in riscv64 CI.

* Add some tests
@alexcrichton alexcrichton enabled auto-merge April 7, 2025 21:11
@alexcrichton alexcrichton added this pull request to the merge queue Apr 7, 2025
Merged via the queue into bytecodealliance:main with commit aa801ba Apr 7, 2025
79 checks passed
@alexcrichton alexcrichton deleted the another-merge branch April 7, 2025 21:58
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants