Skip to content

Commit aaa7a8a

Browse files
committed
Added challenge 30 progressing public mir proposal document
1 parent 2f8269d commit aaa7a8a

1 file changed

Lines changed: 191 additions & 0 deletions

File tree

Lines changed: 191 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,191 @@
1+
# Challenge 30: Progressing Public MIR for verification tool infrastructure
2+
3+
- **Status:** *Open*
4+
- **Solution:** *Option field to point to the PR that solved this challenge.*
5+
- **Tracking Issue:** *TBD*
6+
- **Start date:** *TBD*
7+
- **End date:** *TBD*
8+
- **Reward:** *TBD*
9+
10+
-------------------
11+
12+
13+
## Goal
14+
15+
Progress the [`rustc_public`](https://github.com/rust-lang/project-stable-mir)
16+
(formerly Stable MIR) interface so that verification tools targeting the Rust
17+
standard library can consume post-analysis MIR without falling back to
18+
`rustc_internal` APIs. This challenge is infrastructure-focused: it does not
19+
verify a specific standard library module, but rather removes a class of
20+
fragility that affects tools in this repository that operates on Public MIR
21+
(Kani, KMIR, and any future consumer of Public MIR).
22+
23+
## Motivation
24+
25+
The Rust compiler exposes multiple intermediate representations. MIR after
26+
borrow-check is the most useful endpoint for third-party static and dynamic
27+
analyses, and the `rustc_public` crate (split into `rustc_public` and
28+
`rustc_public_bridge` upstream) is the stable(r) surface intended for such tools.
29+
30+
In practice, `rustc_public` is incomplete. Tools that consume MIR today
31+
routinely round-trip through `rustc_internal` to retrieve information that
32+
should be available through the public API. Two concrete examples from
33+
verification tools used or targeted by this repository:
34+
35+
- Kani maintains an internal MIR helper
36+
[internal_mir.rs](https://github.com/model-checking/kani/blob/main/kani-compiler/src/kani_middle/transform/internal_mir.rs)
37+
that reaches into compiler-private types to reconstruct data the public API
38+
cannot yet provide. The file exists because Kani's delayed-UB instrumentation
39+
relies on rustc's MIR dataflow framework, which operates on internal MIR only.
40+
Kani carries them in-tree today, but without upstream inclusion the benefit
41+
is likely not distributed to other consumers of Public MIR. This challenge
42+
therefore cannot be treated as a purely mechanical lifting of Kani's helpers
43+
and other missing components; part of the work is re-opening that design
44+
conversation in the workgroup.
45+
- Stable MIR JSON (the serialisation layer used by KMIR) has had to carry
46+
fixes to issues such as [94](https://github.com/rust-lang/project-stable-mir/issues/94)
47+
that were simply misuse of internel vs. public APIs.
48+
49+
Internal APIs may change on any nightly, reducing the effectiveness of a
50+
stablised interface. Furthermore, tools that require calls an
51+
internal API instead of a public one run the risk of accessing an _incorrect_
52+
API for their purpose. This is both easy to do and hard to detect. This
53+
creates friction for verification tooling built on Public MIR and effictively
54+
narrows the verification techniques available by creating unnecessary hurdles
55+
that are antipatterns to the goals of Public MIR (discouraging adoption).
56+
57+
## Description
58+
59+
`rustc_public` today exposes a mostly-stable view of MIR, but three classes
60+
of problem push tool authors back into `rustc_internal`:
61+
62+
1. **Missing coverage.** In-source `todo!()` and `FIXME` markers are a
63+
direct in-source signal, but are not entirely informative. These are often
64+
linked to experimental or planned features in Public MIR and not
65+
representative of the gaps downstream users are running into currently.
66+
However, these places will be considered and closed opportunitistically.
67+
68+
The higher-signal picture of where `rustc_public` falls short for
69+
verification tools is the workarounds tool authors have had to produce to
70+
get the data unavailable through the Public MIR API. Kani documents this in
71+
[internal_mir.rs](https://github.com/model-checking/kani/blob/main/kani-compiler/src/kani_middle/transform/internal_mir.rs),
72+
with similar challenges faced by
73+
[stable-mir-json](https://github.com/runtimeverification/stable-mir-json/). The
74+
[project-stable-mir issue tracker](https://github.com/rust-lang/project-stable-mir/issues)
75+
also offers insight into current gaps that are actually being hit by consumers
76+
of Public MIR. The subject of this challenge is that body of work: the design
77+
conversations with the workgroup, the upstream API additions that unblock tool
78+
migrations, and the consumer-side deletions that follow.
79+
2. **Dialect ambiguity.** The internal MIR pipeline exposes four distinct
80+
phases (built, const-eval, optimised, post-analysis). The access point
81+
for all versions of MIR is the `after_analysis` callback. `rustc_public`
82+
itself does not commit to any single phase at the API boundary, with
83+
different dialects being accessible by applying transformations on the
84+
_internal_ MIR and then bridging to _public_ MIR. The MIR collected via
85+
the callback is polymorphic; tools that want monomorphised MIR
86+
therefore perform manual monomorphisation from that callback. However,
87+
subtle intricacies with this method and missing coverage (see 1.) make
88+
it easy to mix monomorphised MIR with MIR from another phase leaving the
89+
resulting MIR in a state that no specific tool is designed to consume.
90+
91+
Clarifying what phase(s) `rustc_public` is willing to expose, and providing
92+
documentation, better API coverage, and potentially enforced guards on
93+
dialect access would let tools access the dialects they expect in a clearly
94+
defined manner.
95+
3. **Typing-context leakage.** Several operations that appear public still
96+
require reaching into `TyCtxt`. Lifting these behind `rustc_public`
97+
wrappers is the remainder of the work started upstream; see the discussion
98+
in the project-stable-mir workgroup and related issues
99+
(e.g. [project-stable-mir#94](https://github.com/rust-lang/project-stable-mir/issues/94)).
100+
101+
The challenge is a time-boxed effort to close these gaps upstream, and to
102+
demonstrate the result by removing the corresponding internal-API usage from
103+
verification tools that (current or future) that can contribute to
104+
verification of Rust `std` library.
105+
106+
### Assumptions
107+
108+
- Experimental nightly-only constructs that are not yet meaningfully present
109+
in internal MIR (for example, features gated behind unstable flags such as
110+
`tail_call`-style terminators) are out of scope beyond mechanical plumbing.
111+
If supporting them in `rustc_public` would require redesigning the
112+
corresponding internal MIR surface, that redesign is out of scope.
113+
- The split between `rustc_public` (stable-facing crate) and
114+
`rustc_public_bridge` (internal adapter) is assumed to be the intended
115+
long-term architecture. Work targets these crates rather than proposing a
116+
new location.
117+
- Solutions are landed upstream in `rust-lang/rust` via the
118+
[project-stable-mir](https://github.com/rust-lang/project-stable-mir)
119+
workgroup.
120+
121+
### Success Criteria
122+
123+
Since this challenge does not directly verify components of `std`, but
124+
improves the landscape for verification, the success criteria are stated as
125+
API-surface and tool-integration deliverables rather than per-function
126+
safety contracts.
127+
128+
**A. Time-boxed, best-effort migration.** Within the engagement, convert as
129+
many verification-tool API calls from `rustc_internal` to `rustc_public` as
130+
is feasible, and implement missing `rustc_public` surface where a conversion
131+
target does not yet exist. A complete migration of the consumer surfaces
132+
named in the Motivation (Kani's `internal_mir.rs`, stable-mir-json, and
133+
similar shims) is attempted, with the understanding that community consensus
134+
has historically blocked some conversions.
135+
136+
Where a similar blocker re-surfaces during the engagement, the deliverable
137+
for that surface is a **taxonomy of possible solutions** (for example:
138+
lift conversions into `rustc_public`, expose a narrower data-only API,
139+
add an opt-in dataflow adapter, or leave the consumer-side shim documented
140+
as intentional) with enough analysis for the workgroup to continue the
141+
conversation after the engagement closes. A blocker presented with this
142+
analysis counts toward the criterion; a blocker only noted without analysis
143+
does not.
144+
145+
**B. Dialect clarity.** The four MIR dialects exposed through the compiler
146+
pipeline are addressed in one of two ways at the `rustc_public` boundary:
147+
(i) the boundary is explicitly restricted to a documented subset (the
148+
specific phase or phases are to be decided in workgroup discussion, with
149+
post-analysis and post-monomorphisation as the leading candidates), or
150+
(ii) the API documents the invariants each reachable dialect satisfies so
151+
that consumers can tell which state they are observing.
152+
153+
**C. Residual roundtrips are documented.** Any internal-MIR roundtrips that
154+
cannot be removed within the time box are enumerated in an upstream-visible
155+
document (workgroup notes or tracking issue) with a short description of
156+
why each roundtrip is still required. This is an explicit success criterion:
157+
a catalogued gap is valuable in lieu of a closed one.
158+
159+
Additional scope for the time-boxed effort may be drawn from the
160+
[open issues on project-stable-mir](https://github.com/rust-lang/project-stable-mir/issues),
161+
prioritised by impact on verification tools targeting this repository. Which
162+
of those issues are addressed, deferred, or re-scoped during the engagement
163+
is itself part of the workgroup conversation, and a justified triage of that
164+
issue list counts toward criterion **C**.
165+
166+
**D. Verification-tool CI still passes.** At least one verification tool
167+
used in this repository's CI (Kani, or KMIR once integrated) is rebuilt
168+
against a toolchain carrying the changes, with all of its proofs for the
169+
currently-resolved challenges still succeeding. This ensures the
170+
stabilisation work does not regress existing verification results.
171+
172+
## Correctness Criteria
173+
174+
The Rust reference's list of undefined behaviour does not apply to this
175+
challenge, because the deliverable is compiler infrastructure rather than
176+
a run-time construct. In its place, solutions must satisfy the following
177+
correctness properties:
178+
179+
- For every new `rustc_public` lowering added under criterion **A**, the
180+
public surface faithfully represents the post-analysis internal MIR node:
181+
a round-trip lowering (where the construct allows one) preserves shape and
182+
referenced identifiers, and the serialised output from Stable MIR JSON
183+
round-trips through its consumer without data loss.
184+
- `rustc_public` APIs introduced to replace consumer-side shims under
185+
criterion **A** do not require the caller to hold or construct a `TyCtxt`,
186+
and do not re-export `rustc_internal` types in their signatures.
187+
- Solutions are landed as merged pull requests against `rust-lang/rust`.
188+
189+
Note: All solutions to verification challenges need to satisfy the criteria
190+
established in the [challenge book](../general-rules.md) in addition to the
191+
ones listed above.

0 commit comments

Comments
 (0)