|
| 1 | +# Challenge 30: Progressing Public MIR for verification tool infrastructure |
| 2 | + |
| 3 | +- **Status:** *Open* |
| 4 | +- **Solution:** *Option field to point to the PR that solved this challenge.* |
| 5 | +- **Tracking Issue:** *TBD* |
| 6 | +- **Start date:** *TBD* |
| 7 | +- **End date:** *TBD* |
| 8 | +- **Reward:** *TBD* |
| 9 | + |
| 10 | +------------------- |
| 11 | + |
| 12 | + |
| 13 | +## Goal |
| 14 | + |
| 15 | +Progress the [`rustc_public`](https://github.com/rust-lang/project-stable-mir) |
| 16 | +(formerly Stable MIR) interface so that verification tools targeting the Rust |
| 17 | +standard library can consume post-analysis MIR without falling back to |
| 18 | +`rustc_internal` APIs. This challenge is infrastructure-focused: it does not |
| 19 | +verify a specific standard library module, but rather removes a class of |
| 20 | +fragility that affects tools in this repository that operates on Public MIR |
| 21 | +(Kani, KMIR, and any future consumer of Public MIR). |
| 22 | + |
| 23 | +## Motivation |
| 24 | + |
| 25 | +The Rust compiler exposes multiple intermediate representations. MIR after |
| 26 | +borrow-check is the most useful endpoint for third-party static and dynamic |
| 27 | +analyses, and the `rustc_public` crate (split into `rustc_public` and |
| 28 | +`rustc_public_bridge` upstream) is the stable(r) surface intended for such tools. |
| 29 | + |
| 30 | +In practice, `rustc_public` is incomplete. Tools that consume MIR today |
| 31 | +routinely round-trip through `rustc_internal` to retrieve information that |
| 32 | +should be available through the public API. Two concrete examples from |
| 33 | +verification tools used or targeted by this repository: |
| 34 | + |
| 35 | +- Kani maintains an internal MIR helper |
| 36 | + [internal_mir.rs](https://github.com/model-checking/kani/blob/main/kani-compiler/src/kani_middle/transform/internal_mir.rs) |
| 37 | + that reaches into compiler-private types to reconstruct data the public API |
| 38 | + cannot yet provide. The file exists because Kani's delayed-UB instrumentation |
| 39 | + relies on rustc's MIR dataflow framework, which operates on internal MIR only. |
| 40 | + Kani carries them in-tree today, but without upstream inclusion the benefit |
| 41 | + is likely not distributed to other consumers of Public MIR. This challenge |
| 42 | + therefore cannot be treated as a purely mechanical lifting of Kani's helpers |
| 43 | + and other missing components; part of the work is re-opening that design |
| 44 | + conversation in the workgroup. |
| 45 | +- Stable MIR JSON (the serialisation layer used by KMIR) has had to carry |
| 46 | + fixes to issues such as [94](https://github.com/rust-lang/project-stable-mir/issues/94) |
| 47 | + that were simply misuse of internel vs. public APIs. |
| 48 | + |
| 49 | +Internal APIs may change on any nightly, reducing the effectiveness of a |
| 50 | +stablised interface. Furthermore, tools that require calls an |
| 51 | +internal API instead of a public one run the risk of accessing an _incorrect_ |
| 52 | +API for their purpose. This is both easy to do and hard to detect. This |
| 53 | +creates friction for verification tooling built on Public MIR and effictively |
| 54 | +narrows the verification techniques available by creating unnecessary hurdles |
| 55 | +that are antipatterns to the goals of Public MIR (discouraging adoption). |
| 56 | + |
| 57 | +## Description |
| 58 | + |
| 59 | +`rustc_public` today exposes a mostly-stable view of MIR, but three classes |
| 60 | +of problem push tool authors back into `rustc_internal`: |
| 61 | + |
| 62 | +1. **Missing coverage.** In-source `todo!()` and `FIXME` markers are a |
| 63 | + direct in-source signal, but are not entirely informative. These are often |
| 64 | + linked to experimental or planned features in Public MIR and not |
| 65 | + representative of the gaps downstream users are running into currently. |
| 66 | + However, these places will be considered and closed opportunitistically. |
| 67 | + |
| 68 | + The higher-signal picture of where `rustc_public` falls short for |
| 69 | + verification tools is the workarounds tool authors have had to produce to |
| 70 | + get the data unavailable through the Public MIR API. Kani documents this in |
| 71 | + [internal_mir.rs](https://github.com/model-checking/kani/blob/main/kani-compiler/src/kani_middle/transform/internal_mir.rs), |
| 72 | + with similar challenges faced by |
| 73 | + [stable-mir-json](https://github.com/runtimeverification/stable-mir-json/). The |
| 74 | + [project-stable-mir issue tracker](https://github.com/rust-lang/project-stable-mir/issues) |
| 75 | + also offers insight into current gaps that are actually being hit by consumers |
| 76 | + of Public MIR. The subject of this challenge is that body of work: the design |
| 77 | + conversations with the workgroup, the upstream API additions that unblock tool |
| 78 | + migrations, and the consumer-side deletions that follow. |
| 79 | +2. **Dialect ambiguity.** The internal MIR pipeline exposes four distinct |
| 80 | + phases (built, const-eval, optimised, post-analysis). The access point |
| 81 | + for all versions of MIR is the `after_analysis` callback. `rustc_public` |
| 82 | + itself does not commit to any single phase at the API boundary, with |
| 83 | + different dialects being accessible by applying transformations on the |
| 84 | + _internal_ MIR and then bridging to _public_ MIR. The MIR collected via |
| 85 | + the callback is polymorphic; tools that want monomorphised MIR |
| 86 | + therefore perform manual monomorphisation from that callback. However, |
| 87 | + subtle intricacies with this method and missing coverage (see 1.) make |
| 88 | + it easy to mix monomorphised MIR with MIR from another phase leaving the |
| 89 | + resulting MIR in a state that no specific tool is designed to consume. |
| 90 | + |
| 91 | + Clarifying what phase(s) `rustc_public` is willing to expose, and providing |
| 92 | + documentation, better API coverage, and potentially enforced guards on |
| 93 | + dialect access would let tools access the dialects they expect in a clearly |
| 94 | + defined manner. |
| 95 | +3. **Typing-context leakage.** Several operations that appear public still |
| 96 | + require reaching into `TyCtxt`. Lifting these behind `rustc_public` |
| 97 | + wrappers is the remainder of the work started upstream; see the discussion |
| 98 | + in the project-stable-mir workgroup and related issues |
| 99 | + (e.g. [project-stable-mir#94](https://github.com/rust-lang/project-stable-mir/issues/94)). |
| 100 | + |
| 101 | +The challenge is a time-boxed effort to close these gaps upstream, and to |
| 102 | +demonstrate the result by removing the corresponding internal-API usage from |
| 103 | +verification tools that (current or future) that can contribute to |
| 104 | +verification of Rust `std` library. |
| 105 | + |
| 106 | +### Assumptions |
| 107 | + |
| 108 | +- Experimental nightly-only constructs that are not yet meaningfully present |
| 109 | + in internal MIR (for example, features gated behind unstable flags such as |
| 110 | + `tail_call`-style terminators) are out of scope beyond mechanical plumbing. |
| 111 | + If supporting them in `rustc_public` would require redesigning the |
| 112 | + corresponding internal MIR surface, that redesign is out of scope. |
| 113 | +- The split between `rustc_public` (stable-facing crate) and |
| 114 | + `rustc_public_bridge` (internal adapter) is assumed to be the intended |
| 115 | + long-term architecture. Work targets these crates rather than proposing a |
| 116 | + new location. |
| 117 | +- Solutions are landed upstream in `rust-lang/rust` via the |
| 118 | + [project-stable-mir](https://github.com/rust-lang/project-stable-mir) |
| 119 | + workgroup. |
| 120 | + |
| 121 | +### Success Criteria |
| 122 | + |
| 123 | +Since this challenge does not directly verify components of `std`, but |
| 124 | +improves the landscape for verification, the success criteria are stated as |
| 125 | +API-surface and tool-integration deliverables rather than per-function |
| 126 | +safety contracts. |
| 127 | + |
| 128 | +**A. Time-boxed, best-effort migration.** Within the engagement, convert as |
| 129 | +many verification-tool API calls from `rustc_internal` to `rustc_public` as |
| 130 | +is feasible, and implement missing `rustc_public` surface where a conversion |
| 131 | +target does not yet exist. A complete migration of the consumer surfaces |
| 132 | +named in the Motivation (Kani's `internal_mir.rs`, stable-mir-json, and |
| 133 | +similar shims) is attempted, with the understanding that community consensus |
| 134 | +has historically blocked some conversions. |
| 135 | + |
| 136 | +Where a similar blocker re-surfaces during the engagement, the deliverable |
| 137 | +for that surface is a **taxonomy of possible solutions** (for example: |
| 138 | +lift conversions into `rustc_public`, expose a narrower data-only API, |
| 139 | +add an opt-in dataflow adapter, or leave the consumer-side shim documented |
| 140 | +as intentional) with enough analysis for the workgroup to continue the |
| 141 | +conversation after the engagement closes. A blocker presented with this |
| 142 | +analysis counts toward the criterion; a blocker only noted without analysis |
| 143 | +does not. |
| 144 | + |
| 145 | +**B. Dialect clarity.** The four MIR dialects exposed through the compiler |
| 146 | +pipeline are addressed in one of two ways at the `rustc_public` boundary: |
| 147 | +(i) the boundary is explicitly restricted to a documented subset (the |
| 148 | +specific phase or phases are to be decided in workgroup discussion, with |
| 149 | +post-analysis and post-monomorphisation as the leading candidates), or |
| 150 | +(ii) the API documents the invariants each reachable dialect satisfies so |
| 151 | +that consumers can tell which state they are observing. |
| 152 | + |
| 153 | +**C. Residual roundtrips are documented.** Any internal-MIR roundtrips that |
| 154 | +cannot be removed within the time box are enumerated in an upstream-visible |
| 155 | +document (workgroup notes or tracking issue) with a short description of |
| 156 | +why each roundtrip is still required. This is an explicit success criterion: |
| 157 | +a catalogued gap is valuable in lieu of a closed one. |
| 158 | + |
| 159 | +Additional scope for the time-boxed effort may be drawn from the |
| 160 | +[open issues on project-stable-mir](https://github.com/rust-lang/project-stable-mir/issues), |
| 161 | +prioritised by impact on verification tools targeting this repository. Which |
| 162 | +of those issues are addressed, deferred, or re-scoped during the engagement |
| 163 | +is itself part of the workgroup conversation, and a justified triage of that |
| 164 | +issue list counts toward criterion **C**. |
| 165 | + |
| 166 | +**D. Verification-tool CI still passes.** At least one verification tool |
| 167 | +used in this repository's CI (Kani, or KMIR once integrated) is rebuilt |
| 168 | +against a toolchain carrying the changes, with all of its proofs for the |
| 169 | +currently-resolved challenges still succeeding. This ensures the |
| 170 | +stabilisation work does not regress existing verification results. |
| 171 | + |
| 172 | +## Correctness Criteria |
| 173 | + |
| 174 | +The Rust reference's list of undefined behaviour does not apply to this |
| 175 | +challenge, because the deliverable is compiler infrastructure rather than |
| 176 | +a run-time construct. In its place, solutions must satisfy the following |
| 177 | +correctness properties: |
| 178 | + |
| 179 | +- For every new `rustc_public` lowering added under criterion **A**, the |
| 180 | + public surface faithfully represents the post-analysis internal MIR node: |
| 181 | + a round-trip lowering (where the construct allows one) preserves shape and |
| 182 | + referenced identifiers, and the serialised output from Stable MIR JSON |
| 183 | + round-trips through its consumer without data loss. |
| 184 | +- `rustc_public` APIs introduced to replace consumer-side shims under |
| 185 | + criterion **A** do not require the caller to hold or construct a `TyCtxt`, |
| 186 | + and do not re-export `rustc_internal` types in their signatures. |
| 187 | +- Solutions are landed as merged pull requests against `rust-lang/rust`. |
| 188 | + |
| 189 | +Note: All solutions to verification challenges need to satisfy the criteria |
| 190 | +established in the [challenge book](../general-rules.md) in addition to the |
| 191 | +ones listed above. |
0 commit comments