Skip to content

Commit 07a5f48

Browse files
Stevengreclaude
andauthored
Implement std::hint::black_box intrinsic support in KMIR (#659)
This PR implements support for Rust intrinsic functions in the MIR semantics, starting with the black_box intrinsic. Intrinsic functions are compiler built-ins that don't have regular MIR bodies and require special handling in the formal semantics. Key Changes Core Implementation - Python Layer (kmir.py, smir.py): - Extended functions() method to recognize and handle intrinsic functions from SMIR - Added mapping of intrinsic symbols to IntrinsicFunction K terms - Intrinsics are identified by IntrinsicSym in the function symbols K Semantics - mono.md: Added IntrinsicFunction(Symbol) constructor to MonoItemKind - kmir.md: - Added special execution path for intrinsic functions that bypasses normal call setup - Implemented #execIntrinsic mechanism for direct intrinsic execution - Added black_box implementation as identity function Testing & Documentation - Added comprehensive test suite for intrinsic functions - Created developer documentation: - docs/dev/adding-intrinsics.md: Guide for adding new intrinsic support - Added test utilities for SMIR generation and parsing - Included integration tests demonstrating black_box usage Implementation Details The intrinsic function support follows this flow: 1. SMIR parser identifies functions with IntrinsicSym symbols 2. Python layer creates IntrinsicFunction K terms for these functions 3. K semantics intercepts intrinsic calls and routes them to #execIntrinsic 4. Each intrinsic has its own execution rule (currently only black_box) Future Work This PR establishes the foundation for supporting additional Rust intrinsics. The modular design makes it straightforward to add new intrinsics by: 1. Adding execution rules in kmir.md 2. Writing corresponding tests in tests/rust/intrinsic/ 3. Following the patterns established in this PR The test infrastructure refactoring also provides a solid foundation for expanding test coverage across the entire MIR semantics implementation. --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent 42e1bb8 commit 07a5f48

38 files changed

Lines changed: 34395 additions & 10 deletions

CLAUDE.md

Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Repository Overview
6+
7+
MIR Semantics provides a K Framework-based formal semantics for Rust's Stable MIR (Mid-level Intermediate Representation), enabling symbolic execution and formal verification of Rust programs.
8+
9+
## Essential Commands
10+
11+
### Build and Setup
12+
```bash
13+
# Initial setup - install stable-mir-json tool for SMIR generation
14+
make stable-mir-json
15+
16+
# Build K semantics definitions (required after any K file changes)
17+
make build
18+
19+
# Full build and check
20+
make check build
21+
```
22+
23+
### Testing
24+
```bash
25+
# Run all tests
26+
make test
27+
28+
# Run unit tests only
29+
make test-unit
30+
31+
# Run integration tests (requires stable-mir-json and build)
32+
make test-integration
33+
34+
# Run a single test
35+
uv --directory kmir run pytest kmir/src/tests/integration/test_prove.py::test_prove_rs -k "test_name"
36+
37+
# Generate and parse SMIR for test files
38+
make smir-parse-tests
39+
```
40+
41+
### Code Quality
42+
```bash
43+
# Format code
44+
make format
45+
46+
# Check code quality (linting, type checking, formatting)
47+
make check
48+
49+
# Individual checks
50+
make check-flake8
51+
make check-mypy
52+
make check-black
53+
```
54+
55+
### Working with KMIR Tool
56+
```bash
57+
# Activate environment for interactive use
58+
source kmir/.venv/bin/activate
59+
60+
# Or run commands directly
61+
uv --directory kmir run kmir <command>
62+
63+
# Prove Rust code directly (recommended)
64+
uv --directory kmir run kmir prove-rs path/to/file.rs --verbose
65+
66+
# Generate SMIR JSON from Rust
67+
./scripts/generate-smir-json.sh file.rs output_dir
68+
69+
# View proof results
70+
uv --directory kmir run kmir show proof_id --proof-dir ./proof_dir
71+
```
72+
73+
## Architecture Overview
74+
75+
### Directory Structure
76+
- `kmir/` - Python frontend tool and K semantics
77+
- `src/kmir/` - Python implementation
78+
- `kmir.py` - Main KMIR class handling K semantics interaction
79+
- `smir.py` - SMIR JSON parsing and info extraction
80+
- `kdist/mir-semantics/` - K semantics definitions
81+
- `src/tests/` - Test suites
82+
- `integration/data/prove-rs/` - Rust test programs for prove-rs
83+
- `integration/data/exec-smir/` - Rust programs for execution tests
84+
85+
### Key K Semantics Files
86+
- `kmir.md` - Main execution semantics and control flow
87+
- `mono.md` - Monomorphized item definitions
88+
- `body.md` - Function body and basic block semantics
89+
- `rt/configuration.md` - Runtime configuration cells
90+
- `rt/data.md` - Runtime data structures
91+
- `ty.md` - Type system definitions
92+
93+
### Python-K Integration
94+
The Python layer (`kmir.py`) bridges between SMIR JSON and K semantics:
95+
1. Parses SMIR JSON via `SMIRInfo` class
96+
2. Transforms to K terms using `_make_function_map`, `_make_type_and_adt_maps`
97+
3. Executes via K framework's `KProve`/`KRun` interfaces
98+
99+
### Intrinsic Functions
100+
Intrinsic functions (like `black_box`) don't have regular function bodies. They're handled by:
101+
1. Python: `_make_function_map` adds `IntrinsicFunction` entries to function map
102+
2. K: Special rules in `kmir.md` execute intrinsics via `#execIntrinsic`
103+
104+
## Testing Patterns
105+
106+
### prove-rs Tests
107+
Tests in `kmir/src/tests/integration/data/prove-rs/` follow this pattern:
108+
- Simple Rust programs with assertions
109+
- File naming: `test-name.rs` (passes), `test-name-fail.rs` (expected to fail)
110+
- Tests run via `kmir prove-rs` command
111+
- Generate SMIR automatically during test execution
112+
113+
### Adding New Tests
114+
1. Add Rust file to `prove-rs/` directory
115+
2. Use assertions to verify behavior
116+
3. Run with: `uv --directory kmir run kmir prove-rs your-test.rs`
117+
118+
## Development Workflow
119+
120+
### Before Starting Any Task
121+
1. Read README and documentation in docs/ directory first
122+
2. Study existing development patterns and conventions
123+
3. Understand the codebase structure before making changes
124+
125+
### Modifying K Semantics
126+
1. Edit `.md` files in `kmir/src/kmir/kdist/mir-semantics/`
127+
2. Run `make build` to compile changes
128+
3. Test with `make test-integration`
129+
130+
### Modifying Python Code
131+
1. Edit files in `kmir/src/kmir/`
132+
2. Run `make format && make check` to verify code quality and formatting
133+
3. Test with `make test-unit`
134+
135+
### Adding Intrinsic Support
136+
1. Update `_make_function_map` in `kmir.py` to recognize intrinsic
137+
2. Add `IntrinsicFunction` constructor in `mono.md`
138+
3. Add execution rules in `kmir.md` under `#execIntrinsic`
139+
4. Add test in `prove-rs/` directory
140+
141+
## Debugging Tips
142+
143+
### Viewing Proof Execution
144+
```bash
145+
# Show specific nodes in proof
146+
uv --directory kmir run kmir show proof_id --nodes "1,2,3" --proof-dir ./proof_dir
147+
148+
# Show transitions between nodes
149+
uv --directory kmir run kmir show proof_id --node-deltas "1:2,2:3" --proof-dir ./proof_dir
150+
151+
# Show rules applied
152+
uv --directory kmir run kmir show proof_id --rules "1:2" --proof-dir ./proof_dir
153+
154+
# Full details with static info
155+
uv --directory kmir run kmir show proof_id --full-printer --no-omit-static-info --proof-dir ./proof_dir
156+
```
157+
158+
### Common Issues
159+
- `Function not found` errors: Check if function is in `FUNCTIONS_CELL` (may be intrinsic)
160+
- K compilation errors: Rules must be properly formatted, check syntax
161+
- SMIR generation fails: Ensure using correct Rust nightly version (2024-11-29)

docs/dev/adding-intrinsics.md

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# Adding Intrinsics
2+
3+
## Development Workflow
4+
5+
### Step 1: Create Test File
6+
Create `tests/rust/intrinsic/your_intrinsic.rs`:
7+
8+
```rust
9+
fn main() {
10+
let result = your_intrinsic(args);
11+
assert_eq!(result, expected);
12+
}
13+
```
14+
15+
### Step 2: Generate SMIR and Verify Intrinsic Detection
16+
```bash
17+
# Generate SMIR JSON
18+
make generate-tests-smir
19+
20+
# Update expected outputs and verify intrinsic is detected
21+
make test-unit TEST_ARGS="--update-expected-output"
22+
```
23+
24+
Check `tests/expected/unit/test_smir/test_function_symbols/your_intrinsic.expected.json` to confirm the intrinsic appears as `IntrinsicSym`.
25+
26+
### Step 3: Run Initial Integration Test
27+
```bash
28+
# Run test and update expected output (will show stuck at intrinsic call)
29+
make test-integration TEST_ARGS="-k your_intrinsic --update-expected-output"
30+
31+
# Backup the initial state for comparison
32+
cp tests/expected/integration/test_exec_smir/intrinsic_your_intrinsic.state \
33+
tests/expected/integration/test_exec_smir/intrinsic_your_intrinsic.state.backup
34+
```
35+
36+
### Step 4: Implement K Rule
37+
Edit `kmir/src/kmir/kdist/mir-semantics/kmir.md`:
38+
39+
```k
40+
rule <k> #execIntrinsic(mirString("your_intrinsic"), ARGS, DEST) =>
41+
/* your implementation */
42+
... </k>
43+
```
44+
45+
### Step 5: Rebuild and Test
46+
```bash
47+
# Rebuild K semantics
48+
make build
49+
50+
# Run test again
51+
make test-integration TEST_ARGS="-k your_intrinsic --update-expected-output"
52+
53+
# Compare results
54+
diff tests/expected/integration/test_exec_smir/intrinsic_your_intrinsic.state.backup \
55+
tests/expected/integration/test_exec_smir/intrinsic_your_intrinsic.state
56+
```
57+
58+
The diff should show progress past the intrinsic call if implementation is correct.
59+
60+
### Step 6: Verify Results
61+
Ensure the test completes successfully and the intrinsic behaves as expected.
62+
63+
## Example: black_box
64+
65+
Initial state (before rule):
66+
```
67+
#setUpCalleeData ( IntrinsicFunction ( mirString ( "black_box" ) ) , ...)
68+
```
69+
70+
After implementing rule:
71+
```
72+
Program continues execution with value 11 passed through
73+
```

kmir/src/kmir/kdist/mir-semantics/kmir.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -418,6 +418,7 @@ If the local `_0` does not have a value (i.e., it remained uninitialised), the f
418418
// other item kinds are not expected or supported
419419
rule #getBlocksAux(monoItemStatic(_, _, _)) => .List // should not occur in calls
420420
rule #getBlocksAux(monoItemGlobalAsm(_)) => .List // not supported
421+
rule #getBlocksAux(IntrinsicFunction(_)) => .List // intrinsics have no body
421422
```
422423

423424
When a `terminatorKindReturn` is executed but the optional target is empty
@@ -511,9 +512,20 @@ An operand may be a `Reference` (the only way a function could access another fu
511512
...
512513
</currentFrame>
513514
// TODO: Haven't handled "noBody" case
515+
516+
// Handle intrinsic functions - execute directly without setting up local stack frame
517+
rule <k> #setUpCalleeData(IntrinsicFunction(INTRINSIC_NAME), ARGS)
518+
=> #execIntrinsic(INTRINSIC_NAME, ARGS, DEST) ~> #execBlockIdx(RETURN_TARGET)
519+
</k>
520+
<currentFrame>
521+
<dest> DEST </dest>
522+
<target> someBasicBlockIdx(RETURN_TARGET) </target>
523+
...
524+
</currentFrame>
514525
515526
syntax KItem ::= #setArgsFromStack ( Int, Operands)
516527
| #setArgFromStack ( Int, Operand)
528+
| #execIntrinsic ( Symbol, Operands, Place )
517529
518530
// once all arguments have been retrieved, execute
519531
rule <k> #setArgsFromStack(_, .Operands) ~> CONT => CONT </k>
@@ -602,6 +614,23 @@ Drops are elaborated to Noops but still define the continuing control flow. Unre
602614
</k>
603615
```
604616

617+
### Intrinsic Functions
618+
619+
Intrinsic functions are built-in functions provided by the compiler that don't have regular MIR bodies.
620+
They are handled specially in the execution semantics through the `#execIntrinsic` mechanism.
621+
When an intrinsic function is called, the execution bypasses the normal function call setup and directly
622+
executes the intrinsic-specific logic.
623+
624+
#### Black Box (`std::hint::black_box`)
625+
626+
The `black_box` intrinsic serves as an optimization barrier, preventing the compiler from making assumptions
627+
about the value passed through it. In the semantics, it acts as an identity function that simply passes
628+
its argument to the destination without modification.
629+
630+
```k
631+
// Black box intrinsic implementation - identity function
632+
rule <k> #execIntrinsic(symbol("black_box"), ARG .Operands, DEST) => #setLocalValue(DEST, ARG) ... </k>
633+
```
605634

606635
### Stopping on Program Errors
607636

kmir/src/kmir/kdist/mir-semantics/mono.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ syntax MonoItemKind ::= monoItemFn(name: Symbol, id: DefId, body: MaybeBody)
3434
[ group(mir-enum)
3535
, symbol(MonoItemKind::MonoItemGlobalAsm)
3636
]
37+
| IntrinsicFunction(Symbol) [ symbol(IntrinsicFunction) ]
3738
syntax MonoItem ::= monoItem(symbolName: Symbol, monoItemKind: MonoItemKind)
3839
[symbol(monoItemWrapper), group(mir---symbol-name--mono-item-kind)]
3940
syntax MonoItems ::= List {MonoItem, ""}

kmir/src/kmir/kmir.py

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -68,9 +68,11 @@ def kcfg_explore(self, label: str | None = None) -> Iterator[KCFGExplore]:
6868
) as cts:
6969
yield KCFGExplore(cts, kcfg_semantics=KMIRSemantics())
7070

71-
def _make_function_map(self, smir_info: SMIRInfo) -> KInner:
71+
def functions(self, smir_info: SMIRInfo) -> dict[int, KInner]:
7272
parser = Parser(self.definition)
73-
parsed_items: dict[KInner, KInner] = {}
73+
functions: dict[int, KInner] = {}
74+
75+
# Parse regular functions
7476
for item_name, item in smir_info.items.items():
7577
if not item_name in smir_info.function_symbols_reverse:
7678
_LOGGER.warning(f'Item not found in SMIR: {item_name}')
@@ -82,9 +84,23 @@ def _make_function_map(self, smir_info: SMIRInfo) -> KInner:
8284
assert isinstance(parsed_item_kinner, KApply) and parsed_item_kinner.label.name == 'monoItemWrapper'
8385
# each item can have several entries in the function table for linked SMIR JSON
8486
for ty in smir_info.function_symbols_reverse[item_name]:
85-
item_ty = KApply('ty', [token(ty)])
86-
parsed_items[item_ty] = parsed_item_kinner.args[1]
87-
return map_of(parsed_items)
87+
functions[ty] = parsed_item_kinner.args[1]
88+
89+
# Add intrinsic functions
90+
for ty, sym in smir_info.function_symbols.items():
91+
if 'IntrinsicSym' in sym and ty not in functions:
92+
functions[ty] = KApply(
93+
'IntrinsicFunction',
94+
[KApply('symbol(_)_LIB_Symbol_String', [token(sym['IntrinsicSym'])])],
95+
)
96+
97+
return functions
98+
99+
def _make_function_map(self, smir_info: SMIRInfo) -> KInner:
100+
parsed_terms: dict[KInner, KInner] = {}
101+
for ty, body in self.functions(smir_info).items():
102+
parsed_terms[KApply('ty', [token(ty)])] = body
103+
return map_of(parsed_terms)
88104

89105
def _make_type_and_adt_maps(self, smir_info: SMIRInfo) -> tuple[KInner, KInner]:
90106
parser = Parser(self.definition)

kmir/src/kmir/smir.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -116,8 +116,10 @@ def function_symbols_reverse(self) -> dict[str, list[int]]:
116116
tys_for_name: dict[str, list[int]] = {}
117117
for ty, sym in self.function_symbols.items():
118118
if 'NormalSym' in sym:
119-
tys_for_name.setdefault(sym['NormalSym'], [])
120-
tys_for_name[sym['NormalSym']].append(ty)
119+
tys_for_name.setdefault(sym['NormalSym'], []).append(ty)
120+
elif 'IntrinsicSym' in sym:
121+
tys_for_name.setdefault(sym['IntrinsicSym'], []).append(ty)
122+
# Skip other symbol types like NoOpSym for now
121123
return tys_for_name
122124

123125
@cached_property

0 commit comments

Comments
 (0)