Skip to content

Commit 05552e7

Browse files
Stevengreclaude
andcommitted
feat(rt): add address allocation model for pointer-to-integer semantics
Implements Miri-style lazy address allocation (Issue #1002): Configuration: - `<addressMap>`: maps allocation keys to (base_address, size) pairs - `<nextAddress>`: next available address (starts at 4096 to avoid NULL) - `<exposedSet>`: tracks allocations with exposed provenance Semantics: - `#allocAddressFor`: lazily assigns aligned base addresses on demand - `#alignUp`: aligns addresses to type alignment requirements - `castKindTransmute` PtrLocal→int: computes base + byte_offset - `castKindPointerExposeAddress`: same as transmute but also exposes provenance Verified via LLVM backend execution: - interior-mut3-fail.rs reaches #EndProgram (alignment check passes with addr=4096) - Address uniqueness: different locals get different base addresses Note: Haskell backend (prove) performance regresses due to 3 new configuration cells increasing the matching state space. This needs further optimization (e.g. cell multiplicity annotations or rule priorities). Closes #1002 Supersedes #812, #877 Fixes #638 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 92056cd commit 05552e7

4 files changed

Lines changed: 243 additions & 2 deletions

File tree

kmir/src/kmir/kdist/mir-semantics/rt/configuration.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ Essential parts of the configuration:
88
* `currentFrame`, an unpacked version of the top of `stack`
99
* the `functions` map to look up function bodies when they are called
1010
* the `memory` cell which abstracts allocated heap data
11+
* `addressMap`, `nextAddress`, and `exposedSet` for pointer-to-integer semantics
1112

1213
The entire program's return value (`retVal`) is held in a separate cell.
1314

@@ -19,6 +20,8 @@ requires "./value.md"
1920
module KMIR-CONFIGURATION
2021
imports INT-SYNTAX
2122
imports BOOL-SYNTAX
23+
imports MAP
24+
imports SET
2225
imports RT-VALUE-SYNTAX
2326
2427
syntax RetVal ::= return( Value )
@@ -45,6 +48,10 @@ module KMIR-CONFIGURATION
4548
</currentFrame>
4649
// remaining call stack (without top frame)
4750
<stack> .List </stack>
51+
// address allocation model for pointer-to-integer casts
52+
<addressMap> .Map </addressMap>
53+
<nextAddress> 4096 </nextAddress>
54+
<exposedSet> .Set </exposedSet>
4855
</kmir>
4956
```
5057

kmir/src/kmir/kdist/mir-semantics/rt/data.md

Lines changed: 211 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ module RT-DATA
1818
imports BYTES
1919
imports LIST
2020
imports MAP
21+
imports SET
2122
imports K-EQUAL
2223
2324
imports BODY
@@ -1615,6 +1616,105 @@ What can be supported without additional layout consideration is trivial casts b
16151616
requires lookupTy(TY_SOURCE) ==K lookupTy(TY_TARGET)
16161617
```
16171618

1619+
Transmuting a pointer to an integer discards provenance and reinterprets the pointer's
1620+
address as a value of the target integer type.
1621+
The address is computed from a lazily-assigned base address for the allocation plus the
1622+
pointer offset within that allocation, following Miri's approach.
1623+
1624+
An allocation key uniquely identifies a stack allocation by its stack depth and local index.
1625+
1626+
```k
1627+
syntax AllocKey ::= allocKey ( Int , Local ) [symbol(allocKey)]
1628+
syntax AddrEntry ::= addrEntry ( Int , Int ) [symbol(addrEntry)]
1629+
1630+
syntax Int ::= #alignUp ( Int , Int ) [function, total]
1631+
rule #alignUp(ADDR, ALIGN) => ((ADDR +Int ALIGN -Int 1) /Int ALIGN) *Int ALIGN
1632+
requires ALIGN >Int 0
1633+
[preserves-definedness]
1634+
rule #alignUp(ADDR, _) => ADDR [owise]
1635+
1636+
syntax Int ::= #addrEntryBase ( AddrEntry ) [function, total]
1637+
rule #addrEntryBase(addrEntry(BASE, _)) => BASE
1638+
1639+
syntax Int ::= #ptrElemSize ( TypeInfo ) [function, total]
1640+
rule #ptrElemSize(TY_INFO) => #sizeOf(TY_INFO) requires #sizeOf(TY_INFO) >Int 0
1641+
rule #ptrElemSize(TY_INFO) => 1 requires #sizeOf(TY_INFO) ==Int 0 // ZST
1642+
rule #ptrElemSize(_) => 1 [owise]
1643+
1644+
syntax KItem ::= #allocAddressFor ( Int , Local , Int , Int )
1645+
// #allocAddressFor(STACK_DEPTH, LOCAL, SIZE, ALIGN)
1646+
// Lazily assigns an aligned base address for the allocation (STACK_DEPTH, LOCAL).
1647+
rule <k> #allocAddressFor(STACK_DEPTH, LOCAL, SIZE, ALIGN)
1648+
=> .K
1649+
...
1650+
</k>
1651+
<addressMap>
1652+
MAP => MAP[allocKey(STACK_DEPTH, LOCAL) <- addrEntry(#alignUp(NEXT, ALIGN), SIZE)]
1653+
</addressMap>
1654+
<nextAddress>
1655+
NEXT => #alignUp(NEXT, ALIGN) +Int maxInt(SIZE, 1)
1656+
</nextAddress>
1657+
requires notBool allocKey(STACK_DEPTH, LOCAL) in_keys(MAP)
1658+
1659+
// Already allocated: no-op
1660+
rule <k> #allocAddressFor(STACK_DEPTH, LOCAL, _, _) => .K ... </k>
1661+
<addressMap> MAP </addressMap>
1662+
requires allocKey(STACK_DEPTH, LOCAL) in_keys(MAP)
1663+
```
1664+
1665+
Transmuting a `PtrLocal` to an integer type: look up or lazily assign the base address,
1666+
then compute `base + byte_offset` where byte_offset comes from the pointer's metadata.
1667+
1668+
```k
1669+
// Case 1: address already assigned — compute base + offset
1670+
rule <k> #cast(
1671+
PtrLocal(STACK_DEPTH, place(LOCAL, _PROJS), _MUT, metadata(_SIZE, PTR_OFFSET, _ORIGIN)),
1672+
castKindTransmute,
1673+
TY_SOURCE,
1674+
TY_TARGET
1675+
)
1676+
=>
1677+
#intAsType(
1678+
#addrEntryBase({MAP[allocKey(STACK_DEPTH, LOCAL)]}:>AddrEntry)
1679+
+Int PTR_OFFSET *Int #ptrElemSize(#lookupMaybeTy(pointeeTy(#lookupMaybeTy(TY_SOURCE)))),
1680+
64,
1681+
#numTypeOf(lookupTy(TY_TARGET))
1682+
)
1683+
...
1684+
</k>
1685+
<addressMap> MAP </addressMap>
1686+
requires #isIntType(lookupTy(TY_TARGET))
1687+
andBool allocKey(STACK_DEPTH, LOCAL) in_keys(MAP)
1688+
[preserves-definedness]
1689+
1690+
// Case 2: address not yet assigned — allocate first, original #cast stays on <k>
1691+
rule <k> #cast(
1692+
PtrLocal(STACK_DEPTH, place(LOCAL, PROJS), MUT, META),
1693+
castKindTransmute,
1694+
TY_SOURCE,
1695+
TY_TARGET
1696+
)
1697+
=>
1698+
#allocAddressFor(
1699+
STACK_DEPTH,
1700+
LOCAL,
1701+
#sizeOf(#lookupMaybeTy(pointeeTy(#lookupMaybeTy(TY_SOURCE)))),
1702+
#alignOf(#lookupMaybeTy(pointeeTy(#lookupMaybeTy(TY_SOURCE))))
1703+
)
1704+
~> #cast(
1705+
PtrLocal(STACK_DEPTH, place(LOCAL, PROJS), MUT, META),
1706+
castKindTransmute,
1707+
TY_SOURCE,
1708+
TY_TARGET
1709+
)
1710+
...
1711+
</k>
1712+
<addressMap> MAP </addressMap>
1713+
requires #isIntType(lookupTy(TY_TARGET))
1714+
andBool notBool allocKey(STACK_DEPTH, LOCAL) in_keys(MAP)
1715+
[preserves-definedness]
1716+
```
1717+
16181718
Other `Transmute` casts that can be resolved are round-trip casts from type A to type B and then directly back from B to A.
16191719
The first cast is reified as a `thunk`, the second one resolves it and eliminates the `thunk`:
16201720

@@ -1879,12 +1979,121 @@ the safety of this cast. The logic of the semantics and saftey of this cast for
18791979
```
18801980

18811981

1982+
### Pointer provenance casts
1983+
1984+
`PointerExposeAddress` converts a pointer to an integer (like `ptr as usize`), exposing
1985+
the pointer's provenance so that a future `PointerWithExposedProvenance` cast can recover it.
1986+
1987+
```k
1988+
// PointerExposeAddress for PtrLocal: compute address and expose provenance
1989+
rule <k> #cast(
1990+
PtrLocal(STACK_DEPTH, place(LOCAL, _PROJS), _MUT, metadata(_SIZE, PTR_OFFSET, _ORIGIN)),
1991+
castKindPointerExposeAddress,
1992+
TY_SOURCE,
1993+
TY_TARGET
1994+
)
1995+
=>
1996+
#intAsType(
1997+
#addrEntryBase({MAP[allocKey(STACK_DEPTH, LOCAL)]}:>AddrEntry)
1998+
+Int PTR_OFFSET *Int #ptrElemSize(#lookupMaybeTy(pointeeTy(#lookupMaybeTy(TY_SOURCE)))),
1999+
64,
2000+
#numTypeOf(lookupTy(TY_TARGET))
2001+
)
2002+
...
2003+
</k>
2004+
<addressMap> MAP </addressMap>
2005+
<exposedSet> EXPOSED => EXPOSED |Set SetItem(allocKey(STACK_DEPTH, LOCAL)) </exposedSet>
2006+
requires #isIntType(lookupTy(TY_TARGET))
2007+
andBool allocKey(STACK_DEPTH, LOCAL) in_keys(MAP)
2008+
[preserves-definedness]
2009+
2010+
rule <k> #cast(
2011+
PtrLocal(STACK_DEPTH, place(LOCAL, PROJS), MUT, META),
2012+
castKindPointerExposeAddress,
2013+
TY_SOURCE,
2014+
TY_TARGET
2015+
)
2016+
=>
2017+
#allocAddressFor(
2018+
STACK_DEPTH,
2019+
LOCAL,
2020+
#sizeOf(#lookupMaybeTy(pointeeTy(#lookupMaybeTy(TY_SOURCE)))),
2021+
#alignOf(#lookupMaybeTy(pointeeTy(#lookupMaybeTy(TY_SOURCE))))
2022+
)
2023+
~> #cast(
2024+
PtrLocal(STACK_DEPTH, place(LOCAL, PROJS), MUT, META),
2025+
castKindPointerExposeAddress,
2026+
TY_SOURCE,
2027+
TY_TARGET
2028+
)
2029+
...
2030+
</k>
2031+
<addressMap> MAP </addressMap>
2032+
requires #isIntType(lookupTy(TY_TARGET))
2033+
andBool notBool allocKey(STACK_DEPTH, LOCAL) in_keys(MAP)
2034+
[preserves-definedness]
2035+
2036+
// PointerExposeAddress for Reference
2037+
rule <k> #cast(
2038+
Reference(STACK_DEPTH, place(LOCAL, _PROJS), _MUT, metadata(_SIZE, PTR_OFFSET, _ORIGIN)),
2039+
castKindPointerExposeAddress,
2040+
TY_SOURCE,
2041+
TY_TARGET
2042+
)
2043+
=>
2044+
#intAsType(
2045+
#addrEntryBase({MAP[allocKey(STACK_DEPTH, LOCAL)]}:>AddrEntry)
2046+
+Int PTR_OFFSET *Int #ptrElemSize(#lookupMaybeTy(pointeeTy(#lookupMaybeTy(TY_SOURCE)))),
2047+
64,
2048+
#numTypeOf(lookupTy(TY_TARGET))
2049+
)
2050+
...
2051+
</k>
2052+
<addressMap> MAP </addressMap>
2053+
<exposedSet> EXPOSED => EXPOSED |Set SetItem(allocKey(STACK_DEPTH, LOCAL)) </exposedSet>
2054+
requires #isIntType(lookupTy(TY_TARGET))
2055+
andBool allocKey(STACK_DEPTH, LOCAL) in_keys(MAP)
2056+
[preserves-definedness]
2057+
2058+
rule <k> #cast(
2059+
Reference(STACK_DEPTH, place(LOCAL, PROJS), MUT, META),
2060+
castKindPointerExposeAddress,
2061+
TY_SOURCE,
2062+
TY_TARGET
2063+
)
2064+
=>
2065+
#allocAddressFor(
2066+
STACK_DEPTH,
2067+
LOCAL,
2068+
#sizeOf(#lookupMaybeTy(pointeeTy(#lookupMaybeTy(TY_SOURCE)))),
2069+
#alignOf(#lookupMaybeTy(pointeeTy(#lookupMaybeTy(TY_SOURCE))))
2070+
)
2071+
~> #cast(
2072+
Reference(STACK_DEPTH, place(LOCAL, PROJS), MUT, META),
2073+
castKindPointerExposeAddress,
2074+
TY_SOURCE,
2075+
TY_TARGET
2076+
)
2077+
...
2078+
</k>
2079+
<addressMap> MAP </addressMap>
2080+
requires #isIntType(lookupTy(TY_TARGET))
2081+
andBool notBool allocKey(STACK_DEPTH, LOCAL) in_keys(MAP)
2082+
[preserves-definedness]
2083+
```
2084+
2085+
`PointerWithExposedProvenance` converts an integer to a pointer (`ptr::with_exposed_provenance`),
2086+
recovering the provenance that was previously exposed.
2087+
2088+
```k
2089+
// TODO: PointerWithExposedProvenance requires searching the address map and exposed set
2090+
// to recover the original allocation. This is left for future implementation.
2091+
```
2092+
18822093
### Other casts involving pointers
18832094

18842095
| CastKind | Description |
18852096
|------------------------------|-------------|
1886-
| PointerExposeAddress | |
1887-
| PointerWithExposedProvenance | |
18882097
| FnPtrToPtr | |
18892098

18902099
## Decoding constants from their bytes representation to values
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
/// Test that a stack pointer transmuted to usize is non-zero.
2+
use std::mem::transmute;
3+
4+
fn main() {
5+
let x: u32 = 42;
6+
let ptr: *const u32 = &x;
7+
unsafe {
8+
let addr: usize = transmute(ptr);
9+
assert!(addr != 0);
10+
}
11+
}
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
/// Test that two different local pointers transmute to different addresses.
2+
use std::mem::transmute;
3+
4+
fn main() {
5+
let a: u32 = 42;
6+
let b: u32 = 99;
7+
let pa: *const u32 = &a;
8+
let pb: *const u32 = &b;
9+
unsafe {
10+
let addr_a: usize = transmute(pa);
11+
let addr_b: usize = transmute(pb);
12+
assert!(addr_a != addr_b);
13+
}
14+
}

0 commit comments

Comments
 (0)