|
| 1 | +# Allocation Decoding in MIR-Semantics |
| 2 | + |
| 3 | +This module provides functions for decoding byte representations of various allocations into |
| 4 | +high-level `Value` representations used by the MIR semantics. |
| 5 | + |
| 6 | +When Rust code contains constants (arrays, structs, enums, etc.), the compiler stores these as |
| 7 | +byte sequences in the SMIR JSON output. |
| 8 | +The semantics needs to decode these bytes back into structured values that can be operated on at |
| 9 | +runtime. |
| 10 | +This module contains the decoding functions for different allocation types, handling the conversion |
| 11 | +from raw bytes to typed `Value` objects according to Rust's memory layout rules. |
| 12 | + |
| 13 | +```k |
| 14 | +requires "../ty.md" |
| 15 | +requires "value.md" |
| 16 | +requires "numbers.md" |
| 17 | +
|
| 18 | +module RT-DECODING |
| 19 | + imports BOOL |
| 20 | + imports MAP |
| 21 | +
|
| 22 | + imports TYPES |
| 23 | + imports RT-VALUE-SYNTAX |
| 24 | + imports RT-NUMBERS |
| 25 | + imports RT-TYPES |
| 26 | +``` |
| 27 | + |
| 28 | +## Element Decoding Interface to turn bytes into a `Value` |
| 29 | + |
| 30 | +This recursive decoder function checks byte length and decodes the bytes to a `Value` of the given type. |
| 31 | + |
| 32 | +This is currently only defined for `PrimitiveType`s (primitive types in MIR). |
| 33 | +and arrays (where layout is trivial). |
| 34 | + |
| 35 | +### Decoding `PrimitiveType`s |
| 36 | + |
| 37 | +```k |
| 38 | + syntax Evaluation ::= #decodeValue ( Bytes , TypeInfo , Map ) [function, total] |
| 39 | + | UnableToDecode( Bytes , TypeInfo ) |
| 40 | +
|
| 41 | + // Boolean: should be one byte with value one or zero |
| 42 | + rule #decodeValue(BYTES, typeInfoPrimitiveType(primTypeBool), _TYPEMAP) => BoolVal(false) |
| 43 | + requires 0 ==Int Bytes2Int(BYTES, LE, Unsigned) andBool lengthBytes(BYTES) ==Int 1 |
| 44 | +
|
| 45 | + rule #decodeValue(BYTES, typeInfoPrimitiveType(primTypeBool), _TYPEMAP) => BoolVal(true) |
| 46 | + requires 1 ==Int Bytes2Int(BYTES, LE, Unsigned) andBool lengthBytes(BYTES) ==Int 1 |
| 47 | +
|
| 48 | + // Integer: handled in separate module for numeric operation_s |
| 49 | + rule #decodeValue(BYTES, TYPEINFO, _TYPEMAP) => #decodeInteger(BYTES, #intTypeOf(TYPEINFO)) |
| 50 | + requires #isIntType(TYPEINFO) andBool lengthBytes(BYTES) ==Int #elemSize(TYPEINFO) |
| 51 | + [preserves-definedness] |
| 52 | +
|
| 53 | + // TODO Char type |
| 54 | + // rule #decodeConstant(constantKindAllocated(allocation(BYTES, _, _, _)), typeInfoPrimitiveType(primTypeChar)) => typedValue(Str(...), TY, mutabilityNot) |
| 55 | +
|
| 56 | + // TODO Float decoding: not supported natively in K |
| 57 | +``` |
| 58 | + |
| 59 | + |
| 60 | +### Array decoding |
| 61 | + |
| 62 | +Arrays are decoded iteratively, using a known (expected) length or the length of the byte array. |
| 63 | + |
| 64 | +```k |
| 65 | +rule #decodeValue(BYTES, typeInfoArrayType(ELEMTY, someTyConst(tyConst(LEN, _))), TYPEMAP) |
| 66 | + => #decodeArrayAllocation(BYTES, {TYPEMAP[ELEMTY]}:>TypeInfo, readTyConstInt(LEN, TYPEMAP)) |
| 67 | + requires ELEMTY in_keys(TYPEMAP) |
| 68 | + andBool isTypeInfo(TYPEMAP[ELEMTY]) |
| 69 | + andBool isInt(readTyConstInt(LEN, TYPEMAP)) |
| 70 | + [preserves-definedness] |
| 71 | +
|
| 72 | +rule #decodeValue(BYTES, typeInfoArrayType(ELEMTY, noTyConst), TYPEMAP) |
| 73 | + => #decodeSliceAllocation(BYTES, {TYPEMAP[ELEMTY]}:>TypeInfo) |
| 74 | + requires ELEMTY in_keys(TYPEMAP) |
| 75 | + andBool isTypeInfo(TYPEMAP[ELEMTY]) |
| 76 | +``` |
| 77 | + |
| 78 | +### Error marker (becomes thunk) for other (unimplemented) cases |
| 79 | + |
| 80 | +All unimplemented cases will become thunks by way of this default rule: |
| 81 | + |
| 82 | +```k |
| 83 | + rule #decodeValue(BYTES, TYPEINFO, _TYPEMAP) => UnableToDecode(BYTES, TYPEINFO) [owise] |
| 84 | +``` |
| 85 | + |
| 86 | +## Helper function to determine the expected byte length for a type |
| 87 | + |
| 88 | +```k |
| 89 | + // TODO: this function should go into the rt/types.md module |
| 90 | + syntax Int ::= #elemSize ( TypeInfo ) [function] |
| 91 | +``` |
| 92 | + |
| 93 | +Known element sizes for common types: |
| 94 | + |
| 95 | +```k |
| 96 | + rule #elemSize(typeInfoPrimitiveType(primTypeBool)) => 1 |
| 97 | + rule #elemSize(TYPEINFO) => #bitWidth(#intTypeOf(TYPEINFO)) /Int 8 |
| 98 | + requires #isIntType(TYPEINFO) |
| 99 | +
|
| 100 | + rule 0 <=Int #elemSize(_) => true [simplification, preserves-definedness] |
| 101 | +``` |
| 102 | + |
| 103 | + |
| 104 | + |
| 105 | +## Array Allocations |
| 106 | + |
| 107 | +Array allocations contain homogeneous elements stored contiguously in memory. |
| 108 | +The main function `#decodeArrayAllocation` takes the raw bytes of an array allocation along with |
| 109 | +type information and converts it into a `Range` value containing the decoded elements. |
| 110 | + |
| 111 | +The decoding process: |
| 112 | +1. Takes the byte array, element type information, and array length |
| 113 | +2. Iteratively consumes elements from the front of the byte array |
| 114 | +3. Decodes each element according to its type using `#decodeElement` |
| 115 | +4. Accumulates the decoded elements into a list |
| 116 | +5. Returns a `Range` value containing all elements |
| 117 | + |
| 118 | +The byte consumption approach allows for validation - if there are surplus bytes or insufficient |
| 119 | +bytes for the declared array length, the function will get stuck rather than produce incorrect |
| 120 | +results. |
| 121 | + |
| 122 | +```k |
| 123 | + syntax Value ::= #decodeArrayAllocation ( Bytes, TypeInfo, Int ) [function] |
| 124 | + // bytes, element type info, array length |
| 125 | +
|
| 126 | + rule #decodeArrayAllocation(BYTES, ELEMTYPEINFO, LEN) |
| 127 | + => Range(#decodeArrayElements(BYTES, ELEMTYPEINFO, LEN, .List)) |
| 128 | +
|
| 129 | + syntax List ::= #decodeArrayElements ( Bytes, TypeInfo, Int, List ) [function] |
| 130 | + // bytes, elem type info, remaining length, accumulated list |
| 131 | +
|
| 132 | + rule #decodeArrayElements(BYTES, _ELEMTYPEINFO, LEN, ACC) |
| 133 | + => ACC |
| 134 | + requires LEN <=Int 0 |
| 135 | + andBool lengthBytes(BYTES) ==Int 0 // exact match - no surplus bytes |
| 136 | + [preserves-definedness] |
| 137 | +
|
| 138 | + rule #decodeArrayElements(BYTES, ELEMTYPEINFO, LEN, ACC) |
| 139 | + => #decodeArrayElements( |
| 140 | + substrBytes(BYTES, #elemSize(ELEMTYPEINFO), lengthBytes(BYTES)), |
| 141 | + ELEMTYPEINFO, |
| 142 | + LEN -Int 1, |
| 143 | + ACC ListItem(#decodeValue( |
| 144 | + substrBytes(BYTES, 0, #elemSize(ELEMTYPEINFO)), |
| 145 | + ELEMTYPEINFO, |
| 146 | + .Map // HACK |
| 147 | + )) |
| 148 | + ) |
| 149 | + requires LEN >Int 0 |
| 150 | + andBool lengthBytes(BYTES) >=Int #elemSize(ELEMTYPEINFO) // enough bytes remaining |
| 151 | + [preserves-definedness] |
| 152 | +``` |
| 153 | + |
| 154 | +## Slice Allocations |
| 155 | + |
| 156 | +Slices are arrays with dynamic length. |
| 157 | +The `#decodeSliceAllocation` function computes the array length by dividing the total byte length |
| 158 | +by the element size, then uses the same element-by-element decoding approach as arrays. |
| 159 | + |
| 160 | +```k |
| 161 | + syntax Value ::= #decodeSliceAllocation ( Bytes, TypeInfo ) [function] |
| 162 | + // ------------------------------------------------------------------- |
| 163 | + rule #decodeSliceAllocation(BYTES, ELEMTYPEINFO) |
| 164 | + => Range(#decodeArrayElements(BYTES, ELEMTYPEINFO, |
| 165 | + lengthBytes(BYTES) /Int #elemSize(ELEMTYPEINFO), .List)) |
| 166 | + requires lengthBytes(BYTES) %Int #elemSize(ELEMTYPEINFO) ==Int 0 // element size divides cleanly |
| 167 | + andBool 0 <Int #elemSize(ELEMTYPEINFO) |
| 168 | + [preserves-definedness] |
| 169 | +``` |
| 170 | + |
| 171 | +```k |
| 172 | +endmodule |
| 173 | +``` |
0 commit comments