|
| 1 | +# stdarch-gen-arm generator guide |
| 2 | +## Running the generator |
| 3 | +- Run: `cargo run --bin=stdarch-gen-arm -- crates/stdarch-gen-arm/spec` |
| 4 | +``` |
| 5 | +$ cargo run --bin=stdarch-gen-arm -- crates/stdarch-gen-arm/spec |
| 6 | + Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.18s |
| 7 | + Running `target/debug/stdarch-gen-arm crates/stdarch-gen-arm/spec` |
| 8 | +``` |
| 9 | +## Input/Output |
| 10 | +### Input files (intrinsic YAML definitions) |
| 11 | + - `crates/stdarch-gen-arm/spec/<feature>/*.spec.yml` |
| 12 | +### Output files |
| 13 | + - Generated intrinsics: |
| 14 | + - `crates/core_arch/src/<arch>/<feature>/generated.rs` |
| 15 | + - Generated load/store tests: |
| 16 | + - `crates/core_arch/src/<arch>/<feature>/ld_st_tests_<arch>.rs` |
| 17 | + - Only generated when `test: { load: <idx> }` or `test: { store: <idx> }` is set for SVE/SVE2 intrinsics. |
| 18 | +## `.spec.yml` file anatomy |
| 19 | +``` |
| 20 | +--- |
| 21 | +Configs |
| 22 | +--- |
| 23 | +Variable definitions |
| 24 | +--- |
| 25 | +
|
| 26 | +Intrinsic definitions |
| 27 | +
|
| 28 | +--- |
| 29 | +``` |
| 30 | +- If you're new to YAML syntax, consider [reviewing](https://quickref.me/yaml.html) some of the less obvious syntax and features. |
| 31 | +- For example, mapping an attribute to a sequence can be done in two different ways: |
| 32 | +```yaml |
| 33 | +attribute: [item_a, item_b, item_c] |
| 34 | +``` |
| 35 | +or |
| 36 | +```yaml |
| 37 | +attribute: |
| 38 | + - item_a |
| 39 | + - item_b |
| 40 | + - item_c |
| 41 | +``` |
| 42 | +## Configs |
| 43 | +- Mappings defining top-level settings applied to all intrinsics: |
| 44 | +- `arch_cfgs` |
| 45 | + - Sequence of mappings specifying `arch_name`, `target_feature` (sequence), and `llvm_prefix`. |
| 46 | +- `uses_neon_types`(_Optional_) |
| 47 | + - A boolean specifying whether to emit NEON type imports in generated code. |
| 48 | +- `auto_big_endian`(_Optional_) |
| 49 | + - A boolean specifying whether to auto-generate big-endian shuffles when possible. |
| 50 | +- `auto_llvm_sign_conversion`(_Optional_) |
| 51 | + - A boolean specifying whether to auto-convert LLVM wrapper args to signed types. |
| 52 | +## Variable definitions |
| 53 | +- Defines YAML anchors/variables to avoid repetition. |
| 54 | +- Commonly used for stability attributes, cfgs and target features. |
| 55 | +## Intrinsic definitions |
| 56 | +### Example |
| 57 | +```yaml |
| 58 | + - name: "vtst{neon_type[0].no}" |
| 59 | + doc: "Signed compare bitwise Test bits nonzero" |
| 60 | + arguments: ["a: {neon_type[0]}", "b: {neon_type[0]}"] |
| 61 | + return_type: "{neon_type[1]}" |
| 62 | + attr: |
| 63 | + - FnCall: [cfg_attr, [test, {FnCall: [assert_instr, [cmtst]]}]] |
| 64 | + - FnCall: [stable, ['feature = "neon_intrinsics"', 'since = "1.59.0"']] |
| 65 | + safety: safe |
| 66 | + types: |
| 67 | + - [int64x1_t, uint64x1_t, 'i64x1', 'i64x1::new(0)'] |
| 68 | + - [int64x2_t, uint64x2_t, 'i64x2', 'i64x2::new(0, 0)'] |
| 69 | + - [poly64x1_t, uint64x1_t, 'i64x1', 'i64x1::new(0)'] |
| 70 | + - [poly64x2_t, uint64x2_t, 'i64x2', 'i64x2::new(0, 0)'] |
| 71 | + compose: |
| 72 | + - Let: [c, "{neon_type[0]}", {FnCall: [simd_and, [a, b]]}] |
| 73 | + - Let: [d, "{type[2]}", "{type[3]}"] |
| 74 | + - FnCall: [simd_ne, [c, {FnCall: [transmute, [d]]}]] |
| 75 | +``` |
| 76 | + |
| 77 | +### Explanation of fields |
| 78 | +- `name` |
| 79 | + - The name of the intrinsic |
| 80 | + - Often built from a base name followed by a type suffix |
| 81 | +- `doc` (_Optional_) |
| 82 | + - A string explaining the purpose of the intrinsic |
| 83 | +- `static_defs` (_Optional_) |
| 84 | + - A sequence of const generics of the format `"const <NAME>: <type>"` |
| 85 | +- `arguments` |
| 86 | + - A sequence of strings in the format `"<argname>: <argtype>"` |
| 87 | +- `return_type` (_Optional_) |
| 88 | + - A string specifying the return type. If omitted, the intrinsic returns `()`. |
| 89 | +- `attr` (_Optional_) |
| 90 | + - A sequence of items defining the attributes to be applied to the intrinsic. Often stability attributes, target features, or `assert_instr` tests. At least one of `attr` or `assert_instr` must be set. |
| 91 | +- `target_features` (_Optional_) |
| 92 | + - A sequence of target features to enable for this intrinsic (merged with any global `arch_cfgs` settings). |
| 93 | +- `assert_instr` (_Optional_) |
| 94 | + - A sequence of strings expected to be found in the assembly. Required if `attr` is not set. |
| 95 | +- `safety` (_Optional_) |
| 96 | + - Use `safe`, or map `unsafe:` to a sequence of unsafety comments: |
| 97 | + - `custom: "<string>"` |
| 98 | + - `uninitialized` |
| 99 | + - `pointer_offset`, `pointer_offset_vnum`, or `dereference` (optionally qualified with `predicated`, `predicated_non_faulting`, or `predicated_first_faulting`) |
| 100 | + - `unpredictable_on_fault` |
| 101 | + - `non_temporal` |
| 102 | + - `neon` |
| 103 | + - `no_provenance: "<string>"` |
| 104 | +- `substitutions` (_Optional_) |
| 105 | + - Mappings of custom wildcard names to either `MatchSize` or `MatchKind` expressions |
| 106 | +- `types` |
| 107 | + - A sequence or sequence of sequences specifying the types to use when producing each intrinsic variant. These sequences can then be indexed by wildcards. |
| 108 | +- `constraints` (_Optional_) |
| 109 | + - A sequence of mappings. Each specifies a variable and a constraint. The available mappings are: |
| 110 | + - Assert a variable's value exists in a sequence of i32's |
| 111 | + - Usage: `{ variable: <name>, any_values: [<i32>,...] }` |
| 112 | + - Assert a variable's value exists in a range (inclusive) |
| 113 | + - Usage: `{ variable: <name>, range: [<i32>, <i32>] }` |
| 114 | + - Assert a variable's value exists in a range via a match (inclusive) |
| 115 | + - Usage: `{ variable: <name>, range: <MatchSize returning [i32,i32]> }` |
| 116 | + - Assert a variable's value does not exceed the number of elements in a SVE type `<type>`. |
| 117 | + - Usage: `{ variable: <name>, sve_max_elems_type: <type> }` |
| 118 | + - Assert a variable's value does not exceed the number of elements in a vector type `<type>`. |
| 119 | + - Usage: `{ variable: <name>, vec_max_elems_type: <type> }` |
| 120 | +- `predication_methods` (_Optional_) |
| 121 | + - Configuration for predicate-form variants. Only used when the intrinsic name includes an `_m*_` wildcard (e.g., `{_mx}`, `{_mxz}`). |
| 122 | + - `zeroing_method`: Required when requesting `_z`; either `{ drop: <arg> }` to remove an argument and replace it with a zero initialiser, or `{ select: <predicate_var> }` to select zeros into a predicate. |
| 123 | + - `dont_care_method`: How `_x` should be implemented (`inferred`, `as_zeroing`, or `as_merging`). |
| 124 | +- `compose` |
| 125 | + - A sequence of expressions that make up the body of the intrinsic |
| 126 | +- `big_endian_inverse` (_Optional_) |
| 127 | + - A boolean, default false. If true, generates two implementations of each intrinsic variant, one for each endianness, and attempts to automatically generate the required bit swizzles |
| 128 | +- `visibility` (_Optional_) |
| 129 | + - Function visibility. One of `public` (default) or `private`. |
| 130 | +- `n_variant_op` (_Optional_) |
| 131 | + - Enables generation of an `_n` variant when the intrinsic name includes the `{_n}` wildcard. Set to the operand name that should be splattered for the `_n` form. |
| 132 | +- `test` (_Optional_) |
| 133 | + - When set, load/store tests are automatically generated. |
| 134 | + - A mapping of either `load` or `store` to a number that indexes `types` to specify the type that the test should be addressing in memory. |
| 135 | +### Expressions |
| 136 | +#### Common |
| 137 | +- `Let` |
| 138 | + - Defines a variable |
| 139 | + - Usage: `Let: [<variable>, <type(optional)>, <expression>]` |
| 140 | +- `Const` |
| 141 | + - Defines a const |
| 142 | + - Usage: `Const: [<variable>, <type>, <expression>]` |
| 143 | +- `Assign` |
| 144 | + - Performs variable assignment |
| 145 | + - Usage: `Assign: [<variable>, <expression>]` |
| 146 | +- `FnCall` |
| 147 | + - Performs a function call |
| 148 | + - Usage: `FnCall: [<function pointer: expression>, [<argument: expression>, ... ], [<turbofish argument: expression>, ...](optional), <unsafe wrapper(optional): bool>]` |
| 149 | +- `MacroCall` |
| 150 | + - Performs a macro call |
| 151 | + - Usage: `MacroCall: [<macro name>, <token stream>]` |
| 152 | +- `MethodCall` |
| 153 | + - Performs a method call |
| 154 | + - Usage: `MethodCall: [<object: expression>, <method name>, [<argument: expression>, ... ]]` |
| 155 | +- `LLVMLink` |
| 156 | + - Creates an LLVM link and stores the function's name in the wildcard `{llvm_link}` for later use in subsequent expressions. |
| 157 | + - If left unset, the arguments and return type inherit from the intrinsic's signature by default. The links will also be set automatically if unset. |
| 158 | + - Usage: |
| 159 | +```yaml |
| 160 | +LLVMLink: |
| 161 | + name: <name> |
| 162 | + arguments: [<expression>, ... ](optional) |
| 163 | + return_type: <return type>(optional) |
| 164 | + links: (optional) |
| 165 | + - link: <link> |
| 166 | + arch: <arch> |
| 167 | + - ... |
| 168 | +``` |
| 169 | +- `Identifier` |
| 170 | + - Emits a symbol. Prepend with a `$` to treat it as a scope variable, which engages variable tracking and enables inference. For example, `my_function_name` for a generic symbol or `$my_variable` for a variable. |
| 171 | + - Usage `Identifier: [<symbol name>, <Variable|Symbol>]` |
| 172 | +- `CastAs` |
| 173 | + - Casts an expression to an unchecked type |
| 174 | + - Usage: `CastAs: [<expression>, <type>]` |
| 175 | +- `MatchSize` |
| 176 | + - Allows for conditional generation depending on the size of a specified type |
| 177 | + - Usage: |
| 178 | +```yaml |
| 179 | +MatchSize: |
| 180 | + - <type> |
| 181 | + - default: <expression> |
| 182 | + byte(optional): <expression> |
| 183 | + halfword(optional): <expression> |
| 184 | + doubleword(optional): <expression> |
| 185 | +``` |
| 186 | +- `MatchKind` |
| 187 | + - Allows for conditional generation depending on the kind of a specified type |
| 188 | +```yaml |
| 189 | +MatchKind: |
| 190 | + - <type> |
| 191 | + - default: <expression> |
| 192 | + float(optional): <expression> |
| 193 | + unsigned(optional): <expression> |
| 194 | +``` |
| 195 | +#### Rarely Used |
| 196 | +- `IntConstant` |
| 197 | + - Constant signed integer expression |
| 198 | + - Usage: `IntConstant: <i32>` |
| 199 | +- `FloatConstant` |
| 200 | + - Constant floating-point expression |
| 201 | + - Usage: `FloatConstant: <f32>` |
| 202 | +- `BoolConstant` |
| 203 | + - Constant boolean expression |
| 204 | + - Usage: `BoolConstant: <bool>` |
| 205 | +- `Array` |
| 206 | + - An array of expressions |
| 207 | + - Usage: `Array: [<expression>, ...]` |
| 208 | +- `SvUndef` |
| 209 | + - Returns the LLVM `undef` symbol |
| 210 | + - Usage: `SvUndef` |
| 211 | +- `Multiply` |
| 212 | + - Simply `*` |
| 213 | + - Usage: `Multiply: [<expression>, <expression>]` |
| 214 | +- `Xor` |
| 215 | + - Simply `^` |
| 216 | + - Usage: `Xor: [<expression>, <expression>]` |
| 217 | +- `ConvertConst` |
| 218 | + - Converts the specified constant to the specified type's kind |
| 219 | + - Usage: `ConvertConst: [<type>, <i32>]` |
| 220 | +- `Type` |
| 221 | + - Yields the given type in the Rust representation |
| 222 | + - Usage: `Type: [<type>]` |
| 223 | + |
| 224 | +### Wildstrings |
| 225 | +- Wildstrings let you take advantage of wildcards. |
| 226 | +- For example, they are often used in intrinsic names `name: "vtst{neon_type[0].no}"` |
| 227 | +- As shown above, wildcards are identified by the surrounding curly brackets. |
| 228 | +- Double curly brackets can be used to escape wildcard functionality if you need literal curly brackets in the generated intrinsic. |
| 229 | +### Wildcards |
| 230 | +Wildcards are heavily used in the spec. They let you write generalised definitions for a group of intrinsics that generate multiple variants. The wildcard itself is replaced with the relevant string in each variant. |
| 231 | +Ignoring endianness, for each row in the `types` field of an intrinsic in the spec, a variant of the intrinsic will be generated. That row's contents can be indexed by the wildcards. Below is the behaviour of each wildcard. |
| 232 | +- `type[<index: usize>]` |
| 233 | + - Replaced in each variant with the value in the indexed position in the relevant row of the `types` field. |
| 234 | + - For unnested sequences of `types` (i.e., `types` is a sequence where each element is a single item, not another sequence), the square brackets can be omitted. Simply: `type` |
| 235 | +- `neon_type[<index: usize>]` |
| 236 | + - Extends the behaviour of `type` with some NEON-specific features and inference. |
| 237 | + - Tuples: This wildcard can also be written as `neon_type_x<n>` where `n` is in the set `{2,3,4}`. This generates the `n`-tuple variant of the (inferred) NEON type. |
| 238 | + - Suffixes: These modify the behaviour of the wildcard from simple substitution. |
| 239 | + - `no` - normal behaviour. Tries to do as much work as it can for you, inferring when to emit: |
| 240 | + - Regular type-size suffixes: `_s8`, `_u16`, `_f32`, ... |
| 241 | + - `q` variants for double-width (128b) vector types: `q_s8`, `q_u16`, `q_f32`, ... |
| 242 | + - `_x<n>` variants for tuple vector types: `_s8_x2`, `_u32_x3`, `_f64_x4`, ... |
| 243 | + - As well as any combination of the above: `q_s16_x16` ... |
| 244 | + - Most of the other suffixes modify the normal behaviour by disabling features or adding new ones. (See table below) |
| 245 | +- `sve_type[<index: usize>]` |
| 246 | + - Similar to `neon_type`, but without the suffixes. |
| 247 | +- `size[<index: usize>]` |
| 248 | + - The size (in bits) of the indexed type. |
| 249 | +- `size_minus_one[<index: usize>]` |
| 250 | + - Emits the size (in bits) of the indexed type minus one. |
| 251 | +- `size_literal[<index: usize>]` |
| 252 | + - The literal representation of the indexed type. |
| 253 | + - `b`: byte, `h`: halfword, `w`: word, or `d`: double. |
| 254 | +- `type_kind[<index: usize>]` |
| 255 | + - The literal representation of the indexed type's kind. |
| 256 | + - `f`: float, `s`: signed, `u`: unsigned, `p`: polynomial, `b`: boolean. |
| 257 | +- `size_in_bytes_log2[<index: usize>]` |
| 258 | + - Log2 of the size of the indexed type in *bytes*. |
| 259 | +- `predicate[<index: usize>]` |
| 260 | + - SVE predicate vector type inferred from the indexed type. |
| 261 | +- `max_predicate` |
| 262 | + - The same as predicate, but uses the largest type in the relevant `types` sequence/row. |
| 263 | +- `_n` |
| 264 | + - Emits the current N-variant suffix when `n_variant_op` is configured. |
| 265 | +- `<wildcard> as <type>` |
| 266 | + - If `<wildcard>` evaluates to a vector, it produces a vector of the same shape, but with `<type>` as the base type. |
| 267 | +- `llvm_link` |
| 268 | + - If the `LLVMLink` mapping has been set for an intrinsic, this will give the name of the link. |
| 269 | +- `_m*` |
| 270 | + - Predicate form masks. Use wildcards such as `{_mx}` or `{_mxz}` to expand merging/don't-care/zeroing variants according to the mask. |
| 271 | +- `<custom>` |
| 272 | + - You may simply call upon wildcards defined under `substitutions`. |
| 273 | +### neon_type suffixes |
| 274 | + |
| 275 | +| suffix | implication | |
| 276 | +| ----------------- | --------------------------------------------- | |
| 277 | +| `.no` | Normal | |
| 278 | +| `.noq` | Never include `q`s | |
| 279 | +| `.nox` | Never include `_x<n>`s | |
| 280 | +| `.N` | Include `_n_`, e.g., `_n_s8` | |
| 281 | +| `.noq_N` | Include `_n_`, but never `q`s | |
| 282 | +| `.dup` | Include `_dup_`, e.g., `_dup_s8` | |
| 283 | +| `.dup_nox` | Include `_dup_` but never `_x<n>`s | |
| 284 | +| `.lane` | Include `_lane_`, e.g., `_lane_s8` | |
| 285 | +| `.lane_nox` | Include `_lane_`, but never `_x<n>`s | |
| 286 | +| `.rot90` | Include `_rot90_`, e.g., `_rot90_s8` | |
| 287 | +| `.rot180` | Include `_rot180_`, e.g., `_rot180_s8` | |
| 288 | +| `.rot270` | Include `_rot270_`, e.g., `_rot270_s8` | |
| 289 | +| `.rot90_lane` | Include `_rot90_lane_` | |
| 290 | +| `.rot180_lane` | Include `_rot180_lane_` | |
| 291 | +| `.rot270_lane` | Include `_rot270_lane_` | |
| 292 | +| `.rot90_laneq` | Include `_rot90_laneq_` | |
| 293 | +| `.rot180_laneq` | Include `_rot180_laneq_` | |
| 294 | +| `.rot270_laneq` | Include `_rot270_laneq_` | |
| 295 | +| `.base` | Produce only the size, e.g., `8`, `16` | |
| 296 | +| `.u` | Produce the type's unsigned equivalent | |
| 297 | +| `.laneq_nox` | Include `_laneq_`, but never `_x<n>`s | |
| 298 | +| `.tuple` | Produce only the size of the tuple, e.g., `3` | |
| 299 | +| `.base_byte_size` | Produce only the size in bytes. | |
| 300 | + |
0 commit comments