-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathSPECIFICATION.html
More file actions
1113 lines (595 loc) · 87.2 KB
/
Copy pathSPECIFICATION.html
File metadata and controls
1113 lines (595 loc) · 87.2 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width,initial-scale=1" />
<title>Prefix Specification</title>
<link rel="icon" href="./icon.png" />
<link rel="stylesheet" href="./style.css" />
</head>
<body>
<div class="container" id="content">Rendering specification…</div>
<!-- Markdown source is embedded below. marked.js will render it into #content. -->
<script id="md" type="text/markdown">
# <a href="index.html"><img class="title-icon" alt="Prefix icon" src="./icon.png"></a> <span class="semibold">Pre<span class="grey">fix</span></span> Specification
---
## Table of contents
---
## 1. Preamble
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [IETF RFC 2119](https://datatracker.ietf.org/doc/html/rfc2119), as clarified in [IETF RFC 8174](https://datatracker.ietf.org/doc/html/rfc8174).
Prefix is a statically typed, imperative, interpreted programming language focused on explicit, readable source code.
Programs MUST consist of variable declarations via assignment, expressions, and control-flow constructs.
---
## 2. Lexical structure
Source files MUST consist of ASCII characters. Non-ASCII characters MUST NOT appear in syntactic elements that enter the program namespace, such as identifiers.
---
### 2.1 Whitespace, comments, and logical lines
Whitespace consists of the space character (`U+0020`), horizontal tab, carriage return, and line feed. Whitespace MAY appear freely between tokens and is otherwise ignored, except that logical newlines delimit top-level statements.
Comments MUST begin with `!` and continue to the end of the current line. Comments have no effect on execution.
The semicolon character (`;`) MUST act as a newline-token alias outside string literals. Each `;` therefore separates statements exactly as a physical newline would.
The caret character (`^`) MUST act as a line-continuation marker when it is immediately followed by a newline or by a comment and that comment's terminating newline. In those cases the continuation marker and intervening text are ignored by the lexer. A caret inside a string literal has no continuation meaning. Any other use of `^` MUST raise a syntax error.
---
### 2.2 Tokens and reserved words
The lexer MUST distinguish at least the following token classes: numeric literals, string literals, identifiers, keywords and built-ins, and delimiters. The base delimiters of the language are `(`, `)`, `{`, `}`, `[`, `]`, `<`, `>`, `,`, `:`, `=`, and `~`. The pointer marker `@` MUST be tokenized separately and MUST NOT be treated as part of an identifier.
Keywords and built-in names MUST be matched case-sensitively and MUST be written in their canonical uppercase forms. If a reserved word is written in any other case, it MUST be tokenized as an identifier instead.
The character `-` MUST be interpreted only as the leading sign of a numeric literal. Any unsupported use of `-` MUST raise a syntax error.
The character `~` MUST be reserved for coerced function parameters and MUST NOT appear inside identifiers.
---
### 2.3 Identifiers
Identifiers MUST be non-empty and case-sensitive. Variables and user-defined functions share a single flat namespace, so one name MUST NOT denote both a variable and a function. A user-defined function name MUST NOT conflict with any built-in operator or function name.
Identifiers MUST NOT contain non-ASCII characters or any of the following characters: `{`, `}`, `[`, `]`, `(`, `)`, `=`, `,`, `!`, `~`, or `@`. The first character of an identifier MUST NOT be `0` or `1`.
The first identifier character MAY be a letter `A-Z` or `a-z`, a decimal digit `2-9`, or one of `/`, `$`, `%`, `&`, `_`, `+`, `|`, or `?`. Subsequent identifier characters MAY additionally include the digits `0` and `1`. This permissive ASCII-only character set preserves an unambiguous distinction between identifiers and numeric literals, which MUST begin with a `0`-prefixed base marker.
---
### 2.4 Pointer token
The syntax `@name` MUST form a pointer literal, where `@` is a dedicated lexical token and `name` is an identifier resolved in the current lexical environment. Because `@` is a separate token, it MUST NOT appear inside identifier names.
---
## 3. Statements
A Prefix program MUST consist of zero or more statements separated by logical newlines. Each top-level expression statement or assignment MUST occupy its own logical line.
---
### 3.1 Blocks and statement forms
Blocks MUST be enclosed in matching curly braces as `{ statement1 ... statementN }`. Blocks are used as the bodies of conditionals, loops, thread constructs, exception handlers, and function definitions.
The core statement forms of the language MUST include typed declarations, assignments, expression statements, conditional execution, loops, asynchronous execution, exception handling, loop-control statements, and low-level jump statements.
---
### 3.2 Declarations and assignment
A symbol MAY be declared without a value using `TYPE name`. Such a declaration records the symbol's static type but does not create a readable runtime binding until an assignment occurs. The type and name MUST be separated by one or more space characters, and no other character MAY appear between them.
The first assignment to a symbol MUST use a typed form such as `TYPE name = expression`, with one or more spaces between the type and name and optional spaces around `=`. Subsequent assignments MAY omit the type annotation, but the symbol's type MUST remain unchanged for the lifetime of that name, including after deletion and re-assignment.
Assignments to undeclared identifiers without a type annotation MUST raise a runtime error. A binding MUST remain allocated until it is deleted with `DEL(name)`.
The built-in `ASSIGN(target, expression)` MUST provide expression-position assignment. The `target` MAY be a plain identifier or an indexed target. If the identifier does not yet exist, the typed form `ASSIGN(TYPE name, expression)` MUST be used, and that typed form is valid only for plain identifiers.
Indexed assignment to a tensor MUST use the form `tensor[i1, i2, ..., iN] = expression`. The base symbol MUST already be a `TNS`, the index count MUST match the tensor rank, indices MUST follow the language's one-based and negative-index rules, and the assignment MUST preserve the static element type at the selected position.
Indexed assignment to a map MUST use the same form as ordinary map access, such as `map<k1, k2> = expression`. Intermediate nested maps MUST be created on demand when assigning to deeper keys. Reassigning an existing key to a value of a different static type MUST raise a runtime error.
If a tensor target uses slice indices such as `lo:hi` or `*`, the right-hand side MUST evaluate to a `TNS` whose shape exactly matches the selected slice, and every written element MUST satisfy the target's static type constraints.
---
### 3.3 Conditionals and conditions
Conditional execution MUST use `IF(condition){ ... }`, optionally followed by one or more `ELSEIF(condition){ ... }` clauses and at most one trailing `ELSE{ ... }` clause. An `ELSEIF` or `ELSE` MUST immediately follow a preceding `IF` or `ELSEIF`; otherwise it is a syntax error.
The interpreter MUST evaluate the initial `IF` condition first, then each `ELSEIF` condition in order until one is truthy. If no earlier branch is taken and an `ELSE` clause is present, its block MUST execute.
Conditions MUST accept values of any type. When evaluating a condition, the runtime MUST coerce the value to its boolean representation according to that type's conversion rules (see [4](#4-types) for type-specific boolean semantics).
---
### 3.4 Exception handling
Structured exception handling MUST use either `TRY{ ... }CATCH{ ... }` or `TRY{ ... }CATCH(SYMBOL name){ ... }`. A `TRY` block MUST be followed immediately by exactly one `CATCH` block, and a standalone `CATCH` is a syntax error.
If the `TRY` block completes without a runtime error, the `CATCH` block MUST be skipped. If a runtime error occurs while executing the `TRY` block, execution of that block MUST stop and the matching `CATCH` block MUST execute.
In the parameterized form, `CATCH(SYMBOL name)` MUST create a temporary `STR` binding named `name` in the handler's lexical environment containing the triggering error message. That temporary binding MUST shadow any existing binding of the same name only for the duration of the `CATCH` block.
`TRY` and `CATCH` MUST intercept interpreter-level errors. Errors raised inside `ASYNC` or other background execution contexts MUST be reported through the runtime's error-reporting mechanisms and MUST NOT synchronously transfer control to a surrounding `CATCH` block in another thread.
---
### 3.5 Loops
`WHILE(condition){ ... }` MUST repeatedly evaluate its condition before each iteration and MUST terminate when the condition becomes false.
`FOR(counter, target){ ... }` MUST evaluate `target` exactly once at loop entry, require that value to be an `INT`, initialize `counter` to `1`, and execute the body once for each value through `target`, inclusive. The loop counter MUST be loop-local and MUST NOT persist after the loop finishes, though other symbols declared in the body remain bound in the enclosing environment.
`PARFOR(counter, target){ ... }` MUST evaluate `target` once, require an `INT`, then execute the iterations `1..target` concurrently. The interpreter MUST wait for all iterations to finish before continuing. The loop counter is iteration-local and does not persist afterward, while other symbols declared in iteration bodies are merged back into the enclosing environment. Concurrent writes to shared identifiers are therefore race-prone, and the resulting value is implementation-defined.
Runtime errors raised inside a `PARFOR` iteration MUST be collected and re-raised after all iterations have joined. `CONTINUE()` inside `PARFOR` MUST end only the current iteration. `BREAK(n)` inside any `PARFOR` iteration MUST terminate the `PARFOR` as a whole by preventing new iterations from starting, waiting for in-flight iterations to finish, and then propagating the break to surrounding loop handlers.
---
### 3.6 Asynchronous execution and threads
The statement form `ASYNC{ ... }` MUST start a background task that executes the enclosed block asynchronously relative to the main flow of execution while sharing the same lexical environment. Mutations performed inside the block are therefore visible to the rest of the program as they occur.
When `ASYNC{ ... }` appears in statement position, its resulting `THR` handle is ignored. When it appears in expression position, it MUST evaluate to a `THR` handle. If that expression is supplied as an argument to an enclosing call expression, the runtime MUST defer starting the worker until the enclosing call expression has completed.
The statement form `THR(symbol){ ... }` MUST create a background task and bind its handle to `symbol` with static type `THR` in the current environment.
Runtime errors raised inside asynchronous execution MUST be recorded by the interpreter's error-reporting and state-logging mechanisms and MUST NOT synchronously abort the main thread.
---
### 3.7 Loop control and jumps
`BREAK(n)` MUST terminate the innermost `n` enclosing loops. The argument `n` MUST evaluate to a strictly positive integer. Using `BREAK` outside a loop, passing `n <= 0`, or requesting more loop exits than are currently available MUST raise a runtime error.
`CONTINUE()` MUST skip the remainder of the current iteration of the innermost enclosing loop and proceed to the next iteration. If no further iteration would occur, `CONTINUE()` MUST have the same effect as `BREAK(1)`. Using `CONTINUE()` outside a loop MUST raise a runtime error.
`GOTOPOINT(n)` MUST register a jump target identified by the runtime value `n`, where `n` MAY be either `INT` or `STR`. Negative integer gotopoint identifiers are invalid.
`GOTO(n)` MUST jump to a previously registered gotopoint with matching type and value. The jump target MUST be visible within the same containing function or top-level scope unless an implementation explicitly widens that scope. `GOTO` MAY jump either forward or backward. Jumping to an unregistered target MUST raise a runtime error.
Implementations MUST record `BREAK`, `CONTINUE`, `GOTOPOINT`, and `GOTO` in the state log so deterministic replay and traceback generation remain possible.
---
## 4. Types
Prefix MUST support eight types: `BOOL` (booleans), `STR` (strings), `INT` (integers), `FLT` (floating-point numbers), `TNS` (tensors), `MAP` (maps), `FUNC` (functions), and `THR` (threads). The type of a symbol is determined when it is first declared, and cannot be changed, even after deletion of the original symbol.
---
### 4.0 Booleans
`BOOL` MUST represent the two truth values of the language.
---
#### 4.0.1 Boolean literals
The only `BOOL` literals are `TRUE` and `FALSE`.
---
#### 4.0.2 Boolean representation
When a `BOOL` is used in a boolean context, `FALSE` MUST remain false and `TRUE` MUST remain true.
When a `BOOL` is converted to `STR`, `TRUE` MUST render as `TRUE` and `FALSE` MUST render as `FALSE`.
---
### 4.1 Strings
`STR` MUST consist of a sequence of Unicode characters.
---
#### 4.1.1 String literals
`STR` literals MUST be enclosed in either single quotes (`'`) or double quotes (`"`). The closing quote character MUST match the opening quote character.
---
#### 4.1.2 Escape sequences
The character `\` MUST begin an escape sequence in a `STR` literal. The following escape sequences MUST be supported:
- `\\` = U+005C
- `\"` = U+0022
- `\'` = U+0027
- `\a` = U+0007
- `\b` = U+0008
- `\f` = U+000C
- `\n` = U+000A
- `\r` = U+000D
- `\t` = U+0009
- `\v` = U+000B
- `\e` = U+001B
- `\xHH` = exactly two hexadecimal digits (`0-9`|`A-F`|`a-f`). Produces code point U+00..U+FF specified by 0xHH.
- `\uHHHH` = exactly four hexadecimal digits (`0-9`|`A-F`|`a-f`). Produces code point U+HHHH.
- `\UHHHHHHHH` = exactly eight hexadecimal digits (`0-9`|`A-F`|`a-f`). Produces code point U+HHHHHHHH.
- `\R` = raw mode toggle. In raw mode, all escape sequences (with the exception of `\R`) are treated as plain text until the next `\R` or the end of the string literal.
If an invalid escape sequence (or a non-Hexadecimal value when one is expected by the escape sequence) is encountered in a `STR` literal, an error MUST be raised.
---
#### 4.1.3 Boolean representation
When a `STR` is used in a boolean context, an empty string MUST be treated as `FALSE`, and any non-empty string MUST be treated as `TRUE`.
---
### 4.2 Integers
`INT` MUST be conceptually unbounded integers with an attached numeric base. Implementations MAY implement a size limit, but MUST support at least 32-bit signed integers. Implementations SHOULD support at least 64-bit signed integers.
---
#### 4.2.1 Integer literals
`INT` literals MUST include a base prefix. Valid prefixes are `0b`, `0o`, `0d`, `0x`, `0t`, `0c`, `0s`, and `0rNN` where `NN` is two decimal digits and `2 <= NN <= 64`.
`INT` literals MUST support negative values via prefixing with a `-` before the base prefix, and positive values with no sign. `INT` literals MUST NOT support a `+` prefix.
Literal digits MUST match the selected base's alphabet. Bases smaller than 2 or larger than 64 MUST raise an error.
---
#### 4.2.2 Boolean representation
When an `INT` is used in a boolean context, `0` MUST be treated as `FALSE`, and any non-zero value MUST be treated as `TRUE`.
---
### 4.3 Floating-point numbers
`FLT` MUST be a compliant implementation of IEEE 754 floating-point numbers and carry an attached numeric base (or base-NaN for special values). `FLT` SHOULD use 64-bit double-precision representation.
---
#### 4.3.1 Floating-point literals
`FLT` literals MUST include a base prefix using the same prefix forms as `INT` literals.
`FLT` literals MUST consist of a prefixed integer part, a radix point (`.`), and a fractional part. The integer and fractional parts MUST each contain at least one digit in the selected base.
`FLT` literals MUST support negative values via prefixing with a `-` before the base prefix, and positive values with no sign. `FLT` literals MUST NOT support a `+` prefix.
---
#### 4.3.2 Special values
`FLT` MUST support the following special values:
- `INF` - Infinity. `INF` MUST support negative values via prefixing with a `-` (`-INF`).
- `NaN` - Quiet Not a Number. `NaN` MUST NOT support negative values. Attempting to create a negative `NaN` MUST cause an error.
`INF`, `-INF`, and `NaN` MUST NOT include a base prefix and are considered base-NaN.
When numeric values are converted to `STR`, they MUST render in their own base and MUST include the base prefix. `INF`, `-INF`, and `NaN` render without a base prefix.
Mathematical operations MUST produce results in the highest base present among numeric operands.
---
#### 4.3.3 Boolean representation
When a `FLT` is used in a boolean context, `0.0` MUST be treated as `FALSE`, and any non-zero value MUST be treated as `TRUE`.
---
### 4.4 Tensors
`TNS` MUST represent a non-scalar aggregate of elements. The elements MAY be of type `BOOL`, `INT`, `FLT`, `STR`, `MAP`, `THR`, `FUNC`, or `TNS`. `TNS` values MUST be atomic (de-referenced) types, where assigning a `TNS` to another identifier copies the container object by default.
---
#### 4.4.1 Tensor literals
`TNS` literals MUST be enclosed in square brackets (`[` and `]`). Each pair of matching brackets MUST introduce a dimension, and nested brackets MUST form a rectangular shape where all sublists at a given depth have the same length. A `TNS` literal that mixes sub-brackets of differing lengths MUST raise a syntax error.
If an element evaluates to a `TNS` value, it MUST occupy a single position and MUST NOT contribute additional dimensions to the shape.
---
#### 4.4.2 Tensor indexing
`TNS` indexing MUST use square brackets (`[` and `]`) with a comma-separated list of indices. Indexing MUST be one-based. Negative indices MUST count backwards from the end of the dimension.
Slice indexing MUST be supported in any index position using a range of the form `lo:hi`. The selected slice MUST be inclusive of both endpoints. The symbol `*` MAY be used in an index position to denote a full-dimension slice, selecting every element along that axis.
---
#### 4.4.3 Boolean representation
When a `TNS` is used in a boolean context, a `TNS` MUST be treated as `TRUE` if it contains any truthy elements, and `FALSE` otherwise.
---
### 4.5 Maps
`MAP` MUST represent an N-dimensional associative container mapping keys to values of a single static type. `MAP` values MUST be atomic (de-referenced) types, where assigning a `MAP` to another identifier copies the container object by default. Maps MUST preserve insertion order.
---
#### 4.5.1 Map literals
`MAP` literals MUST be enclosed in angle brackets (`<` and `>`), consisting of a comma-separated list of key-value bindings of the form `key = value`. Values MAY themselves be maps.
The literal `SELF` MAY be evaluated inside `MAP` literals. When used as a value within a map literal, `SELF` MUST denote a pointer (alias) to the enclosing `MAP` being constructed.
---
#### 4.5.2 Map indexing
`MAP` indexing MUST use angle brackets (`<` and `>`) with a comma-separated list of keys. Each key MUST evaluate to a scalar value of type `INT`, `FLT`, or `STR`. Tensor values MUST NOT be permitted as keys. The number of keys supplied MAY vary between lookups.
Looking up a key that does not exist MUST raise a runtime error.
Intermediate nested maps MUST be created on-demand when assigning to deeper keys. Assigning a value of a different type to an existing key MUST raise a runtime error.
When a `MAP` index expression is used in call position and the indexed value resolves to a `FUNC`, the interpreter MUST bind `SELF` for the duration of that call to the `MAP` value that supplied the function. If the `MAP` expression refers to a visible pointer alias to a `MAP` binding, `SELF` MUST alias the underlying binding so that mutations through `SELF` update the original map. Otherwise `SELF` MUST follow normal `MAP` value semantics, so calling through a non-pointer value mutates only the local copy.
---
#### 4.5.3 Boolean representation
When a `MAP` is used in a boolean context, a `MAP` MUST be treated as `TRUE` if it contains any key-value pairs, and `FALSE` otherwise.
---
### 4.6 Functions
`FUNC` MUST represent a reference to a user-defined function body, including its lexical closure. `FUNC` values MAY be assigned to variables, stored in `TNS` or `MAP` objects, passed as arguments, or returned.
---
#### 4.6.1 Function literals
`FUNC` statements MUST define explicit parameter and return types. The definition MUST use the `FUNC` keyword followed by the return type, the function name, a comma-separated list of typed parameters in parentheses, and the body in curly braces. Type annotations in function definitions MUST use the `TYPE name` form, with one or more space characters between the type and the name.
Parameters MAY declare a call-time default value. Positional parameters MUST appear before any parameters with defaults. A `RETURN(value)` statement terminates the function; the returned value MUST match the declared return type. Functions with a `MAP`, `TNS`, `FUNC`, or `THR` return type MUST execute an explicit `RETURN` or raise a runtime error.
---
#### 4.6.2 Lambdas
Lambdas MUST construct an anonymous `FUNC` value without binding it to a name in the global function table. They MUST be defined using the `LAMBDA` keyword.
Evaluating a `LAMBDA` expression MUST capture the current lexical environment to produce a first-class `FUNC` value. Parameter typing, default values, and return type semantics MUST follow the same rules as `FUNC` definitions.
---
#### 4.6.3 Coerced type parameters
If the type declaration of a `FUNC` parameter is prefixed with `~`, the parameter MUST accept values of any type that can be converted to the declared type, and convert them to the declared type at call time. If conversion is not possible, a runtime error MUST be raised.
---
#### 4.6.4 Boolean representation
When a `FUNC` is used in a boolean context, a `FUNC` MUST be treated as `TRUE`.
---
### 4.7 Threads
`THR` MUST represent a handle to a background task that executes asynchronously relative to the rest of the program.
---
#### 4.7.1 Thread literals
`THR` handles MUST be created using the `ASYNC{ ... }` expression or the `THR(symbol){ ... }` statement. The block inside them MUST run synchronously within itself but asynchronously with the caller thread. The spun task MUST share the main program namespace (executing with the same lexical environment).
---
#### 4.7.2 Boolean representation
When a `THR` is used in a boolean context, a completed or stopped `THR` MUST be treated as `FALSE`, and any other state MUST be treated as `TRUE`.
---
## 5. Namespaces and symbols
Prefix MUST maintain a typed lexical environment mapping identifiers to bindings. Variables and user-defined functions share a single flat identifier namespace within each scope. Imported modules expose distinct qualified namespaces under the module name or an explicit import alias.
---
### 5.1 Declarations, bindings, and lifetime
A declaration MUST record the symbol's static type without necessarily creating a readable runtime value. A readable binding is created only when a value is first assigned. In all type annotations, the type and name MUST be separated by one or more space characters, and other characters MUST NOT appear between them.
Reading an undeclared symbol, a declared-but-never-assigned symbol, or a deleted symbol MUST raise a runtime error. Removing a binding MUST preserve its recorded static type, so any later re-assignment to the same name MUST still match the original type.
Re-assignment MUST preserve the declared type for the lifetime of the symbol, including after deletion and re-creation. Symbol existence, lifetime, and mutability MAY be inspected or modified only through the language's dedicated symbol-management facilities.
---
### 5.2 Lexical environments and scope
The top-level program MUST execute in a global environment. Function calls MUST create new activation records that close over the lexical environment captured when the callable was defined.
Unqualified identifier lookup MUST search the current environment first and then enclosing lexical environments. Loop counters introduced by iterative constructs are loop-local: any prior binding with the same name MUST be restored when the loop completes. Other symbols first declared inside loop bodies remain bound in the enclosing environment after the loop finishes.
Modules loaded through the language's import facilities MUST execute in their own top-level environments. Their bindings are exposed only through qualified names, and repeated imports of the same module MUST reuse the same cached module namespace rather than re-executing the module source.
---
### 5.3 Function symbols and call binding
Named function definitions MUST bind callable values into the surrounding environment. Anonymous function forms MUST instead produce callable values without introducing a new top-level name.
Calls MAY target any expression that evaluates to a callable value. Arguments MUST be evaluated left-to-right. Positional arguments MUST bind from left to right, and named arguments MUST appear only after all positional arguments. Unknown names, duplicate names, or too many positional arguments MUST raise a runtime error.
Default parameter expressions MUST be evaluated at call time in the function's lexical environment after earlier parameters have been bound. For an ordinary parameter, the supplied or defaulted value MUST already have the declared type. For a coerced parameter, the runtime MUST attempt conversion to the declared type and MUST raise a runtime error if that conversion fails.
Any explicit return value MUST match the declared return type. If execution reaches the end of a function body without an explicit return, the runtime MUST apply the following default-return rules: `BOOL` functions MUST return `FALSE`, `INT` functions MUST return `0`, `FLT` functions MUST return `0.0`, and `STR` functions MUST return the empty string. Functions declared with return type `TNS`, `MAP`, `FUNC`, or `THR` MUST raise a runtime error if control reaches the end of the body without an explicit `RETURN`.
---
### 5.4 Pointers, aliasing, and symbol state
The language's pointer mechanism MUST create an alias to an existing visible binding. Pointer creation MUST resolve any existing pointer chain so the new pointer refers to the underlying non-pointer target. Reading through a symbol bound to a pointer MUST behave as an ordinary dereference, so plain reads yield the pointed-to value.
Assigning through a symbol whose current binding is a pointer MUST update the pointed-to target rather than replace the pointer object. Pointer cycles, including direct self-reference, MUST be rejected as runtime errors.
When a built-in operator produces a transformed value from one or more operand arguments, any operand argument that is a pointer literal MUST be written back through the alias for that operand. Arguments that act only as flags, modes, delimiters, bounds, or other control inputs MUST NOT be written back unless they are themselves part of the transformed output.
Because `TNS` and `MAP` are atomic container types, ordinary assignment of those values MUST duplicate the container object by default. Shared mutable aliasing for containers therefore requires the explicit alias mechanism.
Bindings MAY be frozen or permanently frozen by the runtime's symbol-state facilities. A frozen binding MUST reject reassignment and deletion. A permanently frozen binding MUST additionally reject re-enabling mutation. Creating a new pointer to a frozen binding MUST raise a runtime error, though an existing aliased binding MAY itself later become frozen.
---
### 5.5 Execution state and replay
Program execution MUST begin from a seed configuration containing the parsed program, the initial environments, and explicit models of I/O and nondeterministic inputs. The interpreter MUST advance execution by repeatedly applying a fixed, program-independent small-step transition function.
Each intermediate machine state MUST be serializable and sufficiently human-readable for tracing and replay. The interpreter's state log MUST record the sequence of visited states, control-flow decisions, I/O events, and nondeterministic choices so that execution can be replayed deterministically from the seed configuration together with the recorded inputs.
---
## 6. Execution model
The execution model defines how the interpreter advances program state and how runtime effects are recorded.
Implementations MUST describe execution in terms of a deterministic small-step transition function that consumes a seed configuration and produces a sequence of serializable states. The state log produced by these transitions MUST be sufficient for deterministic replay, diagnostics, and tooling integration.
---
### 6.1 Seed configuration and machine state
Execution MUST begin from a seed configuration. At minimum, that seed configuration MUST contain the parsed program representation, the initial lexical environments, the module cache state, the initial control position, and explicit models of external inputs, I/O history, and any other nondeterministic data sources that execution may consult.
Each machine state MUST contain enough information for the interpreter to compute the next transition without relying on hidden process state. The state representation MAY be implementation-specific, but it MUST preserve the currently active control location, the visible environments, any pending call frames, the set of live asynchronous or parallel tasks, and the execution-log metadata needed for replay and traceback construction.
Two executions that begin from equivalent seed configurations and consume the same recorded nondeterministic choices MUST be observationally equivalent: they MUST visit the same sequence of abstract machine states and produce the same externally visible results.
---
### 6.2 Small-step transition semantics
The interpreter MUST advance execution by repeatedly applying a single, fixed transition function to the current machine state. That transition function MUST be program-independent: the same transition rules apply to every Prefix program, with the current state supplying the specific code location, environment contents, and pending effects.
Expression evaluation, statement execution, function call and return, module loading, thread creation, loop control transfer, exception propagation, and I/O MUST all be modeled as one or more explicit state transitions. A transition MAY be purely internal, or it MAY emit an observable effect record such as consumed input, produced output, or a runtime error.
Execution MUST terminate only when the interpreter reaches a terminal state. Terminal states include normal completion of the top-level program, explicit process termination, and unhandled runtime failure.
---
### 6.3 State logging and observable effects
Every non-private execution MUST produce a state log whose entries are ordered by a monotonically increasing rewrite-step index. Each logged transition MUST identify at least the source state, the resulting state, and the rewrite or transition rule that was applied. When a transition corresponds to a source location, the implementation SHOULD also record that source location together with a short statement excerpt.
Observable effects MUST be represented explicitly in the execution record rather than being left implicit in host-language side effects. This includes printed output, consumed input, extension hook activity, runtime error creation, thread lifecycle events, and scheduler decisions that influence visible behavior.
Logging configuration MAY change how much diagnostic data is retained, but it MUST NOT change the language semantics of the running program. In particular, enabling verbose tracing or disabling snapshots for privacy MUST affect diagnostics only, except that disabling the state logger necessarily removes replay artifacts that depend on that log.
---
### 6.4 Concurrency, scheduling, and interleaving
`ASYNC`, `THR`, and `PARFOR` introduce multiple runnable execution contexts that share the language's lexical environment according to the rules defined elsewhere in this specification. Unless a construct states otherwise, the implementation MAY choose any scheduling order for runnable contexts.
Scheduling decisions that can affect observable behavior MUST be treated as explicit nondeterministic choices. When replay logging is enabled, those choices MUST be recorded in sufficient detail for the implementation to reproduce the same interleaving on replay.
Each individual rewrite step MUST belong to exactly one active execution context. The implementation MAY choose statement-level or finer-grained interleaving, but whatever granularity it uses for a given execution MUST be reflected in the state log strongly enough to explain shared-state races, asynchronous errors, and cross-thread visibility of mutations.
Runtime errors raised in background or parallel contexts MUST be attributed to the originating execution context in the log and reported according to the control-flow rules in [3.4](#34-exception-handling), [3.5](#35-loops), and [7](#7-tracebacks-and-error-handling). A surrounding handler in another execution context MUST NOT intercept such an error synchronously unless some future language construct explicitly specifies that behavior.
---
### 6.5 Deterministic replay requirements
When state logging is enabled, the interpreter MUST support deterministic replay from the seed configuration together with the recorded sequence of external inputs and nondeterministic choices. Replay MUST reconstruct the same transition sequence, the same observable effects, and the same terminal outcome as the original execution.
Replay tooling SHOULD be able to resume from any serialized intermediate state whose required predecessor data is available. If an implementation redacts or omits diagnostic data, it MUST clearly indicate that omission and MUST NOT present the resulting replay record as complete when it is not.
The state log defined by this section is the normative source for tracebacks, diagnostics, and replay-oriented tooling. Any alternate implementation strategy remains conforming only if it preserves the same externally visible execution behavior and can expose an equivalent serialized transition history.
---
## 7. Tracebacks and error handling
Structured exception handling is defined in [3.4](#34-exception-handling). This section defines how unhandled runtime failures MUST be reported and how those reports are tied to the serialized execution log.
---
### 7.1 Runtime failures and traceback triggers
A traceback MUST be produced whenever a runtime error prevents normal forward execution, including but not limited to undefined identifier access, type mismatch, divide-by-zero, invalid control-flow transfer, or failed assertion.
Errors raised inside background execution contexts MUST be recorded through the same error-reporting and state-logging mechanisms, but they MUST NOT synchronously transfer control to an exception handler running on a different thread. Nested handlers MUST continue to obey the language's innermost-handler rule.
---
### 7.2 Required traceback contents
The interpreter MUST emit a deterministic, human-readable traceback derived from the state log. Frames MUST be listed from the outermost call site to the innermost frame where the failure occurred.
Each frame MUST include the active function name, or `<top-level>` for global code, together with a precise source location, a short excerpt of the relevant statement, and references that identify the corresponding serialized states in the state log. The innermost frame MUST additionally identify the failing rewrite step that produced the error.
---
### 7.3 State-log linkage
When verbose logging is enabled, the state log MUST make at least the following fields available for traceback construction:
- `step_index`, a monotonically increasing rewrite-step counter.
- `state_id`, a stable identifier for each serialized state.
- `source_location`, when applicable, including file, line, and statement text.
- `frame_id`, when a call frame is active.
- `env_snapshot`, containing the selected local environment and any included global bindings.
- `rewrite_record`, containing the rewrite rule name, its inputs or parameters, and the `from_state_id` and `to_state_id` pair.
Implementations MAY redact or elide large or sensitive values, but they MUST clearly indicate when snapshot data has been omitted.
---
### 7.4 Presentation modes
The default traceback form MUST be a concise textual stack trace. The interpreter MUST also support a verbose mode that includes per-frame state snapshots and full rewrite records, and a machine-readable form suitable for editors and diagnostics tooling.
The following concise textual layout is RECOMMENDED:
```text
Traceback (most recent call last):
File "<file>", line <line>, in <function_or_<top-level>>
<statement excerpt>
State log index: <step_index> State id: <state_id>
```
The final line of the traceback MUST identify the runtime error, the failing rewrite rule when known, and the `step_index` at which execution failed.
---
### 7.5 Point of origin and replay integration
The traceback MUST clearly identify the point of origin of the failure by including the source location and failing step index. When available, implementations SHOULD also identify the exact token or sub-expression that triggered the error and SHOULD include a small surrounding source excerpt.
The disassembler or replay tooling MUST support jumping from a traceback frame to the referenced state-log entry and source location. Because the state log records I/O and nondeterministic choices explicitly, the failing step MUST be replayable exactly from the seed configuration together with the recorded inputs.
---
## 8. Interpreter use
This section defines how the reference interpreter accepts source input, loads extensions, exposes diagnostics flags, and operates in interactive mode.
---
### 8.1 Invocation and program source
The interpreter MUST accept a single program argument. Without `-source`, that argument MUST be interpreted as a path to a Prefix source file. With `-source`, the same argument MUST instead be interpreted as source text and parsed directly without reading a file.
If no program argument and no `-source` text are supplied, the interpreter MUST enter REPL mode. When the only supplied positional arguments are extensions, the interpreter MUST load those extensions and then enter the REPL.
---
### 8.2 Extensions and `EXTEND`
The interpreter MAY accept zero or more compiled extension-library arguments before the program argument. Those libraries MUST be loaded before parsing so that extension-defined operators and hooks are available during parse and execution.
A compiled extension MUST be a platform-specific dynamic library (`.dll` on Windows, `.so` on Unix-like systems, and `.dylib` on macOS) and MUST export a public initialization function named `prefix_extension_init`. The interpreter MUST call that function with the extension context defined in `prefix_extension.h`, which provides the API version constant together with registration callbacks for operators, custom types, event handlers, periodic hooks, and optional REPL replacement.
Compiled extensions MUST be built against the public extension API declared in `prefix_extension.h`. That API MUST expose at least `PREFIX_EXTENSION_API_VERSION`, operator-registration callbacks, custom-type hooks, event-handler registration, periodic-hook registration, REPL-replacement registration, and helper facilities for constructing and inspecting Prefix runtime values.
At load time, the interpreter MUST use the host platform's dynamic-library facilities to load the extension, locate the exported `prefix_extension_init` symbol, invoke it with the extension context, and record all registered operators and hooks. On Windows this SHOULD use `LoadLibraryEx`; on Unix-like systems this SHOULD use `dlopen` or an equivalent facility.
Extensions that opt into module-qualified registration MUST do so with a dedicated operator-registration flag. Only operators registered with that flag MAY be exposed under the extension name as a dotted prefix rather than injected into the global built-in namespace. `PREFIX_EXTENSION_ASMODULE` by itself MUST NOT imply module-qualified exposure. Event-handler registration MUST support the runtime lifecycle hooks defined by the interpreter, including program start and end, error reporting, statement boundaries, and call boundaries.
The extension context MUST provide registration surfaces equivalent to the following:
- `register_operator(name, handler_fn, flags)` to add new callable operators. The `flags` parameter MUST include `PREFIX_EXTENSION_ASMODULE` together with a separate module-restriction option equivalent to `PREFIX_EXTENSION_MODULE_RESTRICTED`. An operator MUST be exposed under the extension name as a dotted prefix such as `mymod.FOO` only when that module-restriction flag is present.
- `register_periodic_hook(N, handler_fn)` to request execution after rewrite steps whose `step_index % N == 0`.
- `register_event_handler(event_name, handler_fn)` for lifecycle events including `program_start`, `program_end`, `on_error`, `before_statement`, `after_statement`, `before_call`, and `after_call`.
- `register_repl_handler(repl_fn)` to replace or augment the default REPL implementation.
The interpreter MUST NOT load `.prex` pointer files.
Runtime extension loading from Prefix source MUST use `EXTEND(EXTENSION ext)`, defined in [9.1.8](#918-function-and-module-operators). The `ext` specifier MUST exclude the platform filename suffix (`.dll`, `.so`, `.dylib`) and MAY use package semantics with `..`. When `ext` names a package, the loader MUST attempt `ext..init`.
`EXTEND` MUST resolve extension libraries using the same extension search roots used by compiled-library loading: the calling module directory when available, then current working directory, then interpreter `ext/std`, `ext/usr`, `lib/std`, and `lib/usr` roots with bundled roots consulted before user roots.
---
### 8.3 Flags, diagnostics, and exit codes
With `-verbose`, tracebacks MUST include the environment snapshots and state-log references required by [7](#7-tracebacks-and-error-handling). With `-private`, the interpreter MUST disable the state logger and suppress environment snapshots while still emitting a traceback that clearly states that snapshot data has been withheld.
The interpreter MAY support additional flags and alternative argument ordering, but the semantics of the program argument, `-source`, `-verbose`, and `-private` MUST remain stable for tooling.
Normal program termination MUST return process exit code `0`. Uncaught runtime failure MUST return exit code `1`. Any explicit process-termination request issued by the program MUST terminate the interpreter immediately and return the supplied integer as the process exit code.
---
### 8.4 REPL behaviour
The REPL MUST execute Prefix statements using the same parser, runtime, built-ins, and state-logging semantics as file-mode execution. It MUST present a primary prompt for new input and a continuation prompt while the user is entering a multi-line construct.
A complete single-line top-level statement MUST be parsed and executed immediately. If the user begins a multi-line block, the REPL MUST buffer input until the block is complete and then execute the collected text as a unit.
Top-level bindings and the state log MUST persist for the lifetime of the REPL session unless the user explicitly deletes symbols. Errors raised in the REPL MUST produce tracebacks using the same concise and verbose formats as file execution.
The REPL MUST support immediate termination through the language's normal process-termination mechanism. Implementations MAY additionally provide meta-commands such as `.exit` or an EOF shortcut.
---
## 9. Built-ins
---
### 9.1 Operators
Built-in operators are pre-defined callable entities provided by the interpreter. They share the same uniform function-call syntax as user-defined functions: `NAME(arg1, arg2, ..., argN)`. The argument list MAY be empty for nullary operators. Arguments MUST be separated by commas; spaces MAY appear freely around commas and parentheses. Each operator has a fixed or variable arity; supplying the wrong number of arguments MUST raise a runtime error.
The subsections below use a type-annotation convention. A type annotation is written as `TYPE name`, where `TYPE` is followed by one or more space characters and then the annotated name. A union notation such as `INT|FLT` restricts an argument to any one of the listed types. `ANY` denotes any runtime type (`BOOL`, `INT`, `FLT`, `STR`, `TNS`, `MAP`, `FUNC`, or `THR`), including extension-defined types when extensions are active, unless the signature narrows the set explicitly. `SYMBOL` is a pseudo-type indicating that the argument MUST be a plain unquoted identifier; such operators receive the symbol name rather than its runtime value.
`MODULE` is a pseudo-type indicating that the argument MUST be a plain unquoted module identifier or a package-qualified module name using the language's `..` separator. A slash-separated signature such as `ADD/SUB/MUL` denotes a family of distinct operators that share the same argument rules and differ only in the named operation.
`EXTENSION` is a pseudo-type indicating that the argument MUST be a plain unquoted extension specifier used by `EXTEND`, excluding the platform filename suffix and optionally using package-style `..` separators.
`INT` and `FLT` are not interoperable. Unless an operator explicitly permits or requires type mixing, all numeric operands MUST share the same numeric type; supplying mismatched types MUST raise a runtime error. The numeric base of an operation's result MUST be the highest base present among its numeric operands.
Built-in operator names MUST be matched case-sensitively and MUST be written in their canonical form. A user-defined function MUST NOT share a name with any built-in operator; such a conflict MUST raise a runtime error. Extensions MAY register additional operators, which are dispatched through the same call syntax and MAY be qualified with a dotted extension-name prefix, but whose names MUST NOT conflict with those of built-in operators.
---
#### 9.1.1 Symbol and type operators
- `BOOL DEL(SYMBOL name)` = MUST delete the readable runtime binding designated by `name`. Deletion MUST preserve the symbol's recorded static type as defined in [5.1](#51-declarations-bindings-and-lifetime). Deleting an undeclared, never-assigned, or already-deleted symbol MUST raise a runtime error. Deleting a frozen or permanently frozen binding MUST raise a runtime error. On success, `DEL` MUST return `FALSE`.
- `BOOL DEL(target<key>)` or `BOOL DEL(target<k1, k2, ...>)` = MUST delete the map entry identified by the final key in the chain of map indices. `target` MUST resolve to a `MAP` value and all intermediate lookups MUST resolve to nested `MAP` values; indexing a non-`MAP` MUST raise a runtime error. If an intermediate key does not exist, the operation MUST be a no-op. After deletion the modified map MUST be written back to the environment. `DEL` in indexed form MUST return `FALSE` whether it deleted an entry or completed as a no-op.
- `ANY ASSIGN(target, ANY value)` = MUST evaluate `value`, assign it to `target`, and return the assigned value. `target` MAY be a plain identifier, a tensor indexed target, or a map indexed target. If a plain identifier has not yet been declared, the typed form `ASSIGN(TYPE name, value)` MUST be used. The typed form MUST NOT be used with indexed targets.
- `BOOL EXIST(SYMBOL name)` = MUST return `TRUE` if `name` resolves to a visible readable binding in the current lexical environment chain and `FALSE` otherwise.
- `BOOL ISBOOL/ISINT/ISFLT/ISSTR/ISTNS/ISMAP/ISFUNC/ISTHR(ANY value)` = MUST return `TRUE` if the runtime type of `value` is respectively `BOOL`, `INT`, `FLT`, `STR`, `TNS`, `MAP`, `FUNC`, or `THR`, and `FALSE` otherwise.
- `STR TYPE(ANY value)` = MUST return the runtime type name of `value` as a `STR`. For core language values, the result MUST be one of `BOOL`, `INT`, `FLT`, `STR`, `TNS`, `MAP`, `FUNC`, or `THR`. For null or otherwise unrecognized internal values, the result MUST be `NULL`. Extension-defined values MUST report their registered type names.
- `STR SIGNATURE(SYMBOL name)` = MUST return a textual signature for `name`. If `name` denotes a user-defined function, the result MUST use the canonical function-signature form of this specification. For any other visible binding, the result MUST be `TYPE name`.
- `BOOL FREEZE(SYMBOL name)`, `BOOL THAW(SYMBOL name)`, and `BOOL PERMAFREEZE(SYMBOL name)` = MUST modify the mutability state of the binding designated by `name`. `FREEZE` MUST prevent reassignment and deletion until thawed. `PERMAFREEZE` MUST permanently prevent reassignment, deletion, and later thawing. `THAW` MUST clear a non-permanent freeze, MUST raise a runtime error when applied to a permanently frozen binding, and MUST otherwise succeed as a no-op when applied to a binding that is not currently frozen. `FREEZE`, `THAW`, and `PERMAFREEZE` MUST each return `FALSE` on success and MUST raise a runtime error if `name` is undefined.
- `BOOL FROZEN(SYMBOL name)` and `BOOL PERMAFROZEN(SYMBOL name)` = MUST report the freeze state of `name`. `FROZEN` MUST return `TRUE` for any frozen or permanently frozen binding and `FALSE` otherwise. `PERMAFROZEN` MUST return `TRUE` only for permanently frozen bindings and `FALSE` otherwise. If `name` is undefined, both operators MUST return `FALSE`.
---
#### 9.1.2 Conversion and construction operators
- `BOOL BOOL(ANY value)`, `INT INT(BOOL|INT|FLT|STR value)`, and `FLT FLT(BOOL|INT|FLT|STR value)` = MUST perform explicit scalar conversion. `BOOL(value)` MUST apply the language's truthiness rules and return a `BOOL`. Converting a `FLT` to `INT` MUST truncate toward zero. Converting `BOOL` to `INT` or `FLT` MUST map `TRUE` to `1` or `1.0` and `FALSE` to `0` or `0.0`. Converting a `STR` to `INT` MUST parse a base-prefixed integer literal, and if parsing fails the result MUST follow the string's boolean representation. Converting a `STR` to `FLT` MUST parse a base-prefixed floating-point literal or the special values `INF`, `-INF`, and `NaN`; invalid input MUST raise a runtime error. Passing any other type MUST raise a runtime error.
- `STR STR(BOOL|STR|INT|FLT value)` = MUST convert `value` to a `STR`. For `BOOL`, the result MUST be `TRUE` or `FALSE`. For `INT` and `FLT`, the result MUST be the base-prefixed numeric spelling, except that `INF`, `-INF`, and `NaN` render without a prefix. For `STR`, the result MUST be a copy.
- `INT|FLT CONVERT(INT|FLT value, INT base)` and `INT BASE(INT|FLT value)` = MUST expose numeric-base manipulation. `CONVERT` MUST return `value` represented in `base`, where `base` MUST be between `2` and `64`, inclusive. `BASE` MUST return the stored base of a numeric value. Asking for an invalid base MUST raise a runtime error.
- `TNS BYTES(INT value, STR endian = "big")` = MUST convert a non-negative integer to a one-dimensional tensor of byte-sized `INT` elements in the range `0` through `0xFF`. `endian` MUST be either `"big"` or `"little"`. `BYTES(0)` MUST return a single zero byte. A negative integer or invalid endianness MUST raise a runtime error.
- `ANY COPY(ANY value)` and `ANY DEEPCOPY(ANY value)` = MUST return, respectively, a shallow copy and a deep copy of `value`. For primitive scalar values, both operations MAY return an equivalent scalar value. For `TNS` and `MAP`, `COPY` MUST duplicate only the outer container, while `DEEPCOPY` MUST recursively duplicate nested containers so that the result shares no mutable container objects with the input.
- `TNS TINT/TFLT/TSTR(TNS tensor)` = MUST convert each scalar element of `tensor` elementwise to `INT`, `FLT`, or `STR`, respectively. If any element cannot be converted, the operator MUST raise a runtime error.
- `FLT ROUND(FLT value, STR mode = "floor", INT ndigits = 0)` = MUST round `value` to `ndigits` places to the right of the radix point, where negative `ndigits` values round to positions to the left of the radix point. Supported modes MUST include `"floor"`, `"ceiling"` and `"ceil"`, `"zero"`, and `"logical"` and `"half-up"`. When exactly two arguments are supplied and the second argument is an `INT`, it MUST be interpreted as `ndigits` and the mode MUST default to `"floor"`.
---
#### 9.1.3 Arithmetic operators
- `INT|FLT ADD/SUB/MUL/DIV/CDIV/POW/MOD(INT|FLT a, INT|FLT b)` = MUST implement, respectively, addition, subtraction, multiplication, division, ceiling division, exponentiation, and remainder. Except where an operator explicitly states otherwise, `a` and `b` MUST have the same numeric type. Division by zero MUST raise a runtime error.
- `INT|FLT NEG/ABS(INT|FLT value)` = MUST return the additive inverse or absolute value of `value`, respectively.
- `INT|FLT GCD/LCM(INT|FLT a, INT|FLT b)` = MUST compute the greatest common divisor or least common multiple of `a` and `b`. Mixed `INT` and `FLT` operands MUST be rejected.
- `INT|FLT ROOT(INT|FLT x, INT|FLT n)` = MUST compute the `n`th root of `x`. For `INT` operands, positive `n` MUST produce the greatest integer `r` such that `r^n <= x` when that notion is defined; `n = 0` MUST raise a runtime error; and negative-`n` integer results are valid only where the reciprocal remains an integer. For `FLT` operands, the result MUST be `x^(1/n)`, with negative `x` permitted only when `n` denotes an odd integer. Invalid roots MUST raise a runtime error.
- `INT IADD/ISUB/IMUL/IDIV/IPOW/IROOT(INT|FLT a, INT|FLT b)` and `FLT FADD/FSUB/FMUL/FDIV/FPOW/FROOT(INT|FLT a, INT|FLT b)` = MUST first coerce both operands to the target numeric type and then perform the corresponding arithmetic operation. Failed coercion MUST raise a runtime error.
- `INT|FLT SUM/PROD(INT|FLT a1, ..., INT|FLT aN)`, `INT ISUM/IPROD(INT|FLT a1, ..., INT|FLT aN)`, and `FLT FSUM/FPROD(INT|FLT a1, ..., INT|FLT aN)` = MUST compute the corresponding aggregate sum or product across all arguments. The unprefixed forms MUST reject mixed `INT` and `FLT` operands.
- `INT|FLT|STR MAX/MIN(INT|FLT|STR a1, ..., INT|FLT|STR aN)` = MUST return the numeric maximum or minimum for numeric arguments and the longest or shortest element for string arguments. Mixing runtime types MUST raise a runtime error.
- `INT|FLT|STR MAX/MIN(TNS t1, ..., TNS tN)` = MUST flatten the supplied tensors and apply the same maximum or minimum rules to their scalar elements. Every encountered element MUST be scalar, and all encountered scalar elements MUST have the same runtime type.
- `INT|FLT LOG(INT|FLT value)` and `INT CLOG(INT value)` = MUST compute, respectively, floor base-2 logarithm and ceiling base-2 logarithm. The argument MUST be strictly positive.
- `INT ILEN(INT value)` = MUST return the length of the absolute value of `value` in binary digits. `ILEN(0)` MUST return `1`.
- `INT LEN(INT|STR a1, ..., INT|STR aN)` = MUST return the number of supplied arguments. Passing a `TNS` or any other unsupported type MUST raise a runtime error.
---
#### 9.1.4 Logical and comparison operators
- `BOOL AND/OR/XOR(ANY a1, ..., ANY aN)` and `BOOL NOT(ANY value)` = MUST perform boolean conjunction, disjunction, exclusive disjunction, and negation using the language's truthiness rules.
- `BOOL ALL(ANY a1, ..., ANY aN)` and `BOOL ANY(ANY a1, ..., ANY aN)` = MUST compute boolean conjunction and disjunction across the supplied arguments and return the result as `BOOL`.
- `BOOL BOOL(ANY value)` = MUST return the truthiness of `value` as `FALSE` or `TRUE` according to the boolean rules of the value's runtime type.
- `INT BAND/BOR/BXOR(INT a, INT b)`, `INT BNOT(INT value)`, and `INT SHL/SHR(INT a, INT b)` = MUST perform bitwise boolean operations and bit shifting on integer operands. These operators MUST reject `INT` with a radix besides 2.
- `BOOL EQ/NEQ(ANY a, ANY b)` = MUST test equality or inequality using the language's normal value-comparison rules.
- `BOOL GT/LT/GTE/LTE(INT|FLT a, INT|FLT b)` = MUST perform the corresponding ordered comparison on like-typed numeric operands and return `TRUE` when the relation holds and `FALSE` otherwise. Mixed `INT` and `FLT` comparisons MUST raise a runtime error.
---
#### 9.1.5 String operators
- `INT SLEN(STR value)` = MUST return the character length of `value`.
- `STR UPPER/LOWER(STR value)` = MUST return the uppercase or lowercase form of `value`, respectively.
- `INT|STR FLIP(INT|STR value)` = MUST reverse `value`. For `STR`, the result MUST be the character-reversed string. For `INT`, the result MUST reverse the integer's binary-digit spelling while preserving the sign.
- `INT|STR SLICE(INT|STR value, INT start, INT end)` = MUST slice either an integer or a string. For `STR`, the operator MUST return the inclusive character slice from `start` to `end`, counting from `1`, with negative indices counting from the end. For `INT`, the operator MUST return the corresponding inclusive slice of the integer's digit or bit representation according to the implementation's integer-slicing rules.
- `INT|STR JOIN(INT|STR a1, ..., INT|STR aN)` = MUST concatenate all arguments. When all arguments are `STR`, the result MUST be their direct concatenation. When all arguments are `INT`, the result MUST be an integer formed by concatenating their binary-digit spellings. Mixing positive and negative `INT` values or mixing `INT` and `STR` MUST raise a runtime error.
- `TNS SPLIT(STR value, STR delimiter = " ")` = MUST split `value` on the exact substring `delimiter` and return the parts as a one-dimensional tensor of `STR`. `delimiter` MUST be non-empty. Consecutive delimiters and trailing delimiters MUST produce empty-string elements.
- `STR STRIP(STR value, STR remove)` = MUST return `value` with every occurrence of the non-empty substring `remove` removed.
- `STR REPLACE(STR value, STR old, STR new)` = MUST return `value` with every occurrence of the non-empty substring `old` replaced by `new`.
---
#### 9.1.6 Tensor operators
- `TNS TNS(TNS shape, ANY value)` and `TNS TNS(STR value)` = MUST construct tensors. In the shape form, `shape` MUST be a one-dimensional tensor of positive `INT` lengths and the result MUST be filled with `value`. In the string form, the result MUST be a one-dimensional tensor of one-character `STR` elements.
- `TNS SHAPE(TNS tensor)` and `INT TLEN(TNS tensor, INT dim)` = MUST expose tensor shape information. `SHAPE` MUST return a one-dimensional tensor of dimension lengths, and `TLEN` MUST return the length of the specified one-based dimension. An out-of-range dimension MUST raise a runtime error.
- `TNS TFLIP(TNS tensor, INT dim)` = MUST return a new tensor whose elements are reversed along the one-based dimension `dim`.
- `TNS SCAT(TNS src, TNS dst, TNS ind)` = MUST return a copy of `dst` with a rectangular slice replaced by `src`. `ind` MUST encode one inclusive `[lo, hi]` range pair per destination dimension. The selected slice shape MUST exactly match the shape of `src`, and out-of-range indices MUST raise a runtime error.
- `TNS APPEND(ANY elem, TNS tns)` = MUST return a new one-dimensional tensor equal to `tns` with `elem` appended as the last element. If `tns` is not one-dimensional the operator MUST raise a runtime error. Prefix `TNS` values MAY contain heterogeneous element types; `APPEND` therefore MUST accept any element type and simply append it as the last element.
- `TNS FILL(TNS tensor, ANY value)` = MUST return a new tensor with the same shape as `tensor`, filled with `value`. `value` MUST satisfy the target tensor's element-type constraints.
- `TNS CONV(TNS x, TNS kernel, INT stride_w = 1, INT stride_h = 1, INT pad_w = 0, INT pad_h = 0, TNS bias = [])` = MUST support both the legacy two-argument N-dimensional convolution form and the extended 2-D multi-output form. In the legacy form, `kernel` MUST have the same rank as `x`, every kernel dimension length MUST be odd, boundary sampling MUST clamp to the nearest valid index, and the result MUST have the same shape as `x`. In the extended form, when any keyword argument is supplied and `x` is rank 3 while `kernel` is rank 4 with shape `[kw, kh, in_c, out_c]`, the operator MUST perform 2-D convolution with the given strides, explicit zero padding, and optional per-output-channel bias, returning shape `[out_w, out_h, out_c]`. Where both inputs are `INT`, the output MUST be `INT`; otherwise it MUST be `FLT`.
- `BOOL IN(ANY value, TNS tensor)` = MUST return `TRUE` if any tensor element is equal to `value` and `FALSE` otherwise.
- `TNS MADD/MSUB/MMUL/MDIV(TNS x, TNS y)` = MUST perform elementwise tensor-tensor arithmetic. The input shapes MUST match exactly, the element types MUST be uniformly numeric and mutually compatible, and division by zero MUST raise a runtime error.
- `TNS MSUM/MPROD(TNS t1, ..., TNS tN)` = MUST perform elementwise sum or product across tensors of identical shape and mutually compatible numeric element type.
- `TNS TADD/TSUB/TMUL/TDIV/TPOW(TNS tensor, INT|FLT scalar)` = MUST perform tensor-scalar arithmetic. All tensor elements and the scalar MUST share the same numeric type, except where a specific operator explicitly permits widening. Division by zero MUST raise a runtime error.
---
#### 9.1.7 Map operators
- `TNS KEYS(MAP map)` and `TNS VALUES(MAP map)` = MUST return one-dimensional tensors containing, respectively, the keys and values of `map` in insertion order.
- `BOOL KEYIN(INT|FLT|STR key, MAP map)` and `BOOL VALUEIN(ANY value, MAP map)` = MUST return `TRUE` when the given key or value occurs in `map` and `FALSE` otherwise.
- `BOOL MATCH(MAP map, MAP template, INT typing = 0, INT recurse = 0, INT shape = 0)` = MUST return `TRUE` if every key in `template` is present in `map`. When `typing` is non-zero, the corresponding stored value types MUST also match. When `shape` is non-zero, corresponding tensor values MUST additionally have identical shapes. When `recurse` is non-zero, the same rules MUST be applied recursively to nested maps. Any failed condition MUST produce `FALSE` rather than raising an error.
- `MAP INV(MAP map)` = MUST return a new map whose keys and values are reversed. Every value in `map` MUST be a scalar key type (`INT`, `FLT`, or `STR`), and duplicate values MUST raise a runtime error.
---
#### 9.1.8 Function and module operators
- `RUN(STR source)` = MUST parse and execute `source` as Prefix code in the caller's current lexical environment. Parse failure or runtime failure inside the executed source MUST raise a runtime error.
- `BOOL ASSERT(ANY value)` = MUST raise a runtime error if `value` is falsey according to the language's truthiness rules. On success, it MUST return `TRUE`.
- `BOOL REFUTE(~BOOL cond)` = MUST raise a runtime error if `cond` is truthy according to the language's truthiness rules. On success, it MUST return `TRUE`.
- `THROW()` or `THROW(INT|STR a1, ..., INT|STR aN)` = MUST raise a runtime error. When one or more arguments are supplied, the error message MUST be formed by concatenating those arguments using the same rendering rules as `PRINT`, but without a trailing newline. When no arguments are supplied, the error message MUST be `"Exception thrown"`.
- `BOOL MAIN()` = MUST return `TRUE` when the call site is in the primary program source and `FALSE` when the call site is executing from imported module code. The result MUST depend on the source location of the call expression rather than on the dynamic caller stack.
- `BOOL EXTEND(EXTENSION ext)` = MUST load the compiled extension designated by `ext`. The specifier `ext` MUST exclude the platform filename suffix and MAY use package-style `..` separators. If `ext` resolves to a package, the loader MUST attempt `ext..init`. Relative extension names MUST resolve using the calling module's source directory when available, then the current working directory, then the configured extension roots `ext/std`, `ext/usr`, `lib/std`, and `lib/usr` (with bundled roots searched before user roots). Repeating an `EXTEND` request for an already-loaded extension exposure MUST be idempotent. On success, `EXTEND` MUST return `FALSE`.
Operators registered with the module-restriction flag (for example `PREFIX_EXTENSION_MODULE_RESTRICTED`) MUST be exposed only under the extending module's namespace. Importing that module MUST expose the extension namespace qualifier under the importing module's qualified name. `PREFIX_EXTENSION_ASMODULE` by itself MUST NOT restrict an operator to the extending module's namespace or cause that qualified exposure. Extension side effects outside operator registration (for example process-global hooks or host-side state) remain global.
- `BOOL IMPORT(MODULE name)` or `BOOL IMPORT(MODULE name, SYMBOL alias)` = MUST load the named module, execute it in its own top-level environment on first import, cache that environment, and expose its bindings under the module name or the supplied alias. The implementation MUST search the referring source directory first, then bundled library locations as described in [8.2](#82-extensions-and-extend), [10](#10-standard-library), and the language's module-search rules. Re-importing an already loaded module MUST reuse the cached module namespace rather than re-executing the module.
Prefix package namespaces MUST use `..` as the package separator. The canonical form is `package..subpackage..module`. When `IMPORT(pkg)` is used and a package directory named `pkg` exists, the interpreter MUST prefer package resolution and attempt to load `pkg/init.pre`. If that package directory exists but contains no `init.pre`, the import MUST raise a runtime error. When `IMPORT(pkg..mod)` is used, the interpreter MUST resolve to `pkg/mod.pre`. If both a package directory and a same-named module file exist in the same search location, the package MUST take precedence.
When the referring source was itself loaded from a file, the search MUST begin in that source file's directory. When the referring source is executing via `-source`, the primary search directory MUST be the current working directory. After local search, the interpreter MUST consult `lib/std/` immediately before `lib/usr/`.
On first import, the module source MUST execute in its own isolated top-level environment. Unqualified identifiers during that execution MUST resolve within the module's own namespace. After execution completes, that namespace MUST be cached and reused by subsequent imports in the same interpreter process. Multiple aliases for the same module MUST provide distinct qualified views into the same cached module instance.
Module import MUST NOT implicitly load companion extension pointer files. Runtime extensions for module code MUST be loaded explicitly via `EXTEND` inside the module. On success, `IMPORT` MUST return `FALSE`.
- `BOOL IMPORT_PATH(STR path)` or `BOOL IMPORT_PATH(STR path, SYMBOL alias)` = MUST load a module from an explicit filesystem path and expose it under `alias` or, if omitted, under a basename-derived module name. The argument MUST at minimum accept an absolute path to a `.pre` source file. The module's basename, excluding the `.pre` suffix, MUST be used as the default qualified module name when no alias is supplied. The loaded module MUST otherwise obey the same isolation, caching, and exposure rules as `IMPORT`. Runtime extensions for that module MUST be loaded explicitly via `EXTEND`. On success, `IMPORT_PATH` MUST return `FALSE`.
- `BOOL EXPORT(SYMBOL symbol, MODULE module)` = MUST copy the caller's current binding for `symbol` into the namespace of the already imported module designated by `module`, making the exported binding available through that module's qualified namespace. `EXPORT` MUST return `FALSE` on success and MUST raise a runtime error if `module` is not currently imported.
---
#### 9.1.9 Concurrency operators
The expression form `ASYNC{ ... }` and the statement form `THR(symbol){ ... }` are defined in [3.6](#36-asynchronous-execution-and-threads). The operators in this subsection act on `THR` handles.
- `BOOL PARALLEL(TNS functions)` or `BOOL PARALLEL(FUNC f1, FUNC f2, ..., FUNC fN)` = MUST invoke each supplied function concurrently with no arguments, wait for all invocations to complete, and return `FALSE` on success. Any non-`FUNC` argument or any worker failure MUST raise a runtime error.
- `THR AWAIT(THR thread)` = MUST block until `thread` has finished and then return the same thread handle.
- `THR PAUSE(THR thread, FLT seconds = -1)` = MUST pause `thread`. If `seconds` is supplied and is non-negative, the runtime MUST automatically resume the thread after that duration. Pausing a thread that is already paused MUST raise a runtime error.
- `THR RESUME(THR thread)` = MUST resume a paused thread. Resuming a thread that is not paused MUST raise a runtime error.
- `BOOL PAUSED(THR thread)` = MUST return `TRUE` when `thread` is paused and `FALSE` otherwise.
- `THR STOP(THR thread)` = MUST cooperatively stop `thread`, mark it finished, and return the resulting handle.
- `THR RESTART(THR thread)` = MUST reinitialize `thread` and begin its execution again, returning the restarted handle.
---
#### 9.1.10 Serialization
This section specifies the on-the-wire JSON encoding used by the `SER` and `UNSER` operators. Implementations MUST follow this schema and the encoding rules below to ensure interoperability.
- The serializer emits a single JSON text (compact, no insignificant whitespace). Decoders MUST accept any JSON whitespace allowed by the JSON specification.
- All emitted JSON string values MUST be ASCII-only: characters with code < 0x20 or >= 0x7F MUST be escaped as `\\uXXXX`. The serializer MUST also escape `"`, `\\`, `\b`, `\f`, `\n`, `\r`, and `\t` as JSON requires.
- The top-level JSON value for any serialized runtime value MUST be an object with a discriminator string field named `t`.
The canonical object forms produced by the reference implementation are (examples shown compactly):
```json
{"t":"BOOL","v":true|false}
{"t":"INT","v":"<Prefix integer literal>"}
{"t":"FLT","v":"<Prefix floating-point literal or INF or -INF or NaN>"}
{"t":"STR","v":"<string contents>"}
{"t":"TNS","shape":[d1,d2,...,dN],"v":[e1,e2,...,eM]}
{"t":"MAP","v":[{"k":k1,"v":v1},{"k":k2,"v":v2},...]}
{"t":"FUNC",...}
{"t":"THR",...}
```
- For `BOOL`, `v` is a JSON boolean. For `INT`, `v` is the Prefix base-prefixed integer spelling (string). For `FLT`, `v` is the Prefix base-prefixed floating-point spelling (string), except `INF`, `-INF`, and `NaN` are emitted exactly as those strings with no base prefix.
- For `TNS`, `shape` is an array of positive integers and `v` is the flattened element array in row-major order.
- For `MAP`, `v` is an array of objects preserving insertion order; each element must be an object with fields `k` and `v` containing the serialized key and value.
To represent shared/recursive structures, the serializer emits auxiliary records with generated IDs:
```json
{"t":"ENV","id":"eN","ref":true}
{"t":"ENV","id":"eN","def":{"values":{...},"declared":{...},"frozen":[...],"permafrozen":[...],"parent":<ENV-or-null>}}
{"t":"PTR","name":"<identifier>","env":<ENV>,"value_type":"BOOL|INT|FLT|STR|TNS|MAP|FUNC|THR|UNKNOWN"}
{"t":"FUNC","id":"fN","ref":true}
{"t":"FUNC","id":"fN","name":"<name>","return":"<TYPE>","params":[...],"def":{"name":"<name>","return":"<TYPE>","params":[...],"body":<Stmt>,"closure":<ENV-or-null>}}
{"t":"THR","id":"tN","state":"running|paused|finished","paused":<bool>,"finished":<bool>,"stop":<bool>,"env":<ENV-or-null>,"block":<Stmt-or-null>}
```
- Generated environment IDs are `e1`, `e2`, ... in first-encounter order. Function IDs are `f1`, `f2`, ...; thread IDs are `t1`, `t2`, ... .
- `ENV` and `FUNC` records may be emitted first as a full `def` object; subsequent encounters use the `{..."ref":true}` form. The reference emitter behavior is normative: decoders MUST resolve `ENV` and `FUNC` by `id` and accept reference-only forms. The reference-only form for `THR` is not emitted by the reference implementation; repeated `THR` objects may appear with the same `id` and decoders MUST coalesce by `id`.
The `ENV.def.values` object contains one member per initialized binding (keyed by symbol name). Pointer bindings are represented as `PTR` objects. `ENV.def.declared` maps symbol names to declared type names. `frozen` and `permafrozen` are arrays of symbol names. `parent` is another `ENV` object or `null`.
Function parameter records use the shape:
```json
{"name":"<param name>","type":"<TYPE>","coerced":<bool>,"default":<Expr-or-null>}
```
AST nodes embedded in function bodies, thread blocks, and default expressions are objects with an `n` kind field and a `loc` location object of the form:
```json
{"file":"<unknown>","line":N,"column":N,"statement":""}
```
Canonical expression node shapes emitted by the reference serializer include, for example:
```json
{"n":"Literal","loc":<loc>,"value":<json-number-or-json-string-or-json-boolean>,"base":<json-number-or-null>,"literal_type":"BOOL|INT|FLT|STR"}
{"n":"TensorLiteral","loc":<loc>,"items":[<Expr>,...]}
{"n":"MapLiteral","loc":<loc>,"items":[{"k":<Expr>,"v":<Expr>},...]}
{"n":"Identifier","loc":<loc>,"name":"<identifier>"}
{"n":"TypedIdentifier","loc":<loc>,"decl_type":"<TYPE>","name":"<identifier>"}
{"n":"PointerExpression","loc":<loc>,"target":"<identifier>"}
{"n":"CallExpression","loc":<loc>,"callee":<Expr>,"args":[{"n":"CallArgument","name":null,"expression":<Expr>},...,{"n":"CallArgument","name":"<keyword>","expression":<Expr>}]}
{"n":"AsyncExpression","loc":<loc>,"block":<Stmt>}
{"n":"IndexExpression","loc":<loc>,"base":<Expr>,"indices":[<Expr>,...],"is_map":false}
{"n":"Range","loc":<loc>,"lo":<Expr>,"start":<Expr>}
{"n":"Star","loc":<loc>}
```
And example statement node shapes include:
```json
{"n":"Block","loc":<loc>,"statements":[<Stmt>,...]}
{"n":"Assignment","loc":<loc>,"target":"<identifier>","declared_type":"<TYPE>"|null,"expression":<Expr>}
{"n":"TensorSetStatement","loc":<loc>,"target":<Expr>,"value":<Expr>}
{"n":"Declaration","loc":<loc>,"name":"<identifier>","declared_type":"<TYPE>"}
{"n":"ExpressionStatement","loc":<loc>,"expression":<Expr>}
{"n":"IfStatement","loc":<loc>,"condition":<Expr>,"then_block":<Stmt>,"elifs":[{"n":"IfBranch","condition":<Expr>,"block":<Stmt>},...],"else_block":<Stmt-or-null>}
{"n":"WhileStatement","loc":<loc>,"condition":<Expr>,"block":<Stmt>}
{"n":"ForStatement","loc":<loc>,"counter":"<identifier>","target_expr":<Expr>,"block":<Stmt>}
{"n":"ParForStatement","loc":<loc>,"counter":"<identifier>","target_expr":<Expr>,"block":<Stmt>}
{"n":"FuncDef","loc":<loc>,"name":"<identifier>","params":[{"n":"Param","type":"<TYPE>","coerced":<bool>,"name":"<identifier>","default":<Expr-or-null>},...],"return_type":"<TYPE>","body":<Stmt>}
{"n":"ReturnStatement","loc":<loc>,"expression":<Expr-or-null>}
{"n":"PopStatement","loc":<loc>,"expression":{"n":"Identifier","loc":<loc>,"name":"<identifier>"}}
{"n":"BreakStatement","loc":<loc>,"expression":<Expr>}
{"n":"GotoStatement","loc":<loc>,"expression":<Expr>}
{"n":"GotopointStatement","loc":<loc>,"expression":<Expr>}
{"n":"ContinueStatement","loc":<loc>}
{"n":"AsyncStatement","loc":<loc>,"block":<Stmt>}
{"n":"ThrStatement","loc":<loc>,"symbol":"<identifier>","block":<Stmt>}
{"n":"TryStatement","loc":<loc>,"try_block":<Stmt>,"catch_symbol":"<identifier>"|null,"catch_block":<Stmt>}
```
The operator semantics that use this format are defined next.
- `STR SER(ANY value)` = MUST serialize `value` to a `STR` containing a single JSON text that conforms to the encoding rules and schema described above. `SER` returns that `STR` on success.
- `ANY UNSER(STR text)` = MUST parse `text` as JSON and reconstruct the runtime value according to the schema above. `UNSER` MUST raise a runtime error for invalid JSON, invalid/missing fields, invalid numeric spellings, unsupported key types for `MAP`, unknown discriminators that cannot be reconstructed, or other structural violations.
---
#### 9.1.11 File and host operators
- `STR READFILE(STR path, STR coding = "UTF-8")` = MUST read the filesystem object at `path` and return its contents as a `STR`. Supported `coding` values MUST include `UTF-8`, `UTF-8 BOM`, `UTF-16 LE`, `UTF-16 BE`, `ANSI`, `binary`, and `hexadecimal`, along with the aliases `bin` for `binary` and `hex` for `hexadecimal`. Coding names MUST be matched case-insensitively. Text decoders MUST tolerate invalid data using replacement semantics. `UTF-8` MUST accept and strip a BOM if present. `ANSI` MUST map to Windows-1252 on Windows and Latin-1 on other hosts. `binary` MUST return an 8-bit-per-byte bitstring, and `hexadecimal` MUST return lowercase hexadecimal text. Read failure MUST raise a runtime error.
- `BOOL WRITEFILE(STR blob, STR path, STR coding = "UTF-8")` = MUST write `blob` to `path` using the selected coding rules. Coding names MUST be matched case-insensitively. `UTF-8 BOM` MUST emit a BOM. `ANSI` MUST map to Windows-1252 on Windows and Latin-1 on other hosts. `binary` MUST require a bitstring whose length is a multiple of 8, and `hexadecimal` MUST require valid hexadecimal text. Invalid coding names or malformed binary or hexadecimal input MUST raise a runtime error. Ordinary I/O failure MUST cause the operator to return `FALSE`; successful writes MUST return `TRUE`.
- `BOOL EXISTFILE(STR path)` = MUST return `TRUE` if a filesystem object exists at `path` and `FALSE` otherwise.
- `BOOL DELETEFILE(STR path)` = MUST delete the filesystem object at `path` and return `TRUE` on success. Missing files, permission failures, and other deletion failures MUST raise a runtime error.