-
-
Notifications
You must be signed in to change notification settings - Fork 128
Expand file tree
/
Copy pathexpressions.Rmd
More file actions
1361 lines (1024 loc) · 47.4 KB
/
expressions.Rmd
File metadata and controls
1361 lines (1024 loc) · 47.4 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# Expressions
An expression is the syntactic unit in a Stan program that denotes a
value. Every expression in a well-formed Stan program has a type that
is determined statically (at compile time), based only on the type of
its variables and the types of the functions used in it. If an
expressions type cannot be determined statically, the Stan compiler
will report the location of the problem.
This chapter covers the syntax, typing, and usage of the various forms
of expressions in Stan.
## Numeric literals
The simplest form of expression is a literal that denotes a primitive
numerical value.
### Integer literals {-}
Integer literals represent integers of type `int`. Integer
literals are written in base 10 without any separators. Integer
literals may contain a single negative sign. (The expression
`--1` is interpreted as the negation of the literal `-1`.)
The following list contains well-formed integer literals.
```
0, 1, -1, 256, -127098, 24567898765
```
Integer literals must have values that fall within the bounds for
integer values (see [section](#numerical-data-types.section)).
Integer literals may not contain decimal points (`.`). Thus the
expressions `1.` and `1.0` are of type `real` and may
not be used where a value of type `int` is required.
### Real literals {-}
A number written with a period or with scientific notation is assigned
to a the continuous numeric type `real`. Real literals are
written in base 10 with a period (`.`) as a separator and
optionally an exponent with optional sign. Examples
of well-formed real literals include the following.
```
0.0, 1.0, 3.14, -217.9387, 2.7e3, -2E-5, 1.23e+3.
```
The notation `e` or `E` followed by a positive or negative
integer denotes a power of 10 to multiply. For instance, `2.7e3`
and `2.7e+3` denote $2.7 \times 10^3$, whereas `-2E-5` denotes $-2 \times
10^{-5}$.
### Imaginary literals {-}
A number followed by the character `i` denotes an imaginary number and
is assigned to the numeric type `complex`. The number preceding `i`
may be either a real or integer literal and determines the magnitude
of the imaginary number. Examples of well-formed imaginary literals
include the following.
```
1i, 2i, -325.786i, 1e10i, 2.87e-10i.
```
Note that the character `i` by itself is _not_ a well-formed imaginary
literal. The unit imaginary number must be written as `1i`.
### Complex literals {-}
Stan does not include complex literals directly, but a real or integer
literal can be added to an imaginary literal to derive an expression
that behaves like a complex literal. Examples include the following.
```
1 + 2i, -3.2e9 + 1e10i
```
These will be assigned the type `complex`, which is the result of
adding a real or integer and a complex number. They will also
function like literals in the sense that the C++ compiler is able to
reduce them to a single complex constant at compile time.
## Variables {#variables.section}
A variable by itself is a well-formed expression of the same type as
the variable. Variables in Stan consist of ASCII strings containing
only the basic lower-case and upper-case Roman letters, digits, and
the underscore (`_`) character. Variables must start with a
letter (`a--z` and `A--Z`) and may not end with two underscores
(`__`).
Examples of legal variable identifiers are as follows.
```
a, a3, a_3, Sigma, my_cpp_style_variable, myCamelCaseVariable
```
Unlike in R and BUGS, variable identifiers in Stan may not contain
a period character.
### Reserved names {-}
Stan reserves many strings for internal use and these may not be used
as the name of a variable. An attempt to name a variable after an
internal string results in the `stanc` translator halting with an
error message indicating which reserved name was used and its location
in the model code.
#### Model name {-}
The name of the model cannot be used as a variable within the model.
This is usually not a problem because the default in `bin/stanc`
is to append `_model` to the name of the file containing the
model specification. For example, if the model is in file
`foo.stan`, it would not be legal to have a variable named
`foo_model` when using the default model name through
`bin/stanc`. With user-specified model names, variables cannot
match the model.
#### User-defined function names {-}
User-defined function names cannot be used as a variable within the
model.
#### Reserved words from Stan language {-}
The following list contains reserved words for Stan's programming
language. Not all of these features are implemented in Stan yet, but
the tokens are reserved for future use.
```
for, in, while, repeat, until, if, then, else,
true, false, target, struct, typedef, export,
auto, extern, var, static
```
Variables should not be named after types, either, and thus may not be
any of the following.
```
int, real, complex, vector, simplex, unit_vector,
ordered, positive_ordered, row_vector, matrix,
cholesky_factor_corr, cholesky_factor_cov,
corr_matrix, cov_matrix
```
The following built in functions are also reserved and
cannot be used as variable names:
```
print, reject, profile, get_lp, increment_log_prob,
target
```
The following block identifiers are reserved and cannot be used as variable names:
```
functions, model, data, parameters, quantities,
transformed, generated
```
#### Reserved distribution names {-}
Variable names will also conflict with the names of distributions
suffixed with `_lpdf`, `_lpmf`, `_lcdf`, and `_lccdf`, `_cdf`, and
`_ccdf`, such as `normal_lcdf_log`; this also holds for the deprecated
forms `_log`, `_cdf_log`, and `_ccdf_log`. No user-defined variable
can take a name ending in `_lupdf` or `_lupmf` even if a corresponding
`_lpdf` or `_lpmf` is not defined.
Using any of these variable names causes the `stanc` translator
to halt and report the name and location of the variable causing the
conflict.
#### Reserved names backend languages {-}
Stan primarily generates code in C++, which features its own
reserved words. It is legal to name a variable any of the
following names, however doing so will lead to it being renamed
`_stan_NAME` (e.g. `_stan_public`) behind the scenes (in the
generated C++ code).
<!-- corresponds to the list in stanc3/src/stan_math_backend/Mangle.ml -->
```
alignas, alignof, and, and_eq, asm, bitand, bitor, bool,
case, catch, char, char16_t, char32_t, class, compl, const,
constexpr, const_cast, decltype, default, delete, do,
double, dynamic_cast, enum, explicit, float, friend, goto,
inline, long, mutable, namespace, new, noexcept, not, not_eq,
nullptr, operator, or, or_eq, private, protected, public,
register, reinterpret_cast, short, signed, sizeof,
static_assert, static_cast, switch, template, this, thread_local,
throw, try, typeid, typename, union, unsigned, using, virtual,
volatile, wchar_t, xor, xor_eq, fvar, STAN_MAJOR, STAN_MINOR,
STAN_PATCH, STAN_MATH_MAJOR, STAN_MATH_MINOR, STAN_MATH_PATCH
```
<!-- mention python here when TFP backend finished -->
### Legal characters {-}
The legal characters for variable identifiers are given in the
[identifier characters table](#identifier-characters-table).
**Identifier Characters Table.** id:identifier-characters-table
*The alphanumeric characters and underscore in base ASCII are the only
legal characters in Stan identifiers.*
| characters | ASCII code points |
| :--------: | :---------------: |
| `a -- z` | 97 -- 122 |
| `A -- Z` | 65 -- 90 |
| `0 -- 9` | 48 -- 57 |
| `_` | 95 |
Although not the most expressive character set, ASCII is the most
portable and least prone to corruption through improper character
encodings or decodings. Sticking to this range of ASCII makes Stan
compatible with Latin-1 or UTF-8 encodings of these characters, which
are byte-for-byte identical to ASCII.
#### Comments allow ASCII-compatible encoding {-}
Within comments, Stan can work with any ASCII-compatible character
encoding, such as ASCII itself, UTF-8, or Latin1. It is up to user
shells and editors to display them properly.
## Vector, matrix, and array expressions {#vector-matrix-array-expressions.section}
Expressions for the Stan container objects arrays, vectors, and
matrices can be constructed via a sequence of expressions
enclosed in either curly braces for arrays, or square brackets for
vectors and matrices.
### Vector expressions {-}
Square brackets may be wrapped around a sequence of comma separated
primitive expressions to produce a row vector expression. For
example, the expression `[ 1, 10, 100 ]` denotes a row vector of
three elements with real values 1.0, 10.0, and 100.0.
Applying the transpose operator to a row vector expression produces
a vector expression.
This syntax provides a way declare and define small vectors a single line, as follows.
```stan
row_vector[2] rv2= [ 1, 2 ];
vector[3] v3 = [ 3, 4, 5 ]';
```
The vector expression values may be compound expressions
or variable names, so it is legal to write
`[ 2 * 3, 1 + 4]` or `[ x, y ]`, providing that `x`
and `y` are primitive variables.
### Matrix expressions {-}
A matrix expression consists of square brackets wrapped
around a sequence of comma separated row vector expressions.
This syntax provides a way declare and define a matrix in a single
line, as follows.
```stan
matrix[3, 2] m1 = [ [ 1, 2 ], [ 3, 4 ], [5, 6 ] ];
```
Any expression denoting a row vector can be used in a matrix expression.
For example, the following code is valid:
```stan
vector[2] vX = [ 1, 10 ]';
row_vector[2] vY = [ 100, 1000 ];
matrix[3, 2] m2 = [ vX', vY, [ 1, 2 ] ];
```
#### No empty vector or matrix expressions {-}
The empty expression `[ ]` is ambiguous and therefore is not
allowed and similarly expressions such as `[ [ ] ]` or
`[ [ ], [ ] ]` are not allowed.
### Array expressions {-}
Curly braces may be wrapped around a sequence of expressions to
produce an array expression. For example, the expression
`{ 1, 10, 100 }` denotes an integer array of three elements with
values 1, 10, and 100. This syntax is particularly convenient to
define small arrays in a single line, as follows.
```stan
array[3] int a = { 1, 10, 100 };
```
The values may be compound expressions, so it is legal to write
`{ 2 * 3, 1 + 4 }`. It is also possible to write two dimensional
arrays directly, as in the following example.
```stan
array[2, 3] int b = { { 1, 2, 3 }, { 4, 5, 6 } };
```
This way, `b[1]` is `{ 1, 2, 3 }` and `b[2]` is
`{ 4, 5, 6 }`.
Whitespace is always interchangeable in Stan, so the above can be laid
out as follows to more clearly indicate the row and column structure
of the resulting two dimensional array.
```stan
array[2, 3] int b = { { 1, 2, 3 },
{ 4, 5, 6 } };
```
### Array expression types {-}
Any type of expression may be used within braces to form an array
expression. In the simplest case, all of the elements will be of the
same type and the result will be an array of elements of that type.
For example, the elements of the array can be vectors, in which case
the result is an array of vectors.
```stan
vector[3] b;
vector[3] c;
// ...
array[2] vector[3] d = { b, c };
```
The elements may also be a mixture of `int` and `real` typed
expressions, in which case the result is an array of real values.
```stan
array[2] real b = { 1, 1.9 };
```
### Restrictions on values {-}
There are some restrictions on how array expressions may be used that
arise from their types being calculated bottom up and the basic data
type and assignment rules of Stan.
#### Rectangular array expressions only {-}
Although it is tempting to try to define a ragged array expression,
all Stan data types are rectangular (or boxes or other
higher-dimensional generalizations). Thus the following nested array
expression will cause an error when it tries to create a
non-rectangular array.
```stan
{ { 1, 2, 3 }, { 4, 5 } } // compile time error: size mismatch
```
This may appear to be OK, because it is creating a two-dimensional
integer array (`array[,] int`) out of two one-dimensional array
integer arrays (`array[] int`). But it is not allowed because the two
one-dimensional arrays are not the same size. If the elements are
array expressions, this can be diagnosed at compile time. If one or
both expressions is a variable, then that won't be caught until
runtime.
```stan
{ { 1, 2, 3 }, m } // runtime error if m not size 3
```
#### No empty array expressions {-}
Because there is no way to infer the type of the result, the empty
array expression (`{ }`) is not allowed. This does not sacrifice
expressive power, because a declaration is sufficient to initialize a
zero-element array.
```stan
array[0] int a; // a is fully defined as zero element array
```
#### Integer only array expressions {-}
If an array expression contains only integer elements, such as
`{ 1, 2, 3 }`, then the result type will be an integer array,
`array [] real`. This means that the following will *not* be
legal.
```stan
array[2] real a = { -3, 12 };
// error: array [] real can't be assigned to array [] real
```
Integer arrays may not be assigned to real values. However, this
problem is easily sidestepped by using real literal expressions.
```stan
array[2] real a = { -3.0, 12.0 };
```
Now the types match and the assignment is allowed.
## Parentheses for grouping
Any expression wrapped in parentheses is also an expression. Like in
C++, but unlike in R, only the round parentheses, `(` and
`)`, are allowed. The square brackets `[` and `]` are
reserved for array indexing and the curly braces `{` and
`}` for grouping statements.
With parentheses it is possible to explicitly group subexpressions
with operators. Without parentheses, the expression `1 + 2 * 3`
has a subexpression `2 * 3` and evaluates to 7. With
parentheses, this grouping may be made explicit with the expression
`1 + (2 * 3)`. More importantly, the expression `(1 + 2) *
3` has `1 + 2` as a subexpression and evaluates to 9.
## Arithmetic and matrix operations on expressions {#arithmetic-expressions.section}
For integer and real-valued expressions, Stan supports the basic
binary arithmetic operations of addition (`+`), subtraction
(`-`), multiplication (`*`) and division (`/`) in the
usual ways.
For integer expressions, Stan supports the modulus (`%`) binary
arithmetic operation. Stan also supports the unary operation of
negation for integer and real-valued expressions. For example,
assuming `n` and `m` are integer variables and `x` and
`y` real variables, the following expressions are legal.
```stan
3.0 + 0.14
-15
2 * 3 + 1
(x - y) / 2.0
(n * (n + 1)) / 2
x / n
m % n
```
The negation, addition, subtraction, and multiplication operations are
extended to matrices, vectors, and row vectors. The transpose
operation, written using an apostrophe (`'`) is also supported
for vectors, row vectors, and matrices. Return types for matrix
operations are the smallest types that can be statically guaranteed to
contain the result. The full set of allowable input types and
corresponding return types is detailed in the list of functions.
For example, if `y` and `mu` are variables of type `vector` and
`Sigma` is a variable of type `matrix`, then `(y - mu)' * Sigma * (y -
mu)` is a well-formed expression of type `real`. The type of the
complete expression is inferred working outward from the
subexpressions. The subexpression(s) `y - mu` are of type `vector`
because the variables `y` and `mu` are of type `vector`. The
transpose of this expression, the subexpression `(y - mu)'` is of type
`row_vector`. Multiplication is left associative and transpose has
higher precedence than multiplication, so the above expression is
equivalent to the following fully specified form `(((y - mu)') *
Sigma) * (y - mu)`.
The type of subexpression `(y - mu)' * Sigma` is inferred to be
`row_vector`, being the result of multiplying a row vector by a
matrix. The whole expression's type is thus the type of a row vector
multiplied by a (column) vector, which produces a `real` value.
Stan provides elementwise matrix multiplication (e.g., `a .* b`) and
division (e.g., `a ./ b`) operations. These provide a shorthand to
replace loops, but are not intrinsically more efficient than a version
programmed with an elementwise calculations and assignments in a loop.
For example, given declarations,
```stan
vector[N] a;
vector[N] b;
vector[N] c;
```
the assignment,
```stan
c = a .* b;
```
produces the same result with roughly the same efficiency as the loop
```stan
for (n in 1:N) {
c[n] = a[n] * b[n];
}
```
Stan supports exponentiation (`^`) of integer and
real-valued expressions. The return type of exponentiation is always
a real-value. For example, assuming `n` and `m` are integer
variables and `x` and `y` real variables, the following
expressions are legal.
```stan
3 ^ 2
3.0 ^ -2
3.0 ^ 0.14
x ^ n
n ^ x
n ^ m
x ^ y
```
Exponentiation is right associative, so the expression `2 ^ 3 ^ 4`
is equivalent to the fully specified form `2 ^ (3 ^ 4)`.
### Operator precedence and associativity {-}
The precedence and associativity of operators, as well as built-in
syntax such as array indexing and function application is given in
tabular form in the [operator precedence table](#operator-precedence-table).
**Operator Precedence Table.** <a id="operator-precedence-table"></a>
*Stan's unary, binary, and ternary
operators, with their precedences, associativities, place in an
expression, and a description. The last two lines list the precedence
of function application and array, matrix, and vector indexing. The
operators are listed in order of precedence, from least tightly
binding to most tightly binding. The full set of legal arguments and
corresponding result types are provided in the function documentation
for the operators (i.e.,* `operator*(int, int):int` *indicates the
application of the multiplication operator to two integers, which
returns an integer). Parentheses may be used to group expressions
explicitly rather than relying on precedence and
associativity.*
| Op. | Prec. | Assoc. | Placement | Description |
| :-----: | ----: | :----: | :------------ | :------------------------- |
| `? ~ :` | 10 | right | ternary infix | conditional |
| `||` | 9 | left | binary infix | logical or |
| `&&` | 8 | left | binary infix | logical and |
| `==` | 7 | left | binary infix | equality |
| `!=` | 7 | left | binary infix | inequality |
| `<` | 6 | left | binary infix | less than |
| `<=` | 6 | left | binary infix | less than or equal |
| `>` | 6 | left | binary infix | greater than |
| `>=` | 6 | left | binary infix | greater than or equal |
| `+` | 5 | left | binary infix | addition |
| `-` | 5 | left | binary infix | subtraction |
| `*` | 4 | left | binary infix | multiplication |
| `.*` | 4 | left | binary infix | elementwise multiplication |
| `/` | 4 | left | binary infix | (right) division |
| `./` | 4 | left | binary infix | elementwise division |
| `%` | 4 | left | binary infix | modulus |
| `\` | 3 | left | binary infix | left division |
| `%/%` | 3 | left | binary infix | integer division |
| `!` | 2 | n/a | unary prefix | logical negation |
| `-` | 2 | n/a | unary prefix | negation |
| `+` | 2 | n/a | unary prefix | promotion (no-op in Stan) |
| `^` | 1 | right | binary infix | exponentiation |
| `.^` | 1 | right | binary infix | elementwise exponentiation |
| `'` | 0 | n/a | unary postfix | transposition |
| `()` | 0 | n/a | prefix, wrap | function application |
| `[]` | 0 | left | prefix, wrap | array, matrix indexing |
Other expression-forming operations, such as function application and
subscripting bind more tightly than any of the arithmetic operations.
The precedence and associativity determine how expressions are
interpreted. Because addition is left associative, the expression
`a + b + c` is interpreted as `(a + b) + c`. Similarly,
`a / b * c` is interpreted as `(a / b) * c`.
Because multiplication has higher precedence than addition, the
expression `a * b + c` is interpreted as `(a * b) + c` and the
expression `a + b * c` is interpreted as `a + (b * c)`. Similarly,
`2 * x + 3 * - y` is interpreted as `(2 * x) + (3 * (-y))`.
Transposition and exponentiation bind more tightly
than any other arithmetic or logical operation.
For vectors, row vectors, and matrices,
`-u'` is interpreted as `-(u')`, `u * v'` as
`u* (v')`, and `u' * v` as `(u') * v`.
For integer and reals, `-n ^ 3` is interpreted as `-(n ^ 3)`.
## Conditional operator {#conditional-operator.section}
### Conditional operator syntax {-}
The ternary conditional operator is unique in that it takes three
arguments and uses a mixed syntax. If `a` is an expression of
type `int` and `b` and `c` are expressions that can be
converted to one another (e.g., compared with `==`), then
```stan
a ? b : c
```
is an expression of the promoted type of `b` and `c`. The
only promotion allowed in Stan is integer -> real -> complex; e.g. if one
argument is of type `int` and the other of type `real`, the
conditional expression as a whole is of type `real`.
In other cases, the arguments have to be of the same underlying Stan type
(i.e., constraints don't count, only the shape) and the conditional
expression is of that type.
#### Conditional operator precedence {-}
The conditional operator is the most loosely binding operator, so its
arguments rarely require parentheses for disambiguation. For example,
```stan
a > 0 || b < 0 ? c + d : e - f
```
is equivalent to the explicitly grouped version
```stan
(a > 0 || b < 0) ? (c + d) : (e - f)
```
The latter is easier to read even if the parentheses are not strictly
necessary.
#### Conditional operator associativity {-}
The conditional operator is right associative, so that
```stan
a ? b : c ? d : e
```
parses as if explicitly grouped as
```stan
a ? b : (c ? d : e)
```
Again, the explicitly grouped version is easier to read.
### Conditional operator semantics {-}
Stan's conditional operator works very much like its C++ analogue.
The first argument must be an expression denoting an integer.
Typically this is a variable or a relation operator, as in the
variable `a` in the example above. Then there are two resulting
arguments, the first being the result returned if the condition
evaluates to true (i.e., non-zero) and the second if the condition
evaluates to false (i.e., zero). In the example above, the value
`b` is returned if the condition evaluates to a non-zero value
and `c` is returned if the condition evaluates to zero.
#### Lazy evaluation of results {-}
The key property of the conditional operator that makes it so useful
in high-performance computing is that it only evaluates the returned
subexpression, not the alternative expression. In other words, it is
not like a typical function that evaluates its argument expressions
eagerly in order to pass their values to the function. As usual, the
saving is mostly in the derivatives that do not get computed rather
than the unnecessary function evaluation itself.
#### Promotion to parameter {-}
If one return expression is a data value (an expression involving only
constants and variables defined in the data or transformed data
block), and the other is not, then the ternary operator will promote
the data value to a parameter value. This can cause needless work
calculating derivatives in some cases and be less efficient than a full
`if`-`then` conditional statement. For example,
```stan
data {
array[10] real x;
// ...
}
parameters {
array[10] real z;
// ...
}
model {
y ~ normal(cond ? x : z, sigma);
// ...
}
```
would be more efficiently (if not more transparently) coded as
```stan
if (cond) {
y ~ normal(x, sigma);
} else {
y ~ normal(z, sigma);
}
```
The conditional statement, like the conditional operator, only
evaluates one of the result statements. In this case, the variable
`x` will not be promoted to a parameter and thus not cause any
needless work to be carried out when propagating the chain rule during
derivative calculations.
## Indexing {#language-indexing.section}
Stan arrays, matrices, vectors, and row vectors are all accessed
using the same array-like notation. For instance, if `x` is a
variable of type `array [] real` (a one-dimensional array of reals)
then `x[1]` is the value of the first element of the
array.
Subscripting has higher precedence than any of the arithmetic
operations. For example, `alpha * x[1]` is equivalent to
`alpha * (x[1])`.
Multiple subscripts may be provided within a single pair of square
brackets. If `x` is of type `array[,] real`, a two-dimensional
array, then `x[2, 501]` is of type `real`.
### Accessing subarrays {-}
The subscripting operator also returns subarrays of arrays. For
example, if `x` is of type `array[,,] real`, then `x[2]`
is of type `array[,] real`, and `x[2, 3]` is of type
`array[] real`. As a result, the expressions `x[2, 3]` and
`x[2][3]` have the same meaning.
### Accessing matrix rows {-}
If `Sigma` is a variable of type `matrix`, then
`Sigma[1]` denotes the first row of `Sigma` and has the
type `row_vector`.
### Mixing array and vector/matrix indexes {-}
Stan supports mixed indexing of arrays and their vector, row vector
or matrix values. For example, if `m` is of type
`matrix[ , ]`, a two-dimensional array of matrices, then
`m[1]` refers to the first row of the array, which is a
one-dimensional array of matrices. More than one index may be used,
so that `m[1, 2]` is of type `matrix` and denotes the matrix
in the first row and second column of the array. Continuing to add
indices, `m[1, 2, 3]` is of type `row_vector` and denotes
the third row of the matrix denoted by `m[1, 2]`. Finally,
`m[1, 2, 3, 4]` is of type `real` and denotes the value in the
third row and fourth column of the matrix that is found at the first
row and second column of the array `m`.
## Multiple indexing and range indexing {#language-multi-indexing.section}
In addition to single integer indexes, as described in
[the language indexing section](#language-indexing.section), Stan supports multiple indexing.
Multiple indexes can be integer arrays of indexes, lower
bounds, upper bounds, lower and upper bounds, or simply shorthand for
all of the indexes. A complete table of index types is given in the
[indexing options table](#index-types-table).
**Indexing Options Table.** <a id="index-types-table"></a>
*Types of indexes and examples with one-dimensional containers of size
`N` and an integer array `ii` of type `array [] real` size `K`.*
| index type | example | value |
| :-----------: | :------: | :-------------------------: |
| integer | `a[11]` | value of `a` at index 11 |
| integer array | `a[ii]` | `a[ii[1]]`, ..., `a[ii[K]]` |
| lower bound | `a[3:]` | `a[3]`, ..., `a[N]` |
| upper bound | `a[:5]` | `a[1]`, ..., `a[5]` |
| range | `a[2:7]` | `a[2]`, ..., `a[7]` |
| all | `a[:]` | `a[1]`, ..., `a[N]` |
| all | `a[]` | `a[1]`, ..., `a[N]` |
### Multiple index semantics {-}
The fundamental semantic rule for dealing with multiple indexes is the
following. If `idxs` is a multiple index, then it produces an
indexable position in the result. To evaluate that index position in
the result, the index is first passed to the multiple index, and the
resulting index used.
```stan
a[idxs, ...][i, ...] = a[idxs[i], ...][...]
```
On the other hand, if `idx` is a single index, it reduces the
dimensionality of the output, so that
```stan
a[idx, ...] = a[idx][...]
```
The only issue is what happens with matrices and vectors. Vectors
work just like arrays. Matrices with multiple row indexes and
multiple column indexes produce matrices. Matrices with multiple row
indexes and a single column index become (column) vectors. Matrices
with a single row index and multiple column indexes become row
vectors. The types are summarized in the
[matrix indexing table](#matrix-indexing-table).
**Matrix Indexing Table.** <a id="matrix-indexing-table"></a>
*Special rules for reducing matrices based on whether the argument is
a single or multiple index. Examples are for a matrix `a`, with
integer single indexes `i` and `j` and integer array multiple
indexes `is` and `js`. The same typing rules apply for all multiple
indexes.*
| example | row index | column index | result type |
| :---------: | :-------: | :----------: | :---------: |
| `a[i]` | single | n/a | row vector |
| `a[is]` | multiple | n/a | matrix |
| `a[i, j]` | single | single | real |
| `a[i, js]` | single | multiple | row vector |
| `a[is, j]` | multiple | single | vector |
| `a[is, js]` | multiple | multiple | matrix |
Evaluation of matrices with multiple indexes is defined to respect the
following distributivity conditions.
```stan
m[idxs1, idxs2][i, j] = m[idxs1[i], idxs2[j]]
m[idxs, idx][j] = m[idxs[j], idx]
m[idx, idxs][j] = m[idx, idxs[j]]
```
Evaluation of arrays of matrices and arrays of vectors or row vectors
is defined recursively, beginning with the array dimensions.
## Function application {#function-application.section}
Stan provides a range of built in mathematical and statistical
functions, which are documented in the built-in function documentation.
Expressions in Stan may consist of the name of function followed by a
sequence of zero or more argument expressions. For instance,
`log(2.0)` is the expression of type `real` denoting the
result of applying the natural logarithm to the value of the real
literal `2.0`.
Syntactically, function application has higher precedence than any of
the other operators, so that `y + log(x)` is interpreted as
`y + (log(x))`.
### Type signatures and result type inference {-}
Each function has a type signature which determines the allowable type
of its arguments and its return type. For instance, the function
signature for the logarithm function can be expressed as
```stan
real log(real);
```
and the signature for the `lmultiply` function is
`real lmultiply(real, real);`
A function is uniquely determined by its name and its sequence of
argument types. For instance, the following two functions are
different functions.
`real mean(array [] real);`
`real mean(vector);`
The first applies to a one-dimensional array of real values and the
second to a vector.
The identity conditions for functions explicitly forbids having two
functions with the same name and argument types but different return
types. This restriction also makes it possible to infer the type of a
function expression compositionally by only examining the type of its
subexpressions.
### Constants {-}
Constants in Stan are nothing more than nullary (no-argument)
functions. For instance, the mathematical constants $\pi$ and $e$ are
represented as nullary functions named `pi()` and `e()`.
See the [built-in constants section](#built-in-constants.section) for a list of built-in constants.
### Type promotion and function resolution {-}
Because of integer to real type promotion, rules must be established
for which function is called given a sequence of argument types. The
scheme employed by Stan is the same as that used by C++, which
resolves a function call to the function requiring the minimum number
of type promotions.
For example, consider a situation in which the following two function
signatures have been registered for `foo`.
```stan
real foo(real, real);
int foo(int, int);
```
The use of `foo` in the expression `foo(1.0, 1.0)` resolves
to `foo(real, real)`, and thus the expression `foo(1.0, 1.0)`
itself is assigned a type of `real`.
Because integers may be promoted to real values, the expression
`foo(1, 1)` could potentially match either `foo(real, real)`
or `foo(int, int)`. The former requires two type promotions and
the latter requires none, so `foo(1, 1)` is resolved to function
`foo(int, int)` and is thus assigned the type `int`.
The expression `foo(1, 1.0)` has argument types `(int, real)`
and thus does not explicitly match either function signature. By
promoting the integer expression `1` to type `real`, it is
able to match `foo(real, real)`, and hence the type of the
function expression `foo(1, 1.0)` is `real`.
In some cases (though not for any built-in Stan functions), a
situation may arise in which the function referred to by an
expression remains ambiguous. For example, consider a situation in
which there are exactly two functions named `bar` with the
following signatures.
```stan
real bar(real, int);
real bar(int, real);
```
With these signatures, the expression `bar(1.0, 1)` and
`bar(1, 1.0)` resolve to the first and second of the above
functions, respectively. The expression `bar(1.0, 1.0)` is
illegal because real values may not be demoted to integers. The
expression `bar(1, 1)` is illegal for a different reason. If the
first argument is promoted to a real value, it matches the first
signature, whereas if the second argument is promoted to a real value,
it matches the second signature. The problem is that these both
require one promotion, so the function name `bar` is ambiguous.
If there is not a unique function requiring fewer promotions than all
others, as with `bar(1, 1)` given the two declarations above,
the Stan compiler will flag the expression as illegal.
### Random-number generating functions {-}
For most of the distributions supported by Stan, there is a
corresponding random-number generating function. These random number
generators are named by the distribution with the suffix `_rng`.
For example, a univariate normal random number can be generated by
`normal_rng(0, 1)`; only the parameters of the distribution,
here a location (0) and scale (1) are specified because the variate is
generated.
#### Random-number generators locations {-}
The use of random-number generating functions is restricted to the
transformed data and generated quantities blocks; attempts to use them
elsewhere will result in a parsing error with a diagnostic message.
They may also be used in the bodies of user-defined functions whose
names end in `_rng`.
This allows the random number generating functions to be used for
simulation in general, and for Bayesian posterior predictive checking
in particular.
#### Posterior predictive checking {-}
Posterior predictive checks typically use the parameters of the model
to generate simulated data (at the individual and optionally at the
group level for hierarchical models), which can then be compared
informally using plots and formally by means of test statistics, to
the actual data in order to assess the suitability of the model; see
Chapter 6 of [@GelmanEtAl:2013] for more information on
posterior predictive checks.
## Type inference