Skip to content

Commit d1ea3f4

Browse files
committed
update docs for PRNG
1 parent 31249fb commit d1ea3f4

File tree

1 file changed

+27
-2
lines changed

1 file changed

+27
-2
lines changed

designs/0020-parallel-chain-api.md

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,22 @@ The services API on the backend has a prototype implementation found [here](http
8080

8181
Then a [`tbb::parallel_for()`](https://github.com/stan-dev/stan/blob/147fba5fb93aa007ec42744a36d97cc84c291945/src/stan/services/sample/hmc_nuts_dense_e_adapt.hpp#L261) is used to run the each of the samplers.
8282

83+
PRNGs will be initialized such as the following pseudocode, where a constant stride is used to initialize the PRNG.
84+
85+
```cpp
86+
inline boost::ecuyer1988 create_rng(unsigned int seed, unsigned int init_chain_id, unsigned int chain_num) {
87+
// Initialize L’ecuyer generator
88+
boost::ecuyer1988 rng(seed);
89+
90+
// Seek generator to disjoint region for each chain
91+
static uintmax_t DISCARD_STRIDE = static_cast<uintmax_t>(1) << 50;
92+
rng.discard(DISCARD_STRIDE * (init_chain_id + chain_num - 1));
93+
return rng;
94+
}
95+
```
96+
97+
The constant stride guarantees that models which use multiple chains in one program and multiple programs using multiple chains are able to be reproducible given the same seed as noted below.
98+
8399
### Recommended Upstream Initialization
84100
85101
Upstream packages can generate `init` and `init_inv_metric` as they wish, though for cmdstan the prototype follows the following rules for reading user input.
@@ -89,7 +105,7 @@ If the user specifies their init as `{file_name}.{file_ending}` with an input `i
89105
For example, if a user specifies `chains=4`, `id=2`, and their init file as `init=init.data.R` then the program
90106
will first search for `init.data_2.R` and if it finds it will then search for `init.data_3.R`,
91107
`init.data_4.R`, `init.data_5.R` and will fail if all files are not found. If the program fails to find `init.data_2.R` then it will attempt
92-
to find `init.data.R` and if successfull will use these initial values for all chains. If neither
108+
to find `init.data.R` and if successful will use these initial values for all chains. If neither
93109
are found then an error will be thrown.
94110
95111
Documentation must be added to clarify reproducibility between a multi-chain program and running multiple chains across several programs. This requires
@@ -111,9 +127,18 @@ examples/bernoulli/bernoulli sample data file=examples/bernoulli/bernoulli.data.
111127
examples/bernoulli/bernoulli sample data file=examples/bernoulli/bernoulli.data.R chains=1 id=4 random seed=123 output file=output4.csv
112128
```
113129

130+
In general the constant stride allow for the following where `n1 + n2 + n3 + n4 = N` chains.
131+
132+
```
133+
seed=848383, id=1, chains=n1
134+
seed=848383, id=1 + n1, chains=n2
135+
seed=848383, id=1 + n1 + n2, chains=n3
136+
seed=848383, id=1 + n1 + n2 + n3, chains=n4
137+
```
138+
114139

115140

116141
# Drawbacks
117142
[drawbacks]: #drawbacks
118143

119-
This does add overhead to existing implimentations in managing the per chain IO.
144+
This does add overhead to existing implementations in managing the per chain IO.

0 commit comments

Comments
 (0)