Skip to content

Commit 82159ed

Browse files
Add parallel experiments via GNU parallel and geometric mean aggregation
Experiments across multiple inputs/seeds should be parallelized with GNU parallel to reduce wall-clock time. Geometric mean is the default summary statistic for aggregating multiple measurements.
1 parent 585dd47 commit 82159ed

File tree

1 file changed

+12
-0
lines changed

1 file changed

+12
-0
lines changed

AAE.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,18 @@ Run the benchmark and collect results:
123123
- Extract the key metric(s) from the log.
124124
- If the run crashes, diagnose from the log tail.
125125

126+
**Parallel runs**: When the benchmark must be run across multiple inputs, seeds, or configurations, use GNU `parallel` to execute them concurrently. For example:
127+
```bash
128+
parallel --jobs 4 --results run_results/ './bench.sh --input {} > logs/run_{}.log 2>&1' ::: input1 input2 input3 input4
129+
```
130+
This reduces wall-clock time per iteration significantly. Only parallelize when runs are independent and the system has sufficient resources (CPU cores, memory, GPU slots) to avoid contention that would distort measurements.
131+
132+
**Aggregation**: When an experiment produces multiple measurements (e.g. across inputs or seeds), use the **geometric mean** as the default summary statistic. The geometric mean is appropriate for ratios and multiplicative quantities like speedups or normalized scores, and is less sensitive to outliers than the arithmetic mean. Compute it as:
133+
```
134+
geometric_mean = (x1 * x2 * ... * xn) ^ (1/n)
135+
```
136+
Use the arithmetic mean only when the metric is inherently additive (e.g. total time summed across phases) and the user explicitly requests it.
137+
126138
### 6. Evaluate (Keep or Discard)
127139

128140
Compare the result to the current best:

0 commit comments

Comments
 (0)