Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 30 additions & 34 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,9 +165,33 @@ The precise run configurations should be taken from the data spreadsheets that l

### Correctness testing

Correctness can be verified using the [validate.py](./validate.py) script.
To check that the benchmark software is working correctly, a matrix
comparison job must be run on 1 GPU and 8 GPUs with no collocation (`qmode=1`), 10000
total dofs (`ndofs_global`), and in both cases should produce the same
output `ynorm` and `znorm` (within numerical roundoff precision).
For a problem with 10000 dofs, the numerical value of the `ynorm` and
`znorm` should be 1.141577508 to 9 decimal places. The console output
and the JSON file should be reported.

The same correctness test should be performed with the CG operator on
1 and 8 GPUS:

- Correctness comparison with matrix result: `bench_dolfinx --mat_comp --cg
--ndofs_global=10000 --degree=3 --json mat_comp_cg.json`

In this case, `ynorm` and `znorm` should be 167.5924472. Console output
and JSON should be reported.

The validation script should be run as follows and produce output similar to the following:
For both these tests, the [validate.py](./validate.py) script can be used to check
the values of `ynorm` and `znorm` meet the accuracy requirements (see description
below of how to use the script).

### Benchmark run validation

Benchmark runs can be verified using the [validate.py](./validate.py) script.

The validation script takes as input the JSON and console output from the
benchmark code. For example:
```
./validate output.json output.out

Expand All @@ -178,49 +202,20 @@ The validation script should be run as follows and produce output similar to the
nreps : 1000
scalar size : 64

MAT COMP performance: 0.2957402083152624 Gdofs/s
Stencil performance: 0.2957402083152624 Gdofs/s

Validation: PASSED


```

Sanity check: The matrix comparison must be run on 1 GPU and 8 GPUs with no collocation (`qmode=1`), 10000
total dofs (`ndofs_global`), and in both cases should produce the same
output `ynorm` and `znorm` (within numerical roundoff precision).
For a problem with 10000 dofs, the numerical value of the `ynorm` and
`znorm` should be 1.141577508 to 9 decimal places. The console output
and the JSON file should be reported.

For the acceptance tests, with `--qmode=0`, all GPU-based computations must
yield the same answer as a CPU-based variant, subject to numerical
roundoffs.

The same correctness test should be performed with the CG operator on
1 and 8 GPUS:

- Correctness comparison with matrix result: `bench_dolfinx --mat_comp --cg
--ndofs_global=10000 --degree=3 --json mat_comp_cg.json`

In this case, `ynorm` and `znorm` should be 167.5924472. Console output
and JSON should be reported.


### Performance results

In addition to testing for correctness, `validate.py` will also print the Computation Rate, which is the sole FoM for the benchmark.
In addition to validating the benchmark run, `validate.py` will also
print the Computation Rate, which is the sole FoM for the benchmark.
The Computation Rate printed by `validate.py` corresponds to the
total throughput in billion degrees of freedom per second (Gdofs/s).

<!--
The minimum problem size allowed is 200M DoFs at Q3 and 350M DoFs at
Q6. Performance may improve with larger problems sizes, subject to
memory available.

The throughput tests can in principle be run on any number of GPU. The
problem size can be increased to use more GPU memory.
-->

### Reference data

#### LUMI-G (MI250x): Throughput in GDoFs/s for 2-64 nodes (8-256 GPUs)
Expand Down Expand Up @@ -295,6 +290,7 @@ The following changes to this document have been made since initial release:

| <div style="width:90px">Date</div> | Change |
|-----------:|--------|
| 2026-06-09 | Fixes validate.py script and clarifies wording around correctness and validation |
| 2026-06-05 | Removes incorrect `--mat-comp` option from suggested configurations |
| 2026-05-29 | Correct validation script to support CG correctness test |

Expand Down
8 changes: 2 additions & 6 deletions validate.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,10 +56,6 @@ def parse_output( json_fname, console_fname ):
print(f' nreps : {d["nreps"]}')
print(f' scalar size : {d["fp"]}')

if not d["is_mat_comp"]:
print(f'\n ERROR: Benchmark must be run with --mat_comp')
valid = False

if d["fp"] != 64:
print(f'\n ERROR: Benchmark must be run with 64-bit precision')
valid = False
Expand Down Expand Up @@ -103,9 +99,9 @@ def parse_output( json_fname, console_fname ):
valid = False

if d["is_cg"] and valid:
print(f'\n CG performance: {d["gdofs"]} Gdofs/s')
print(f'\n Stencil+CG performance: {d["gdofs"]} Gdofs/s')
else:
print(f'\n MAT COMP performance: {d["gdofs"]} Gdofs/s')
print(f'\n Stencil performance: {d["gdofs"]} Gdofs/s')

print("\n Validation:", ("PASSED" if valid else "FAILED") )
print()
Expand Down