diff --git a/README.md b/README.md index b28c771..9abc972 100644 --- a/README.md +++ b/README.md @@ -165,9 +165,33 @@ The precise run configurations should be taken from the data spreadsheets that l ### Correctness testing -Correctness can be verified using the [validate.py](./validate.py) script. +To check that the benchmark software is working correctly, a matrix +comparison job must be run on 1 GPU and 8 GPUs with no collocation (`qmode=1`), 10000 +total dofs (`ndofs_global`), and in both cases should produce the same +output `ynorm` and `znorm` (within numerical roundoff precision). +For a problem with 10000 dofs, the numerical value of the `ynorm` and +`znorm` should be 1.141577508 to 9 decimal places. The console output +and the JSON file should be reported. + +The same correctness test should be performed with the CG operator on +1 and 8 GPUS: + +- Correctness comparison with matrix result: `bench_dolfinx --mat_comp --cg + --ndofs_global=10000 --degree=3 --json mat_comp_cg.json` + +In this case, `ynorm` and `znorm` should be 167.5924472. Console output +and JSON should be reported. -The validation script should be run as follows and produce output similar to the following: +For both these tests, the [validate.py](./validate.py) script can be used to check +the values of `ynorm` and `znorm` meet the accuracy requirements (see description +below of how to use the script). + +### Benchmark run validation + +Benchmark runs can be verified using the [validate.py](./validate.py) script. + +The validation script takes as input the JSON and console output from the +benchmark code. For example: ``` ./validate output.json output.out @@ -178,49 +202,20 @@ The validation script should be run as follows and produce output similar to the nreps : 1000 scalar size : 64 - MAT COMP performance: 0.2957402083152624 Gdofs/s + Stencil performance: 0.2957402083152624 Gdofs/s Validation: PASSED ``` -Sanity check: The matrix comparison must be run on 1 GPU and 8 GPUs with no collocation (`qmode=1`), 10000 -total dofs (`ndofs_global`), and in both cases should produce the same -output `ynorm` and `znorm` (within numerical roundoff precision). -For a problem with 10000 dofs, the numerical value of the `ynorm` and -`znorm` should be 1.141577508 to 9 decimal places. The console output -and the JSON file should be reported. - -For the acceptance tests, with `--qmode=0`, all GPU-based computations must -yield the same answer as a CPU-based variant, subject to numerical -roundoffs. - -The same correctness test should be performed with the CG operator on -1 and 8 GPUS: - -- Correctness comparison with matrix result: `bench_dolfinx --mat_comp --cg - --ndofs_global=10000 --degree=3 --json mat_comp_cg.json` - -In this case, `ynorm` and `znorm` should be 167.5924472. Console output -and JSON should be reported. - - ### Performance results -In addition to testing for correctness, `validate.py` will also print the Computation Rate, which is the sole FoM for the benchmark. +In addition to validating the benchmark run, `validate.py` will also +print the Computation Rate, which is the sole FoM for the benchmark. The Computation Rate printed by `validate.py` corresponds to the total throughput in billion degrees of freedom per second (Gdofs/s). - - ### Reference data #### LUMI-G (MI250x): Throughput in GDoFs/s for 2-64 nodes (8-256 GPUs) @@ -295,6 +290,7 @@ The following changes to this document have been made since initial release: |
Date
| Change | |-----------:|--------| +| 2026-06-09 | Fixes validate.py script and clarifies wording around correctness and validation | | 2026-06-05 | Removes incorrect `--mat-comp` option from suggested configurations | | 2026-05-29 | Correct validation script to support CG correctness test | diff --git a/validate.py b/validate.py index 93d7c2f..829663f 100755 --- a/validate.py +++ b/validate.py @@ -56,10 +56,6 @@ def parse_output( json_fname, console_fname ): print(f' nreps : {d["nreps"]}') print(f' scalar size : {d["fp"]}') -if not d["is_mat_comp"]: - print(f'\n ERROR: Benchmark must be run with --mat_comp') - valid = False - if d["fp"] != 64: print(f'\n ERROR: Benchmark must be run with 64-bit precision') valid = False @@ -103,9 +99,9 @@ def parse_output( json_fname, console_fname ): valid = False if d["is_cg"] and valid: - print(f'\n CG performance: {d["gdofs"]} Gdofs/s') + print(f'\n Stencil+CG performance: {d["gdofs"]} Gdofs/s') else: - print(f'\n MAT COMP performance: {d["gdofs"]} Gdofs/s') + print(f'\n Stencil performance: {d["gdofs"]} Gdofs/s') print("\n Validation:", ("PASSED" if valid else "FAILED") ) print()