model_performance.py benchmarks the existing Graph_Build executable for all
target networks:
alexnet_mnistgooglenetdensenetresnetyolo
It measures wall time and RSS memory timeline for two stages:
compile: process start untilGraph_BuildprintsStarting inference...inference:Starting inference...untilInference completed successfully.
The benchmark does not modify C++ code. It reads the executable output live, samples process memory while the command is running, stores the full RSS sample series, and writes a memory plot for every measured run.
Install matplotlib to generate memory plots. Install psutil to measure RSS
for the full process tree on every platform. Without psutil, Linux uses
/proc, while macOS and Windows use parent-process RSS fallbacks.
Build the project first:
cmake -S . -B build
cmake --build build --target Graph_Build --parallelRun the default benchmark over every model with available JSON/input assets:
python3 benchmarks/model_performance.pyRun selected models and variants:
python3 benchmarks/model_performance.py \
--model googlenet,resnet \
--variant target \
--repeat 3 \
--warmup 1The JSON report includes memory_samples for every run. PNG plots are written
to benchmark_results/memory_plots by default. Use --samples-csv-out to export
the memory timeline to CSV and --plots-dir to choose another plot directory.
Use --variant target for the full target matrix: every supported parallel
backend with fusion off/on, plus oneDNN with fusion off/on. Fusion-on uses the
existing Conv+Relu fused layer for naive/parallel backends and existing
post-ops mode for oneDNN.
Use --strict-assets to fail when a model JSON or input image directory is
missing instead of skipping that model.