Locust Compare

Compare performance results between two Locust runs and show changes relative to a base run. Works with both Locust CSV report.csv outputs and the per-feature HTML reports generated by the Locust web UI.

Features

Compare any two runs (base vs. current).
Parses CSV report.csv for aggregated and per-endpoint metrics.
Parses per-feature .html pages and compares the latest history sample.
Outputs human-readable tables, markdown with emoji indicators, or machine-friendly JSON.

Requirements

Python 3.8+ (no third-party dependencies).

Installation

With uvx (recommended)

Run directly from GitHub without cloning:

uvx --from 'git+https://github.com/dev-ankit/python-tools.git#subdirectory=tools/locust-compare' locust-compare <base_dir> <current_dir>

Or from a local clone (run from tools/locust-compare/ directory):

git clone https://github.com/dev-ankit/python-tools.git
cd python-tools/tools/locust-compare
uvx --from . locust-compare test_runs/HTML-Report-292 test_runs/HTML-Report-294

Once published to PyPI, you can run without any prefix:

uvx locust-compare <base_dir> <current_dir>

With pip

cd tools/locust-compare
pip install .
locust-compare <base_dir> <current_dir>

Direct execution

python3 compare_runs.py <base_dir> <current_dir>

Quick Start

Compare two run directories (each containing a report.csv and HTML files):

locust-compare test_runs/HTML-Report-292 test_runs/HTML-Report-294

Compare two specific CSV files:

locust-compare test_runs/HTML-Report-292/report.csv test_runs/HTML-Report-294/report.csv

JSON output for scripting:

locust-compare test_runs/HTML-Report-292 test_runs/HTML-Report-294 -o json

Markdown output with emoji indicators (✅ better, ❌ worse, ➖ same):

python3 compare_runs.py test_runs/HTML-Report-292 test_runs/HTML-Report-294 -o markdown

Colorize text output (green=better, red=worse):

locust-compare test_runs/HTML-Report-292 test_runs/HTML-Report-294 --color

Exit code is 0 on success and 1 on error.

What It Compares

From CSV report.csv (Aggregated and each request row):

Requests/s, Request Count, Failure Count
Average, Median, Min, Max Response Time
Percentiles: 50%, 66%, 75%, 80%, 90%, 95%, 98%, 99%, 99.9%, 99.99%, 100% (if present)

From HTML feature pages (last entry in window.templateArgs.history):

Requests/s (current_rps)
Average Response Time (total_avg_response_time)
50% (response_time_percentile_0.5)
95% (response_time_percentile_0.95)

If a metric is not available for an item, it is shown as -.

Example Output (truncated)

Markdown Output Example

The -o markdown flag produces markdown tables with emoji indicators for verdicts:

## Aggregated

| Metric | Base | Current | Diff | % Change | Verdict |
| --- | --- | --- | --- | --- | --- |
| Requests/s | 286.200 | 300 | +13.800 | +4.8% | ✅ |
| Request Count | 1500 | 1800 | +300 | +20.0% | ✅ |
| Failure Count | 7 | 4 | -3 | -42.9% | ✅ |
| Average Response Time | 85.200 | 78.500 | -6.700 | -7.9% | ✅ |
| 95% | 150 | 140 | -10 | -6.7% | ✅ |

Verdict emojis:

✅ Better performance
❌ Worse performance
➖ No change

JSON Schema

The -o json output is a single JSON object containing keys for each compared item.

CSV items use their request name; the aggregated row is keyed as Aggregated.
HTML feature pages are keyed as HTML:<feature_file_stem>.

Each item maps metric names to an object with:

{
  "base": number | null,
  "current": number | null,
  "diff": number | null,
  "pct_change": number | null
}

Example (truncated):

{
  "Aggregated": {
    "Requests/s": {"base": 268.623, "current": 196.786, "diff": -71.836, "pct_change": -26.72},
    "Average Response Time": {"base": 71.801, "current": 98.069, ...}
  },
  "HTML:conferences_widget_all_lists": {
    "Requests/s": {"base": 271.5, "current": 189.8, ...},
    "95%": {"base": 160, "current": 190, ...}
  }
}

Notes & Limitations

For HTML pages, only the last sample in window.templateArgs.history is compared. This typically represents the end-state of the run. If you prefer a different aggregation (mean/max), open an issue or adjust the code where noted.
Request Count and Failure Count are not available from HTML pages and are displayed as -.
If the base value is 0 or missing, percent change is shown as -.
The tool skips non-feature HTML pages such as htmlpublisher-wrapper.html.
The tool prints a Verdict column. By default, it evaluates improvements as:
- Higher is better: Requests/s, Request Count.
- Lower is better: all response-time metrics and percentiles, Failure Count, Failures/s.
- Neutral (no verdict): other metrics (e.g., Average Content Size).

Tool Layout

tools/locust-compare/
├── compare_runs.py     # CLI tool
├── pyproject.toml      # Package configuration
├── tests/              # Test suite
└── test_runs/          # Sample Locust outputs for trying the tool
    ├── HTML-Report-292/
    └── HTML-Report-294/

Contributing

Small and simple by design. If you need additional metrics, output formats, or aggregation modes, feel free to extend compare_runs.py or open a PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Locust Compare

Features

Requirements

Installation

With uvx (recommended)

With pip

Direct execution

Quick Start

What It Compares

Example Output (truncated)

Markdown Output Example

JSON Schema

Notes & Limitations

Tool Layout

Contributing

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Locust Compare

Features

Requirements

Installation

With uvx (recommended)

With pip

Direct execution

Quick Start

What It Compares

Example Output (truncated)

Markdown Output Example

JSON Schema

Notes & Limitations

Tool Layout

Contributing