Task Runner Comparison: Re-evaluating `tox` #2559

danceratopz · 2026-03-25T23:43:28Z

danceratopz
Mar 25, 2026
Maintainer

Goal TLDR

Now that uv handles environment isolation, most of tox's functionality is redundant. This document evaluates lighter alternatives that pair with uv run with the goal of improving local developer experience.

Why replace tox?

tox's core value is its environment matrix: testing across Python 3.10/3.11/3.12/3.13 with different dependency sets in isolated venvs. This project doesn't use that. Our 14 tox environments are named command sequences; in execution-specs, tox is a testing orchestrator being used as a task runner.

Meanwhile, uv's dependency-groups and uv run give us the same isolation guarantees without a separate tool managing venvs. The dev group unions everything for local dev, and tasks can use specific groups uv sync --group test when need be. tox's venv-per-environment model is redundant when uv already handles this.

Switching to a dedicated, lighter command runner would offer better ergonomics for local development (grouped task listing, simpler argument passthrough).

Other Build Tools

The following table lists tools that I think are heavier than what we need:

Tool	Solves	Why not
nox	venv-per-task isolation	Same structural overhead as tox; uv already handles this.
doit	Incremental rebuilds via file dependency tracking	We don't need "rebuild only if changed."
invoke/fabric	Python-native task definitions, SSH execution	Both effectively in maintenance mode (last releases mid-2023).
Task	YAML-based task orchestration	YAML is verbose for shell one-liners.
waf	Compiling C/C++/Fortran with complex dependency graphs	Build system, not a command runner.

As our tasks mainly consist of uv run <tool> <args>, a thin command runner seems a better fit. just is a good candidate and make, although not as light, is a worthwhile candidate due to its ubiquity.

Serious contenders: make vs just

Both are language-agnostic command runners that define named tasks (targets/recipes) mapping to shell commands. Neither manages virtual environments; they pair with uv run for that.

make

Pros:

Universally available on Unix systems, no installation required.
Well-understood by most developers.
Massive ecosystem of tutorials and examples.

Cons:

Designed for file-based builds, not command running. Every command-only target needs .PHONY declarations.
Error messages are often cryptic.

just

Pros:

Purpose-built for command running. All recipes are commands by default (no .PHONY).
Clear error messages.

Cons:

Requires installation (not pre-installed on most systems).
Smaller ecosystem and community than make.

CLI ergonomics

How common developer commands compare across the three tools.

	tox	make	just
List tasks	`tox list`	(hand-written `help` target; not introspective)	`just` (list configured as default)
Basic task execution	`tox -e static`	`make static`	`just static`
Execution w/arg passthrough	`tox -e py3 -- tests/amsterdam/ -x`	`make fill ARGS="tests/amsterdam/ -x"`	`just fill tests/amsterdam/ -x`
Parallel task execution	`tox -p -e static` (if we split into multiple envs and apply the static label)	`make -j static`	`just static` (via parallel attribute)

Config file comparison

	tox	make	just
Task descriptions	`description =` field	`##` comment convention (parsed by `help` target)	`#` comment above recipe (shown in `just --list`)
Arg passthrough	`{posargs}` placeholder	`$(ARGS)` variable	`*args` parameter, interpolated with `{{ args }}`
Indentation	Free-form	Tabs required (spaces cause silent failures)	Spaces (any amount)
Shell model	Managed by tox (each command is a separate invocation)	Each line runs in a separate shell	Single shell per recipe
Env var defaults	`{env:VAR:default}`	`$${VAR:-default}`	`env('VAR', 'default')`
Env var isolation	Isolated; requires `passenv` per var	Inherits full shell environment	Inherits full shell environment

Boilerplate per task

tox.ini

[testenv:static]
description = Run static checks
commands =
    ...

Makefile

.PHONY: static
## Run static checks
static:
 ...

Justfile

# Run static checks
static:
    ...

Task listing output

tox — ordered by declaration in envlist:

❯ tox list
default environments:
py3                           -> Fill the tests using EELS (with Python)
pypy3                         -> Fill the tests using EELS (with PyPy)
json_loader                   -> Fill and run the spec against test fixtures
static                        -> Run spelling, lint, typechecking and dependency checks
tests_pytest_py3              -> Run the testing package unit tests (with Python)
...

make — whatever the hand-written help target outputs (no standard format).

just — grouped by [group] attribute; recipes can appear in multiple groups:

❯ just
Available recipes:
    [benchmark tests]
    benchmark-fixed-opcode-cli *args    # Fill benchmark tests with --fixed-opcode-count 1
    benchmark-fixed-opcode-config *args # Run benchmark_parser, then fill benchmark tests using its config
    benchmark-gas-values *args          # Fill benchmark tests with --gas-benchmark-values
    tests_benchmark_pytest_py3 *args    # Run benchmark framework unit tests (with Python)

    [consensus tests]
    coverage                            # Generate HTML coverage report from last just fill run
    fill *args                          # Fill the consensus tests using EELS (with Python)
    ...

Adoption

make is ubiquitous. Many Python projects use Makefiles as task runners (Django, FastAPI, and Flask projects commonly include them), though this is a pragmatic reuse of a build tool rather than an endorsed pattern. The Scientific Python Development Guide mentions make as traditional but recommends nox for Python-specific workflows.

just is newer and has smaller overall adoption, but is gaining traction specifically in the modern Python tooling ecosystem. Notably, Astral (the team behind ruff and uv) uses Justfiles in their own projects:

Other projects using just include behave (Python testing framework), GluonTS (AWS time series library), and takopi (Coding agent bridge to Telegram).

danceratopz · 2026-03-25T23:53:18Z

danceratopz
Mar 25, 2026
Maintainer Author

This PR implements the migration to just:

feat(ci,tooling): replace tox with just #2555

0 replies

felix314159 · 2026-03-26T11:08:41Z

felix314159
Mar 26, 2026
Maintainer

after checking our 'discussions' page first thing in the morning for 4 months we finally have new content! thanks for the comparison, i think 'better error messages than make' and 'astral uses just' are pretty good arguments that i did not consider. since you want just so much, i am fine with it

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Task Runner Comparison: Re-evaluating `tox` #2559

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Task Runner Comparison: Re-evaluating tox #2559

Uh oh!

Uh oh!

danceratopz Mar 25, 2026 Maintainer

Goal TLDR

Why replace tox?

Other Build Tools

Serious contenders: make vs just

make

just

CLI ergonomics

Config file comparison

Boilerplate per task

Task listing output

Adoption

Replies: 2 comments

Uh oh!

danceratopz Mar 25, 2026 Maintainer Author

Uh oh!

felix314159 Mar 26, 2026 Maintainer

Task Runner Comparison: Re-evaluating `tox` #2559

danceratopz
Mar 25, 2026
Maintainer

danceratopz
Mar 25, 2026
Maintainer Author

felix314159
Mar 26, 2026
Maintainer