Task Runner Comparison: Re-evaluating tox
#2559
Unanswered
danceratopz
asked this question in
General
Replies: 2 comments
-
|
This PR implements the migration to |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
after checking our 'discussions' page first thing in the morning for 4 months we finally have new content! thanks for the comparison, i think 'better error messages than make' and 'astral uses just' are pretty good arguments that i did not consider. since you want |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Goal TLDR
Now that uv handles environment isolation, most of tox's functionality is redundant. This document evaluates lighter alternatives that pair with
uv runwith the goal of improving local developer experience.Why replace tox?
tox's core value is its environment matrix: testing across Python 3.10/3.11/3.12/3.13 with different dependency sets in isolated venvs. This project doesn't use that. Our 14 tox environments are named command sequences; in execution-specs, tox is a testing orchestrator being used as a task runner.
Meanwhile, uv's
dependency-groupsanduv rungive us the same isolation guarantees without a separate tool managing venvs. Thedevgroup unions everything for local dev, and tasks can use specific groupsuv sync --group testwhen need be. tox's venv-per-environment model is redundant when uv already handles this.Switching to a dedicated, lighter command runner would offer better ergonomics for local development (grouped task listing, simpler argument passthrough).
Other Build Tools
The following table lists tools that I think are heavier than what we need:
As our tasks mainly consist of
uv run <tool> <args>, a thin command runner seems a better fit.justis a good candidate andmake, although not as light, is a worthwhile candidate due to its ubiquity.Serious contenders: make vs just
Both are language-agnostic command runners that define named tasks (targets/recipes) mapping to shell commands. Neither manages virtual environments; they pair with
uv runfor that.make
Pros:
Cons:
.PHONYdeclarations.just
Pros:
.PHONY).Cons:
CLI ergonomics
How common developer commands compare across the three tools.
tox listhelptarget; not introspective)just(list configured as default)
tox -e staticmake staticjust statictox -e py3 -- tests/amsterdam/ -xmake fill ARGS="tests/amsterdam/ -x"just fill tests/amsterdam/ -xtox -p -e static(if we split into multiple envs and apply the static label)
make -j staticjust static(via parallel attribute)
Config file comparison
description =field##comment convention (parsed byhelptarget)#comment above recipe (shown injust --list){posargs}placeholder$(ARGS)variable*argsparameter, interpolated with{{ args }}{env:VAR:default}$${VAR:-default}env('VAR', 'default')passenvper varBoilerplate per task
tox.ini
Makefile
Justfile
Task listing output
tox — ordered by declaration in
envlist:make — whatever the hand-written
helptarget outputs (no standard format).just — grouped by
[group]attribute; recipes can appear in multiple groups:Adoption
make is ubiquitous. Many Python projects use Makefiles as task runners (Django, FastAPI, and Flask projects commonly include them), though this is a pragmatic reuse of a build tool rather than an endorsed pattern. The Scientific Python Development Guide mentions make as traditional but recommends nox for Python-specific workflows.
just is newer and has smaller overall adoption, but is gaining traction specifically in the modern Python tooling ecosystem. Notably, Astral (the team behind ruff and uv) uses Justfiles in their own projects:
Other projects using just include behave (Python testing framework), GluonTS (AWS time series library), and takopi (Coding agent bridge to Telegram).
Beta Was this translation helpful? Give feedback.
All reactions