labeille works in two phases: resolve discovers packages and builds a registry, then run executes their test suites against a JIT-enabled CPython build and detects crashes. Between these phases, packages need to be enriched with specific installation and test instructions.
labeille resolve → enrich registry → labeille run → analyze results
│ │ │ │
Fetch PyPI Fill in YAML Clone, build, Crashes, timing,
metadata, install/test test, detect comparisons
find repos commands crashes
↕
laruche (external registry repo)
The registry is maintained externally in
laruche and synced locally via
labeille registry sync. Resolve creates skeleton entries, enrichment fills
them in, and run consumes them. Each phase can be run independently and
repeatedly.
labeille resolve fetches PyPI metadata for each package and extracts:
- Source repository URL — from PyPI metadata fields (project_urls, home_page, etc.)
- Extension type —
pure,extensions, orunknown, determined from wheel tag analysis - PyPI URL — direct link to the package page
It writes two things to the registry directory (default: ~/.local/share/labeille/registry/):
index.yaml— a sorted list of all tracked packages with download countspackages/{name}.yaml— per-package YAML with the discovered metadata and empty fields for enrichment
You can feed packages to resolve from multiple sources:
# Positional arguments
labeille resolve requests click flask
# From a file (one package name per line)
labeille resolve --from-file packages.txt
# From a JSON file with download counts (e.g. from BigQuery)
labeille resolve --from-json top-pypi.json --top 500
# Combine sources — all are merged
labeille resolve requests --from-file extras.txt --from-json top-pypi.json --top 100Resolve never overwrites package files that have been manually enriched
(enriched: true). Re-running resolve on an existing registry safely adds new
packages without touching ones you've already configured.
| Option | Description |
|---|---|
PACKAGES |
Package names (positional, multiple) |
--from-file FILE |
File with one package name per line |
--from-json FILE |
JSON file with download counts |
--top N |
Top N packages by downloads (requires --from-json) |
--registry-dir PATH |
Registry directory (default: ~/.local/share/labeille/registry/) |
--workers N |
Parallel PyPI API requests (default: 1) |
--timeout SECONDS |
PyPI API request timeout (default: 10) |
--dry-run |
Show what would be done without writing files |
-v, --verbose |
Detailed output |
-q, --quiet |
Only show errors |
--log-file PATH |
Log file path (default: labeille-resolve.log) |
labeille run reads the registry and, for each enriched package:
- Clones the source repository (shallow by default)
- Creates a virtual environment with the target Python
- Installs the package using its
install_command - Runs the test suite using its
test_command - Detects crashes from exit codes, signals, and stderr patterns
- Records results as JSONL with full metadata
The target Python has PYTHON_JIT=1 and PYTHONFAULTHANDLER=1 set automatically
to enable the JIT and get crash tracebacks.
# Run all enriched packages
labeille run --target-python ~/cpython/python
# Run specific packages
labeille run --target-python ~/cpython/python --packages requests,click,flask
# Run top 50 by downloads, 4 workers
labeille run --target-python ~/cpython/python --top 50 --workers 4
# Test a specific git revision
labeille run --target-python ~/cpython/python --packages=urllib3@abc1234 --no-shallow| Option | Description |
|---|---|
--packages CSV |
Comma-separated filter (supports name@revision syntax) |
--top N |
Top N packages by download count |
--skip-extensions |
Skip C extension packages (test pure Python only) |
--force-run |
Override skip and skip_versions flags |
--skip-completed |
Resume: skip already-tested packages |
| Status | Meaning |
|---|---|
pass |
Test suite exited 0 (all tests passed) |
fail |
Test suite exited 1 (some tests failed — normal) |
crash |
Segfault, abort, or assertion failure detected |
timeout |
Test suite exceeded the time limit |
install_error |
Package installation failed |
clone_error |
Git clone failed |
error |
Other unexpected error |
labeille detects crashes from:
- Exit codes: negative exit codes (signals), exit code 134 (SIGABRT), 139 (SIGSEGV)
- Stderr patterns: ASAN reports, Python fatal errors, assertion failures
- Signal names: extracted from the exit code for crash signature
Each crash gets a signature (signal + stderr context) for deduplication and comparison across runs.
--workers N runs N packages in parallel. Each worker handles one package
end-to-end. Memory scales linearly with worker count. For ASAN-enabled builds,
2-3 workers is the practical limit due to ~2-3x memory overhead per process.
By default, repos and venvs are created in temporary directories and cleaned up after each package. For faster repeated runs:
# Reuse repos and venvs across runs
labeille run --target-python ~/cpython/python --work-dir ~/labeille-work
# Or set them individually
labeille run --target-python ~/cpython/python \
--repos-dir ~/repos --venvs-dir ~/venvs
# Force reinstall when install commands change
labeille run --target-python ~/cpython/python --work-dir ~/work --refresh-venvs# Start a named run
labeille run --target-python ~/cpython/python --run-id my-batch
# Resume after interruption (skips already-tested packages)
labeille run --target-python ~/cpython/python --run-id my-batch --skip-completedOverride registry settings without editing YAML files:
# Inject extra dependencies into every venv
labeille run --target-python ~/cpython/python --extra-deps "trustme,uvicorn"
# Override all test commands
labeille run --target-python ~/cpython/python --test-command-override "python -m pytest -x"
# Append flags to test commands
labeille run --target-python ~/cpython/python --test-command-suffix "--tb=long -v"
# Use a fork's repo
labeille run --target-python ~/cpython/python --repo-override "requests=https://github.com/me/requests"| Option | Description |
|---|---|
--target-python PATH |
Python interpreter to test with (required) |
--registry-dir PATH |
Registry directory (default: ~/.local/share/labeille/registry/) |
--results-dir PATH |
Output directory (default: results) |
--packages CSV |
Comma-separated filter (supports name@revision) |
--top N |
Top N packages by download count |
--workers N |
Parallel package execution (default: 1) |
--timeout SECONDS |
Per-package timeout (default: 600) |
--skip-extensions |
Skip C extension packages |
--skip-completed |
Resume: skip already-tested packages |
--force-run |
Override skip and skip_versions flags |
--stop-after-crash N |
Stop after N crashes found |
--run-id ID |
Custom run identifier (default: timestamp) |
--work-dir PATH |
Base directory for repos and venvs |
--repos-dir PATH |
Persistent repo clones |
--venvs-dir PATH |
Persistent venvs |
--keep-work-dirs |
Don't clean up temporary directories |
--refresh-venvs |
Delete and recreate existing venvs |
--extra-deps CSV |
Inject additional packages into every venv |
--test-command-override STR |
Replace all test commands |
--test-command-suffix STR |
Append flags to test commands |
--repo-override PKG=URL |
Override repo URL (repeatable) |
--clone-depth N |
Git clone depth |
--no-shallow |
Full clone (needed for old revisions) |
--installer {auto,uv,pip} |
Package installer backend (default: auto) |
--env KEY=VALUE |
Extra environment variables (repeatable) |
--dry-run |
Don't actually execute tests |
-v, --verbose |
Show all details |
-q, --quiet |
Only show crashes |
The registry connects resolve and run. It is maintained as a separate project,
laruche, and synced locally via
labeille registry sync. By default, the registry lives at
~/.local/share/labeille/registry/.
- Resolve creates it — writes skeleton YAML with repo URL and extension type
- Enrichment fills it in — adds install_command, test_command, dependencies
- Run consumes it — reads the commands and executes them
This separation means you resolve once and run many times against different Python builds. See enrichment.md for the enrichment guide and the laruche repository for the full field schema and enrichment documentation.
# 1. Build a registry from the top 100 PyPI packages
labeille resolve --from-json top-pypi.json --top 100
# 2. Enrich packages (see doc/enrichment.md)
# Fill in install_command, test_command, etc. for each package
# 3. Run tests against a JIT-enabled CPython build
labeille run --target-python ~/jit_cpython/python \
--results-dir results \
--work-dir ~/labeille-work \
--workers 4
# 4. Analyze results
labeille analyze run # Summary
labeille analyze run -q # Crashes only
labeille analyze package requests # Deep dive on one package
# 5. Compare with a previous run
labeille analyze compare run_001 run_002When a crash is found:
# 1. Reproduce the crash
labeille run --target-python ~/jit_cpython/python --packages=urllib3 \
--work-dir ~/investigate
# 2. Check if HEAD still crashes
labeille run --target-python ~/jit_cpython/python --packages=urllib3
# 3. Test at the specific revision where the crash was found
labeille run --target-python ~/jit_cpython/python \
--packages=urllib3@abc1234 --no-shallow
# 4. Bisect across the package's git history
labeille bisect --target-python ~/jit_cpython/python \
--package urllib3 --good v2.0.0 --bad HEADIf the package code didn't change between runs but the crash appeared, the regression is almost certainly on the CPython/JIT side.
Each run creates a directory under results/{run_id}/:
run_meta.json— Run metadata: Python version, JIT status, hostname, timingresults.jsonl— One JSON line per package with status, exit code, signal, crash signature, timing, and installed dependency versionscrashes/— Full stderr captures for crashed packagesrun.log— Detailed debug log
Use labeille analyze commands to examine results, or process the JSONL directly.
Once you have the basic workflow running, labeille offers specialized testing modes:
- Benchmarking — Compare test suite performance across conditions (JIT vs no-JIT, different interpreters, with/without coverage)
- Free-threaded testing — Test packages against free-threaded CPython builds to detect crashes, deadlocks, and race conditions
- Compatibility analysis — Survey C extension packages for build compatibility against any Python version