This document describes the mutation testing implementation for Mostro, a critical Rust daemon handling Bitcoin P2P trades over Lightning Network and Nostr.
Mutation testing is a technique to measure the quality and effectiveness of the test suite. Unlike code coverage (which only checks if code is executed), mutation testing verifies that tests actually detect bugs.
-
Creating mutants: The tool makes small, controlled changes (mutations) to the source code
- Change
==to!=,+to-,truetofalse - Remove a line of code, change a condition boundary
- Replace
&&with||,>with>= - Replace function return value with default
- Change
-
Running the test suite: Tests are executed against each mutated version
-
Measuring survival rate:
- ✅ Mutant killed: Test failed → Good! Tests detected the artificial bug
- ❌ Mutant survived: Test passed → Bad! Tests did not catch the change
Score = (Mutants killed / Total mutants) × 100
- > 80%: Excellent test quality
- 50-80%: Acceptable, room for improvement
- < 50%: Poor test quality, needs immediate attention
- Financial security: Trading logic, escrow handling, and dispute resolution must be bulletproof
- Detect weak tests: Find tests that "cover" code but don't actually verify correctness
- Force better assertions: Encourages specific, strict assertions instead of generic ones
- Find edge cases: Surviving mutants often reveal untested boundary conditions
- CI integration: Can fail builds if mutation score drops below threshold
- Documentation: Living documentation of what behavior is actually tested
- Refactoring safety: High mutation score gives confidence when refactoring critical code
We use cargo-mutants as it is the most mature mutation testing tool for Rust.
cargo install cargo-mutants# Test all mutants
cargo mutants
# Test with specific packages
cargo mutants -p mostro-core
# Test specific files
cargo mutants --file src/flow.rs
# Test with sharding (for CI parallelization)
cargo mutants --shard 1/4
# Output to specific directory
cargo mutants --output mutants.outThe .mutants.toml file configures mutation testing behavior:
# Examine only source files
examine_globs = [
"src/**/*.rs",
]
# Exclude generated code, tests, and non-critical modules
exclude_globs = [
"**/target/**",
"**/tests/**",
"**/examples/**",
"src/main.rs", # Entry point, minimal logic
"src/cli.rs", # CLI parsing
"src/config/**", # Configuration loading
"src/scheduler.rs", # Background jobs
"src/bitcoin_price.rs", # External API calls
"src/rpc/**", # RPC server
]
# Timeout for each mutant (seconds)
# Mostro tests involve DB operations, so we need generous timeout
timeout = 600
# Number of parallel jobs
jobs = 4
# Output directory
output_dir = "mutants.out"
# Additional cargo test arguments
test_tool_options = ["--", "--test-threads=1"]Mutation testing runs in CI on every PR and on main branch:
- PR workflow: Runs mutation testing on changed files only (faster feedback)
- Main branch: Runs full mutation testing weekly (baseline tracking)
- Release gate: Mutation score must not decrease from previous release
Initially, mutation testing runs in "report only" mode:
- Results are uploaded as artifacts
- CI does NOT fail on low mutation score
- Team reviews results and improves tests incrementally
Once baseline is established:
- CI fails if mutation score drops below threshold
- New code must maintain or improve mutation score
Critical modules that should have high mutation scores (>80%):
| Module | Priority | Rationale |
|---|---|---|
src/flow.rs |
Critical | Order state transitions, trade logic |
src/db.rs |
Critical | Database operations, state persistence |
src/util.rs |
Critical | Utility functions used across codebase |
src/nip33.rs |
High | Nostr event tagging |
src/lnurl.rs |
High | LNURL handling |
src/messages.rs |
Medium | Message formatting |
src/models.rs |
Medium | Data models |
- Install and configure cargo-mutants
- Create
.mutants.tomlconfiguration - Add mutation testing CI workflow
- Document the strategy
- Run full mutation testing on current codebase
- Document current mutation score per module
- Identify "survivors" (mutants that passed tests)
- Prioritize critical modules
- Improve tests for
src/flow.rs(target: 80%) - Improve tests for
src/db.rs(target: 80%) - Improve tests for
src/util.rs(target: 80%)
- Set minimum mutation score threshold in CI
- Fail builds if score drops
- Track score trends over time
INFO Found 245 mutants to test
INFO 189 mutants killed (77.1%)
INFO 56 mutants survived (22.9%)
INFO 0 mutants timed out
INFO Mutation score: 77.1%
Each surviving mutant represents a potential gap in testing:
INFO src/flow.rs:245:9: replace Order::validate -> bool with true
This mutant replaced the validate method with return true, and tests still passed. This means:
- Either the validation logic is not tested
- Or tests don't verify validation failures
Add tests that would catch the mutation:
#[test]
fn test_order_validation_rejects_invalid() {
let order = Order::new(/* invalid data */);
assert!(!order.validate()); // This would catch the mutant
}# Quick check (mutants only in changed files)
./scripts/mutation-test.sh quick
# Full run (takes ~30-60 min)
./scripts/mutation-test.sh full
# Specific module
./scripts/mutation-test.sh file src/flow.rs
# Show results summary
./scripts/mutation-test.sh report
# Or run cargo-mutants directly:
cargo mutants --file src/flow.rs --output mutants.outMutation testing is computationally expensive:
- Full run: ~30-60 minutes depending on hardware
- Each mutant requires a full test suite run
- Use sharding for parallelization in CI
- Start with critical modules only
If mutants timeout, increase the timeout in .mutants.toml:
timeout = 900 # 15 minutesSome mutations may be equivalent (changing code that doesn't affect behavior). Add to exclude list:
exclude_re = [
"replace .*::default\\(\\) -> Self with", # Default impls often equivalent
]If mutation causes compilation errors (not test failures), cargo-mutants should handle this automatically. If not:
# Exclude files with heavy macros
exclude_globs = [
"src/proto/**",
]-
.mutants.tomlcreated with appropriate configuration - CI workflow added for mutation testing
- Documentation written (
docs/MUTATION_TESTING.md) - Non-blocking in CI (report-only mode initially)
- Baseline mutation score documented (Phase 2)
- Tests improved for critical survivors (Phase 3)