You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/blockchain/smart-contract-security/mutation-testing-with-slither.md
+71-17Lines changed: 71 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
-
# Mutation Testing for Solidity with Slither (slither-mutate)
1
+
# Mutation Testing for Smart Contracts (slither-mutate, mewt, MuTON)
2
2
3
3
{{#include ../../banners/hacktricks-training.md}}
4
4
5
-
Mutation testing "tests your tests" by systematically introducing small changes (mutants) into your Solidity code and re-running your test suite. If a test fails, the mutant is killed. If the tests still pass, the mutant survives, revealing a blind spot in your test suite that line/branch coverage cannot detect.
5
+
Mutation testing "tests your tests" by systematically introducing small changes (mutants) into contract code and re-running the test suite. If a test fails, the mutant is killed. If the tests still pass, the mutant survives, revealing a blind spot that line/branch coverage cannot detect.
6
6
7
7
Key idea: Coverage shows code was executed; mutation testing shows whether behavior is actually asserted.
8
8
@@ -22,20 +22,37 @@ function verifyMinimumDeposit(uint256 deposit) public returns (bool) {
22
22
23
23
Unit tests that only check a value below and a value above the threshold can reach 100% line/branch coverage while failing to assert the equality boundary (==). A refactor to `deposit >= 2 ether` would still pass such tests, silently breaking protocol logic.
24
24
25
-
Mutation testing exposes this gap by mutating the condition and verifying your tests fail.
25
+
Mutation testing exposes this gap by mutating the condition and verifying tests fail.
26
26
27
-
## Common Solidity mutation operators
27
+
For smart contracts, surviving mutants frequently map to missing checks around:
28
+
- Authorization and role boundaries
29
+
- Accounting/value-transfer invariants
30
+
- Revert conditions and failure paths
31
+
- Boundary conditions (`==`, zero values, empty arrays, max/min values)
28
32
29
-
Slither’s mutation engine applies many small, semantics-changing edits, such as:
30
-
- Operator replacement: `+` ↔ `-`, `*` ↔ `/`, etc.
Mutation campaigns can take hours or days. Tips to reduce cost:
76
93
- Scope: Start with critical contracts/directories only, then expand.
77
-
- Prioritize mutators: If a high-priority mutant on a line survives (e.g., entire line commented), you can skip lower-priority variants for that line.
94
+
- Prioritize mutators: If a high-priority mutant on a line survives (for example `revert()` or comment-out), skip lower-priority variants for that line.
95
+
- Use two-phase campaigns: run focused/fast tests first, then re-test only uncaught mutants with the full suite.
96
+
- Map mutation targets to specific test commands when possible (for example auth code -> auth tests).
97
+
- Restrict campaigns to high/medium severity mutants when time is tight.
78
98
- Parallelize tests if your runner allows it; cache dependencies/builds.
79
99
- Fail-fast: stop early when a change clearly demonstrates an assertion gap.
80
100
101
+
The runtime math is brutal: `1000 mutants x 5-minute tests ~= 83 hours`, so campaign design matters as much as the mutator itself.
102
+
103
+
## Persistent campaigns and triage at scale
104
+
105
+
One weakness of older workflows is dumping results only to `stdout`. For long campaigns, this makes pause/resume, filtering, and review harder.
106
+
107
+
`mewt`/`MuTON` improve this by storing mutants and outcomes in SQLite-backed campaigns. Benefits:
108
+
- Pause and resume long runs without losing progress
109
+
- Filter only uncaught mutants in a specific file or mutation class
110
+
- Export/translate results to SARIF for review tooling
111
+
- Give AI-assisted triage smaller, filtered result sets instead of raw terminal logs
112
+
113
+
Persistent results are especially useful when mutation testing becomes part of an audit pipeline instead of a one-off manual review.
114
+
81
115
## Triage workflow for surviving mutants
82
116
83
117
1) Inspect the mutated line and behavior.
@@ -93,7 +127,10 @@ Mutation campaigns can take hours or days. Tips to reduce cost:
93
127
4) Add invariants for fuzz tests.
94
128
- E.g., conservation of value, non-negative balances, authorization invariants, monotonic supply where applicable.
95
129
96
-
5) Re-run slither-mutate until survivors are killed or explicitly justified.
130
+
5) Separate true positives from semantic no-ops.
131
+
- Example: `x > 0` -> `x != 0` is meaningless when `x` is unsigned.
132
+
133
+
6) Re-run the campaign until survivors are killed or explicitly justified.
97
134
98
135
## Case study: revealing missing state assertions (Arkis protocol)
99
136
@@ -107,21 +144,38 @@ Commenting out the assignment didn’t break the tests, proving missing post-sta
107
144
108
145
Guidance: Treat survivors that affect value transfers, accounting, or access control as high-risk until killed.
109
146
147
+
## Do not blindly generate tests to kill every mutant
148
+
149
+
Mutation-driven test generation can backfire if the current implementation is wrong. Example: mutating `priority >= 2` to `priority > 2` changes behavior, but the right fix is not always "write a test for `priority == 2`". That behavior may itself be the bug.
150
+
151
+
Safer workflow:
152
+
- Use surviving mutants to identify ambiguous requirements
153
+
- Validate expected behavior from specs, protocol docs, or reviewers
154
+
- Only then encode the behavior as a test/invariant
155
+
156
+
Otherwise, you risk hard-coding implementation accidents into the test suite and gaining false confidence.
- Persist results when the tooling supports it, and filter uncaught mutants before triage.
168
+
- Use two-phase or per-target campaigns to keep runtime manageable.
118
169
- Iterate until all mutants are killed or justified with comments and rationale.
119
170
120
171
## References
121
172
173
+
-[Mutation testing for the agentic era](https://blog.trailofbits.com/2026/04/01/mutation-testing-for-the-agentic-era/)
122
174
-[Use mutation testing to find the bugs your tests don't catch (Trail of Bits)](https://blog.trailofbits.com/2025/09/18/use-mutation-testing-to-find-the-bugs-your-tests-dont-catch/)
123
175
-[Arkis DeFi Prime Brokerage Security Review (Appendix C)](https://github.com/trailofbits/publications/blob/master/reviews/2024-12-arkis-defi-prime-brokerage-securityreview.pdf)
0 commit comments