Skip to content

Commit e845c06

Browse files
authored
Merge pull request #2161 from HackTricks-wiki/update_Trailmark_turns_code_into_graphs_20260423_132529
Trailmark turns code into graphs
2 parents 6e2ce60 + 05ae3e9 commit e845c06

1 file changed

Lines changed: 66 additions & 0 deletions

File tree

src/generic-methodologies-and-resources/fuzzing.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -141,11 +141,77 @@ python3 infra/helper.py introspector libdwarf --public-corpora
141141

142142
Use the report to decide whether to add a new harness for an untested parser path, expand the corpus for a specific feature, or split a monolithic harness into smaller entry points.
143143

144+
## Graph-First Fuzz Target Selection And Mutation Triage
145+
146+
If you already have **static-analysis findings**, **mutation-testing survivors**, and **coverage reports**, don't triage them as independent lists. Build a **call graph** first, annotate nodes with **cyclomatic complexity**, **entrypoint/untrusted-input reachability**, and any external findings, then ask graph questions:
147+
148+
- Which high-complexity functions are reachable from untrusted input?
149+
- Which mutation survivors sit on paths from parsers/handlers to security-critical code?
150+
- Which functions are architectural choke points with unusually high **blast radius**?
151+
152+
This usually surfaces better fuzz targets than "lowest coverage" alone. A parser/decoder with **high complexity** and confirmed **external reachability** is a stronger harness candidate than an isolated internal helper with weak coverage but no attacker-controlled path.
153+
154+
### Practical triage workflow
155+
156+
1. Build a **code graph** from the codebase and extract per-function complexity/branch metrics.
157+
2. Enumerate **entrypoints** that accept attacker-controlled input: request handlers, decoders, importers, protocol parsers, CLI/file readers.
158+
3. Run **path queries** from those entrypoints to candidate functions to separate reachable attack surface from dead/internal-only code.
159+
4. Prioritize nodes that combine:
160+
- high **cyclomatic complexity**
161+
- confirmed **reachability from untrusted input**
162+
- high **blast radius** or many downstream dependents
163+
- corroborating evidence such as **SARIF** findings, audit notes, or mutation survivors
164+
5. Write focused harnesses for the best-scoring nodes first, especially **parsers/codecs** such as hex/Base64/IP/message decoders.
165+
166+
### Mutation survivors: equivalent vs actionable
167+
168+
Mutation testing often produces a noisy survivor list. Before treating every survivor as a security gap, use the graph to ask:
169+
170+
- Is the mutated function reachable from an attacker-controlled entrypoint?
171+
- Are all call paths constrained by stronger invariants than the mutated check?
172+
- Does the node sit in dead code, formatting-only logic, or in a high-impact arithmetic/parser path?
173+
174+
Survivors that remain unreachable or structurally constrained are often **equivalent mutants**. Survivors that stay **reachable** and touch **boundary conditions**, **overflow/carry paths**, or **security-critical arithmetic/parsing** should be promoted into:
175+
176+
- new fuzz harnesses
177+
- direct property/invariant tests
178+
- targeted edge-case vectors
179+
180+
### Correlate external findings onto the graph
181+
182+
If your SAST pipeline exports **SARIF**, project findings onto graph nodes by **file + line range** and use the graph to expand the impact:
183+
184+
- compute the **blast radius** of the flagged function
185+
- check whether the finding is on any path from an entrypoint
186+
- cluster nearby findings that collapse into the same choke point
187+
188+
This is useful when deciding whether to spend fuzzing time on a specific function: a node that is **reachable**, **complex**, and already has **SAST hits** is often a better target than a merely complex node with no attacker path.
189+
190+
Example workflow with Trailmark:
191+
192+
```bash
193+
uv pip install trailmark
194+
trailmark analyze --complexity 10 path/to/project
195+
```
196+
197+
```python
198+
from trailmark.query.api import QueryEngine
199+
200+
engine = QueryEngine.from_directory("path/to/project", language="c")
201+
engine.preanalysis()
202+
engine.complexity_hotspots(10)
203+
engine.paths_between("handle_request", "parse_ipv6")
204+
```
205+
206+
The important methodology is the intersection: **complexity x exposure x impact**. Use the graph to pick fuzz targets with the highest expected security value, then use mutation survivors to decide which boundaries and invariants your harness must stress.
207+
144208
## References
145209

146210
- [Mutational grammar fuzzing](https://projectzero.google/2026/03/mutational-grammar-fuzzing.html)
147211
- [Jackalope](https://github.com/googleprojectzero/Jackalope)
148212
- [AFL++ Fuzzing in Depth](https://aflplus.plus/docs/fuzzing_in_depth/)
149213
- [AFLNet Five Years Later: On Coverage-Guided Protocol Fuzzing](https://arxiv.org/abs/2412.20324)
214+
- [Trailmark turns code into graphs](https://blog.trailofbits.com/2026/04/23/trailmark-turns-code-into-graphs/)
215+
- [trailofbits/trailmark](https://github.com/trailofbits/trailmark)
150216

151217
{{#include ../banners/hacktricks-training.md}}

0 commit comments

Comments
 (0)