Skip to content

Commit deab819

Browse files
committed
docs(blog): tighten determinism and rules-vs-engine wording
- AppSec Agent: rephrase the determinism payoff line, confining non-determinism to discovery instead of asserting both sides - XSS comparison: resolve the rule-format-vs-syntax contradiction and replace source-to-sink jargon with plainer wording
1 parent 27e87a5 commit deab819

2 files changed

Lines changed: 3 additions & 3 deletions

File tree

src/content/blog/appsec-agent.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ When the agent can name *why* a finding is false — a neutralizing check it had
138138

139139
## The payoff
140140

141-
You spend the model's discovery power exactly once per pattern, distill the result into an artifact, and let the engine amortize it across the entire codebase and every future commit. Re-runs are deterministic — the same scan over the same code produces the same findings, bit for bit — and they cost CPU, not tokens. The discovery is non-deterministic. Everything downstream of it is deterministic.
141+
You spend the model's discovery power exactly once per pattern, distill the result into an artifact, and let the engine amortize it across the entire codebase and every future commit. Re-runs are deterministic — the same scan over the same code produces the same findings — and they cost CPU, not tokens. The non-determinism is confined to discovery, and everything downstream is deterministic.
142142

143143
Here is what that looks like in practice, on [Komga](https://github.com/gotson/komga) — an open-source media server, about 137,000 lines of Kotlin — starting cold, with the agent at Deep level:
144144

src/content/blog/semgrep-vs-codeql-vs-opentaint.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ keywords:
1515
author: "Seqra Team"
1616
---
1717

18-
Good rules are a big part of what makes a SAST tool accurate, and that isn't going to change. What has changed is how easy rules are to write. Encoding a known vulnerability pattern as a rule used to take real expertise. Now AI can handle most of that workand the easier the rule format is to work with, the better the result. So rules themselves aren't really where tools differ anymore. The harder problem — the one no amount of rule tuning can fix — is the engine itself: how far it can actually trace a value through the code. If the engine can't follow data through a constructor or a virtual call, even a perfect rule won't catch the bug.
18+
Good rules are a big part of what makes a SAST tool accurate, and that isn't going to change. What has changed is how easy rules are to write. Encoding a known vulnerability pattern as a rule used to take real expertise. Now AI can handle most of that work, and a friendlier rule format lets it handle more of the rest. So rules themselves aren't really where tools differ anymore. The harder problem — the one no amount of rule tuning can fix — is the engine itself: how far it can actually trace a value through the code. If the engine can't follow data through a constructor or a virtual call, even a perfect rule won't catch the bug.
1919

2020
To see where that limit falls, we tested three tools — Semgrep, CodeQL, and OpenTaint — on five XSS examples in a Java Spring application. Every example is the same basic bug: a controller reads a request parameter and writes it straight into the HTML it returns. What changes is how the value gets from input to output. The first case returns it directly. After that it passes through a local variable, then a helper method, then a constructor chain, and finally a builder that uses virtual dispatch. Each step adds more code between the user input and where it's used, and makes the bug a little harder to trace.
2121

@@ -437,7 +437,7 @@ Each tool plateaus at a different depth of analysis:
437437
- **CodeQL** covers most cases but hits its limits at deep field chains and virtual calls.
438438
- **OpenTaint** tracks data through all five cases — including builder state, constructor chains, and interface dispatch — using the same pattern rules throughout.
439439

440-
What separates the tools here isn't rule syntaxthey all express roughly the same source-to-sink intent. It's how far each engine carries a tracked value on its own. In OpenTaint the same pattern rule that catches a value returned directly also catches one routed through a builder. The assignments, inter-procedural calls, field state, and virtual dispatch in between are resolved by the engine, not spelled out in the rule.
440+
What separates the tools here isn't the rules — all three express roughly the same intent — untrusted input reaching a dangerous use. It's how far each engine carries a tracked value on its own. In OpenTaint the same pattern rule that catches a value returned directly also catches one routed through a builder. The assignments, inter-procedural calls, field state, and virtual dispatch in between are resolved by the engine, not spelled out in the rule.
441441

442442
Real codebases are full of these patterns. As code grows it adds helpers, builders, persistence layers, and interface calls, and each one is another place a scanner can lose the value it is tracking. The more layers there are, the more a tool misses. This is why, over time, the engine matters more than the rules. A rule that says *what* to look for and leaves the *how* of tracking to the engine is the one that keeps working as the code gets more complex.
443443

0 commit comments

Comments
 (0)