You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(blog): clarify engine-vs-rules framing in XSS comparison
Reword Semgrep CE/Code descriptions for accuracy, note sanitized
variants in each case, move Scope section to the end, and refresh
the spring-analyzer link text and updated date.
Copy file name to clipboardExpand all lines: src/content/blog/semgrep-vs-codeql-vs-opentaint.mdx
+21-21Lines changed: 21 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
title: "Semgrep vs. CodeQL vs. OpenTaint: XSS Detection Depth Compared"
3
3
description: "We tested Semgrep, CodeQL, and OpenTaint on five progressively harder XSS cases in a Java Spring application — from direct returns to builder patterns with virtual dispatch — to show where each tool's analysis model hits its limit."
4
4
date: "2026-03-24"
5
-
updatedDate: "2026-05-13"
5
+
updatedDate: "2026-06-11"
6
6
keywords:
7
7
- "semgrep vs codeql"
8
8
- "xss detection comparison"
@@ -15,14 +15,14 @@ keywords:
15
15
author: "Seqra Team"
16
16
---
17
17
18
-
Good rules are a big part of what makes a SAST tool accurate, and that isn't going to change. What has changed is how easy rules are to write. Encoding a known vulnerability pattern as a rule used to take real expertise. Now AI can handle most of that work. So rules aren't really where tools differ anymore. The harder problem — the one no amount of rule tuning can fix — is the engine itself: how far it can actually trace a value through the code. If the engine can't follow data through a constructor or a virtual call, even a perfect rule won't catch the bug.
18
+
Good rules are a big part of what makes a SAST tool accurate, and that isn't going to change. What has changed is how easy rules are to write. Encoding a known vulnerability pattern as a rule used to take real expertise. Now AI can handle most of that work — and the easier the rule format is to work with, the better the result. So rules themselves aren't really where tools differ anymore. The harder problem — the one no amount of rule tuning can fix — is the engine itself: how far it can actually trace a value through the code. If the engine can't follow data through a constructor or a virtual call, even a perfect rule won't catch the bug.
19
19
20
20
To see where that limit falls, we tested three tools — Semgrep, CodeQL, and OpenTaint — on five XSS examples in a Java Spring application. Every example is the same basic bug: a controller reads a request parameter and writes it straight into the HTML it returns. What changes is how the value gets from input to output. The first case returns it directly. After that it passes through a local variable, then a helper method, then a constructor chain, and finally a builder that uses virtual dispatch. Each step adds more code between the user input and where it's used, and makes the bug a little harder to trace.
21
21
22
-
Each case measures two outcomes: false negatives (vulnerabilities the tool fails to detect) and false positives (secure code paths the tool incorrectly flags). The three tools under test:
22
+
Each case measures two outcomes: false negatives (vulnerabilities the tool fails to detect) and false positives (secure code paths the tool incorrectly flags). Every case after the first pairs the vulnerable endpoint with a sanitized variant the tool should leave alone. The three tools under test:
23
23
24
-
-**Semgrep** matches patterns syntactically, with taint-analysis support and broader inter-procedural coverage in Semgrep Code, its paid commercial edition. Results below distinguish Semgrep CE and Semgrep Code where they diverge.
25
-
-**CodeQL** runs semantic analysis through a dedicated query language. We use its default `java/xss` rule. Free for open-source repositories, requires GitHub Advanced Security for private repos.
24
+
-**Semgrep** matches patterns syntactically and offers a taint mode for local dataflow. Its paid commercial edition, Semgrep Code, adds broader inter-procedural coverage. Results below distinguish Semgrep CE and Semgrep Code where they diverge.
25
+
-**CodeQL** runs semantic analysis through a dedicated query language. We use its default `java/xss` rule. Free for open-source repositories. Private repos require GitHub Advanced Security.
26
26
-**OpenTaint** interprets Semgrep-style patterns as dataflow queries — metavariables are tracked as program values, not syntactic placeholders. Runs whole-program analysis against a build artifact, which is what enables the deeper tracking shown in the later cases. Java and Kotlin today, Apache 2.0 / MIT licensed.
27
27
28
28
## Five test cases
@@ -41,7 +41,7 @@ These are ordinary patterns — a variable, a helper method, a constructor, a bu
41
41
42
42
### Syntax matching — direct return
43
43
44
-
Here a profile page takes a greeting from the URL and writes it back into an HTML response. This is the simplest case: one endpoint, one parameter, no helpers. The controller below implements it.
44
+
Here a profile page takes a message from the URL and writes it back into an HTML response. This is the simplest case: one endpoint, one parameter, no helpers. The controller below implements it.
45
45
46
46
```java
47
47
// ProfileController.java
@@ -67,9 +67,9 @@ patterns:
67
67
}
68
68
```
69
69
70
-
All three tools detect this case. No surprise — this is the simplest form of XSS.
- ✅ **Semgrep**, ✅ **CodeQL**, ✅ **OpenTaint**: All three detect the vulnerability — no surprise for the simplest form of XSS.
73
73
74
74
### Local dataflow — variable assignment
75
75
@@ -147,7 +147,7 @@ public String displaySecureUserStatus(
147
147
}
148
148
```
149
149
150
-
The basic rules above still flag this as vulnerable because they don't recognize the sanitization function. To handle it, enhance the patternrule with a negative pattern:
150
+
The basic rules above still flag this secure version as vulnerable because they don't recognize the sanitization function. For the pattern rule, the fix is a negative pattern. This only matters for OpenTaint — Semgrep's pattern rule already misses this case entirely, so there is nothing for a negative pattern to suppress:
151
151
152
152
```yaml
153
153
# pattern.xss — with sanitization
@@ -182,7 +182,7 @@ Results:
182
182
- ✅ **Semgrep (taint)**: Detects the vulnerability and can recognize sanitization.
183
183
- ✅ **CodeQL** and ✅ **OpenTaint (pattern and taint)**: Correctly handle both vulnerable and secure code.
184
184
185
-
From this point forward, Semgrep's taint rules are used — pattern rules are insufficient. OpenTaint's pattern rule from this case is reused unchanged for all remaining examples; results are shown for both rule types.
185
+
From this point forward, Semgrep's taint rules are used — pattern rules are insufficient. OpenTaint's pattern rule from this case is reused unchanged for all remaining examples. Results are shown for both rule types.
186
186
187
187
### Inter-procedural analysis — function call boundary
This is where the tools separate. Semgrep CE does not model what happens inside the callee — it can be configured to ignore callees, which avoids false positives on the secure version but introduces false negatives on the vulnerable one. Semgrep Code inspects the callee's body and handles both correctly.
224
+
This is where the tools separate. Semgrep CE does not model what happens inside the callee. By default it assumes a call on tainted arguments returns tainted data, which catches the vulnerable version but also flags the secure one. It can instead be configured to trust callees, which clears the false positive but misses the real bug. Either way, it gets one of the two versions wrong. Semgrep Code inspects the callee's body and handles both correctly.
225
225
226
226
Results:
227
227
228
228
- ⚠️ **Semgrep CE**: Can either produce false positives or false negatives — cannot see inside the callee.
229
229
- ✅ **Semgrep Code**: Correctly handles both vulnerable and secure code.
230
230
- ✅ **CodeQL** and ✅ **OpenTaint**: Correctly handle both vulnerable and secure code.
231
231
232
-
From this point, Semgrep Code is used for remaining examples since inter-procedural analysis is essential.
232
+
From this point, the prose follows Semgrep Code, since inter-procedural analysis is essential. Semgrep CE was still run on the remaining cases — its results appear in the summary table, where it matches Semgrep Code from here on.
233
233
234
234
### Field sensitivity — constructor chains
235
235
@@ -262,7 +262,7 @@ public MessageContent(String text) {
262
262
}
263
263
```
264
264
265
-
All three tools detect this first constructor-based example. Each version also has a secure variant that reads `secureText` instead of `text` — the `MessageContent` constructor escapes the value with `HtmlUtils.htmlEscape` before storing it:
265
+
All three tools detect this first constructor-based example. This case has two versions — the three-deep chain above and a six-deep one below — and each has a secure variant that reads `secureText` instead of `text`. The `MessageContent` constructor escapes the value with `HtmlUtils.htmlEscape` before storing it:
266
266
267
267
```java
268
268
// Profile.java
@@ -326,7 +326,7 @@ Results:
326
326
327
327
### Pointer analysis — builder pattern with virtual dispatch
328
328
329
-
The final case uses a builder pattern. Method chaining returns the same instance, and a field assigned in one call is read in the next — the analyzer must carry the field across the chained call to keep the value reachable at the sink.
329
+
The final case uses a builder pattern. Method chaining returns the same instance, and a field assigned in one call is read in the next — the analyzer must carry the field across the chained call to keep the value reachable at the sink. The pointer analysis named in the table comes into play in the later variants, where what a reference actually points at decides the verdict.
330
330
331
331
```java
332
332
// MessageController.java
@@ -412,10 +412,6 @@ Results:
412
412
- ⚠️ **CodeQL**: Handles the simple builder but misses the interface-based version.
413
413
- ✅ **OpenTaint**: Detects both patterns, resolves virtual dispatch, and correctly filters the secure `EscapeFormatter` variant.
414
414
415
-
## Scope
416
-
417
-
This is a narrow test: five cases in one Spring Boot application. It shows how deeply each tool can follow data flow, but it says nothing about how they handle other languages or how they perform on a large codebase. A tool that catches all five cases here could still miss things in a different framework.
418
-
419
415
## Results summary
420
416
421
417
| Test Case | Semgrep CE | Semgrep Code | CodeQL | OpenTaint |
@@ -437,14 +433,18 @@ This is a narrow test: five cases in one Spring Boot application. It shows how d
437
433
Each tool plateaus at a different depth of analysis:
438
434
439
435
- **Semgrep CE** handles syntax matching and local taint tracking but stops at function boundaries.
440
-
- **Semgrep Code** extends through inter-procedural analysis and field sensitivity but produces false positives on secure field variants and does not follow builder patterns or virtual dispatch.
441
-
- **CodeQL** covers most cases but its analysis limits surface at deep field chains and virtual calls.
436
+
- **Semgrep Code** extends through inter-procedural analysis and deep field chains, but it cannot tell a sanitized field from a tainted one on the same object and does not follow builder patterns or virtual dispatch.
437
+
- **CodeQL** covers most cases but hits its limits at deep field chains and virtual calls.
442
438
- **OpenTaint** tracks data through all five cases — including builder state, constructor chains, and interface dispatch — using the same pattern rules throughout.
443
439
444
440
What separates the tools here isn't rule syntax — they all express roughly the same source-to-sink intent. It's how far each engine carries a tracked value on its own. In OpenTaint the same pattern rule that catches a value returned directly also catches one routed through a builder. The assignments, inter-procedural calls, field state, and virtual dispatch in between are resolved by the engine, not spelled out in the rule.
445
441
446
442
Real codebases are full of these patterns. As code grows it adds helpers, builders, persistence layers, and interface calls, and each one is another place a scanner can lose the value it is tracking. The more layers there are, the more a tool misses. This is why, over time, the engine matters more than the rules. A rule that says *what* to look for and leaves the *how* of tracking to the engine is the one that keeps working as the code gets more complex.
447
443
448
-
All five cases are runnable end-to-end in the [java-spring-demo project](https://github.com/seqra/java-spring-demo). For a deeper look at what Spring-specific data flows OpenTaint can model — dependency injection, JPA persistence, and cross-endpoint tracking — see [Taint Analysis for Spring: Data Flow Beyond the Call Graph](/blog/spring-analyzer).
444
+
## Scope
445
+
446
+
The test is narrow by design. A single common vulnerability and deliberately simple rules take rule quality out of the equation — the only variable left is how far each engine can carry a value. That is also all the results measure. They say nothing about language coverage or performance on a large codebase.
447
+
448
+
All five cases are runnable end-to-end in the [java-spring-demo project](https://github.com/seqra/java-spring-demo). For a deeper look at what Spring-specific data flows OpenTaint can model — dependency injection, JPA persistence, and cross-endpoint tracking — see [Taint Analysis for Spring: Security Beyond Syntax](/blog/spring-analyzer).
449
449
450
450
To try OpenTaint on your own project, see the [quick start guide](https://github.com/seqra/opentaint#quick-start).
0 commit comments