Skip to content

Commit 4975333

Browse files
committed
fix: Add secure variants to field-sensitivity comparison
1 parent 6370d0c commit 4975333

1 file changed

Lines changed: 46 additions & 6 deletions

File tree

src/content/blog/semgrep-vs-codeql-vs-opentaint.mdx

Lines changed: 46 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -267,7 +267,33 @@ public MessageContent(String text) {
267267
}
268268
```
269269

270-
All three tools detect this first constructor-based example. The next variant extends the field chain further:
270+
All three tools detect this first constructor-based example. Each version also has a secure variant that reads `secureText` instead of `text` — the `MessageContent` constructor escapes the value with `HtmlUtils.htmlEscape` before storing it:
271+
272+
```java
273+
// Profile.java
274+
public class MessageContent {
275+
public String text;
276+
public String secureText;
277+
278+
public MessageContent(String text) {
279+
this.text = "<h1>Notification: " + text + "</h1>";
280+
this.secureText = "<h1>Notification: " + HtmlUtils.htmlEscape(text) + "</h1>";
281+
}
282+
}
283+
```
284+
285+
```java
286+
// NotificationController.java — secure version
287+
@GetMapping("/notifications/secureTemplate")
288+
@ResponseBody
289+
public String generateSecureTemplate(
290+
@RequestParam(defaultValue = "New Message") String content) {
291+
Profile.MessageTemplate template = new Profile.MessageTemplate(content);
292+
return template.body.content.secureText;
293+
}
294+
```
295+
296+
The next variant extends the field chain further:
271297

272298
```java
273299
// NotificationController.java
@@ -281,13 +307,27 @@ public String generateNotification(
281307
}
282308
```
283309

284-
Here the tools diverge. Semgrep Code and OpenTaint track the deeper field chain. CodeQL does not report the vulnerability — its taint-tracking model does not propagate through field stores and loads on heap objects beyond a limited depth, so the six-deep accessor chain exceeds what its default `java/xss` query tracks.
310+
And its secure counterpart:
311+
312+
```java
313+
// NotificationController.java — secure version
314+
@GetMapping("/notifications/secureGenerate")
315+
@ResponseBody
316+
public String generateSecureNotification(
317+
@RequestParam(defaultValue = "New Message") String content) {
318+
Profile.UserProfile profile = new Profile.UserProfile(content);
319+
320+
return profile.settings.config.template.body.content.secureText;
321+
}
322+
```
323+
324+
Here the tools diverge. Semgrep Code and OpenTaint track the deeper field chain. CodeQL does not report the vulnerability — its taint-tracking model does not propagate through field stores and loads on heap objects beyond a limited depth, so the six-deep accessor chain exceeds what its default `java/xss` query tracks. On the secure variants, Semgrep Code produces false positives — it flags the sanitized `secureText` field as vulnerable, unable to distinguish it from the tainted `text` field on the same object.
285325

286326
Results:
287327

288-
- **Semgrep Code**: Detects both simple and complex versions.
328+
- ⚠️ **Semgrep Code**: Detects both simple and complex vulnerable versions but produces false positives on secure variants.
289329
- ⚠️ **CodeQL**: Handles the simple version but misses the complex one.
290-
- ✅ **OpenTaint**: Correctly handles both versions, including secure variants.
330+
- ✅ **OpenTaint**: Correctly handles both versions, filtering out secure variants.
291331

292332
### Pointer analysis — builder pattern with virtual dispatch
293333

@@ -394,7 +434,7 @@ This comparison has deliberate constraints worth naming.
394434
| 1. **Direct return** | ✅ Pattern<br/>✅ Taint | ✅ Pattern<br/>✅ Taint | ✅ Built-in | ✅ Pattern<br/>✅ Taint |
395435
| 2. **Local variable assignment** | ❌ Pattern<br/>✅ Taint | ❌ Pattern<br/>✅ Taint | ✅ Built-in | ✅ Pattern<br/>✅ Taint |
396436
| 3. **Inter-procedural flow** | ❌ Pattern<br/>⚠️ Taint | ❌ Pattern<br/>✅ Taint | ✅ Built-in | ✅ Pattern<br/>✅ Taint |
397-
| 4. **Field sensitivity — constructor chains** | ❌ Pattern<br/>⚠️ Taint | ❌ Pattern<br/> Taint | ⚠️ Built-in | ✅ Pattern<br/>✅ Taint |
437+
| 4. **Field sensitivity — constructor chains** | ❌ Pattern<br/>⚠️ Taint | ❌ Pattern<br/>⚠️ Taint | ⚠️ Built-in | ✅ Pattern<br/>✅ Taint |
398438
| 5. **Pointer analysis — builder pattern with virtual dispatch** | ❌ Pattern<br/>❌ Taint | ❌ Pattern<br/>❌ Taint | ⚠️ Built-in | ✅ Pattern<br/>✅ Taint |
399439

400440
### Legend
@@ -408,7 +448,7 @@ This comparison has deliberate constraints worth naming.
408448
Each tool plateaus at a different depth of analysis:
409449

410450
- **Semgrep CE** handles syntax matching and local taint tracking but stops at function boundaries.
411-
- **Semgrep Code** extends through inter-procedural analysis and field sensitivity but does not follow builder patterns or virtual dispatch.
451+
- **Semgrep Code** extends through inter-procedural analysis and field sensitivity but produces false positives on secure field variants and does not follow builder patterns or virtual dispatch.
412452
- **CodeQL** covers most cases but its analysis limits surface at deep field chains and virtual calls.
413453
- **OpenTaint** tracks data through all five cases — including builder state, constructor chains, and interface dispatch — using the same pattern rules throughout.
414454

0 commit comments

Comments
 (0)