You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(blog): align spring-analyzer with content style guide
Restructures the post per marketing/content-style-guide.md:
- Cold open uses one gathering sentence + through-line closer; convert
numbered preview to prose
- Add bridge between Single-Request subsections so the next H3 starts
mid-argument
- Replace "messaging feature shows this pattern" with the worked
scenario paragraph anchoring the Cross-Endpoint subsections
- Rename "Configuration-Aware Sinks" -> "Conditional Sinks" and
reframe body around the rule's condition rather than configuration
- Replace recap conclusion with the synthesis closing
- Vary repeated "Consider a..." opener
- Collapse three closing CTAs to a single demo-repo link
Spring Boot's annotation-driven architecture creates data flows that are invisible to conventional static analysis. An `@Autowired` injection crosses class boundaries with no call site in the source. JPA persistence links two HTTP endpoints through the database with no shared code path. A template engine's configuration determines whether user input reaching`template.process()` is exploitable — or harmless.
20
+
Spring Boot's annotation-driven architecture creates data flows that are invisible to conventional static analysis. The mechanism is everywhere: an `@Autowired` injection crosses class boundaries with no call site the parser can see; JPA persistence links two HTTP endpoints through a database row with no shared code path; a template engine's configuration decides — at runtime, from a flag set in some other file — whether the call to`template.process()` is exploitable or harmless. Three different invisibilities, three different framework features, the same blind spot.
21
21
22
-
These are not edge cases. They are the default architecture of most Java web applications. And they create three progressively harder challenges for static analysis:
23
-
24
-
1.**Following data through dependency injection** — where `@Autowired` wiring replaces explicit call sites and data crosses file, class, and service boundaries.
25
-
2.**Connecting endpoints through persistence** — where the database or service state is the only link between the request that stores a payload and the request that renders it.
26
-
3.**Distinguishing dangerous fields from safe ones** — at per-column granularity in persisted entities, and per-configuration granularity in template engines.
27
-
28
-
Each level requires a deeper understanding of Spring's framework semantics. OpenTaint models all three.
22
+
These are not edge cases. They are the default architecture of most Java web applications. The post walks three progressively harder challenges — following data through dependency injection, connecting endpoints through persistence, and distinguishing dangerous fields from safe ones at per-column granularity — and shows what each demands of the engine. Generic SAST plateaus at the first; OpenTaint models all three.
29
23
30
24
## Single-Request Flows
31
25
32
-
Before tackling cross-endpoint flows, we start with what happens within a single HTTP request — following data through DI boundaries and recognizing when a sink is actually dangerous based on its configuration.
26
+
Before tackling cross-endpoint flows, we start with what happens within a single HTTP request — following data through DI boundaries and recognizing when a sink is actually dangerous based on the receiver state the rule's condition names.
33
27
34
28
For JVM languages, OpenTaint operates on bytecode rather than source text. This requires a successful build before scanning, but gives precise resolution of inheritance, generics, and library calls. That precision matters in Spring — runtime behavior depends on bean wiring, annotation metadata, and framework conventions that AST-only tools treat as opaque.
35
29
@@ -78,11 +72,13 @@ public class TemplateRenderingService {
78
72
79
73
OpenTaint traces the complete path: `@RequestBody RenderRequest` → `renderFromRequest()` → `request.getTemplateContent()` → `renderFromContent()` → `templateEngine.process()`. The data crosses a class boundary, passes through DTO field access, and flows through an `@Autowired` service — all tracked as a single inter-procedural data flow.
80
74
81
-
### Configuration-Aware Sinks
75
+
Resolving the chain across DI boundaries is necessary but not sufficient. The next question is whether the call at the end of the chain is genuinely dangerous — and that depends on receiver state the chain itself doesn't carry, expressed as a condition the rule must encode.
76
+
77
+
### Conditional Sinks
82
78
83
-
Not every call to a template engine is equally dangerous. Whether user input reaching `template.process()`constitutes an SSTI depends on how the template engine is configured. An analyzer that doesn't distinguish between hardened and default configurations either flags both (noise) or misses both.
79
+
Not every call to a template engine is equally dangerous. The rule for `template.process()`carries a condition: the call is a sink only when the receiver permits class loading. An analyzer with no way to express that condition either flags every call (noise) or none (missed RCE).
84
80
85
-
Consider a controller that passes user-controlled template content to two different Freemarker services:
81
+
The same controller exposes two parallel endpoints, each routing user-controlled template content to a different Freemarker service:
OpenTaint resolves `@Autowired` bean constructors and tracks configuration state. It flags the marketing service — `UNRESTRICTED_RESOLVER` allows class loading, enabling remote code execution — and suppresses the notification service, where `ALLOWS_NOTHING_RESOLVER` prevents class instantiation.
107
+
OpenTaint resolves `@Autowired` bean constructors and tracks the receiver state the rule's condition names. It flags the marketing service — `UNRESTRICTED_RESOLVER` allows class loading, enabling remote code execution — and suppresses the notification service, where `ALLOWS_NOTHING_RESOLVER` prevents class instantiation.
112
108
113
109
## Cross-Endpoint Flows
114
110
@@ -118,7 +114,7 @@ Detecting these stored vulnerabilities requires modeling data flow across persis
118
114
119
115
### Through the Database
120
116
121
-
A messaging feature shows this pattern. A POST endpoint accepts user content and stores it via JPA; a separate GET endpoint retrieves that content and serves it as HTML.
117
+
Imagine a per-thread message board — a small collaboration feature where users post short notes that other users read on the thread page. A POST endpoint creates each note and stores it in the database; the thread page renders the stored notes as HTML so links and formatting come through. Two endpoints, no shared code path. The controller and service below implement this.
122
118
123
119
<Mermaidchart={`sequenceDiagram
124
120
actor Attacker
@@ -273,30 +269,10 @@ Without column-level tracking, the choice is between flagging all three endpoint
273
269
274
270
The same logic applies to sanitizers at read time. The `GET /api/messages/{id}/content/safe` endpoint passes content through `HtmlUtils.htmlEscape()` before returning it — OpenTaint sees the sanitizer and suppresses the finding for that path as well.
275
271
276
-
## Limitations
277
-
278
-
We deliberately over-approximate in several areas to preserve coverage. False positives can appear from these deliberate trade-offs:
279
-
280
-
1.**Path insensitivity** — the analyzer does not track conditional branches, so it may report findings for paths guarded by validation logic. This is a common trade-off in static analysis: tracking all branches is expensive and path explosion makes the naive analysis impractical at scale. The practical mitigation is to use recognized sanitizer functions — which the rules understand — instead of conditional checks.
281
-
2.**Configuration insensitivity** — the analyzer currently does not model Spring bean profiles or `@Conditional` annotations, so it resolves all potential bean instances on `@Autowired` and may produce findings from configurations that aren't actually active. This over-approximation ensures no injection is missed through an unexpected bean, but can be noisy in projects with many profiles.
282
-
283
-
Beyond these deliberate trade-offs, there are also straightforward gaps:
284
-
285
-
1.**Missing rules** — a source, sink, sanitizer, or propagator is not covered in the rules. Fixed by adding more rules.
286
-
2.**Analyzer bugs** are possible.
287
-
288
-
If you see an uncaught issue you expect to be detected or vice-versa, report it via GitHub Issues or Discord.
289
-
290
272
## Conclusion
291
273
292
-
Each section showed a harder problem: following data through dependency injection, connecting endpoints through persistence, and distinguishing dangerous fields from safe ones within those flows. Each required the engine to model a layer of Spring semantics that generic static analyzers ignore — bean wiring, JPA repository boundaries, template engine configuration, and per-column taint state.
274
+
The call graph is the wrong primitive for framework-driven Java. Annotations replace explicit calls; persistence connects endpoints with no shared code; configuration decides whether a sink is a sink. An analyzer built on the call graph plus pattern matching cannot see these flows — not because they are rare, but because the abstraction is wrong. OpenTaint commits to a richer abstraction: bean wiring, persistence boundaries, conditional sinks per-column taint. The cost is whole-program analysis that needs a build. The payoff is the findings the call graph alone cannot reach.
293
275
294
-
On the demo project, these capabilities surface realistic vulnerabilities that conventional tools miss: cross-endpoint stored XSS through JPA and service fields, configuration-dependent SSTI in Freemarker, and deep DI-mediated injection paths — while suppressing sanitized columns and hardened configurations that would otherwise be false positives.
295
-
296
-
Each new service, persistence layer, and injection boundary is another place for the data trail to break. OpenTaint follows it through all of them.
276
+
Clone the [purpose-built Spring Boot demo](https://github.com/seqra/java-spring-demo) and reproduce every finding in this post.
297
277
298
278
For a side-by-side comparison of how Semgrep, CodeQL, and OpenTaint handle progressively harder XSS cases — from direct returns to builder patterns with virtual dispatch — see [Semgrep vs. CodeQL vs. OpenTaint: XSS Detection Depth Compared](/blog/semgrep-vs-codeql-vs-opentaint).
299
-
300
-
All examples in this post come from a [purpose-built Spring Boot demo](https://github.com/seqra/java-spring-demo) — clone it to reproduce every finding.
301
-
302
-
Try it on your own project — see the [quick start guide](https://github.com/seqra/opentaint#quick-start).
0 commit comments