You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/content/blog/spring-analyzer.mdx
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
title: "Taint Analysis for Spring: Security Beyond Syntax"
3
-
description: "AST-pattern matchers break where Spring's architecture begins — interprocedural flow across class boundaries, conditionally dangerous APIs configured at bean wiring time, JPA persistence. OpenTaint traces tainted data through every layer, from injected services to database storage to dangerous API calls, distinguishing raw columns from sanitized ones."
3
+
description: "AST-pattern matchers miss the data flows Spring's architecture creates: interprocedural flow across class boundaries, conditionally dangerous APIs configured at bean wiring time, and JPA persistence. OpenTaint traces tainted data through every layer, from injected services to database storage to dangerous API calls, distinguishing raw columns from sanitized ones."
Spring Boot's annotation-driven architecture creates data flows that are invisible to AST-pattern matchers. The mechanism is everywhere: dependency injection wires `@Autowired` beans together at startup with no call site the parser can see; a template engine's configuration decides — at runtime, from a flag set in another class — whether the call to `template.process()` is exploitable or harmless; JPA persistence links two HTTP endpoints through a database row with no shared code path. Three different invisibilities, three different framework features, the same blind spot.
20
+
Spring Boot wires an application together with annotations, and that creates data flows a pattern matcher reading one file at a time cannot see. `@Autowired` beans are connected at startup, with no call site in the source. A template call can be safe or exploitable depending on a flag set in another class. Two endpoints can be linked by nothing more than a row in the database. None of this is unusual — it is how most Java web applications are built.
21
21
22
-
These are not edge cases. They are the default architecture of most Java web applications. The post walks three progressively harder challenges — following data across function and class boundaries, recognizing when an `@Autowired` constructor makes an otherwise-benign call dangerous, and connecting endpoints through persistence with per-column precision — and shows what each demands of the engine. AST-pattern matchers plateau at the first; OpenTaint models all three.
22
+
This post works through three cases, each harder than the last. First, following data across function and class boundaries. Then, recognizing when an `@Autowired` constructor turns a harmless call dangerous. Finally, connecting two endpoints through stored data, with enough precision to tell a sanitized column from a raw one. A pattern matcher stops at the first. OpenTaint handles all three.
23
23
24
24
## Single-Request Flows
25
25
@@ -29,7 +29,7 @@ For JVM languages, OpenTaint operates on bytecode rather than source text. This
29
29
30
30
### Following Data Across Function and Class Boundaries
31
31
32
-
Consider a campaign management endpoint that lets users preview custom templates. The controller receives a JSON request body and delegates to an `@Autowired` service:
32
+
A campaign management endpoint lets users preview custom templates. The controller receives a JSON request body and delegates to an `@Autowired` service:
33
33
34
34
```java
35
35
// CampaignController.java
@@ -72,7 +72,7 @@ public class TemplateRenderingService {
72
72
73
73
OpenTaint traces the complete path: `@RequestBody RenderRequest` → `renderFromRequest()` → `request.getTemplateContent()` → `renderFromContent()` → `templateEngine.process()`. The data crosses a class boundary, passes through DTO field access, and flows through an `@Autowired` service — all tracked as a single inter-procedural data flow.
74
74
75
-
Tracing the chain across function and class boundaries is necessary but not sufficient. With Thymeleaf, once the trace reaches `templateEngine.process()` on a user-controlled body, the call is exploitable on its own — the API and the taint source are enough to confirm the finding. Other engines aren't so obliging. Freemarker's `template.process()`, for instance, is exploitable only when the engine was wired up with a permissive class resolver — and that choice is made inside the engine's `@Autowired` constructor.
75
+
Tracing the chain across function and class boundaries is necessary but not sufficient. With Thymeleaf, once the trace reaches `templateEngine.process()` on a user-controlled body, the call is exploitable on its own — the API and the taint source are enough to confirm the finding. Not every template engine is this simple. Freemarker's `template.process()`, for instance, is exploitable only when the engine was wired up with a permissive class resolver — and that choice is made inside the engine's `@Autowired` constructor.
76
76
77
77
### When Autowired Constructors Matter
78
78
@@ -110,13 +110,13 @@ OpenTaint resolves `@Autowired` bean constructors and tracks the receiver state
110
110
111
111
## Cross-Endpoint Flows
112
112
113
-
Single-request flows, however complex, have a property that makes them tractable: a code path connects the user input to the dangerous call. Cross-endpoint vulnerabilities don't have this property. An attacker submits a payload through one endpoint; a different endpoint reads it and renders it. No code path connects the two — the database or service state is the only link.
113
+
Single-request flows, however complex, have one thing in common: a single code path connects the user input to the dangerous call. Cross-endpoint vulnerabilities don't. An attacker submits a payload through one endpoint, and a different endpoint reads it back and renders it. No code path connects the two. The only link is the database, or some state the two endpoints share.
114
114
115
115
Detecting these stored vulnerabilities requires modeling data flow across persistence boundaries, not just within them.
116
116
117
117
### Through the Database
118
118
119
-
Imagine a per-thread message board — a small collaboration feature where users post short notes that other users read on the thread page. A POST endpoint creates each note and stores it in the database; the thread page renders the stored notes as HTML so links and formatting come through. Two endpoints, no shared code path. The controller and service below implement this.
119
+
Take a per-thread message board, a small collaboration feature where users post short notes that others read on the thread page. A POST endpoint creates each note and stores it in the database. The thread page renders the stored notes as HTML, so links and formatting come through. Two endpoints, no shared code path. The controller and service below implement it.
120
120
121
121
<Mermaidchart={`sequenceDiagram
122
122
actor Attacker
@@ -273,7 +273,7 @@ The same logic applies to sanitizers at read time. The `GET /api/messages/{id}/c
273
273
274
274
## Conclusion
275
275
276
-
In framework-driven Java, the data flow that matters spans the whole program — long call chains across class boundaries, `@Autowired` constructor configuration that decides whether a call is dangerous, JPA persistence joining endpoints with no shared code. Spring assembles these connections at startup; reading the source one file at a time can't follow them. No amount of pattern depth fixes that — the abstraction itself is wrong. OpenTaint commits to a richer abstraction: bean wiring, persistence boundaries, conditionally dangerous APIs, per-column taint. The cost is a successful build before scanning, and whole-program analysis instead of file-by-file. The payoff is the findings that syntactic analysis alone cannot reach.
276
+
In framework-driven Java, the data flow that matters spans the whole program: long call chains across class boundaries, `@Autowired` constructor configuration that decides whether a call is dangerous, and JPA persistence that joins endpoints with no shared code. Spring assembles these connections at startup, so reading the source one file at a time cannot follow them. Making the rules deeper does not help, because the problem is not the rules. It is what the analyzer looks at. OpenTaint looks at more: bean wiring, persistence boundaries, conditionally dangerous APIs, and per-column taint. That costs a successful build before scanning, and whole-program analysis instead of file-by-file. In return, it finds the bugs that pattern matching alone cannot reach.
277
277
278
278
Clone the [purpose-built Spring Boot demo](https://github.com/seqra/java-spring-demo) and reproduce every finding in this post.
0 commit comments