Skip to content

Add Java taint engine coverage#459

Open
Darkroom4364 wants to merge 1 commit into
mainfrom
issue-449-java-taint
Open

Add Java taint engine coverage#459
Darkroom4364 wants to merge 1 commit into
mainfrom
issue-449-java-taint

Conversation

@Darkroom4364
Copy link
Copy Markdown
Collaborator

@Darkroom4364 Darkroom4364 commented May 27, 2026

Summary

  • add a Java taint engine for servlet and Spring MVC sources
  • register Java taint rules for command injection, SQL injection, SSRF, and unsafe deserialization
  • add positive, negative, realistic, and cross-language matrix coverage
  • document Java taint scope, sources, sinks, and sanitizer behavior

Closes #449.

Verification

  • cargo fmt --check
  • cargo test java_taint
  • cargo test realistic_java_spring_controller
  • cargo test taint_explain_matrix_cells_render_traces
  • cargo test rule_inventory
  • cargo clippy --all-targets --all-features -- -D warnings
  • target/debug/foxguard --config /dev/null --severity medium src/rules/java_taint.rs -f json
  • cargo test

Summary by CodeRabbit

  • New Features

    • Added Java taint analysis with four security rules: SQL injection, command injection, SSRF, and unsafe deserialization detection.
    • Extended taint engine support from three to six languages.
  • Documentation

    • Updated taint tracking documentation with expanded language support details and Java framework coverage.
  • Tests

    • Added comprehensive test coverage for Java taint analysis including real-world Spring controller scenarios.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 27, 2026

📝 Walkthrough

Walkthrough

This PR implements Java taint analysis by adding a new intraprocedural engine in src/rules/java_taint.rs, integrating it into the rule framework via src/rules/java.rs, wiring it into scanner dispatch, and including comprehensive test fixtures and documentation updates covering SQL injection, command injection, SSRF, and unsafe deserialization patterns.

Changes

Java Taint Engine

Layer / File(s) Summary
Core taint analysis engine
src/rules/java_taint.rs
Implements intraprocedural, flow-insensitive taint analysis: collects sources from servlet request methods, Spring @RequestParam annotations, and System.getenv; propagates taint through variable initializers and assignments over three passes; detects sinks via method invocation and object creation matchers; emits TaintFinding with source/sink metadata and hop counts. Includes four per-rule TaintSpec definitions (SQL, command, SSRF, deserialization) and unit tests covering positive and negative cases.
Rule framework integration
src/rules/java.rs, src/rules/mod.rs
Maps Java taint findings into the Finding schema via map_java_taint_finding and JavaTaintRuleMeta, implements run_java_taint_batched and run_java_taint_single executors, exports four new taint rules (TaintSqlInjection, TaintCommandInjection, TaintSsrf, TaintUnsafeDeserialization), adds pub mod java_taint, introduces TaintEngine::Java variant, and wires Java taint specs into rule registration and language selection.
Scanner dispatch
src/engine/scanner.rs
Adds Java taint rule execution block that filters enabled rules by TaintEngine::Java and invokes run_java_taint_batched before Python taint dispatch.
Test fixtures and integration
tests/fixtures/vulnerable_java_taint.java, tests/fixtures/safe_java_taint.java, tests/fixtures/realistic/java_spring_controller.java, tests/integration.rs, tests/realistic_fixtures.rs, tests/cross_language_matrix.rs
Three fixtures demonstrating vulnerable flows (SQL/command/SSRF/deserialization), safe patterns, and realistic Spring controller scenarios. Integration tests validate positive (vulnerable fixture catches 1 finding per rule) and negative (safe fixture produces 0 findings) outcomes, realistic fixture assertion of 7 findings, and cross-language matrix coverage update marking Java taint_explain as Covered.
Documentation and rule registry
docs/taint-tracking.md, www/src/data/rules.ts
Updates language support from three to six languages, expands known limitations, adds "Supported Java frameworks" section documenting servlet/Spring sources and analysis scope, rewrites performance section for multi-language context, and registers four new Java taint rules with CWE mappings.

Sequence Diagram

sequenceDiagram
  participant Scanner as Scanner
  participant JavaTaint as java_taint<br/>(analyze_tree)
  participant Sources as Param/Request<br/>Source Matchers
  participant Propagate as Taint<br/>Propagation
  participant Sinks as SQL/Command/<br/>SSRF/Deser Sinks
  participant Finding as Finding<br/>Mapper
  Scanner->>JavaTaint: per-rule TaintSpec
  JavaTaint->>Sources: collect `@RequestParam/servlet` params
  Sources->>Propagate: tainted variable names
  Propagate->>Sinks: taint state after 3 passes
  Sinks->>Sinks: match invocations to sink specs
  Sinks->>Finding: emit when arg/receiver tainted
  Finding->>Scanner: mapped Finding objects
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • 0sec-labs/foxguard#453: Adds C taint engine using the same scanner dispatch and run_*_taint_batched pattern.
  • 0sec-labs/foxguard#397: Updates docs/taint-tracking.md and TaintEngine infrastructure around language selection.

Suggested reviewers

  • peaktwilight

Poem

🐰 A rabbit's verse on taint flows bright,
Java sinks detected left and right,
SQL, command, SSRF's call,
Deserialization caught by all!
Sources mapped and hopcount tracked,
Six languages now fully stacked! 🔐

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 35.53% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and concisely describes the main change: adding Java taint engine coverage, which is the primary focus of the pull request.
Linked Issues check ✅ Passed The PR fully addresses all acceptance criteria from issue #449: Java is chosen and documented, engine emits all required metadata (source_line, source_description, sink_line, sink_description, confidence), four vulnerability types are implemented (command/SQL/SSRF/deserialization), comprehensive tests are added, and docs are updated.
Out of Scope Changes check ✅ Passed All changes are directly related to Java taint engine implementation and testing. Documentation updates, test fixtures, rule implementations, and integration tests are all within the defined scope of issue #449.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue-449-java-taint

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

void runExport(HttpServletRequest request) throws Exception {
String report = request.getParameter("report");
String command = "/usr/local/bin/export " + report;
Runtime.getRuntime().exec(command);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · CRITICAL · java/no-command-injection (CWE-78)

Runtime.exec() called with dynamic argument — risk of command injection


String proxy(HttpServletRequest req) throws Exception {
String url = req.getParameter("url");
URL target = new URL(url);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · HIGH · java/no-ssrf (CWE-918)

new URL() with dynamic argument — validate and allowlist target hosts to prevent SSRF

Object importSnapshot(HttpServletRequest request) throws Exception {
InputStream input = request.getInputStream();
ObjectInputStream stream = new ObjectInputStream(input);
return stream.readObject();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · CRITICAL · java/no-unsafe-deserialization (CWE-502)

readObject() on untrusted data can lead to remote code execution — use allowlist-based deserialization

void deserialize(HttpServletRequest request) throws Exception {
InputStream input = request.getInputStream();
ObjectInputStream stream = new ObjectInputStream(new ByteArrayInputStream(new byte[0]));
stream.readObject();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · CRITICAL · java/no-unsafe-deserialization (CWE-502)

readObject() on untrusted data can lead to remote code execution — use allowlist-based deserialization


void command(HttpServletRequest request) throws Exception {
String command = request.getParameter("cmd");
Runtime.getRuntime().exec(command);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · CRITICAL · java/no-command-injection (CWE-78)

Runtime.exec() called with dynamic argument — risk of command injection


void ssrf(HttpServletRequest req) throws Exception {
String target = req.getParameter("url");
new URL(target);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · HIGH · java/no-ssrf (CWE-918)

new URL() with dynamic argument — validate and allowlist target hosts to prevent SSRF

void deserialize(HttpServletRequest request) throws Exception {
InputStream input = request.getInputStream();
ObjectInputStream stream = new ObjectInputStream(input);
stream.readObject();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · CRITICAL · java/no-unsafe-deserialization (CWE-502)

readObject() on untrusted data can lead to remote code execution — use allowlist-based deserialization

Comment thread tests/integration.rs

#[test]
fn test_vulnerable_java_taint_catches_every_flow() {
let output = foxguard_cmd_isolated()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)

.expect() can panic at runtime — use proper error handling with ? or match

Comment thread tests/integration.rs

#[test]
fn test_safe_java_taint_has_no_taint_findings() {
let output = foxguard_cmd_isolated()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)

.expect() can panic at runtime — use proper error handling with ? or match

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/rules/java_taint.rs`:
- Around line 268-299: collect_param_sources currently traverses the entire
subtree via walk_tree(scope_node, ...) and thus picks up formal_parameter nodes
inside nested lambdas, causing parent-scope taints; change the traversal so it
only considers formal_parameter nodes directly declared in the current scope
(e.g., iterate the immediate children of scope_node or stop descent when
encountering lambda/method nodes) before calling formal_parameter_name/node_text
and state.taint, ensuring nested function/lambda nodes are not descended into.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 60810c0e-6cc4-4c8d-9f67-e3dc0dc13914

📥 Commits

Reviewing files that changed from the base of the PR and between 84f6ae5 and f073871.

📒 Files selected for processing (12)
  • docs/taint-tracking.md
  • src/engine/scanner.rs
  • src/rules/java.rs
  • src/rules/java_taint.rs
  • src/rules/mod.rs
  • tests/cross_language_matrix.rs
  • tests/fixtures/realistic/java_spring_controller.java
  • tests/fixtures/safe_java_taint.java
  • tests/fixtures/vulnerable_java_taint.java
  • tests/integration.rs
  • tests/realistic_fixtures.rs
  • www/src/data/rules.ts

Comment thread src/rules/java_taint.rs
Comment on lines +268 to +299
walk_tree(scope_node, source, &mut |node, src| {
if node.kind() != "formal_parameter" {
return;
}
let Some(name) = formal_parameter_name(node, src) else {
return;
};
let text = node_text(node, src);
let matched_annotation = annotation_names
.iter()
.find(|annotation| text.contains(&format!("@{annotation}")));

if let Some(annotation) = matched_annotation {
state.taint(
name.to_string(),
TaintInfo {
description: format!("@{annotation} parameter '{name}'"),
line: node.start_position().row + 1,
hops: 0,
},
);
} else if bare_names.contains(&name) {
state.taint(
name.to_string(),
TaintInfo {
description: format!("parameter '{name}'"),
line: node.start_position().row + 1,
hops: 0,
},
);
}
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid collecting parameter sources from nested scopes.

collect_param_sources currently walks the full subtree of the scope and will include formal_parameter nodes from nested lambdas. That can taint names in the parent scope incorrectly and create false positives.

Suggested fix
-    walk_tree(scope_node, source, &mut |node, src| {
+    walk_scope_nodes(scope_node, source, &mut |node, src| {
         if node.kind() != "formal_parameter" {
             return;
         }
         let Some(name) = formal_parameter_name(node, src) else {
             return;
         };
         let text = node_text(node, src);
         let matched_annotation = annotation_names
             .iter()
             .find(|annotation| text.contains(&format!("@{annotation}")));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
walk_tree(scope_node, source, &mut |node, src| {
if node.kind() != "formal_parameter" {
return;
}
let Some(name) = formal_parameter_name(node, src) else {
return;
};
let text = node_text(node, src);
let matched_annotation = annotation_names
.iter()
.find(|annotation| text.contains(&format!("@{annotation}")));
if let Some(annotation) = matched_annotation {
state.taint(
name.to_string(),
TaintInfo {
description: format!("@{annotation} parameter '{name}'"),
line: node.start_position().row + 1,
hops: 0,
},
);
} else if bare_names.contains(&name) {
state.taint(
name.to_string(),
TaintInfo {
description: format!("parameter '{name}'"),
line: node.start_position().row + 1,
hops: 0,
},
);
}
});
walk_scope_nodes(scope_node, source, &mut |node, src| {
if node.kind() != "formal_parameter" {
return;
}
let Some(name) = formal_parameter_name(node, src) else {
return;
};
let text = node_text(node, src);
let matched_annotation = annotation_names
.iter()
.find(|annotation| text.contains(&format!("@{annotation}")));
if let Some(annotation) = matched_annotation {
state.taint(
name.to_string(),
TaintInfo {
description: format!("@{annotation} parameter '{name}'"),
line: node.start_position().row + 1,
hops: 0,
},
);
} else if bare_names.contains(&name) {
state.taint(
name.to_string(),
TaintInfo {
description: format!("parameter '{name}'"),
line: node.start_position().row + 1,
hops: 0,
},
);
}
});
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rules/java_taint.rs` around lines 268 - 299, collect_param_sources
currently traverses the entire subtree via walk_tree(scope_node, ...) and thus
picks up formal_parameter nodes inside nested lambdas, causing parent-scope
taints; change the traversal so it only considers formal_parameter nodes
directly declared in the current scope (e.g., iterate the immediate children of
scope_node or stop descent when encountering lambda/method nodes) before calling
formal_parameter_name/node_text and state.taint, ensuring nested function/lambda
nodes are not descended into.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add next-language taint engine coverage for Java or C#

1 participant