Add Java taint engine coverage#459
Conversation
📝 WalkthroughWalkthroughThis PR implements Java taint analysis by adding a new intraprocedural engine in ChangesJava Taint Engine
Sequence DiagramsequenceDiagram
participant Scanner as Scanner
participant JavaTaint as java_taint<br/>(analyze_tree)
participant Sources as Param/Request<br/>Source Matchers
participant Propagate as Taint<br/>Propagation
participant Sinks as SQL/Command/<br/>SSRF/Deser Sinks
participant Finding as Finding<br/>Mapper
Scanner->>JavaTaint: per-rule TaintSpec
JavaTaint->>Sources: collect `@RequestParam/servlet` params
Sources->>Propagate: tainted variable names
Propagate->>Sinks: taint state after 3 passes
Sinks->>Sinks: match invocations to sink specs
Sinks->>Finding: emit when arg/receiver tainted
Finding->>Scanner: mapped Finding objects
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| void runExport(HttpServletRequest request) throws Exception { | ||
| String report = request.getParameter("report"); | ||
| String command = "/usr/local/bin/export " + report; | ||
| Runtime.getRuntime().exec(command); |
There was a problem hiding this comment.
foxguard · CRITICAL · java/no-command-injection (CWE-78)
Runtime.exec() called with dynamic argument — risk of command injection
|
|
||
| String proxy(HttpServletRequest req) throws Exception { | ||
| String url = req.getParameter("url"); | ||
| URL target = new URL(url); |
There was a problem hiding this comment.
foxguard · HIGH · java/no-ssrf (CWE-918)
new URL() with dynamic argument — validate and allowlist target hosts to prevent SSRF
| Object importSnapshot(HttpServletRequest request) throws Exception { | ||
| InputStream input = request.getInputStream(); | ||
| ObjectInputStream stream = new ObjectInputStream(input); | ||
| return stream.readObject(); |
There was a problem hiding this comment.
foxguard · CRITICAL · java/no-unsafe-deserialization (CWE-502)
readObject() on untrusted data can lead to remote code execution — use allowlist-based deserialization
| void deserialize(HttpServletRequest request) throws Exception { | ||
| InputStream input = request.getInputStream(); | ||
| ObjectInputStream stream = new ObjectInputStream(new ByteArrayInputStream(new byte[0])); | ||
| stream.readObject(); |
There was a problem hiding this comment.
foxguard · CRITICAL · java/no-unsafe-deserialization (CWE-502)
readObject() on untrusted data can lead to remote code execution — use allowlist-based deserialization
|
|
||
| void command(HttpServletRequest request) throws Exception { | ||
| String command = request.getParameter("cmd"); | ||
| Runtime.getRuntime().exec(command); |
There was a problem hiding this comment.
foxguard · CRITICAL · java/no-command-injection (CWE-78)
Runtime.exec() called with dynamic argument — risk of command injection
|
|
||
| void ssrf(HttpServletRequest req) throws Exception { | ||
| String target = req.getParameter("url"); | ||
| new URL(target); |
There was a problem hiding this comment.
foxguard · HIGH · java/no-ssrf (CWE-918)
new URL() with dynamic argument — validate and allowlist target hosts to prevent SSRF
| void deserialize(HttpServletRequest request) throws Exception { | ||
| InputStream input = request.getInputStream(); | ||
| ObjectInputStream stream = new ObjectInputStream(input); | ||
| stream.readObject(); |
There was a problem hiding this comment.
foxguard · CRITICAL · java/no-unsafe-deserialization (CWE-502)
readObject() on untrusted data can lead to remote code execution — use allowlist-based deserialization
|
|
||
| #[test] | ||
| fn test_vulnerable_java_taint_catches_every_flow() { | ||
| let output = foxguard_cmd_isolated() |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
|
|
||
| #[test] | ||
| fn test_safe_java_taint_has_no_taint_findings() { | ||
| let output = foxguard_cmd_isolated() |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/rules/java_taint.rs`:
- Around line 268-299: collect_param_sources currently traverses the entire
subtree via walk_tree(scope_node, ...) and thus picks up formal_parameter nodes
inside nested lambdas, causing parent-scope taints; change the traversal so it
only considers formal_parameter nodes directly declared in the current scope
(e.g., iterate the immediate children of scope_node or stop descent when
encountering lambda/method nodes) before calling formal_parameter_name/node_text
and state.taint, ensuring nested function/lambda nodes are not descended into.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 60810c0e-6cc4-4c8d-9f67-e3dc0dc13914
📒 Files selected for processing (12)
docs/taint-tracking.mdsrc/engine/scanner.rssrc/rules/java.rssrc/rules/java_taint.rssrc/rules/mod.rstests/cross_language_matrix.rstests/fixtures/realistic/java_spring_controller.javatests/fixtures/safe_java_taint.javatests/fixtures/vulnerable_java_taint.javatests/integration.rstests/realistic_fixtures.rswww/src/data/rules.ts
| walk_tree(scope_node, source, &mut |node, src| { | ||
| if node.kind() != "formal_parameter" { | ||
| return; | ||
| } | ||
| let Some(name) = formal_parameter_name(node, src) else { | ||
| return; | ||
| }; | ||
| let text = node_text(node, src); | ||
| let matched_annotation = annotation_names | ||
| .iter() | ||
| .find(|annotation| text.contains(&format!("@{annotation}"))); | ||
|
|
||
| if let Some(annotation) = matched_annotation { | ||
| state.taint( | ||
| name.to_string(), | ||
| TaintInfo { | ||
| description: format!("@{annotation} parameter '{name}'"), | ||
| line: node.start_position().row + 1, | ||
| hops: 0, | ||
| }, | ||
| ); | ||
| } else if bare_names.contains(&name) { | ||
| state.taint( | ||
| name.to_string(), | ||
| TaintInfo { | ||
| description: format!("parameter '{name}'"), | ||
| line: node.start_position().row + 1, | ||
| hops: 0, | ||
| }, | ||
| ); | ||
| } | ||
| }); |
There was a problem hiding this comment.
Avoid collecting parameter sources from nested scopes.
collect_param_sources currently walks the full subtree of the scope and will include formal_parameter nodes from nested lambdas. That can taint names in the parent scope incorrectly and create false positives.
Suggested fix
- walk_tree(scope_node, source, &mut |node, src| {
+ walk_scope_nodes(scope_node, source, &mut |node, src| {
if node.kind() != "formal_parameter" {
return;
}
let Some(name) = formal_parameter_name(node, src) else {
return;
};
let text = node_text(node, src);
let matched_annotation = annotation_names
.iter()
.find(|annotation| text.contains(&format!("@{annotation}")));📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| walk_tree(scope_node, source, &mut |node, src| { | |
| if node.kind() != "formal_parameter" { | |
| return; | |
| } | |
| let Some(name) = formal_parameter_name(node, src) else { | |
| return; | |
| }; | |
| let text = node_text(node, src); | |
| let matched_annotation = annotation_names | |
| .iter() | |
| .find(|annotation| text.contains(&format!("@{annotation}"))); | |
| if let Some(annotation) = matched_annotation { | |
| state.taint( | |
| name.to_string(), | |
| TaintInfo { | |
| description: format!("@{annotation} parameter '{name}'"), | |
| line: node.start_position().row + 1, | |
| hops: 0, | |
| }, | |
| ); | |
| } else if bare_names.contains(&name) { | |
| state.taint( | |
| name.to_string(), | |
| TaintInfo { | |
| description: format!("parameter '{name}'"), | |
| line: node.start_position().row + 1, | |
| hops: 0, | |
| }, | |
| ); | |
| } | |
| }); | |
| walk_scope_nodes(scope_node, source, &mut |node, src| { | |
| if node.kind() != "formal_parameter" { | |
| return; | |
| } | |
| let Some(name) = formal_parameter_name(node, src) else { | |
| return; | |
| }; | |
| let text = node_text(node, src); | |
| let matched_annotation = annotation_names | |
| .iter() | |
| .find(|annotation| text.contains(&format!("@{annotation}"))); | |
| if let Some(annotation) = matched_annotation { | |
| state.taint( | |
| name.to_string(), | |
| TaintInfo { | |
| description: format!("@{annotation} parameter '{name}'"), | |
| line: node.start_position().row + 1, | |
| hops: 0, | |
| }, | |
| ); | |
| } else if bare_names.contains(&name) { | |
| state.taint( | |
| name.to_string(), | |
| TaintInfo { | |
| description: format!("parameter '{name}'"), | |
| line: node.start_position().row + 1, | |
| hops: 0, | |
| }, | |
| ); | |
| } | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/rules/java_taint.rs` around lines 268 - 299, collect_param_sources
currently traverses the entire subtree via walk_tree(scope_node, ...) and thus
picks up formal_parameter nodes inside nested lambdas, causing parent-scope
taints; change the traversal so it only considers formal_parameter nodes
directly declared in the current scope (e.g., iterate the immediate children of
scope_node or stop descent when encountering lambda/method nodes) before calling
formal_parameter_name/node_text and state.taint, ensuring nested function/lambda
nodes are not descended into.
Summary
Closes #449.
Verification
Summary by CodeRabbit
New Features
Documentation
Tests