feature: Malicious packages scanner [TAROT-3600]#175
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR integrates OpenSSF's malicious packages database into Trivy to detect known malicious packages. This adds a new "Malicious packages detection" rule at the highest severity level to identify packages containing malware, typosquatting attacks, and dependency confusion attacks.
- Implements OpenSSF scanner with OSV format parsing for malicious package detection
- Adds new
malicious_packagesrule with critical severity level - Integrates scanning into the main tool execution flow
Reviewed Changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/tool/tool.go | Integrates OpenSSF scanner into main execution flow and adds new rule pattern |
| internal/tool/openssf_scanner.go | Core implementation of OpenSSF malicious packages scanner with OSV parsing |
| internal/docgen/rule.go | Adds rule definition for malicious packages detection |
| docs/patterns.json | Adds pattern configuration for malicious packages rule |
| docs/multiple-tests/pattern-malicious/* | Test files demonstrating malicious package detection |
| docs/description/* | Documentation files for the new pattern |
| Dockerfile | Copies OpenSSF cache data into container |
| .circleci/config.yml | Downloads OpenSSF database during CI build |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
afsmeira
left a comment
There was a problem hiding this comment.
There are still comments to address. And unit tests to add.
…s to accelerate scanning.
… package-lock.json also
… reporting in openssf scanner
204371d to
438d8ef
Compare
4f94427 to
0cfdc38
Compare
0cfdc38 to
d6f2a33
Compare
6a64afb to
b96b28d
Compare
There was a problem hiding this comment.
Pull Request Overview
Adds OpenSSF malicious-packages detection: CI steps to build the index, packaging the index into the image, a MaliciousPackagesScanner with tests, wiring into codacyTrivy, docs and a build script. Static analysis shows one existing Revive warning about exported New returning an unexported type; PR changes intentionally modify New signature to return (*codacyTrivy, error). Overall good coverage (lots of unit tests). Key risks: error handling around semver/ecosystem versions, potential performance/memory impacts loading a ~227MB DB at runtime (even gzipped), and a behavioral change in New() API.
About this PR
Loading the full OpenSSF index into memory (even gzipped) could increase container start time and memory. Consider measuring memory use and supporting a streaming/lookup-backed index or lazy loading, and add runtime telemetry or config to opt-out in constrained environments.
Medium risk | High confidence
This introduces a new exported constructor New(maliciousPackagesIndexPath string) that returns (*codacyTrivy, error) and may break callers that expected the previous parameterless New() returning a value. Ensure all call sites updated (CI/main binary updated in this PR) and consider a compat shim New() for backward compatibility if other consumers exist.
Medium risk | High confidence
Good unit test coverage for scanner logic. Add an integration test (or smoke test) that exercises building and loading the real index artifact or a realistic-sized sample to validate CI performance and failure modes.
Low risk | High confidence
💡 Codacy uses AI. Check for mistakes.
|
BTW, I think there is a risk of overlap between a vulnerable dependency and a malicious package. I've seen some malicious packages having a CVE in their metadata. In those cases, an analysis would create two issues: one for the vulnerable dependency pattern and one for the malicious package pattern. I figure this would be rare. |
I'm fine with this. There are already multiple CVEs detected per line in package files. |
OpenSSF publishes a regularly updated list of 3rd party packages that contain malware. This PR adds a new Trivy rule "Malicious packages detection" at the highest severity level.
The OpenSSF DB is 227mb which is about 1/3rd the size of Trivy's vuln DB (734mb) -- I would hope this would not add a vast extra burden on processing times. Probably we would want/need to run malicious package detection both on commit and in the nightly SCA process, since packages can be retroactively designated as malicious.