Conversation
|
i think its better to do this in the docker image publishing step, similar to what Superset is doing |
|
@kevinjqliu I'm not sure I follow. How's this relate to publishing docker images? This is to identify CVEs in the Kafka Connect connector itself. thanks. |
|
Looking at how Trivy is used in Superset, its scanning the docker image only. In this PR, we're unpacking the jars to scan. I think it makes more sense to build a kafka connect image and then use Trivy to scan the image. Just my preference |
Not really clear why we would add the infra to build a docker image just to use trivy. Seems like a direct scan is more parsimonious |
rymurr
left a comment
There was a problem hiding this comment.
LGTM, I just want to see if we can avoid 2x build w/o over complicating the CI job
Just looking at general patterns from the apache repos. https://grep.app/search?f.repo.pattern=apache%2F&q=uses%3A+aquasecurity%2Ftrivy-action The only instance of From the docs, it seems like Filesystem scan and Container image scan are similar in that they both scan for Vulnerabilities, Misconfigurations, Secrets, and Licenses. I think it would be helpful here if you can run the change on your fork repo and see if the fs trivy scan catches the currently open CVE for kafka connect (#15440) |
rymurr
left a comment
There was a problem hiding this comment.
The changes themselves look good to me. I am just confused/questioning the github actions config...
My assumption is we want:
- results to be uploaded regardless of pass/fail states of steps above
- the job to fail if it detects errors
- the 2nd scan to run regardless of if the other failed
I think whats happening now is the first always looks like it fails, the second always looks like it passes and the entire job always looks like it passes.
It might be worth trying this a few times in your own fork to make sure this actaully has the edge cases we expect.
There was a problem hiding this comment.
Thanks for the PR! I think this is a great addition, however I'm concerned about supply chain risk due to the recent (second) compromise of trivy's infra.
See
- https://opensourcemalware.com/repository/https%3A%2F%2Fgithub.com%2Faquasecurity%2Ftrivy%2F
- https://labs.boostsecurity.io/articles/20-days-later-trivy-compromise-act-ii/
- https://www.wiz.io/blog/trivy-compromised-teampcp-supply-chain-attack
I see we are using aquasecurity/trivy-action@e368e328979b113139d6f9068e03accaed98a518 . Glad to see we're using the commit hash.
I believe this is still ongoing; we should wait until theres resolution to proceed with this PR.
EDIT: looks like trivy was just pulled from the list of allowed actions by asf-infra apache/infrastructure-actions#548
https://infra.apache.org/blog/trivy_security_incident.html
@kevinjqliu PR to add it: apache/infrastructure-actions#573 |
|
apache/infrastructure-actions#582 is merged ASF Infra has already allowlisted it |
Thanks @kevinjqliu. |
7f169af to
9d21bb3
Compare
|
OK, I think this PR is ready for folk to take another look please. How it works:
Here are the execution paths, expected behaviours, and test evidence:
N.B. To test "no CVEs present" condition, known CVEs were temporarily suppressed via |
Add a Trivy vulnerability scan to the Kafka Connect CI workflow that scans bundled JARs for known CVEs with available fixes. The scan runs after the existing check task on JVM 21 only (dependency CVEs are JVM-independent). It builds distZip, unpacks it, and scans using rootfs mode via lhotari/sandboxed-trivy-action (ASF-allowlisted). Behaviour: - Flag, don't block: continue-on-error keeps the job green when CVEs are found. The step shows an orange warning icon in the GitHub UI. - On push to main/release branches: SARIF results are uploaded to the GitHub Security tab for ongoing tracking. - On PRs: SARIF upload is skipped (GitHub only accepts results from default/protected branches). Findings are visible in the CI log. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
9d21bb3 to
4720573
Compare
ed906dd to
0264d75
Compare
| # Behaviour: | ||
| # - Flag, don't block: the scan step uses exit-code 1 so it | ||
| # "fails" when CVEs are found, but continue-on-error keeps | ||
| # the overall job green. GitHub Actions shows the step with |
There was a problem hiding this comment.
I don't think this is true. Perhaps a hallucination.
My suggestion would be to invert the flow.
- remove
continue-on-errorandexit-codefrom the first run. - change the
ifstatement toalways() && matrix.jvm==21
This job will fail adn show 'red' in the UI but its a non-blocking check and the PR can still be merged. the always clause ensures that the results are always uploaded and printed regardless of the state of the scan.
I think this has the desired ux: the user knows they have a CVE and why but it isn't blocking for the PR (eg if the patched version hasn';t been released yet)
There was a problem hiding this comment.
Makes sense, thanks @rymurr. Sounds like a good option.
Out of interest where can I see which checks will block a merge and which won't?
Re. exit-code, don't we still want this to be 1, otherwise the step won't fail (it defaults to 0) and show the job as red?
There was a problem hiding this comment.
OK, I've made this change now, and updated the above detail & testing to reflect it.
@rymurr please could you take another look?
c7265f0 to
6d19226
Compare
6d19226 to
d1b2ca2
Compare
Summary
checkcompletes, it buildsdistZip, unpacks it, and scans the bundled jars for CRITICAL/HIGH CVEsContext
Discussion on dev@ mailing list: https://lists.apache.org/thread/kbf98950pzstzgon92st7mh9vrbv5yhb
Confluent Marketplace requires a Trivy scan before listing connectors. This has previously caught CVEs that needed patching (e.g. #14985). Running the scan in CI catches vulnerabilities during development and — critically — on RC tags before the release vote starts, when fixes can still be applied.
This is independent of #15212 (adding the KC artifact to the release process) and can land in either order.