Skip to content

Commit 6da122a

Browse files
Unify weekly + per-PR security scanning into a single workflow (#1460)
## Summary Consolidates security scanning into one workflow with one job (`securityScan.yml`). Replaces `vulnerabilityCatcher.yml`. Adds OSV-Scanner alongside the existing OWASP check, plus cyclonedx SBOM generation so OSV scans the actually-resolved local dependency tree. This bundles the work that was originally split across (now-closed) PRs #1458 (suppressions + email-notification fix) and #1459 (per-PR gate). One unified workflow file, one set of suppression rules, one place to look. ### Design: single job, terminal steps gated by event ``` Triggers: pull_request, schedule (weekly cron), workflow_dispatch │ ├─ shared: checkout, JDK, JFrog OIDC, maven config ├─ shared: NVD database cache (saves ~2 min) ├─ shared: mvn package (generates cyclonedx aggregate SBOM) ├─ shared: OWASP dependency-check (continue-on-error) ├─ shared: install + run OSV-Scanner ├─ shared: Collect findings (writes job summary; sets outputs) │ ├─ if pull_request + findings ⇒ fail the job └─ if schedule|dispatch + findings ⇒ send email + fail the job ``` No duplicated steps across jobs. Same scan logic for every trigger; only the notification mechanism differs. ### What changed | File | Change | |---|---| | `.github/workflows/vulnerabilityCatcher.yml` | **Removed.** Superseded by `securityScan.yml`. | | `.github/workflows/securityScan.yml` | **New.** Single job with event-gated terminal steps. Includes NVD database caching for faster runs. | | `owasp-suppressions.xml` | **8 new entries** for documented CPE/ecosystem false positives — Arrow R-only (CVE-2024-52338), gRPC-Go-only (CVE-2026-33186), protobuf Python-only (CVE-2026-0994), 5 libthrift non-Java bindings. Each entry has a justification comment block. | | `osv-scanner.toml` | **New.** Mirrors the OWASP suppressions in TOML format. Keep the two files in sync when adding/removing entries. | | `pom.xml` | **cyclonedx-maven-plugin 2.9.1** added, bound to `package` phase, `skipNotDeployed=false`. Emits an aggregate `target/bom.json` for OSV to read. | ### Why two scanners | Scanner | Source | Catches | Misses | |---|---|---|---| | OWASP dependency-check | NVD (CPE-based) | Anything with an NVD CPE entry. Reuses existing `owasp-suppressions.xml` and `failBuildOnCVSS=7` threshold. | GHSA-only advisories without an NVD CPE (e.g., CVE-2025-66566 in lz4-java — the #1455 finding). | | OSV-Scanner v2.3.8 | OSV.dev (purl-based; federates GHSA + NVD + PyPA + RustSec) | The GHSA-only gap. Already turned up a new finding: **bouncycastle CVE-2026-5598 (severity 8.9)** in bcprov-jdk18on 1.79, invisible to OWASP. | Findings without a CVSS score (rare for high-severity). | ### Why cyclonedx SBOM OSV-Scanner's Maven resolver consults deps.dev's published-artifact metadata. For the project's *own* GA, that means it sees the previously-published 3.3.3 pom — not whatever bumps are sitting on `main`. Result: in CI, OSV would keep showing yesterday's released state instead of the PR branch's actual dependency tree. Producing a CycloneDX aggregate SBOM at build time captures the actually-resolved local tree. ### Fixes from #1458 included here The weekly scan has been failing every Sunday for 7+ consecutive weeks (since early April 2026) without sending a notification email. Root cause: the OWASP maven plugin step exits non-zero when CVEs are found (because `<failBuildOnCVSS>7</failBuildOnCVSS>` is set in `jdbc-core/pom.xml:357`), short-circuiting the rest of the job. The subsequent steps never ran. The new workflow has `continue-on-error: true` on the OWASP step and routes finding-detection through an explicit `Collect findings` step that reads the OWASP JSON report directly. Whether the job ultimately fails (and whether email goes out) is decided after that step, not by maven's exit code. ### Findings on `main` today This workflow, run against current `main`, reports unsuppressed findings that will be addressed in **separate follow-up PRs**: 1. \`org.bouncycastle:bcprov-jdk18on@1.79\` → CVE-2026-5598 (severity 8.9). Fixed in bouncycastle 1.84. **OWASP doesn't see this — OSV-only finding.** 2. \`org.apache.thrift:libthrift@0.19.0\` → 4 CVEs (CVE-2026-41603/04/05/43869, max severity 8.2). Cleared by the libthrift 0.19 → 0.23 follow-up (requires regenerating \`TCLIService.java\` with the matching compiler). So this PR's own CI is expected to fail on these 2 known findings until the follow-ups land. Aligned with the rollout plan: add the workflow now, mark it required-to-merge in branch protection after the burn-down. ### Expected runtime Each invocation: **~5–7 minutes** (vs ~4 min for the old weekly). Breakdown: - `mvn package` (with cyclonedx SBOM): ~45–90s - OWASP dependency-check (with NVD cache hit): ~1–2 min - OSV-Scanner install + run: ~15s - Other steps: ~30s NVD caching cuts ~2 minutes off OWASP's runtime when the cache is warm. The cache key rotates daily so we still get fresh CVE data. ### What this PR is NOT * It is **not** a required check. Branch protection is unchanged — PRs aren't blocked by the red ❌ yet. * It does **not** introduce diff-mode (net-new findings only) gating. Every PR has to ship into a clean codebase; this is the design we agreed on, to force burn-down rather than just \"not making it worse.\" * It does **not** include a Phase 2 design (override flags, severity tiers, auto-issue filing). Phase 2 is deferred until we have ~2 weeks of operational experience. ### Test plan - [x] \`mvn package -DskipTests -Ddependency-check.skip=true\` produces \`target/bom.json\` (69 components, all expected versions). - [x] \`mvn -pl jdbc-core org.owasp:dependency-check-maven:check\` reports 4 unsuppressed CVEs (all libthrift; matches expectation). - [x] \`osv-scanner scan source --config=osv-scanner.toml --format=json target/bom.json\` + severity≥7 filter reports 2 findings (bouncycastle + libthrift cluster). - [x] Findings-step jq logic correctly counts OWASP CVSS>=7 from the JSON report. - [x] XML validation of \`owasp-suppressions.xml\`. - [x] YAML validation of \`securityScan.yml\`. - [ ] CI run on this PR — expected to fail on the 2 known findings above. **The CI failure on this PR is the intended outcome.** - [ ] Manual \`workflow_dispatch\` run post-merge to verify the email path fires. NO_CHANGELOG=true This pull request and its description were written by Isaac. --------- Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
1 parent c2ad803 commit 6da122a

5 files changed

Lines changed: 559 additions & 125 deletions

File tree

.github/workflows/securityScan.yml

Lines changed: 312 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,312 @@
1+
name: Security Scan
2+
3+
# Single workflow, single job. Triggered three ways with DIFFERENT
4+
# thresholds:
5+
#
6+
# - pull_request to main: fail the job on any unsuppressed
7+
# CVSS >= 7 finding (HIGH+). MEDIUM/LOW findings show in the step
8+
# summary but don't block merges. Not yet required-to-merge in
9+
# branch protection.
10+
#
11+
# - cron (weekly): report ALL findings regardless of severity. Sends
12+
# an email with the full sorted list and fails the job on any
13+
# finding. The intent is full situational awareness for the team --
14+
# emerging MEDIUM risks should be visible before they cross the PR
15+
# gate, and the weekly is read by humans, not enforced by code.
16+
#
17+
# - workflow_dispatch: behaves like the cron run (full reporting).
18+
#
19+
# Scanner: OSV-Scanner v2.3.8 (purl-based via OSV.dev; federates GHSA,
20+
# NVD, PyPA, RustSec, Go vuln DB). Reads the cyclonedx aggregate SBOM
21+
# produced by `mvn package` so it sees the actually-resolved local
22+
# dependency tree, not deps.dev's stale published-artifact metadata.
23+
#
24+
# OSV replaced OWASP dependency-check (NVD CPE-based) as the sole gate
25+
# in PR #1460. OSV's database is a strict superset of NVD's, and several
26+
# real CVEs (CVE-2025-66566 in lz4, CVE-2026-5598 in bouncycastle) are
27+
# GHSA-only with no NVD CPE -- invisible to OWASP, caught by OSV. The
28+
# `owasp-suppressions.xml` and dependency-check plugin in jdbc-core/pom.xml
29+
# remain in the repo because the in-repo release.yml/release-thin.yml
30+
# workflows still reference them, but those workflows are themselves
31+
# `if: false` and superseded by databricks/secure-public-registry-releases-eng.
32+
#
33+
# Suppressions live in `osv-scanner.toml` as [[IgnoredVulns]] entries
34+
# (CVE-id global; OSV-Scanner v2.3.8 doesn't support per-package CVE
35+
# scoping). Each entry has a justification comment.
36+
37+
on:
38+
pull_request:
39+
branches: [main]
40+
schedule:
41+
- cron: '0 0 * * 0' # Run every Sunday at midnight UTC
42+
workflow_dispatch:
43+
44+
permissions:
45+
id-token: write
46+
contents: read
47+
48+
jobs:
49+
security-scan:
50+
name: Security Scan
51+
runs-on:
52+
group: databricks-protected-runner-group
53+
labels: linux-ubuntu-latest
54+
55+
steps:
56+
- name: Checkout repository
57+
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
58+
59+
- name: Set up JDK 11
60+
uses: actions/setup-java@c1e323688fd81a25caa38c78aa6df2d33d3e20d9 # v4
61+
with:
62+
java-version: '11'
63+
distribution: 'temurin'
64+
cache: maven
65+
66+
# JFrog OIDC + maven proxy: skipped on fork PRs (no OIDC token from
67+
# GitHub's perspective). Fork PRs still work because all of the
68+
# driver's direct dependencies are published to public Maven Central
69+
# (verified against jdbc-core/pom.xml); without ~/.m2/settings.xml,
70+
# Maven falls through to Central directly. JFrog is just a faster
71+
# mirror, not a source of any artifact the build genuinely needs.
72+
- name: Get JFrog OIDC token
73+
if: github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository
74+
run: |
75+
set -euo pipefail
76+
77+
ID_TOKEN=$(curl -sLS \
78+
-H "User-Agent: actions/oidc-client" \
79+
-H "Authorization: Bearer $ACTIONS_ID_TOKEN_REQUEST_TOKEN" \
80+
"${ACTIONS_ID_TOKEN_REQUEST_URL}&audience=jfrog-github" | jq .value | tr -d '"')
81+
echo "::add-mask::${ID_TOKEN}"
82+
83+
ACCESS_TOKEN=$(curl -sLS -XPOST -H "Content-Type: application/json" \
84+
"https://databricks.jfrog.io/access/api/v1/oidc/token" \
85+
-d "{\"grant_type\": \"urn:ietf:params:oauth:grant-type:token-exchange\", \"subject_token_type\":\"urn:ietf:params:oauth:token-type:id_token\", \"subject_token\": \"${ID_TOKEN}\", \"provider_name\": \"github-actions\"}" | jq .access_token | tr -d '"')
86+
echo "::add-mask::${ACCESS_TOKEN}"
87+
88+
if [ -z "$ACCESS_TOKEN" ] || [ "$ACCESS_TOKEN" = "null" ]; then
89+
echo "FAIL: Could not extract JFrog access token"
90+
exit 1
91+
fi
92+
93+
echo "JFROG_ACCESS_TOKEN=${ACCESS_TOKEN}" >> "$GITHUB_ENV"
94+
95+
- name: Configure maven
96+
if: github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name == github.repository
97+
run: |
98+
set -euo pipefail
99+
mkdir -p ~/.m2
100+
cat > ~/.m2/settings.xml << EOF
101+
<settings>
102+
<mirrors>
103+
<mirror>
104+
<id>jfrog-central</id>
105+
<mirrorOf>*</mirrorOf>
106+
<url>https://databricks.jfrog.io/artifactory/db-maven/</url>
107+
</mirror>
108+
</mirrors>
109+
<servers>
110+
<server>
111+
<id>jfrog-central</id>
112+
<username>gha-service-account</username>
113+
<password>${JFROG_ACCESS_TOKEN}</password>
114+
</server>
115+
</servers>
116+
</settings>
117+
EOF
118+
119+
# Build the project to produce the cyclonedx aggregate SBOM that OSV
120+
# will scan. -Ddependency-check.skip=true because the OWASP plugin
121+
# is bound to the verify phase in jdbc-core/pom.xml and we don't
122+
# use it anymore -- skipping saves ~2 minutes.
123+
- name: Build (generates cyclonedx SBOM)
124+
run: mvn package -DskipTests -Ddependency-check.skip=true -B
125+
126+
- name: Install osv-scanner
127+
run: |
128+
set -euo pipefail
129+
curl -fsSL -o /tmp/osv-scanner \
130+
https://github.com/google/osv-scanner/releases/download/v2.3.8/osv-scanner_linux_amd64
131+
chmod +x /tmp/osv-scanner
132+
/tmp/osv-scanner --version
133+
134+
- name: Run OSV-Scanner
135+
# Drop -e because osv-scanner exits 1 on ANY finding regardless of
136+
# severity. The severity >= 7 filter below is our actual gate, so
137+
# we explicitly tolerate osv-scanner's non-zero exit via `|| true`.
138+
run: |
139+
set -uo pipefail
140+
141+
if [ ! -f target/bom.json ]; then
142+
echo "::error::SBOM not found at target/bom.json (build likely failed)."
143+
exit 1
144+
fi
145+
146+
/tmp/osv-scanner scan source \
147+
--recursive=false \
148+
--config=osv-scanner.toml \
149+
--format=json \
150+
--output-file=/tmp/osv-out.json \
151+
target/bom.json || true
152+
153+
if [ ! -s /tmp/osv-out.json ]; then
154+
echo "::error::OSV-Scanner did not produce an output file."
155+
exit 1
156+
fi
157+
158+
# Parse OSV's JSON into job outputs. The terminal steps below
159+
# (PR-fail and email) consume these outputs.
160+
#
161+
# Two thresholds: PR gating uses CVSS >= 7 (high_count) so we don't
162+
# block merges on MEDIUM/LOW noise; the weekly email reports
163+
# everything (total_findings) so the team has full situational
164+
# awareness of emerging risk before it crosses the gate.
165+
- name: Collect findings
166+
id: findings
167+
run: |
168+
set -uo pipefail
169+
170+
# All findings (sorted by severity desc). Anything missing a
171+
# CVSS score sorts to 0 -- visible in the report but not silent.
172+
ALL_FINDINGS=$(jq -c '[
173+
.results[].packages[]? |
174+
.package as $pkg |
175+
.groups[]? |
176+
{pkg: ($pkg.name + "@" + $pkg.version), ids: .ids, severity: (.max_severity // "0")}
177+
] | sort_by(- (.severity | tonumber? // 0))' /tmp/osv-out.json)
178+
TOTAL_FINDINGS=$(echo "$ALL_FINDINGS" | jq 'length')
179+
180+
# High findings (CVSS >= 7). Both counters are logged so a
181+
# mismatch (e.g. 50 total / 0 high) is visible -- protects
182+
# against silent fail-open if OSV ever changes its severity
183+
# format (e.g. emits "HIGH" instead of a number, which
184+
# `tonumber? // 0` would mask).
185+
HIGH_FINDINGS=$(echo "$ALL_FINDINGS" | jq -c '[.[] | select((.severity | tonumber? // 0) >= 7)]')
186+
HIGH_COUNT=$(echo "$HIGH_FINDINGS" | jq 'length')
187+
188+
# Persist the full findings list to a file rather than a job
189+
# output -- GitHub Actions outputs are size-capped at 1 MB and
190+
# the formatted email body can be larger than that for big
191+
# finding lists.
192+
echo "$ALL_FINDINGS" > /tmp/all-findings.json
193+
194+
echo "total_findings=$TOTAL_FINDINGS" >> "$GITHUB_OUTPUT"
195+
echo "high_count=$HIGH_COUNT" >> "$GITHUB_OUTPUT"
196+
197+
# Step summary so findings are visible in the GH Actions UI
198+
# without downloading artifacts.
199+
{
200+
echo "## OSV-Scanner Findings"
201+
echo ""
202+
echo "- Total findings (any severity): \`$TOTAL_FINDINGS\`"
203+
echo "- High findings (CVSS >= 7, PR-blocking): \`$HIGH_COUNT\`"
204+
if [ "$TOTAL_FINDINGS" -gt 0 ]; then
205+
echo ""
206+
echo "All findings (sorted by severity desc):"
207+
echo ""
208+
echo "| Severity | Package | IDs |"
209+
echo "|---|---|---|"
210+
echo "$ALL_FINDINGS" | jq -r '.[] | "| \(.severity) | \(.pkg) | \(.ids | join(",")) |"'
211+
fi
212+
} >> "$GITHUB_STEP_SUMMARY"
213+
214+
# Also dump the findings to the job log so they're visible in
215+
# the default "Logs" view, not just the step summary panel.
216+
echo "OSV: $TOTAL_FINDINGS total findings, $HIGH_COUNT at CVSS>=7"
217+
if [ "$TOTAL_FINDINGS" -gt 0 ]; then
218+
echo ""
219+
echo "All findings (sorted by severity desc):"
220+
echo "$ALL_FINDINGS" | jq -r '.[] | " [\(.severity)] \(.pkg) \(.ids | join(", "))"'
221+
fi
222+
223+
# --- Terminal: PR event ---
224+
# Fail the job so the PR's check goes red. No email.
225+
# PR gate is CVSS >= 7 only; MEDIUM/LOW findings show up in the
226+
# step summary but don't block merges.
227+
- name: Fail on findings (PR)
228+
if: github.event_name == 'pull_request' && steps.findings.outputs.high_count != '0'
229+
run: |
230+
set -uo pipefail
231+
# List the actual HIGH findings inline so the author sees what
232+
# needs fixing without clicking through to the step summary
233+
# panel or downloading artifacts.
234+
HIGH_FINDINGS=$(jq -c '[.[] | select((.severity | tonumber? // 0) >= 7)]' /tmp/all-findings.json)
235+
236+
echo "::error::${{ steps.findings.outputs.high_count }} unsuppressed CVSS>=7 finding(s) in this PR:"
237+
echo ""
238+
echo "$HIGH_FINDINGS" | jq -r '.[] | " [\(.severity)] \(.pkg) \(.ids | join(", "))"'
239+
echo ""
240+
echo "Fix by either:"
241+
echo " 1. Bumping the affected dependency to a patched version, or"
242+
echo " 2. Adding a documented [[IgnoredVulns]] entry to osv-scanner.toml"
243+
echo " with a clear justification for why the CVE doesn't apply to our usage."
244+
echo ""
245+
echo "Full step summary: $GITHUB_SERVER_URL/$GITHUB_REPOSITORY/actions/runs/$GITHUB_RUN_ID"
246+
exit 1
247+
248+
# --- Terminal: scheduled/manual event ---
249+
# Weekly reports ALL findings (not just CVSS >= 7) so the team sees
250+
# emerging risk before it crosses the PR gate. PR-time is narrower
251+
# to avoid blocking on MEDIUM/LOW noise; weekly is broader because
252+
# it's read by humans, not enforced.
253+
- name: Compose email body
254+
if: (github.event_name == 'schedule' || github.event_name == 'workflow_dispatch') && steps.findings.outputs.total_findings != '0'
255+
run: |
256+
set -uo pipefail
257+
{
258+
echo "<!DOCTYPE html><html><head><title>JDBC Driver Security Scan Results</title>"
259+
echo "<style>"
260+
echo " body { font-family: -apple-system, sans-serif; }"
261+
echo " table { border-collapse: collapse; margin-top: 1em; }"
262+
echo " th, td { border: 1px solid #ddd; padding: 6px 12px; text-align: left; }"
263+
echo " th { background: #f5f5f5; }"
264+
echo " tr.high { background: #ffe5e5; }"
265+
echo " tr.medium { background: #fff5e5; }"
266+
echo "</style></head><body>"
267+
echo "<h1>Security Vulnerabilities Found</h1>"
268+
echo "<p><b>${{ steps.findings.outputs.total_findings }}</b> total finding(s) on main; <b>${{ steps.findings.outputs.high_count }}</b> are CVSS &gt;= 7 (PR-blocking).</p>"
269+
echo "<p>Full reports are attached to the GitHub Actions run as artifacts: <a href='https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}'>View Artifacts</a></p>"
270+
echo "<table><tr><th>Severity</th><th>Package</th><th>IDs</th></tr>"
271+
jq -r '.[] |
272+
(if (.severity | tonumber? // 0) >= 7 then "high"
273+
elif (.severity | tonumber? // 0) >= 4 then "medium"
274+
else "" end) as $cls |
275+
"<tr class=\"\($cls)\"><td>\(.severity)</td><td>\(.pkg)</td><td>\(.ids | join(", "))</td></tr>"
276+
' /tmp/all-findings.json
277+
echo "</table>"
278+
echo "</body></html>"
279+
} > security-scan-report.html
280+
281+
- name: Send Email
282+
if: (github.event_name == 'schedule' || github.event_name == 'workflow_dispatch') && steps.findings.outputs.total_findings != '0'
283+
uses: dawidd6/action-send-mail@4226df7daafa6fc901a43789c49bf7ab309066e7 # v3
284+
with:
285+
server_address: smtp.gmail.com
286+
server_port: 465
287+
username: ${{ secrets.SMTP_USERNAME }}
288+
password: ${{ secrets.SMTP_PASSWORD }}
289+
subject: OSS JDBC Driver Security Scan - 🚨 Vulnerabilities Found
290+
html_body: file://security-scan-report.html
291+
to: ${{ secrets.EMAIL_RECIPIENTS }}
292+
from: JDBC Security Scanner
293+
content_type: text/html
294+
295+
- name: Fail on findings (scheduled/manual)
296+
if: (github.event_name == 'schedule' || github.event_name == 'workflow_dispatch') && steps.findings.outputs.total_findings != '0'
297+
run: |
298+
echo "::error::${{ steps.findings.outputs.total_findings }} OSV finding(s) on main (${{ steps.findings.outputs.high_count }} at CVSS>=7). Email sent."
299+
exit 1
300+
301+
# Always upload artifacts so triagers can pull the full reports
302+
# without having to rerun anything.
303+
- name: Upload reports
304+
if: always()
305+
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
306+
with:
307+
name: security-scan-reports
308+
path: |
309+
/tmp/osv-out.json
310+
target/bom.json
311+
security-scan-report.html
312+
if-no-files-found: ignore

0 commit comments

Comments
 (0)