Skip to content

Commit dbcf711

Browse files
CI repair: pip/JFrog routing, PGP keysmap drift, POM transit, Python lockfile, test progress (#26)
* fix(ci): pre-bootstrap JFrog pip routing before setup-python The hardened runner groups (larger-runners, databrickslabs-protected-runner-group) block egress to pypi.org by design — all package fetches must go through the JFrog db-pypi mirror per go/hardened-gha policy. But actions/setup-python@v6.2.0 unconditionally runs `pip install --upgrade pip` inside its python-versions installer BEFORE our jfrog-auth step has a chance to redirect pip. That first pip call escapes the allowlist and dies with SSL UNEXPECTED_EOF against pypi.org, failing the entire Scala build at the "Configure python interpreter" step. Fix: add a new jfrog-pip-bootstrap composite that runs BEFORE setup-python. It mints a JFrog OIDC access token (reusing the existing jfrog-auth shell script — kept in sync with UCX), writes netrc + pip.conf, and exports NETRC + PIP_CONFIG_FILE to $GITHUB_ENV. By the time setup-python downloads Python and fires its internal pip upgrade, pip already routes through db-pypi and succeeds. The main jfrog-auth composite still runs after setup-python — now solely responsible for Maven config (which needs setup-java to have written ~/.m2/settings.xml first) and an idempotent pip refresh with a fresh token. Wired into both scala_build and python_build. No workflow changes: all five callers (build_main, build_scala, build_python, build_scala_by_package, codecov-scala-parallel, codecov-upload) already declare id-token: write. Co-authored-by: Isaac * fix(security): drop unknown failNoKey param on pgpverify 1.19.1 pgpverify-maven-plugin renamed-then-removed the `failNoKey` parameter sometime before 1.13.0; on 1.19.1 every build that runs the `check` mojo emits: Warning: Parameter 'failNoKey' is unknown for plugin 'pgpverify-maven-plugin:1.19.1:check (default)' The lockdown intent ("fail when a dependency's signing key is not in the keysmap") is the DEFAULT behavior in 1.19.1 — it's enforced by the keysmap file itself (no `noKey` tokens in .maven-keys.list), which .maven-keys.list already documents as policy. Removing the no-op parameter does not weaken the security posture. Changes: * pom.xml: drop <failNoKey>true</failNoKey> from the verify-pgp profile's plugin config; replace with a comment so this isn't re-added on the next lockdown pass. * scripts/security/maven-pgp-bootstrap: drop the matching -DfailNoKey=false override (same unknown-parameter no-op). Co-authored-by: Isaac * fix(security): drop deprecated failNoSignature param on pgpverify 1.19.1 pgpverify-maven-plugin 1.13.0 deprecated `failNoSignature` in favor of expressing the requirement through the keysmap file. On 1.19.1 the `check` mojo emits: Warning: Parameter 'failNoSignature' (user property 'pgpverify.failNoSignature') is deprecated: Deprecated as of 1.13.0: this requirement can be expressed through the keysMap. .maven-keys.list already documents "DO NOT USE noSig" as policy and contains no `noSig` tokens, so the strict-by-default behavior we want ("fail on unsigned artifact") is enforced through the keysmap. The mojo parameter is redundant; remove it. The verify-pgp profile now relies on: * failWeakSignature=true (still a valid mojo parameter) * keysmap entries without `noSig` (enforces "must be signed") * keysmap entries without `noKey` (enforces "key must be mapped") Updated the inline comment to record both the failNoSignature deprecation and the prior failNoKey removal so the next lockdown pass doesn't re-add either parameter. Bootstrap script (scripts/security/maven-pgp-bootstrap) intentionally keeps -DfailNoSignature=false / -DfailWeakSignature=false: those are local-only relaxations needed for key discovery; the deprecation warning surfaces during local bootstrap, not in CI. Co-authored-by: Isaac * fix(security): pin Maven lifecycle plugin versions (Category A drift) The hardened verify-pgp profile fails with "Not allowed artifact ... and keyID" for five lifecycle plugins that pom.xml never pinned: org.apache.maven.plugins:maven-install-plugin:3.1.4 org.apache.maven.plugins:maven-deploy-plugin:3.1.4 org.apache.maven.plugins:maven-compiler-plugin:3.15.0 org.codehaus.plexus:plexus-compiler-{api,javac,manager}:2.16.2 org.apache.maven.shared:file-management:3.2.0 Maven's Super POM auto-resolves these when the project doesn't pin them. Resolution drifts across Maven minor versions and across CI runner image upgrades, swapping in new (legit but un-mapped) signing keys. The PGP verify step then rejects them — not because they're untrusted, but because .maven-keys.list was bootstrapped against an earlier closure. Pin maven-compiler-plugin, maven-install-plugin, and maven-deploy-plugin in the root <build><plugins> block. plexus-compiler-* and file-management come in transitively from the pinned plugins, so they freeze with them. This commit does NOT update .maven-keys.list — that requires running scripts/security/maven-pgp-bootstrap in the geobrix-dev container after this pin lands, cross-checking each new fingerprint against the trust anchors listed at the top of .maven-keys.list (apache KEYS, mojohaus, etc.), and committing the reviewed entries as a follow-up. Category B (POM-only "PGP Signature INVALID") is investigated separately via the diag-pgpverify-pom-transit workflow added in the next commit. Co-authored-by: Isaac * chore(diag): add POM transit diagnostic for Category B PGP failures Category B from the post-lockdown verify-pgp build: "PGP Signature INVALID" on .pom files only (never .jar). Hypothesis: the JFrog db-maven mirror is mutating POM bytes (line endings, XML normalization, or appended _remote.repositories metadata) between Maven Central and what CI sees, breaking the upstream signature. This adds: * scripts/security/diag-pgpverify-pom-transit — fetches each of the six failing .pom + .pom.asc files from db-maven via NETRC auth, computes sha256, compares against a known-good Maven Central reference (captured 2026-05-18 via maven-proxy.dev.databricks.com and recorded inline), prints a hex dump of the first/last bytes on mismatch, and runs gpg --verify standalone if gpg is on PATH. * .github/workflows/diag-pgpverify-pom-transit.yml — workflow_dispatch one-shot that authenticates to JFrog (OIDC), installs gpg, and runs the script on the protected runner group. No untrusted github.event.* values reach any run block — only static script invocations. Trigger via Actions → "Diag — POM transit (db-maven vs Maven Central)" → Run workflow. The exit code surfaces red/green; the logs carry the sha256 diff and gpg output. Delete this workflow + script once the investigation concludes. Co-authored-by: Isaac * fix(security): add keysmap entries for pinned lifecycle plugins (Category A) The version pins added in 5c1daf4 freeze the plugin versions but leave their actual signing keys unmapped in .maven-keys.list, so the verify-pgp profile still rejects them with "Not allowed artifact ... and keyID". Add version-specific entries for each. Both signing keys verified against keyserver.ubuntu.com on 2026-05-18: 0x84789D24DF77A32433CE1F079EB80E92EB2135B1 Slawomir Jaranowski <sjaranowski@apache.org> Apache Maven committer + lead developer of pgpverify-maven-plugin itself. RSA 4096, created 2021-12-22, active. Now signs newer maven-install-plugin, maven-deploy-plugin, file-management releases. 0x32118CF76C9EC5D918E54967CA80D1F0EB6CA4BA Sylwester Lachiewicz <slachiewicz@apache.org> Apache committer; Mojo Codehaus / Maven plugins maintainer. RSA 4096, created 2020-05-09, active. Signs newer maven-compiler-plugin and plexus-compiler-{api,javac,manager} releases. Neither key appears in downloads.apache.org/maven/KEYS, which only lists release-signing keys for current/past PMC members — these two are personal committer keys (self-signed UIDs at @apache.org), legitimate for per-artifact signing per Apache release policy. Trust anchor URLs and verification details recorded inline in .maven-keys.list above the new block so a future audit can re-verify. Entries are added at the most-specific level (g:a:p:v) so older versions that may exist in other build contexts remain bound to their original keysmap entry — versions only get the new keys when they're the exact ones pom.xml pins. This closes Category A. Category B (POM-only "PGP Signature INVALID" on jackson 2.18.3, scala-maven-plugin 4.9.9, snappy 0.4, javax.servlet-api 3.1.0) is investigated separately via the diag-pgpverify-pom-transit workflow. Co-authored-by: Isaac * chore(diag): also fire POM transit diagnostic on push to ci-fix-jfrog workflow_dispatch requires the workflow to live on the default branch before it shows up in the UI / API. While the diagnostic only lives on this branch, also fire it on push that touches its own files so we can actually trigger a run. Drop the push trigger when the diagnostic is deleted. Co-authored-by: Isaac * fix(security): badSig overrides for 6 POMs byte-mutated by db-maven (Cat B) Final root cause from the diag-pgpverify-pom-transit workflow (run 26056343517 on this branch): the JFrog db-maven mirror applies a text- artifact transformation to .pom files in transit. Per-file sha256 comparison vs Maven Central: .pom files: ALL 6 MISMATCH (size delta -348 to +170 bytes) .pom.asc: ALL 6 MATCH (signatures untouched) The hex dump shows LF -> CRLF at every line end plus additional content normalization (whitespace / XML formatting). Standalone `gpg --verify` on the db-maven bytes fails because the .asc was computed against the original LF Maven Central bytes, not the JFrog-mutated ones. The corresponding .jar files are binary and pass through JFrog untouched, so JAR signature verification is unaffected. Fix: add badSig overrides for the six affected .pom artifacts in a dedicated, self-documenting section at the end of .maven-keys.list, moving the four pre-existing noSig entries (jackson + scala-maven) out of the "legacy unsigned" section since they aren't legacy-unsigned, and adding two new entries for snappy / javax.servlet-api which had no prior override and were failing INVALID against the general entry. `badSig` is narrower than `noSig`: it only tolerates a crypto failure on a signature that DOES exist on the server. JAR verification is unchanged (the keysmap's general fingerprint entries still match the real signing keys, so JARs continue to verify strictly). Residual security risk documented inline: db-maven is now part of the POM-content trust boundary. An attacker who compromises the mirror could rewrite POMs to regress to a previously-trusted-but-vulnerable signed JAR — but only among artifacts already in the mirror's allowlist; the JAR signatures themselves still gate executable code. Action item (in the comment): file a JFrog admin ticket asking db-maven to disable text-artifact transformations so .pom bytes pass through verbatim, then delete this section. Co-authored-by: Isaac * fix(ci): bump setuptools to 80.9.0 for PEP 639 SPDX license parsing GDAL 3.11.4's sdist pyproject.toml uses the PEP 639 SPDX form for license metadata: [project] license = "MIT" setuptools 74.0.0 (previously pinned) predates PEP 639 support and rejects the string form, demanding the older table form ({file=...} or {text=...}). The Scala build step's GDAL install (--no-build-isolation --no-binary :all: gdal[numpy]==3.11.4) fails at metadata generation: invalid pyproject.toml config: `project.license`. configuration error: `project.license` must be valid exactly by one definition (2 matches found): ... GIVEN VALUE: "MIT" OFFENDING RULE: 'oneOf' PEP 639 SPDX-string parsing landed in setuptools 77.0.0. Bumping to 80.9.0 (current stable post-PEP-639) unblocks the GDAL sdist build while staying within the hash-pinned closure. Regenerated requirements-ci.txt via: cd python/geobrix uv pip compile --generate-hashes --python-version 3.12 \ --output-file requirements-ci.txt requirements-ci.in setuptools is consumed via --no-build-isolation (Scala build action line 102 + Python build action line 84) — pip uses the host setuptools installed by requirements-ci.txt rather than fetching a build-time isolated copy, so bumping it here is what GDAL's build sees. Co-authored-by: Isaac * fix(build): set project.build.sourceEncoding to silence platform-encoding warning maven-resources-plugin 3.4.0 (and all other modern Maven plugins) look up the standard property names `project.build.sourceEncoding` and `project.reporting.outputEncoding`. The pom.xml had a custom `<encoding>` property used by some other build pieces, but the standard names were unset, causing: [INFO] --- resources:3.4.0:resources (default-resources) @ geobrix --- Warning: Using platform encoding (UTF-8 actually) to copy filtered resources, i.e. build is platform dependent! Set both standard properties to UTF-8. Custom `<encoding>` property is preserved for any existing references. Co-authored-by: Isaac * feat(test): per-suite progress reporter + per-test durations Long mvn test runs (full Scala suite is 10-15 min locally and on CI) gave no visibility into how far along the run was — just an unbroken stream of `SuiteName:` / `- test case` lines. Add two complementary improvements: 1. Custom ScalaTest reporter at src/test/scala/com/databricks/labs/gbx/util/ProgressReporter.scala Fires after every SuiteCompleted event with a one-line marker: [progress] suite #12 done · SpatialRefOpsTest · 215 ms · tests=6 (0 failed) · totals: 312 tests, 0 failed · elapsed 3m 24s On RunCompleted it prints a final summary. Counters are AtomicInteger + ThreadLocal so the reporter stays correct if scalatest ever runs suites in parallel. Wired in pom.xml via scalatest-maven-plugin's <reporters> config. 2. Add `D` to the stdout config flags (`FS` -> `FSD`) so each test line carries its duration: - fromEPSGCode(4326) should return SpatialReference with EPSG 4326 (12 ms) Useful for spotting stuck/slow tests at a glance. No new dependencies — ProgressReporter only uses org.scalatest.Reporter and org.scalatest.events._ (already on the test classpath from scalatest_2.13:3.2.14). Co-authored-by: Isaac * fix(python): switch project.license to PEP 639 SPDX form setuptools >=77 deprecates the TOML-table form of `project.license` (`{ text = "..." }` / `{ file = "..." }`) and emits a warning on every build: SetuptoolsDeprecationWarning: `project.license` as a TOML table is deprecated. Please use a simple string containing a SPDX expression for `project.license`. You can also use `project.license-files`. ... By 2027-Feb-18, you need to update your project and remove deprecated calls or your builds will no longer be supported. Since we bumped setuptools to 80.9.0 in 8672256 to unblock GDAL 3.11 sdist (whose pyproject.toml uses the new form), our own pyproject.toml now trips the deprecation in the other direction. Use PEP 639's LicenseRef-<id> namespace for the proprietary Databricks License (no SPDX identifier exists). Canonical license text continues to live in ../../LICENSE at the repo root; the `LicenseRef-` prefix makes it valid PEP 639 syntax without claiming an SPDX-managed name. Classifiers entry (License :: Other/Proprietary License) is left in place for downstream consumers reading legacy Trove metadata. Co-authored-by: Isaac * feat(test): show suite #N/M progress with auto-discovered total Per request — surface the denominator so progress reads as [progress] suite #45/64 done · H3Test · 9 ms · … instead of just `#45`. ScalaTest's events don't carry a total-suite count anywhere (RunStarting has testCount but no suiteCount; DiscoveryCompleted is empty), so the reporter computes M itself: on first use, walk `target/test-classes/` and count `*Test.class` files (geobrix's only test-class naming convention — verified with `find src/test/scala -name '*Test.scala' | wc -l`, all 64 match). The result caches via `lazy val`; discovery is a few-ms file walk, runs once per test JVM. Override path with `-DgbxTestClassesDir=…` if the discovery dir ever moves. If discovery fails or the dir is missing (e.g., running the reporter in an unusual classpath layout), totalSuites = 0 and the formatter falls back to just `#N` rather than printing a garbage denominator. For filtered runs (`-Dsuites=com.databricks.labs.gbx.gridx.*`), M still reports the full discoverable count, so `#3/64` reads as "3 of 64 available" — a useful upper bound rather than misleading "3 of 3 selected" semantics that would require coordinating with ScalaTest's runner internals. Co-authored-by: Isaac * fix(python): drop License Trove classifier — PEP 639 forbids overlap Follow-on to fde207a (switched to `license = "LicenseRef-..."`). setuptools 80.9.0 doesn't just warn about the overlap; it raises InvalidConfigError when both a SPDX license expression and a License classifier are present: setuptools.errors.InvalidConfigError: License classifiers have been superseded by license expressions (see https://peps.python.org/pep-0639/). Please remove: License :: Other/Proprietary License Removed the classifier; the license is fully expressed via the SPDX `license` field above. The remaining classifiers (Topic, Programming Language, Operating System) are unaffected. Co-authored-by: Isaac * fix(test): filter ProgressReporter total to runnable Suite classes only Reported denominator was 73 against an actual run of 63 — naive filename-only counting of `*Test.class` files included abstract base classes and helper traits that compile to `*Test.class` but ScalaTest doesn't run. Match ScalaTest's own discovery filter: * concrete (not abstract, not interface) * public * extends `org.scalatest.Suite` * has a public no-arg constructor Implemented via reflection over each candidate class file, loading with `Class.forName(name, initialize=false, classLoader)` so static initializers don't fire during counting. Any reflection failure (NoClassDefFoundError, missing transitive dep, etc.) conservatively counts the class as "not runnable" — better to undercount M than to inflate it. Discovery happens once per JVM via `lazy val`. This drops the denominator from 73 -> 63 (matches actual full-run suite count) without changing local/CI invocation; everything stays internal to the reporter. Co-authored-by: Isaac * fix(test): decrement progress M when a SuiteCompleted reports 0 tests The reflection-based discovery in da10881 still landed at M=73 against an actual run of 63 — the 10 extras pass the structural Suite filter (concrete + public + extends Suite + has a no-arg ctor) but contain no `test("...")` blocks, so they run zero work. There's no static way to detect "this Suite has no tests registered" without instantiating the class (`Suite.expectedTestCount` is an instance method), and constructing every candidate at discovery time would run user code with potential side effects (Spark session bootstrap, GDAL init, etc.). Switch M to a mutable AtomicInteger seeded from discovery. On every SuiteCompleted: * tests > 0: increment #N, emit the progress line as usual * tests == 0: decrement M, suppress the progress line entirely M converges to the real runnable count as empty suites are observed; the user sees `#N/M` where M moves downward toward the true total. Suppressing the empty-suite progress line keeps #N aligned with what the reader perceives as "actual work happened" (no `[progress] suite #X done · EmptyScaffoldingTest · 0 ms · tests=0` lines cluttering the stream). Also defers the discovery scan from `lazy val` initialization to the first SuiteCompleted event so the file walk happens after the test classpath is fully assembled in the forked JVM, not during reporter construction. Co-authored-by: Isaac * fix(test): honor -Dsuites filter in ProgressReporter discovery Previous count (73) stayed put through the entire run because the 10 extras are classes ScalaTest never sees: they pass the structural Suite filter but live outside the `com.databricks.labs.gbx.*` namespace that our build passes via `-Dsuites=…`, so the runner doesn't load them and no SuiteCompleted ever fires — the decrement-on-empty fallback can't fix what never appears. Found via: diff <(find target/test-classes -name '*Test.class' -not -name '*$*' \ | sed 's|target/test-classes/||;s|.class$||;s|/|.|g' | sort) \ <(find src/test/scala -name '*Test.scala' \ | sed 's|src/test/scala/||;s|.scala$||;s|/|.|g' | sort) The 10 extras: * org.apache.spark.sql.adapters.SparkAdaptersTest (× 1) * docs.tests.scala.… (× 5) * tests.docs.scala.… (× 4) The latter 9 come in via build-helper-maven-plugin's `add-test-source` config (docs/tests/scala/), but only the `Scala doc tests` step uses them — and that step is currently disabled (`if: false` in build_main.yml). For everything else they're dead weight at the .class level. scalatest-maven-plugin forwards `-Dsuites=…` into the test JVM as a system property, so ProgressReporter can now read it and apply the same pattern matching: * exact `com.x.YTest` → equality * package wildcard `com.x.*` → prefix match on `com.x.` * comma-separated list → any-match * empty / unset → accept all The runtime "decrement on 0-test SuiteCompleted" guard from be883e3 is kept as belt-and-braces for any future Suite class that registers zero tests at runtime. Expected behavior post-change: with default `-Dsuites=com.databricks.labs.gbx.*` the denominator becomes 63 from the first progress line and stays there. Co-authored-by: Isaac * fix(test): forward Maven `suites` property into test JVM as `gbx.suites` The previous attempt to read `-Dsuites=…` from `sys.props("suites")` inside ProgressReporter found nothing — scalatest-maven-plugin *consumes* the Maven `suites` property and translates it into scalatest runner args (`-w`, `-m`, etc.); it does not forward it as a `-D` system property on the forked test JVM. So discovery never saw the filter and stayed at the unfiltered 73. Fix in two parts: 1. pom.xml: define a default `<suites></suites>` property so the `${suites}` substitution always resolves, and append `-Dgbx.suites=${suites}` to scalatest-maven-plugin's <argLine>. This explicitly propagates the Maven property value to the forked JVM as a system property. 2. ProgressReporter: prefer `sys.props("gbx.suites")` (the forwarded value) and fall back to `sys.props("suites")` (only set when the user passes `-Dsuites=…` at the JVM level rather than to Maven — useful for direct invocation). After this, with the default CI invocation `mvn -Dsuites=com.databricks.labs.gbx.*`, the JVM sees `-Dgbx.suites=com.databricks.labs.gbx.*` and the reporter's matcher correctly excludes the 10 phantom classes (SparkAdaptersTest + docs.tests.scala.*). Co-authored-by: Isaac * chore(docs): rebuild doc-snippet-inventory.json [skip ci] * style(python): satisfy isort + black for PR lint gate CI surfaced one isort failure (test_sample_bundle.py — `_bundle as _bundle_mod` had to sort before the multiline `from … import (…)` because `_` precedes letters in isort's ordering). Once that's fixed, black also wants to reformat three pre-existing files where short signatures had been unnecessarily split across multiple lines: src/databricks/labs/gbx/gridx/bng/functions.py (3 fn signatures) src/databricks/labs/gbx/rasterx/functions.py (1 fn signature) test/gridx/test_bng_functions.py (mechanical) These are pure reformats — no behavior change. Verified locally with the pinned versions from requirements-ci.in (isort==8.0.1, black==26.3.1, flake8==7.3.0). All 26 .py files under src/ and test/ now pass all three gates cleanly. In test_sample_bundle.py, also replaced the previous pair of mis- ordered comments above the imports with a single accurate comment; isort's auto-fix had stacked both comments above the `_bundle_mod` line, leaving "Public API from package" attached to the wrong import. Co-authored-by: Isaac --------- Co-authored-by: Michael Johns <user.name> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2 parents 7b7677f + 89d546f commit dbcf711

17 files changed

Lines changed: 718 additions & 14527 deletions

File tree

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
name: 'Bootstrap pip JFrog routing (pre-setup-python)'
2+
description: |
3+
Acquire a JFrog OIDC access token and pre-configure pip + netrc so that any
4+
pip invocation routes through the JFrog db-pypi mirror instead of pypi.org —
5+
including actions/setup-python's internal "Upgrading pip" step, which runs
6+
before any other action gets a chance to redirect pip.
7+
8+
Why this exists: the Databricks hardened runner groups (larger-runners,
9+
databrickslabs-protected-runner-group) block egress to pypi.org by design;
10+
the go/hardened-gha policy is that all package fetches go through JFrog.
11+
Without this pre-bootstrap, setup-python's bundled "pip install --upgrade pip"
12+
hits pypi.org and dies with SSL EOF before .github/actions/jfrog-auth
13+
ever gets to configure pip.
14+
15+
Reuses the OIDC exchange script from .github/actions/jfrog-auth/jfrog-auth
16+
to avoid duplicating curl logic that we keep in sync with UCX. The main
17+
jfrog-auth composite still runs later in the same job (idempotent for pip;
18+
primary for Maven, which needs setup-java to have run first).
19+
20+
Caller job MUST declare:
21+
permissions:
22+
id-token: write
23+
runs:
24+
using: "composite"
25+
steps:
26+
- id: jfrog-auth
27+
name: Acquire JFrog OIDC access token
28+
shell: bash
29+
run: |
30+
if [[ -z "${ACTIONS_ID_TOKEN_REQUEST_URL}" ]] || [[ -z "${ACTIONS_ID_TOKEN_REQUEST_TOKEN}" ]]; then
31+
printf '::error::%s\n' 'This action uses OIDC: job must have "id-token: write" permission'
32+
exit 1
33+
fi
34+
"${GITHUB_WORKSPACE}/.github/actions/jfrog-auth/jfrog-auth" \
35+
"${ACTIONS_ID_TOKEN_REQUEST_URL}" \
36+
"${ACTIONS_ID_TOKEN_REQUEST_TOKEN}"
37+
38+
- name: Write pip.conf + netrc for JFrog (db-pypi)
39+
shell: bash
40+
env:
41+
JFROG_ACCESS_TOKEN: "${{ steps.jfrog-auth.outputs.jfrog-access-token }}"
42+
run: |
43+
umask 077
44+
cat > "${RUNNER_TEMP}/.netrc" << EOF
45+
machine databricks.jfrog.io
46+
login gha-service-account
47+
password ${JFROG_ACCESS_TOKEN}
48+
EOF
49+
# Same db-pypi URL the main jfrog-auth composite uses; the later
50+
# jfrog-auth run will overwrite this file with an identical value
51+
# (modulo a fresh token in netrc) — idempotent by design.
52+
cat > "${RUNNER_TEMP}/.pip.conf" << 'EOF'
53+
[global]
54+
index-url = https://databricks.jfrog.io/artifactory/api/pypi/db-pypi/simple
55+
EOF
56+
printf '%s=%s\n' 'NETRC' "${RUNNER_TEMP}/.netrc" >> "${GITHUB_ENV}"
57+
printf '%s=%s\n' 'PIP_CONFIG_FILE' "${RUNNER_TEMP}/.pip.conf" >> "${GITHUB_ENV}"
58+
printf '::debug::%s\n' 'Pre-bootstrap: configured JFrog access for pip.'

.github/actions/python_build/action.yml

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,24 @@ inputs:
1111
runs:
1212
using: "composite"
1313
steps:
14+
# Pre-route pip at db-pypi BEFORE actions/setup-python runs: setup-python's
15+
# python-versions installer unconditionally runs `pip install --upgrade pip`,
16+
# which on the hardened runner group hits the egress allowlist and fails
17+
# SSL-EOF against pypi.org. This step writes netrc + pip.conf so that
18+
# internal pip call (and every later one) routes through JFrog instead.
19+
# Idempotent if scala_build already ran in the same job — the env vars
20+
# NETRC + PIP_CONFIG_FILE just get re-set to the same paths with a fresh token.
21+
- name: Pre-bootstrap pip for JFrog (pre-setup-python)
22+
uses: ./.github/actions/jfrog-pip-bootstrap
1423
- name: Configure python interpreter
1524
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
1625
with:
1726
cache: 'pip'
1827
cache-dependency-path: '.ci-pip-cache-key'
1928
python-version: ${{ matrix.python }}
2029
# Route pip through JFrog (OIDC) per go/hardened-gha policy.
21-
# Idempotent if scala_build already ran in the same job (re-auths but env vars stay set).
22-
# Caller job must declare `permissions: id-token: write`.
30+
# Idempotent: jfrog-pip-bootstrap already configured pip; this re-runs the
31+
# same write with a fresh token. Caller job must declare `permissions: id-token: write`.
2332
- name: Authenticate for JFrog (pip via OIDC)
2433
uses: ./.github/actions/jfrog-auth
2534
- name: Add packaged GDAL dependencies

.github/actions/scala_build/action.yml

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,16 +27,25 @@ runs:
2727
- name: Set Maven opts for coverage and parallel builds
2828
shell: bash
2929
run: echo "MAVEN_OPTS=-Xmx4g -XX:+UseG1GC" >> $GITHUB_ENV
30+
# Pre-route pip at db-pypi BEFORE actions/setup-python runs: setup-python's
31+
# python-versions installer unconditionally runs `pip install --upgrade pip`,
32+
# which on the hardened runner group hits the egress allowlist and fails
33+
# SSL-EOF against pypi.org. This step writes netrc + pip.conf so that
34+
# internal pip call (and every later one) routes through JFrog instead.
35+
- name: Pre-bootstrap pip for JFrog (pre-setup-python)
36+
uses: ./.github/actions/jfrog-pip-bootstrap
3037
- name: Configure python interpreter
3138
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
3239
with:
3340
cache: 'pip' # caches dependencies for faster subsequent runs
3441
cache-dependency-path: '.ci-pip-cache-key'
3542
python-version: ${{ matrix.python }}
36-
# Route pip + Maven through JFrog (OIDC) per go/hardened-gha policy.
37-
# Must run after setup-java + setup-python so mvn + pip3 are on PATH for auto-detect.
43+
# Route Maven through JFrog (OIDC) per go/hardened-gha policy. Pip was
44+
# already configured by jfrog-pip-bootstrap above; this re-runs pip's
45+
# netrc/pip.conf write idempotently with a fresh token, and configures
46+
# Maven now that setup-java has put mvn + ~/.m2/settings.xml in place.
3847
# Caller job must declare `permissions: id-token: write`.
39-
- name: Authenticate for JFrog (pip + Maven via OIDC)
48+
- name: Authenticate for JFrog (Maven + pip refresh via OIDC)
4049
uses: ./.github/actions/jfrog-auth
4150
# Verify the PGP signature of every Maven dependency / plugin against
4251
# .maven-keys.list BEFORE any other mvn call resolves or compiles them.
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
name: Diag — POM transit (db-maven vs Maven Central)
2+
# One-shot diagnostic for Category B PGP failures: "PGP Signature INVALID"
3+
# on .pom files only (never .jar). Hypothesis: db-maven JFrog mirror is
4+
# mutating POM bytes between Maven Central and the CI runner, breaking the
5+
# upstream signature.
6+
#
7+
# This workflow fetches each suspect .pom + .pom.asc from db-maven (the
8+
# exact path Maven uses in CI), computes sha256, and compares against a
9+
# known reference from Maven Central (recorded inside the script). Mismatch
10+
# = byte mutation confirmed; match = signature failure has a different
11+
# root cause.
12+
#
13+
# Manual trigger only (workflow_dispatch). Delete this workflow once the
14+
# investigation concludes. Run blocks contain no untrusted github.event.*
15+
# values — only static script invocations.
16+
17+
on:
18+
# workflow_dispatch is the long-term trigger, but it only works once the
19+
# workflow file lands on the default branch. While this diagnostic lives
20+
# only on ci-fix-jfrog, also fire on push that touches the diag script
21+
# or the workflow itself so we can actually run it. Remove the push
22+
# trigger once the diagnostic concludes and the workflow is deleted.
23+
workflow_dispatch: {}
24+
push:
25+
branches:
26+
- 'ci-fix-jfrog'
27+
paths:
28+
- 'scripts/security/diag-pgpverify-pom-transit'
29+
- '.github/workflows/diag-pgpverify-pom-transit.yml'
30+
31+
permissions:
32+
contents: read
33+
34+
jobs:
35+
diag:
36+
name: pom-transit-diff
37+
runs-on:
38+
group: databrickslabs-protected-runner-group
39+
labels: linux-ubuntu-latest
40+
environment: runtime
41+
permissions:
42+
contents: read
43+
id-token: write
44+
steps:
45+
- name: Checkout
46+
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
47+
with:
48+
token: ${{ secrets.REPO_ACCESS_TOKEN || secrets.GITHUB_TOKEN }}
49+
- name: Configure JDK
50+
uses: actions/setup-java@be666c2fcd27ec809703dec50e508c2fdc7f6654 # v5.2.0
51+
with:
52+
java-version: '17'
53+
distribution: 'zulu'
54+
- name: Authenticate for JFrog (Maven via OIDC)
55+
uses: ./.github/actions/jfrog-auth
56+
- name: Install gpg (for standalone signature verification)
57+
run: sudo apt-get -o DPkg::Lock::Timeout=-1 install -y gpg
58+
- name: Run POM transit diagnostic
59+
run: ./scripts/security/diag-pgpverify-pom-transit

.maven-keys.list

Lines changed: 89 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -390,11 +390,8 @@ org.tukaani:xz = 0x369
390390
antlr:antlr:jar:2.7.2 = noSig
391391
antlr:antlr:pom:2.7.2 = noSig
392392
com.fasterxml.jackson.core:jackson-annotations:jar:2.18.3 = noSig
393-
com.fasterxml.jackson.core:jackson-annotations:pom:2.18.3 = noSig
394393
com.fasterxml.jackson.core:jackson-core:jar:2.18.3 = noSig
395-
com.fasterxml.jackson.core:jackson-core:pom:2.18.3 = noSig
396394
com.fasterxml.jackson.core:jackson-databind:jar:2.18.3 = noSig
397-
com.fasterxml.jackson.core:jackson-databind:pom:2.18.3 = noSig
398395
com.github.luben:zstd-jni:jar:1.5.7-6 = noSig
399396
com.github.luben:zstd-jni:pom:1.5.7-6 = noSig
400397
com.google.code.findbugs:jsr305:jar:2.0.1 = noSig
@@ -498,7 +495,6 @@ junit:junit:pom:4.13.2
498495
log4j:log4j:jar:1.2.12 = noSig
499496
log4j:log4j:pom:1.2.12 = noSig
500497
net.alchim31.maven:scala-maven-plugin:jar:4.9.9 = noSig
501-
net.alchim31.maven:scala-maven-plugin:pom:4.9.9 = noSig
502498
net.openhft:zero-allocation-hashing:jar:0.16 = noSig
503499
net.openhft:zero-allocation-hashing:pom:0.16 = noSig
504500
org.apache-extras.beanshell:bsh:jar:2.0b6 = noSig
@@ -1053,3 +1049,92 @@ xml-apis:xml-apis:jar:1.0.b2
10531049
xml-apis:xml-apis:jar:1.3.04 = noSig
10541050
xml-apis:xml-apis:pom:1.0.b2 = noSig
10551051
xml-apis:xml-apis:pom:1.3.04 = noSig
1052+
1053+
# --- Version-specific keyed entries surfaced by post-pin closure ----------
1054+
#
1055+
# These are lifecycle-plugin versions that pom.xml now pins explicitly
1056+
# (maven-compiler-plugin, maven-install-plugin, maven-deploy-plugin) plus
1057+
# their transitive plugin-dependencies (plexus-compiler-*, file-management).
1058+
# Newer versions are signed by Apache committers whose keys aren't on the
1059+
# Apache Maven KEYS file but ARE on keyserver.ubuntu.com with self-signed
1060+
# UIDs at apache.org addresses.
1061+
#
1062+
# Trust verification (2026-05-18):
1063+
#
1064+
# 0x84789D24DF77A32433CE1F079EB80E92EB2135B1
1065+
# uid: Slawomir Jaranowski <sjaranowski@apache.org>
1066+
# Apache Maven committer; lead developer of pgpverify-maven-plugin
1067+
# itself. Key created 2021-12-22, RSA 4096, active.
1068+
# https://keyserver.ubuntu.com/pks/lookup?op=vindex&fingerprint=on&search=0x84789D24DF77A32433CE1F079EB80E92EB2135B1
1069+
#
1070+
# 0x32118CF76C9EC5D918E54967CA80D1F0EB6CA4BA
1071+
# uid: Sylwester Lachiewicz <slachiewicz@apache.org>
1072+
# Apache committer; Mojo Codehaus / Maven plugins maintainer.
1073+
# Key created 2020-05-09, RSA 4096, active.
1074+
# https://keyserver.ubuntu.com/pks/lookup?op=vindex&fingerprint=on&search=0x32118CF76C9EC5D918E54967CA80D1F0EB6CA4BA
1075+
#
1076+
# Versions are pinned in pom.xml; bump them in lockstep with new entries here.
1077+
1078+
org.apache.maven.plugins:maven-install-plugin:jar:3.1.4 = 0x84789D24DF77A32433CE1F079EB80E92EB2135B1
1079+
org.apache.maven.plugins:maven-install-plugin:pom:3.1.4 = 0x84789D24DF77A32433CE1F079EB80E92EB2135B1
1080+
org.apache.maven.plugins:maven-deploy-plugin:jar:3.1.4 = 0x84789D24DF77A32433CE1F079EB80E92EB2135B1
1081+
org.apache.maven.plugins:maven-deploy-plugin:pom:3.1.4 = 0x84789D24DF77A32433CE1F079EB80E92EB2135B1
1082+
org.apache.maven.shared:file-management:jar:3.2.0 = 0x84789D24DF77A32433CE1F079EB80E92EB2135B1
1083+
org.apache.maven.shared:file-management:pom:3.2.0 = 0x84789D24DF77A32433CE1F079EB80E92EB2135B1
1084+
org.apache.maven.plugins:maven-compiler-plugin:jar:3.15.0 = 0x32118CF76C9EC5D918E54967CA80D1F0EB6CA4BA
1085+
org.apache.maven.plugins:maven-compiler-plugin:pom:3.15.0 = 0x32118CF76C9EC5D918E54967CA80D1F0EB6CA4BA
1086+
org.codehaus.plexus:plexus-compiler-api:jar:2.16.2 = 0x32118CF76C9EC5D918E54967CA80D1F0EB6CA4BA
1087+
org.codehaus.plexus:plexus-compiler-api:pom:2.16.2 = 0x32118CF76C9EC5D918E54967CA80D1F0EB6CA4BA
1088+
org.codehaus.plexus:plexus-compiler-javac:jar:2.16.2 = 0x32118CF76C9EC5D918E54967CA80D1F0EB6CA4BA
1089+
org.codehaus.plexus:plexus-compiler-javac:pom:2.16.2 = 0x32118CF76C9EC5D918E54967CA80D1F0EB6CA4BA
1090+
org.codehaus.plexus:plexus-compiler-manager:jar:2.16.2 = 0x32118CF76C9EC5D918E54967CA80D1F0EB6CA4BA
1091+
org.codehaus.plexus:plexus-compiler-manager:pom:2.16.2 = 0x32118CF76C9EC5D918E54967CA80D1F0EB6CA4BA
1092+
1093+
# --- badSig overrides for POMs byte-mutated by db-maven JFrog mirror -----
1094+
#
1095+
# These six .pom files fail pgpverify-maven-plugin with "PGP Signature
1096+
# INVALID" — cryptographic verification of the .asc against the bytes
1097+
# fails. The corresponding .jar files for the same artifacts verify
1098+
# OK (they're binary and pass through JFrog untouched).
1099+
#
1100+
# Root cause confirmed 2026-05-18 via the diag-pgpverify-pom-transit
1101+
# workflow (run 26056343517 on branch ci-fix-jfrog):
1102+
#
1103+
# * .pom sha256 from db-maven DIFFERS from Maven Central (size delta
1104+
# -348 to +170 bytes per file)
1105+
# * .pom.asc sha256 from db-maven MATCHES Maven Central (signatures
1106+
# untouched)
1107+
# * Hex dump shows LF → CRLF conversion at every line end, plus
1108+
# additional content normalization (whitespace / XML formatting)
1109+
# * Standalone `gpg --verify` on the db-maven bytes produces "Signature
1110+
# made … using RSA key … Can't check signature" — the .asc was
1111+
# computed against the original Maven Central bytes, not the
1112+
# JFrog-mutated ones.
1113+
#
1114+
# JFrog db-maven applies some form of text-resource transformation to
1115+
# .pom files on the mirror side, which breaks the upstream PGP signature
1116+
# chain. The .jar signatures remain trustworthy (binary, untouched), so
1117+
# code-execution integrity is still gated. POM-declared dependency
1118+
# coordinates inherit JFrog-as-trust-boundary status (an attacker who
1119+
# compromises db-maven could regress to a previously-trusted-but-now-
1120+
# vulnerable signed JAR via POM rewrite — risk is real but bounded
1121+
# to artifacts already in the mirror's allowlist).
1122+
#
1123+
# `badSig` here means "tolerate that the .pom signature does not
1124+
# cryptographically verify" — narrower than `noSig` ("tolerate no
1125+
# signature at all"). The artifact must still HAVE a .asc on the
1126+
# server; this only suppresses the crypto-failure error.
1127+
#
1128+
# Action item: file a JFrog admin ticket asking db-maven to disable
1129+
# text-artifact transformations so POM bytes pass through verbatim, then
1130+
# delete this block (POMs will verify again).
1131+
#
1132+
# Versions are pinned at the exact artifact:packaging:version level so
1133+
# future releases inherit strict verification rather than the override.
1134+
1135+
com.fasterxml.jackson.core:jackson-annotations:pom:2.18.3 = badSig
1136+
com.fasterxml.jackson.core:jackson-core:pom:2.18.3 = badSig
1137+
com.fasterxml.jackson.core:jackson-databind:pom:2.18.3 = badSig
1138+
javax.servlet:javax.servlet-api:pom:3.1.0 = badSig
1139+
net.alchim31.maven:scala-maven-plugin:pom:4.9.9 = badSig
1140+
org.iq80.snappy:snappy:pom:0.4 = badSig

0 commit comments

Comments
 (0)