CI repair: pip/JFrog routing, PGP keysmap drift, POM transit, Python lockfile, test progress#26
Merged
Merged
Conversation
added 18 commits
May 18, 2026 14:35
The hardened runner groups (larger-runners, databrickslabs-protected-runner-group) block egress to pypi.org by design — all package fetches must go through the JFrog db-pypi mirror per go/hardened-gha policy. But actions/setup-python@v6.2.0 unconditionally runs `pip install --upgrade pip` inside its python-versions installer BEFORE our jfrog-auth step has a chance to redirect pip. That first pip call escapes the allowlist and dies with SSL UNEXPECTED_EOF against pypi.org, failing the entire Scala build at the "Configure python interpreter" step. Fix: add a new jfrog-pip-bootstrap composite that runs BEFORE setup-python. It mints a JFrog OIDC access token (reusing the existing jfrog-auth shell script — kept in sync with UCX), writes netrc + pip.conf, and exports NETRC + PIP_CONFIG_FILE to $GITHUB_ENV. By the time setup-python downloads Python and fires its internal pip upgrade, pip already routes through db-pypi and succeeds. The main jfrog-auth composite still runs after setup-python — now solely responsible for Maven config (which needs setup-java to have written ~/.m2/settings.xml first) and an idempotent pip refresh with a fresh token. Wired into both scala_build and python_build. No workflow changes: all five callers (build_main, build_scala, build_python, build_scala_by_package, codecov-scala-parallel, codecov-upload) already declare id-token: write. Co-authored-by: Isaac
pgpverify-maven-plugin renamed-then-removed the `failNoKey` parameter
sometime before 1.13.0; on 1.19.1 every build that runs the `check`
mojo emits:
Warning: Parameter 'failNoKey' is unknown for plugin
'pgpverify-maven-plugin:1.19.1:check (default)'
The lockdown intent ("fail when a dependency's signing key is not in
the keysmap") is the DEFAULT behavior in 1.19.1 — it's enforced by the
keysmap file itself (no `noKey` tokens in .maven-keys.list), which
.maven-keys.list already documents as policy. Removing the no-op
parameter does not weaken the security posture.
Changes:
* pom.xml: drop <failNoKey>true</failNoKey> from the verify-pgp
profile's plugin config; replace with a comment so this isn't
re-added on the next lockdown pass.
* scripts/security/maven-pgp-bootstrap: drop the matching
-DfailNoKey=false override (same unknown-parameter no-op).
Co-authored-by: Isaac
pgpverify-maven-plugin 1.13.0 deprecated `failNoSignature` in favor of
expressing the requirement through the keysmap file. On 1.19.1 the
`check` mojo emits:
Warning: Parameter 'failNoSignature' (user property
'pgpverify.failNoSignature') is deprecated:
Deprecated as of 1.13.0: this requirement can be
expressed through the keysMap.
.maven-keys.list already documents "DO NOT USE noSig" as policy and
contains no `noSig` tokens, so the strict-by-default behavior we want
("fail on unsigned artifact") is enforced through the keysmap. The
mojo parameter is redundant; remove it.
The verify-pgp profile now relies on:
* failWeakSignature=true (still a valid mojo parameter)
* keysmap entries without `noSig` (enforces "must be signed")
* keysmap entries without `noKey` (enforces "key must be mapped")
Updated the inline comment to record both the failNoSignature
deprecation and the prior failNoKey removal so the next lockdown pass
doesn't re-add either parameter.
Bootstrap script (scripts/security/maven-pgp-bootstrap) intentionally
keeps -DfailNoSignature=false / -DfailWeakSignature=false: those are
local-only relaxations needed for key discovery; the deprecation
warning surfaces during local bootstrap, not in CI.
Co-authored-by: Isaac
The hardened verify-pgp profile fails with "Not allowed artifact ... and
keyID" for five lifecycle plugins that pom.xml never pinned:
org.apache.maven.plugins:maven-install-plugin:3.1.4
org.apache.maven.plugins:maven-deploy-plugin:3.1.4
org.apache.maven.plugins:maven-compiler-plugin:3.15.0
org.codehaus.plexus:plexus-compiler-{api,javac,manager}:2.16.2
org.apache.maven.shared:file-management:3.2.0
Maven's Super POM auto-resolves these when the project doesn't pin them.
Resolution drifts across Maven minor versions and across CI runner image
upgrades, swapping in new (legit but un-mapped) signing keys. The PGP
verify step then rejects them — not because they're untrusted, but
because .maven-keys.list was bootstrapped against an earlier closure.
Pin maven-compiler-plugin, maven-install-plugin, and maven-deploy-plugin
in the root <build><plugins> block. plexus-compiler-* and file-management
come in transitively from the pinned plugins, so they freeze with them.
This commit does NOT update .maven-keys.list — that requires running
scripts/security/maven-pgp-bootstrap in the geobrix-dev container after
this pin lands, cross-checking each new fingerprint against the trust
anchors listed at the top of .maven-keys.list (apache KEYS, mojohaus,
etc.), and committing the reviewed entries as a follow-up.
Category B (POM-only "PGP Signature INVALID") is investigated separately
via the diag-pgpverify-pom-transit workflow added in the next commit.
Co-authored-by: Isaac
Category B from the post-lockdown verify-pgp build: "PGP Signature
INVALID" on .pom files only (never .jar). Hypothesis: the JFrog
db-maven mirror is mutating POM bytes (line endings, XML
normalization, or appended _remote.repositories metadata) between
Maven Central and what CI sees, breaking the upstream signature.
This adds:
* scripts/security/diag-pgpverify-pom-transit — fetches each of the
six failing .pom + .pom.asc files from db-maven via NETRC auth,
computes sha256, compares against a known-good Maven Central
reference (captured 2026-05-18 via maven-proxy.dev.databricks.com
and recorded inline), prints a hex dump of the first/last bytes
on mismatch, and runs gpg --verify standalone if gpg is on PATH.
* .github/workflows/diag-pgpverify-pom-transit.yml — workflow_dispatch
one-shot that authenticates to JFrog (OIDC), installs gpg, and runs
the script on the protected runner group. No untrusted github.event.*
values reach any run block — only static script invocations.
Trigger via Actions → "Diag — POM transit (db-maven vs Maven Central)"
→ Run workflow. The exit code surfaces red/green; the logs carry the
sha256 diff and gpg output. Delete this workflow + script once the
investigation concludes.
Co-authored-by: Isaac
…gory A) The version pins added in 5c1daf4 freeze the plugin versions but leave their actual signing keys unmapped in .maven-keys.list, so the verify-pgp profile still rejects them with "Not allowed artifact ... and keyID". Add version-specific entries for each. Both signing keys verified against keyserver.ubuntu.com on 2026-05-18: 0x84789D24DF77A32433CE1F079EB80E92EB2135B1 Slawomir Jaranowski <sjaranowski@apache.org> Apache Maven committer + lead developer of pgpverify-maven-plugin itself. RSA 4096, created 2021-12-22, active. Now signs newer maven-install-plugin, maven-deploy-plugin, file-management releases. 0x32118CF76C9EC5D918E54967CA80D1F0EB6CA4BA Sylwester Lachiewicz <slachiewicz@apache.org> Apache committer; Mojo Codehaus / Maven plugins maintainer. RSA 4096, created 2020-05-09, active. Signs newer maven-compiler-plugin and plexus-compiler-{api,javac,manager} releases. Neither key appears in downloads.apache.org/maven/KEYS, which only lists release-signing keys for current/past PMC members — these two are personal committer keys (self-signed UIDs at @apache.org), legitimate for per-artifact signing per Apache release policy. Trust anchor URLs and verification details recorded inline in .maven-keys.list above the new block so a future audit can re-verify. Entries are added at the most-specific level (g:a:p:v) so older versions that may exist in other build contexts remain bound to their original keysmap entry — versions only get the new keys when they're the exact ones pom.xml pins. This closes Category A. Category B (POM-only "PGP Signature INVALID" on jackson 2.18.3, scala-maven-plugin 4.9.9, snappy 0.4, javax.servlet-api 3.1.0) is investigated separately via the diag-pgpverify-pom-transit workflow. Co-authored-by: Isaac
workflow_dispatch requires the workflow to live on the default branch before it shows up in the UI / API. While the diagnostic only lives on this branch, also fire it on push that touches its own files so we can actually trigger a run. Drop the push trigger when the diagnostic is deleted. Co-authored-by: Isaac
…Cat B) Final root cause from the diag-pgpverify-pom-transit workflow (run 26056343517 on this branch): the JFrog db-maven mirror applies a text- artifact transformation to .pom files in transit. Per-file sha256 comparison vs Maven Central: .pom files: ALL 6 MISMATCH (size delta -348 to +170 bytes) .pom.asc: ALL 6 MATCH (signatures untouched) The hex dump shows LF -> CRLF at every line end plus additional content normalization (whitespace / XML formatting). Standalone `gpg --verify` on the db-maven bytes fails because the .asc was computed against the original LF Maven Central bytes, not the JFrog-mutated ones. The corresponding .jar files are binary and pass through JFrog untouched, so JAR signature verification is unaffected. Fix: add badSig overrides for the six affected .pom artifacts in a dedicated, self-documenting section at the end of .maven-keys.list, moving the four pre-existing noSig entries (jackson + scala-maven) out of the "legacy unsigned" section since they aren't legacy-unsigned, and adding two new entries for snappy / javax.servlet-api which had no prior override and were failing INVALID against the general entry. `badSig` is narrower than `noSig`: it only tolerates a crypto failure on a signature that DOES exist on the server. JAR verification is unchanged (the keysmap's general fingerprint entries still match the real signing keys, so JARs continue to verify strictly). Residual security risk documented inline: db-maven is now part of the POM-content trust boundary. An attacker who compromises the mirror could rewrite POMs to regress to a previously-trusted-but-vulnerable signed JAR — but only among artifacts already in the mirror's allowlist; the JAR signatures themselves still gate executable code. Action item (in the comment): file a JFrog admin ticket asking db-maven to disable text-artifact transformations so .pom bytes pass through verbatim, then delete this section. Co-authored-by: Isaac
GDAL 3.11.4's sdist pyproject.toml uses the PEP 639 SPDX form for
license metadata:
[project]
license = "MIT"
setuptools 74.0.0 (previously pinned) predates PEP 639 support and
rejects the string form, demanding the older table form ({file=...} or
{text=...}). The Scala build step's GDAL install (--no-build-isolation
--no-binary :all: gdal[numpy]==3.11.4) fails at metadata generation:
invalid pyproject.toml config: `project.license`.
configuration error: `project.license` must be valid exactly by
one definition (2 matches found): ...
GIVEN VALUE: "MIT"
OFFENDING RULE: 'oneOf'
PEP 639 SPDX-string parsing landed in setuptools 77.0.0. Bumping to
80.9.0 (current stable post-PEP-639) unblocks the GDAL sdist build
while staying within the hash-pinned closure.
Regenerated requirements-ci.txt via:
cd python/geobrix
uv pip compile --generate-hashes --python-version 3.12 \
--output-file requirements-ci.txt requirements-ci.in
setuptools is consumed via --no-build-isolation (Scala build action
line 102 + Python build action line 84) — pip uses the host setuptools
installed by requirements-ci.txt rather than fetching a build-time
isolated copy, so bumping it here is what GDAL's build sees.
Co-authored-by: Isaac
…ding warning
maven-resources-plugin 3.4.0 (and all other modern Maven plugins) look
up the standard property names `project.build.sourceEncoding` and
`project.reporting.outputEncoding`. The pom.xml had a custom `<encoding>`
property used by some other build pieces, but the standard names were
unset, causing:
[INFO] --- resources:3.4.0:resources (default-resources) @ geobrix ---
Warning: Using platform encoding (UTF-8 actually) to copy filtered
resources, i.e. build is platform dependent!
Set both standard properties to UTF-8. Custom `<encoding>` property is
preserved for any existing references.
Co-authored-by: Isaac
Long mvn test runs (full Scala suite is 10-15 min locally and on CI)
gave no visibility into how far along the run was — just an unbroken
stream of `SuiteName:` / `- test case` lines.
Add two complementary improvements:
1. Custom ScalaTest reporter at
src/test/scala/com/databricks/labs/gbx/util/ProgressReporter.scala
Fires after every SuiteCompleted event with a one-line marker:
[progress] suite #12 done · SpatialRefOpsTest · 215 ms · tests=6 (0 failed)
· totals: 312 tests, 0 failed · elapsed 3m 24s
On RunCompleted it prints a final summary. Counters are AtomicInteger
+ ThreadLocal so the reporter stays correct if scalatest ever runs
suites in parallel. Wired in pom.xml via
scalatest-maven-plugin's <reporters> config.
2. Add `D` to the stdout config flags (`FS` -> `FSD`) so each test
line carries its duration:
- fromEPSGCode(4326) should return SpatialReference with EPSG 4326 (12 ms)
Useful for spotting stuck/slow tests at a glance.
No new dependencies — ProgressReporter only uses org.scalatest.Reporter
and org.scalatest.events._ (already on the test classpath from
scalatest_2.13:3.2.14).
Co-authored-by: Isaac
setuptools >=77 deprecates the TOML-table form of `project.license`
(`{ text = "..." }` / `{ file = "..." }`) and emits a warning on every
build:
SetuptoolsDeprecationWarning: `project.license` as a TOML table
is deprecated. Please use a simple string containing a SPDX
expression for `project.license`. You can also use
`project.license-files`. ... By 2027-Feb-18, you need to update
your project and remove deprecated calls or your builds will no
longer be supported.
Since we bumped setuptools to 80.9.0 in 8672256 to unblock GDAL 3.11
sdist (whose pyproject.toml uses the new form), our own pyproject.toml
now trips the deprecation in the other direction.
Use PEP 639's LicenseRef-<id> namespace for the proprietary Databricks
License (no SPDX identifier exists). Canonical license text continues
to live in ../../LICENSE at the repo root; the `LicenseRef-` prefix
makes it valid PEP 639 syntax without claiming an SPDX-managed name.
Classifiers entry (License :: Other/Proprietary License) is left in
place for downstream consumers reading legacy Trove metadata.
Co-authored-by: Isaac
Per request — surface the denominator so progress reads as
[progress] suite #45/64 done · H3Test · 9 ms · …
instead of just `#45`.
ScalaTest's events don't carry a total-suite count anywhere
(RunStarting has testCount but no suiteCount; DiscoveryCompleted is
empty), so the reporter computes M itself: on first use, walk
`target/test-classes/` and count `*Test.class` files (geobrix's only
test-class naming convention — verified with `find src/test/scala
-name '*Test.scala' | wc -l`, all 64 match). The result caches via
`lazy val`; discovery is a few-ms file walk, runs once per test JVM.
Override path with `-DgbxTestClassesDir=…` if the discovery dir ever
moves. If discovery fails or the dir is missing (e.g., running the
reporter in an unusual classpath layout), totalSuites = 0 and the
formatter falls back to just `#N` rather than printing a garbage
denominator.
For filtered runs (`-Dsuites=com.databricks.labs.gbx.gridx.*`), M
still reports the full discoverable count, so `#3/64` reads as
"3 of 64 available" — a useful upper bound rather than misleading
"3 of 3 selected" semantics that would require coordinating with
ScalaTest's runner internals.
Co-authored-by: Isaac
Follow-on to fde207a (switched to `license = "LicenseRef-..."`). setuptools 80.9.0 doesn't just warn about the overlap; it raises InvalidConfigError when both a SPDX license expression and a License classifier are present: setuptools.errors.InvalidConfigError: License classifiers have been superseded by license expressions (see https://peps.python.org/pep-0639/). Please remove: License :: Other/Proprietary License Removed the classifier; the license is fully expressed via the SPDX `license` field above. The remaining classifiers (Topic, Programming Language, Operating System) are unaffected. Co-authored-by: Isaac
Reported denominator was 73 against an actual run of 63 — naive filename-only counting of `*Test.class` files included abstract base classes and helper traits that compile to `*Test.class` but ScalaTest doesn't run. Match ScalaTest's own discovery filter: * concrete (not abstract, not interface) * public * extends `org.scalatest.Suite` * has a public no-arg constructor Implemented via reflection over each candidate class file, loading with `Class.forName(name, initialize=false, classLoader)` so static initializers don't fire during counting. Any reflection failure (NoClassDefFoundError, missing transitive dep, etc.) conservatively counts the class as "not runnable" — better to undercount M than to inflate it. Discovery happens once per JVM via `lazy val`. This drops the denominator from 73 -> 63 (matches actual full-run suite count) without changing local/CI invocation; everything stays internal to the reporter. Co-authored-by: Isaac
The reflection-based discovery in da10881 still landed at M=73 against an actual run of 63 — the 10 extras pass the structural Suite filter (concrete + public + extends Suite + has a no-arg ctor) but contain no `test("...")` blocks, so they run zero work. There's no static way to detect "this Suite has no tests registered" without instantiating the class (`Suite.expectedTestCount` is an instance method), and constructing every candidate at discovery time would run user code with potential side effects (Spark session bootstrap, GDAL init, etc.). Switch M to a mutable AtomicInteger seeded from discovery. On every SuiteCompleted: * tests > 0: increment #N, emit the progress line as usual * tests == 0: decrement M, suppress the progress line entirely M converges to the real runnable count as empty suites are observed; the user sees `#N/M` where M moves downward toward the true total. Suppressing the empty-suite progress line keeps #N aligned with what the reader perceives as "actual work happened" (no `[progress] suite #X done · EmptyScaffoldingTest · 0 ms · tests=0` lines cluttering the stream). Also defers the discovery scan from `lazy val` initialization to the first SuiteCompleted event so the file walk happens after the test classpath is fully assembled in the forked JVM, not during reporter construction. Co-authored-by: Isaac
Previous count (73) stayed put through the entire run because the 10
extras are classes ScalaTest never sees: they pass the structural Suite
filter but live outside the `com.databricks.labs.gbx.*` namespace that
our build passes via `-Dsuites=…`, so the runner doesn't load them and
no SuiteCompleted ever fires — the decrement-on-empty fallback can't
fix what never appears.
Found via:
diff <(find target/test-classes -name '*Test.class' -not -name '*$*' \
| sed 's|target/test-classes/||;s|.class$||;s|/|.|g' | sort) \
<(find src/test/scala -name '*Test.scala' \
| sed 's|src/test/scala/||;s|.scala$||;s|/|.|g' | sort)
The 10 extras:
* org.apache.spark.sql.adapters.SparkAdaptersTest (× 1)
* docs.tests.scala.… (× 5)
* tests.docs.scala.… (× 4)
The latter 9 come in via build-helper-maven-plugin's `add-test-source`
config (docs/tests/scala/), but only the `Scala doc tests` step uses
them — and that step is currently disabled (`if: false` in
build_main.yml). For everything else they're dead weight at the .class
level.
scalatest-maven-plugin forwards `-Dsuites=…` into the test JVM as a
system property, so ProgressReporter can now read it and apply the
same pattern matching:
* exact `com.x.YTest` → equality
* package wildcard `com.x.*` → prefix match on `com.x.`
* comma-separated list → any-match
* empty / unset → accept all
The runtime "decrement on 0-test SuiteCompleted" guard from be883e3
is kept as belt-and-braces for any future Suite class that registers
zero tests at runtime.
Expected behavior post-change: with default `-Dsuites=com.databricks.labs.gbx.*`
the denominator becomes 63 from the first progress line and stays there.
Co-authored-by: Isaac
The previous attempt to read `-Dsuites=…` from `sys.props("suites")`
inside ProgressReporter found nothing — scalatest-maven-plugin
*consumes* the Maven `suites` property and translates it into scalatest
runner args (`-w`, `-m`, etc.); it does not forward it as a `-D` system
property on the forked test JVM. So discovery never saw the filter and
stayed at the unfiltered 73.
Fix in two parts:
1. pom.xml: define a default `<suites></suites>` property so the
`${suites}` substitution always resolves, and append
`-Dgbx.suites=${suites}` to scalatest-maven-plugin's <argLine>.
This explicitly propagates the Maven property value to the forked
JVM as a system property.
2. ProgressReporter: prefer `sys.props("gbx.suites")` (the forwarded
value) and fall back to `sys.props("suites")` (only set when the
user passes `-Dsuites=…` at the JVM level rather than to Maven —
useful for direct invocation).
After this, with the default CI invocation
`mvn -Dsuites=com.databricks.labs.gbx.*`, the JVM sees
`-Dgbx.suites=com.databricks.labs.gbx.*` and the reporter's matcher
correctly excludes the 10 phantom classes (SparkAdaptersTest +
docs.tests.scala.*).
Co-authored-by: Isaac
CI surfaced one isort failure (test_sample_bundle.py — `_bundle as _bundle_mod` had to sort before the multiline `from … import (…)` because `_` precedes letters in isort's ordering). Once that's fixed, black also wants to reformat three pre-existing files where short signatures had been unnecessarily split across multiple lines: src/databricks/labs/gbx/gridx/bng/functions.py (3 fn signatures) src/databricks/labs/gbx/rasterx/functions.py (1 fn signature) test/gridx/test_bng_functions.py (mechanical) These are pure reformats — no behavior change. Verified locally with the pinned versions from requirements-ci.in (isort==8.0.1, black==26.3.1, flake8==7.3.0). All 26 .py files under src/ and test/ now pass all three gates cleanly. In test_sample_bundle.py, also replaced the previous pair of mis- ordered comments above the imports with a single accurate comment; isort's auto-fix had stacked both comments above the `_bundle_mod` line, leaving "Public API from package" attached to the wrong import. Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR #25 (lockdown round 2, squash-merged as
7b7677f) left several latent CI breakages that only surfaced once GitHub Actions was re-enabled. This branch walks through each one with a root-cause fix, plus a few quality-of-life improvements that fell out along the way.18 commits, organized below by area. The work was iterative because each fix unblocked the next layer of the build to actually run and fail.
CI plumbing — pip cannot escape the hardened runner
c19b91d— pre-bootstrap pip→JFrog beforeactions/setup-python. The hardened runner groups (larger-runners,databrickslabs-protected-runner-group) block egress topypi.org; the supply-chain policy is to route everything throughdb-pypi. Butactions/setup-python@v6runs an unconditionalpip install --upgrade pipinside its python-versions installer before our existingjfrog-authstep has a chance to redirect pip. That first call dies with SSL EOF againstpypi.organd fails the entire Scala build at the "Configure python interpreter" step. New.github/actions/jfrog-pip-bootstrap/composite (reuses the existingjfrog-auth/jfrog-authshell script — kept in sync with UCX) mints the OIDC token, writesnetrc+pip.conf, exportsNETRC+PIP_CONFIG_FILEto$GITHUB_ENVbefore setup-python runs. Wired into bothscala_buildandpython_build.8672256— bumpsetuptools74.0.0 → 80.9.0 inrequirements-ci.txt. GDAL 3.11.4's sdistpyproject.tomluses the PEP 639 SPDX form (license = "MIT"); the pinnedsetuptools==74.0.0predates PEP 639 (landed in 77.0.0) and refused to parse it, blocking the--no-build-isolation --no-binary :all: gdal[numpy]step. Regenerated the hash-pinned lockfile viauv pip compile --generate-hashes.PGP verification — Category A: keysmap drift from version bumps
maven-pgp-verifyfailed withNot allowed artifact … and keyID …for five Maven lifecycle plugins whose actual signing keys didn't match.maven-keys.list.facda8f— drop<failNoKey>true</failNoKey>from pom.xml. Renamed-then-removed in pgpverify-maven-plugin ≥ 1.13; 1.19.1 emitsParameter 'failNoKey' is unknownevery build. The strict "fail on unknown signing key" policy is the default in 1.19.1, enforced through.maven-keys.listitself (nonoKeytokens). Inline comment so it doesn't get re-added.bb42ecc— drop deprecated<failNoSignature>true</failNoSignature>. Deprecated since 1.13.0 — express the requirement via the keysmap (nonoSigtokens) instead. Same warning pattern, same fix.5c1daf4— pinmaven-{compiler,install,deploy}-pluginversions inpom.xml. Maven's Super POM was auto-resolving these to fresh versions across CI runner image upgrades, each signed by different (legit but un-mapped) Apache committer keys. The bumped versions cascade toplexus-compiler-*andfile-managementtransitively. Pinning here freezes the closure so.maven-keys.listcan be authoritative.c897d31— add 14 version-specific keysmap entries for the pinned plugins. Both fingerprints verified againstkeyserver.ubuntu.com:0x84789D24…EB2135B1— Slawomir Jaranowski<sjaranowski@apache.org>, lead developer of pgpverify-maven-plugin itself (coversmaven-install-plugin,maven-deploy-plugin,file-management).0x32118CF7…EB6CA4BA— Sylwester Lachiewicz<slachiewicz@apache.org>, Apache committer / Mojo Codehaus maintainer (coversmaven-compiler-plugin,plexus-compiler-{api,javac,manager})..maven-keys.listso a future audit can re-verify.PGP verification — Category B: db-maven JFrog mirror mutates POM bytes
Six POM-only INVALID-signature failures (jackson-{core,annotations,databind} 2.18.3, scala-maven-plugin 4.9.9, snappy 0.4, javax.servlet-api 3.1.0). All
.jarartifacts verified fine — only.pomfiles failed.f2dc22b/447a4e2— diagnostic workflow + script.scripts/security/diag-pgpverify-pom-transitfetches each suspect.pom+.ascfromdb-maven(NETRC-authenticated via OIDC) and compares sha256 against a Maven Central reference (captured locally viamaven-proxy.dev.databricks.com, embedded in the script)..github/workflows/diag-pgpverify-pom-transit.ymlruns it on the protected runner.26056343517: all 6.pomsha256s MISMATCH, all 6.pom.ascsha256s MATCH. Hex dumps show LF → CRLF conversion at every line end plus additional content normalization (size deltas of −348 to +170 bytes after the line-ending adjustment). Standalonegpg --verifyconfirms the.ascwas computed against the original Maven Central bytes, not the JFrog-mutated ones. JFrog db-maven mirror applies a text-artifact transformation on.pomfiles.fa0512b—badSigoverrides for the 6 affected.poms in a new self-documenting section at the end of.maven-keys.list.badSigis narrower thannoSig: only tolerates crypto failure on a signature that exists. JAR verification still strict. Residual risk and action item ("file JFrog admin ticket to disable text-artifact transformations, then delete this block") recorded inline.Python build — setuptools 80.9 fallout
The setuptools bump (
8672256) made our ownpyproject.tomltrip the PEP 639 deprecation in the other direction.fde207a— switchproject.licenseto PEP 639 SPDX form. Changedlicense = { text = "Databricks License" }tolicense = "LicenseRef-Databricks-Proprietary"— the PEP 639LicenseRef-namespace for proprietary licenses with no SPDX identifier. Canonical text remains in../../LICENSEat the repo root.e32c178— drop the"License :: Other/Proprietary License"Trove classifier. setuptools ≥ 77 raisesInvalidConfigError(not just a warning) when both a SPDXlicenseexpression and a License classifier are present. PEP 639 forbids the overlap.Build hygiene
c0a6b2f— setproject.build.sourceEncoding=UTF-8(andproject.reporting.outputEncoding). Standard property names every Maven plugin looks up; only the custom<encoding>was previously set, triggering "Using platform encoding (UTF-8 actually) to copy filtered resources" from maven-resources-plugin 3.4.0. The custom property is kept for any existing references.Test ergonomics — per-suite progress reporter
The full Scala suite is 10–15 min locally and on CI; previously the only output was an unbroken stream of
SuiteName:/- test caselines with no "how far in are we" signal.c9d836a—src/test/scala/com/databricks/labs/gbx/util/ProgressReporter.scala+ addDto scalatest stdout config (FS→FSD). Fires after everySuiteCompletedwith a one-line marker; per-test lines now carry durations. Counters areAtomicIntegerto stay correct under any future parallel execution.fa0ed86— show#N/Minstead of#Nby auto-discovering total suites from compiled*Test.classfiles intarget/test-classes/.da10881— replicate ScalaTest's structural Suite filter in discovery (concrete + public + extendsorg.scalatest.Suite+ has public no-arg ctor), usingClass.forName(name, initialize=false, …)so static initializers don't fire during counting.be883e3— decrement M when aSuiteCompletedreports 0 tests. Belt-and-braces for Suite-extending classes with notest("…")blocks; M converges to the real runnable count as empty suites are observed.ee64cca— honor-Dsuites=…runtime filter in discovery so doc-test classes (docs.tests.scala.*) and out-of-namespace tests (org.apache.spark.sql.adapters.SparkAdaptersTest) — discovered by the file walk but never run by ScalaTest under the defaultcom.databricks.labs.gbx.*pattern — don't inflate M.dc98373— forward the Mavensuitesproperty into the test JVM asgbx.suitesvia<argLine>. scalatest-maven-plugin consumes the Maven property and translates it to runner args; it never appears asSystem.getProperty("suites")in the forked JVM. The argLine forwarding closes that gap so the reporter actually sees the filter at discovery time.What stays around after merge
.github/actions/jfrog-pip-bootstrap/— permanent. Required for any setup-python step on the hardened runner..github/workflows/diag-pgpverify-pom-transit.yml+scripts/security/diag-pgpverify-pom-transit— temporary. Useful for re-running once JFrog db-maven claims to have fixed text-artifact transformation. Delete (and thepush:trigger onci-fix-jfrog) once the badSig block in.maven-keys.listcan be removed..maven-keys.list: file a JFrog/ITSec ticket asking db-maven to disable.pombyte mutation.Test plan
build mainworkflow on this branch — should pass end-to-end through pip, setup-python, JFrog auth, maven-pgp-verify (both categories), GDAL install, Python build, Scala test suite, Python test suite.[progress] suite #1/63 done · …(denominator stable at 63, no 73-then-decrement behavior).Parameter 'failNoKey' is unknown/Parameter 'failNoSignature' is deprecated/Using platform encodingwarnings in the maven log.This pull request and its description were written by Isaac.