Skip to content

spark-runtime: publish sources and javadoc-Jar with contents of its subprojects#16878

Open
schabch wants to merge 75 commits into
apache:mainfrom
schabch:main
Open

spark-runtime: publish sources and javadoc-Jar with contents of its subprojects#16878
schabch wants to merge 75 commits into
apache:mainfrom
schabch:main

Conversation

@schabch

@schabch schabch commented Jun 19, 2026

Copy link
Copy Markdown

This is a PR to already closed issue #15824. (The comments on this issue don't address the problem stated in the issue description.)

To my mind, the build of subproject spark-runtime should not publish empty sources and javadoc jars.
It should either publish no sources and javadoc jar or a sources and javadoc Jar that contains the sources and javadoc of its subprojects.

This PR adds the sources and javadoc of the subprojects to the source and javadoc jar of spark-runtime.

@schabch schabch changed the title Publish sources and javadoc-Jar of shadow jar spark-runtime.jar with contents of its subprojects spark-runtime: publish sources and javadoc-Jar with contents of its subprojects Jun 19, 2026
Fezinvirtual and others added 24 commits June 19, 2026 05:53
The jackson.databind dependency appears twice in the dependencies closure. This change removes the duplicated line. It does not introduce any functional or behavioral changes to the Kafka Connect connector.
* Flink: Add equality delete conversion committer

Adds EqualityConvertCommitter, a two input operator for the equality delete
conversion task. It buffers the DVWriteResults from the parallel
EqualityConvertDVWriter instances and the EqualityConvertPlan from the planner,
then commits once both the plan and its done-timestamp watermark have
arrived.

The committer adds both new staging data files and the writer's merged DVs,
removes the superseded DVs, and carries over the remaining staging deletes. It
validates against the planner's main snapshot, so external changes on the target
branch fail the commit. The commit summary records the processed staging
snapshot id, which the planner reads on restart to skip already-committed
snapshots, to ensure idempotency.

The committer is responsible for sending Trigger records downstream to the
TaskResultAggregator of the surrounding table maintenance framework. It emits
one after every cycle, including no-op and error, so the task always completes.
On an upstream abort or a commit failure it deletes the DVs written this cycle
so the failure doesn't leak Puffin files.

* fixup! Rename stale DV resolver/merger references to DV writer

* fixup! Reword inaccurate watermark-absorption javadoc on the committer

* fixup! Add shared-branch DV-merge regression test for the committer
…#16870)

Override newInputFile(String, long) so OSSFileIO uses the known length
instead of falling back to the default that drops it. Without the override,
getLength() triggers a HEAD request (getSimplifiedObjectMeta) even when the
caller already knows the size. S3FileIO, GCSFileIO, and ADLSFileIO already
override this method.

Generated-by: Claude Code (claude-opus-4-8)
…ache#16585)

* Parquet: Fix variant metrics crash when value column has no stats

* Fixup after 16327
…#16904)

Bumps [com.google.cloud.gcs.analytics:gcs-analytics-core](https://github.com/GoogleCloudPlatform/gcs-analytics-core) from 1.3.0 to 1.3.1.
- [Release notes](https://github.com/GoogleCloudPlatform/gcs-analytics-core/releases)
- [Changelog](https://github.com/GoogleCloudPlatform/gcs-analytics-core/blob/main/CHANGELOG.md)
- [Commits](GoogleCloudPlatform/gcs-analytics-core@v1.3.0...v1.3.1)

---
updated-dependencies:
- dependency-name: com.google.cloud.gcs.analytics:gcs-analytics-core
  dependency-version: 1.3.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…16903)

Bumps [org.openapitools:openapi-generator-gradle-plugin](https://github.com/OpenAPITools/openapi-generator) from 7.22.0 to 7.23.0.
- [Release notes](https://github.com/OpenAPITools/openapi-generator/releases)
- [Changelog](https://github.com/OpenAPITools/openapi-generator/blob/master/docs/release-summary.md)
- [Commits](OpenAPITools/openapi-generator@v7.22.0...v7.23.0)

---
updated-dependencies:
- dependency-name: org.openapitools:openapi-generator-gradle-plugin
  dependency-version: 7.23.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…6902)

Bumps [io.grpc:grpc-netty-shaded](https://github.com/grpc/grpc-java) from 1.81.0 to 1.82.0.
- [Release notes](https://github.com/grpc/grpc-java/releases)
- [Commits](grpc/grpc-java@v1.81.0...v1.82.0)

---
updated-dependencies:
- dependency-name: io.grpc:grpc-netty-shaded
  dependency-version: 1.82.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…#16901)

Bumps software.amazon.awssdk:bom from 2.46.5 to 2.46.10.

---
updated-dependencies:
- dependency-name: software.amazon.awssdk:bom
  dependency-version: 2.46.10
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [com.google.errorprone:error_prone_annotations](https://github.com/google/error-prone) from 2.49.0 to 2.50.0.
- [Release notes](https://github.com/google/error-prone/releases)
- [Commits](google/error-prone@v2.49.0...v2.50.0)

---
updated-dependencies:
- dependency-name: com.google.errorprone:error_prone_annotations
  dependency-version: 2.50.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [gradle/actions](https://github.com/gradle/actions) from 5.0.2 to 6.2.0.
- [Release notes](https://github.com/gradle/actions/releases)
- [Commits](gradle/actions@0723195...3f131e8)

---
updated-dependencies:
- dependency-name: gradle/actions
  dependency-version: 6.2.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps `nessie` from 0.107.9 to 0.108.0.

Updates `org.projectnessie.nessie:nessie-client` from 0.107.9 to 0.108.0
- [Release notes](https://github.com/projectnessie/nessie/releases)
- [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md)
- [Commits](projectnessie/nessie@nessie-0.107.9...nessie-0.108.0)

Updates `org.projectnessie.nessie:nessie-jaxrs-testextension` from 0.107.9 to 0.108.0
- [Release notes](https://github.com/projectnessie/nessie/releases)
- [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md)
- [Commits](projectnessie/nessie@nessie-0.107.9...nessie-0.108.0)

Updates `org.projectnessie.nessie:nessie-versioned-storage-inmemory-tests` from 0.107.9 to 0.108.0
- [Release notes](https://github.com/projectnessie/nessie/releases)
- [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md)
- [Commits](projectnessie/nessie@nessie-0.107.9...nessie-0.108.0)

Updates `org.projectnessie.nessie:nessie-versioned-storage-testextension` from 0.107.9 to 0.108.0
- [Release notes](https://github.com/projectnessie/nessie/releases)
- [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md)
- [Commits](projectnessie/nessie@nessie-0.107.9...nessie-0.108.0)

---
updated-dependencies:
- dependency-name: org.projectnessie.nessie:nessie-client
  dependency-version: 0.108.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.projectnessie.nessie:nessie-jaxrs-testextension
  dependency-version: 0.108.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.projectnessie.nessie:nessie-versioned-storage-inmemory-tests
  dependency-version: 0.108.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.projectnessie.nessie:nessie-versioned-storage-testextension
  dependency-version: 0.108.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [pymarkdownlnt](https://github.com/jackdewinter/pymarkdown) from 0.9.37 to 0.9.38.
- [Release notes](https://github.com/jackdewinter/pymarkdown/releases)
- [Changelog](https://github.com/jackdewinter/pymarkdown/blob/main/changelog.md)
- [Commits](jackdewinter/pymarkdown@v0.9.37...v0.9.38)

---
updated-dependencies:
- dependency-name: pymarkdownlnt
  dependency-version: 0.9.38
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
)

* Flink, GCP, Kafka, Spark: Update runtime-deps.txt

* Build: Bump datamodel-code-generator from 0.60.0 to 0.63.0

Bumps [datamodel-code-generator](https://github.com/koxudaxi/datamodel-code-generator) from 0.60.0 to 0.63.0.
- [Release notes](https://github.com/koxudaxi/datamodel-code-generator/releases)
- [Changelog](https://github.com/koxudaxi/datamodel-code-generator/blob/main/CHANGELOG.md)
- [Commits](koxudaxi/datamodel-code-generator@0.60.0...0.63.0)

---
updated-dependencies:
- dependency-name: datamodel-code-generator
  dependency-version: 0.63.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…pache#16924)

SparkValueConverter.convertToSpark returned a new
UnsupportedOperationException for STRUCT, LIST, and MAP instead of
throwing it, so the exception object would be passed downstream as a
value. Throw it to match the method's own default branch.
Remove path filters from the ASF allowlist workflow so it runs for all pull requests and pushes to main.

This surfaces upstream approved-pattern drift as a visible check failure even when a pull request does not edit workflow files.

Fixes apache#16914

Generated-by: Codex

Co-authored-by: Codex <codex@openai.com>
…value is null (apache#16826)

* kafka-connect: evolve table schema when record schema is updated but value is null

Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com>

* kafka connect: fix nested schema evolution when parent evolves, add map key evolution

Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com>

* Fix style

Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com>

* kafka connect: improve docs and testing for evolve schema when value is null

Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com>

* kafka connect: defer nested null value evolution when parent evolves, drop map key recursion

Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com>

---------

Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com>
dependabot Bot and others added 16 commits June 29, 2026 09:51
…e#16989)

Bumps software.amazon.awssdk:bom from 2.46.10 to 2.46.15.

---
updated-dependencies:
- dependency-name: software.amazon.awssdk:bom
  dependency-version: 2.46.15
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
The current spec defines this object as a single enum constant, not a comma-separated list of enum constants.
* CVE scan: improve PR failure reporting

Keep the Trivy scan step green so GitHub opens the reporting step that prints the actionable CVE findings, then fail PR runs from that reporting step.

Also remove the Spark 3.5 Jackson CVE ignore entries on this test branch so the PR run exercises the failure UI.

Generated-by: GPT-5 Codex

* CVE scan: format Trivy findings as a table

Parse Trivy SARIF messages into a compact table and keep the PR annotation concise so the failed reporting step is easier to read.

Generated-by: GPT-5 Codex

* CVE scan: simplify Trivy report step

Extract SARIF parsing and report rendering into small shell helpers so the report step is easier to read without changing behavior.

Generated-by: GPT-5 Codex

* CVE scan: restore Spark 3.5 Trivy ignore

Restore the Spark 3.5 Jackson CVE ignore entries that were removed only to exercise the PR failure UI.

Generated-by: GPT-5 Codex

* CVE scan: document PR and push behavior

Clarify that Trivy always writes SARIF, PRs fail from the reporting step on findings, push runs keep findings informational, and missing or unparseable SARIF is still an error.

Generated-by: GPT-5 Codex
…e#17015)

Include snapshot row-lineage fields and manifest list key IDs in BaseSnapshot equals/hashCode and toString. This keeps comparisons and hashes aligned with metadata that changes row ID assignment or manifest-list encryption key selection, and makes those values visible in debug output.

Co-authored-by: Codex <codex@openai.com>
…apache#17038)

This PR fixes the CI test flakiness seen in:

```
TestConvertEqualityDeletesE2E > testConvertEqualityDeletesE2E(String) > [1] staging FAILED
    org.opentest4j.AssertionFailedError:
    expected: 2L
     but was: 1L
        at app//org.apache.iceberg.flink.maintenance.api.TestConvertEqualityDeletesE2E.lambda$testConvertEqualityDeletesE2E$1(TestConvertEqualityDeletesE2E.java:127)
```
e.g.:
https://github.com/apache/iceberg/actions/runs/28499724055/job/84473933155?pr=16293

Flink's table maintenance framework maintains a lock to prevent concurrent
execution of maintenance tasks. The component responsible for removing the
lock (LockRemover) releases the task lock once a watermark reaches it past the
task's start timestamp.

EqualityConvertPlanner emits phase watermarks in the middle of its execution,
and the committer forwarded them, so the lock was released before the run
completed. The maintenance framework then started a next cycle concurrently, and
both re-processed the same uncommitted staging snapshot. The overlapping commits
conflicted, and one advanced its commit marker while dropping its deletion
vector, causing the test flakiness.

The solution is to forward watermarks from the committer only after it finishes
the conversion cycle, to ensure mutual exclusive execution of the maintenance
tasks.
…ime-4.0_2.13 into its source and javadoc-jar
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.