spark-runtime: publish sources and javadoc-Jar with contents of its subprojects by schabch · Pull Request #16878 · apache/iceberg

schabch · 2026-06-19T09:56:47Z

This is a PR to already closed issue #15824. (The comments on this issue don't address the problem stated in the issue description.)

To my mind, the build of subproject spark-runtime should not publish empty sources and javadoc jars.
It should either publish no sources and javadoc jar or a sources and javadoc Jar that contains the sources and javadoc of its subprojects.

This PR adds the sources and javadoc of the subprojects to the source and javadoc jar of spark-runtime.

…ime-4.0_2.13 into its source and javadoc-jar

The jackson.databind dependency appears twice in the dependencies closure. This change removes the duplicated line. It does not introduce any functional or behavioral changes to the Kafka Connect connector.

…ting (apache#16858) (apache#16869)

* Flink: Add equality delete conversion committer Adds EqualityConvertCommitter, a two input operator for the equality delete conversion task. It buffers the DVWriteResults from the parallel EqualityConvertDVWriter instances and the EqualityConvertPlan from the planner, then commits once both the plan and its done-timestamp watermark have arrived. The committer adds both new staging data files and the writer's merged DVs, removes the superseded DVs, and carries over the remaining staging deletes. It validates against the planner's main snapshot, so external changes on the target branch fail the commit. The commit summary records the processed staging snapshot id, which the planner reads on restart to skip already-committed snapshots, to ensure idempotency. The committer is responsible for sending Trigger records downstream to the TaskResultAggregator of the surrounding table maintenance framework. It emits one after every cycle, including no-op and error, so the task always completes. On an upstream abort or a commit failure it deletes the DVs written this cycle so the failure doesn't leak Puffin files. * fixup! Rename stale DV resolver/merger references to DV writer * fixup! Reword inaccurate watermark-absorption javadoc on the committer * fixup! Add shared-branch DV-merge regression test for the committer

…) (apache#16888)

…#16870) Override newInputFile(String, long) so OSSFileIO uses the known length instead of falling back to the default that drops it. Without the override, getLength() triggers a HEAD request (getSimplifiedObjectMeta) even when the caller already knows the size. S3FileIO, GCSFileIO, and ADLSFileIO already override this method. Generated-by: Claude Code (claude-opus-4-8)

…ache#16585) * Parquet: Fix variant metrics crash when value column has no stats * Fixup after 16327

…#16904) Bumps [com.google.cloud.gcs.analytics:gcs-analytics-core](https://github.com/GoogleCloudPlatform/gcs-analytics-core) from 1.3.0 to 1.3.1. - [Release notes](https://github.com/GoogleCloudPlatform/gcs-analytics-core/releases) - [Changelog](https://github.com/GoogleCloudPlatform/gcs-analytics-core/blob/main/CHANGELOG.md) - [Commits](GoogleCloudPlatform/gcs-analytics-core@v1.3.0...v1.3.1) --- updated-dependencies: - dependency-name: com.google.cloud.gcs.analytics:gcs-analytics-core dependency-version: 1.3.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…16903) Bumps [org.openapitools:openapi-generator-gradle-plugin](https://github.com/OpenAPITools/openapi-generator) from 7.22.0 to 7.23.0. - [Release notes](https://github.com/OpenAPITools/openapi-generator/releases) - [Changelog](https://github.com/OpenAPITools/openapi-generator/blob/master/docs/release-summary.md) - [Commits](OpenAPITools/openapi-generator@v7.22.0...v7.23.0) --- updated-dependencies: - dependency-name: org.openapitools:openapi-generator-gradle-plugin dependency-version: 7.23.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…6902) Bumps [io.grpc:grpc-netty-shaded](https://github.com/grpc/grpc-java) from 1.81.0 to 1.82.0. - [Release notes](https://github.com/grpc/grpc-java/releases) - [Commits](grpc/grpc-java@v1.81.0...v1.82.0) --- updated-dependencies: - dependency-name: io.grpc:grpc-netty-shaded dependency-version: 1.82.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…#16901) Bumps software.amazon.awssdk:bom from 2.46.5 to 2.46.10. --- updated-dependencies: - dependency-name: software.amazon.awssdk:bom dependency-version: 2.46.10 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [com.google.errorprone:error_prone_annotations](https://github.com/google/error-prone) from 2.49.0 to 2.50.0. - [Release notes](https://github.com/google/error-prone/releases) - [Commits](google/error-prone@v2.49.0...v2.50.0) --- updated-dependencies: - dependency-name: com.google.errorprone:error_prone_annotations dependency-version: 2.50.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [gradle/actions](https://github.com/gradle/actions) from 5.0.2 to 6.2.0. - [Release notes](https://github.com/gradle/actions/releases) - [Commits](gradle/actions@0723195...3f131e8) --- updated-dependencies: - dependency-name: gradle/actions dependency-version: 6.2.0 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps `nessie` from 0.107.9 to 0.108.0. Updates `org.projectnessie.nessie:nessie-client` from 0.107.9 to 0.108.0 - [Release notes](https://github.com/projectnessie/nessie/releases) - [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md) - [Commits](projectnessie/nessie@nessie-0.107.9...nessie-0.108.0) Updates `org.projectnessie.nessie:nessie-jaxrs-testextension` from 0.107.9 to 0.108.0 - [Release notes](https://github.com/projectnessie/nessie/releases) - [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md) - [Commits](projectnessie/nessie@nessie-0.107.9...nessie-0.108.0) Updates `org.projectnessie.nessie:nessie-versioned-storage-inmemory-tests` from 0.107.9 to 0.108.0 - [Release notes](https://github.com/projectnessie/nessie/releases) - [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md) - [Commits](projectnessie/nessie@nessie-0.107.9...nessie-0.108.0) Updates `org.projectnessie.nessie:nessie-versioned-storage-testextension` from 0.107.9 to 0.108.0 - [Release notes](https://github.com/projectnessie/nessie/releases) - [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md) - [Commits](projectnessie/nessie@nessie-0.107.9...nessie-0.108.0) --- updated-dependencies: - dependency-name: org.projectnessie.nessie:nessie-client dependency-version: 0.108.0 dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.projectnessie.nessie:nessie-jaxrs-testextension dependency-version: 0.108.0 dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.projectnessie.nessie:nessie-versioned-storage-inmemory-tests dependency-version: 0.108.0 dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.projectnessie.nessie:nessie-versioned-storage-testextension dependency-version: 0.108.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [pymarkdownlnt](https://github.com/jackdewinter/pymarkdown) from 0.9.37 to 0.9.38. - [Release notes](https://github.com/jackdewinter/pymarkdown/releases) - [Changelog](https://github.com/jackdewinter/pymarkdown/blob/main/changelog.md) - [Commits](jackdewinter/pymarkdown@v0.9.37...v0.9.38) --- updated-dependencies: - dependency-name: pymarkdownlnt dependency-version: 0.9.38 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

) * Flink, GCP, Kafka, Spark: Update runtime-deps.txt * Build: Bump datamodel-code-generator from 0.60.0 to 0.63.0 Bumps [datamodel-code-generator](https://github.com/koxudaxi/datamodel-code-generator) from 0.60.0 to 0.63.0. - [Release notes](https://github.com/koxudaxi/datamodel-code-generator/releases) - [Changelog](https://github.com/koxudaxi/datamodel-code-generator/blob/main/CHANGELOG.md) - [Commits](koxudaxi/datamodel-code-generator@0.60.0...0.63.0) --- updated-dependencies: - dependency-name: datamodel-code-generator dependency-version: 0.63.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…pache#16924) SparkValueConverter.convertToSpark returned a new UnsupportedOperationException for STRUCT, LIST, and MAP instead of throwing it, so the exception object would be passed downstream as a value. Throw it to match the method's own default branch.

…rter (apache#16925)

…e#16912) This reverts commit e5151f3.

Remove path filters from the ASF allowlist workflow so it runs for all pull requests and pushes to main. This surfaces upstream approved-pattern drift as a visible check failure even when a pull request does not edit workflow files. Fixes apache#16914 Generated-by: Codex Co-authored-by: Codex <codex@openai.com>

…value is null (apache#16826) * kafka-connect: evolve table schema when record schema is updated but value is null Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com> * kafka connect: fix nested schema evolution when parent evolves, add map key evolution Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com> * Fix style Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com> * kafka connect: improve docs and testing for evolve schema when value is null Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com> * kafka connect: defer nested null value evolution when parent evolves, drop map key recursion Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com> --------- Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com>

…pache#16922)

…e#16989) Bumps software.amazon.awssdk:bom from 2.46.10 to 2.46.15. --- updated-dependencies: - dependency-name: software.amazon.awssdk:bom dependency-version: 2.46.15 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…5875)

The current spec defines this object as a single enum constant, not a comma-separated list of enum constants.

* CVE scan: improve PR failure reporting Keep the Trivy scan step green so GitHub opens the reporting step that prints the actionable CVE findings, then fail PR runs from that reporting step. Also remove the Spark 3.5 Jackson CVE ignore entries on this test branch so the PR run exercises the failure UI. Generated-by: GPT-5 Codex * CVE scan: format Trivy findings as a table Parse Trivy SARIF messages into a compact table and keep the PR annotation concise so the failed reporting step is easier to read. Generated-by: GPT-5 Codex * CVE scan: simplify Trivy report step Extract SARIF parsing and report rendering into small shell helpers so the report step is easier to read without changing behavior. Generated-by: GPT-5 Codex * CVE scan: restore Spark 3.5 Trivy ignore Restore the Spark 3.5 Jackson CVE ignore entries that were removed only to exercise the PR failure UI. Generated-by: GPT-5 Codex * CVE scan: document PR and push behavior Clarify that Trivy always writes SARIF, PRs fail from the reporting step on findings, push runs keep findings informational, and missing or unparseable SARIF is still an error. Generated-by: GPT-5 Codex

…e#17015) Include snapshot row-lineage fields and manifest list key IDs in BaseSnapshot equals/hashCode and toString. This keeps comparisons and hashes aligned with metadata that changes row ID assignment or manifest-list encryption key selection, and makes those values visible in debug output. Co-authored-by: Codex <codex@openai.com>

…che#17010)

…he#17031)

…apache#17038) This PR fixes the CI test flakiness seen in: ``` TestConvertEqualityDeletesE2E > testConvertEqualityDeletesE2E(String) > [1] staging FAILED org.opentest4j.AssertionFailedError: expected: 2L but was: 1L at app//org.apache.iceberg.flink.maintenance.api.TestConvertEqualityDeletesE2E.lambda$testConvertEqualityDeletesE2E$1(TestConvertEqualityDeletesE2E.java:127) ``` e.g.: https://github.com/apache/iceberg/actions/runs/28499724055/job/84473933155?pr=16293 Flink's table maintenance framework maintains a lock to prevent concurrent execution of maintenance tasks. The component responsible for removing the lock (LockRemover) releases the task lock once a watermark reaches it past the task's start timestamp. EqualityConvertPlanner emits phase watermarks in the middle of its execution, and the committer forwarded them, so the lock was released before the run completed. The maintenance framework then started a next cycle concurrently, and both re-processed the same uncommitted staging snapshot. The overlapping commits conflicted, and one advanced its commit marker while dropping its deletion vector, causing the test flakiness. The solution is to forward watermarks from the committer only after it finishes the conversion cycle, to ensure mutual exclusive execution of the maintenance tasks.

…completion (apache#17038) (apache#17067)

…ime-4.0_2.13 into its source and javadoc-jar

schabch and others added 2 commits June 19, 2026 11:40

apache#15824 Include sources and Javadoc of subprojects of spark-runt…

f26e71c

…ime-4.0_2.13 into its source and javadoc-jar

Merge branch 'apache:main' into main

8299feb

github-actions Bot added spark INFRA build labels Jun 19, 2026

schabch changed the title ~~Publish sources and javadoc-Jar of shadow jar spark-runtime.jar with contents of its subprojects~~ spark-runtime: publish sources and javadoc-Jar with contents of its subprojects Jun 19, 2026

Fezinvirtual and others added 24 commits June 19, 2026 05:53

Build: Remove duplicate jackson.databind declaration (apache#16879)

40175bf

The jackson.databind dependency appears twice in the dependencies closure. This change removes the duplicated line. It does not introduce any functional or behavioral changes to the Kafka Connect connector.

Flink: Backport: Add equality delete conversion DV resolution and wri…

722a073

…ting (apache#16858) (apache#16869)

Docs: Clarify geography type serialization (apache#16799)

41c8ee4

Flink: Backport: Add equality delete conversion committer (apache#16874…

2a2d5c0

…) (apache#16888)

Parquet: Fix variant metrics crash when value column has no stats (ap…

56b1e19

…ache#16585) * Parquet: Fix variant metrics crash when value column has no stats * Fixup after 16327

Docs: Clarify test method naming guidance (apache#16866)

0b30919

Spark 3.5, 4.0: Throw on unsupported complex types in SparkValueConve…

fb6bb97

…rter (apache#16925)

Revert "Spark: Spark tests cache rewrite input (apache#16740)" (apach…

902ed1b

…e#16912) This reverts commit e5151f3.

Core: Introduce builder for TrackedFile (apache#16769)

7c13104

ORC: Fix lower/upper bounds for timestamp_ns columns in OrcMetrics (a…

ef07755

…pache#16922)

dependabot Bot and others added 16 commits June 29, 2026 09:51

Spec: Add spec for expressions (apache#16652)

a2fb64b

Core, Spark: Migrate Spark table properties to Spark module (apache#1…

8c1ee9d

…5875)

REST: Fix schema of data-access object in REST spec (apache#16594)

be27af4

The current spec defines this object as a single enum constant, not a comma-separated list of enum constants.

API: Add indexStatsNames to create field names for content stats (apa…

e407470

…che#17010)

Core: Remove unused TestTrackedFileStruct.CONTENT_STATS_ORDINAL (apac…

41113b1

…he#17031)

Spec: Add optional specific-name to UDF definition model (apache#16727)

d6aa0bf

Docs: Document nightly snapshots (apache#16544)

da8ff44

Parquet: Read and write geometry and geography WKB values (apache#16982)

744e811

Spark 4.1: Map geo Spark types (apache#16851)

035fc1e

Flink: Backport: Hold back equality delete converter watermark until …

11706a2

…completion (apache#17038) (apache#17067)

apache#15824 Include sources and Javadoc of subprojects of spark-runt…

8af1339

…ime-4.0_2.13 into its source and javadoc-jar

Merge remote-tracking branch 'origin/main'

b2a5aa4

github-actions Bot added API parquet core data flink ORC docs AWS Specification Issues that may introduce spec changes. ALIYUN GCP OPENAPI AZURE KAFKACONNECT labels Jul 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

spark-runtime: publish sources and javadoc-Jar with contents of its subprojects#16878

spark-runtime: publish sources and javadoc-Jar with contents of its subprojects#16878
schabch wants to merge 75 commits into
apache:mainfrom
schabch:main

schabch commented Jun 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Uh oh!

Conversation

schabch commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

schabch commented Jun 19, 2026 •

edited

Loading