spark-runtime: publish sources and javadoc-Jar with contents of its subprojects#16878
Open
schabch wants to merge 75 commits into
Open
spark-runtime: publish sources and javadoc-Jar with contents of its subprojects#16878schabch wants to merge 75 commits into
schabch wants to merge 75 commits into
Conversation
…ime-4.0_2.13 into its source and javadoc-jar
The jackson.databind dependency appears twice in the dependencies closure. This change removes the duplicated line. It does not introduce any functional or behavioral changes to the Kafka Connect connector.
* Flink: Add equality delete conversion committer Adds EqualityConvertCommitter, a two input operator for the equality delete conversion task. It buffers the DVWriteResults from the parallel EqualityConvertDVWriter instances and the EqualityConvertPlan from the planner, then commits once both the plan and its done-timestamp watermark have arrived. The committer adds both new staging data files and the writer's merged DVs, removes the superseded DVs, and carries over the remaining staging deletes. It validates against the planner's main snapshot, so external changes on the target branch fail the commit. The commit summary records the processed staging snapshot id, which the planner reads on restart to skip already-committed snapshots, to ensure idempotency. The committer is responsible for sending Trigger records downstream to the TaskResultAggregator of the surrounding table maintenance framework. It emits one after every cycle, including no-op and error, so the task always completes. On an upstream abort or a commit failure it deletes the DVs written this cycle so the failure doesn't leak Puffin files. * fixup! Rename stale DV resolver/merger references to DV writer * fixup! Reword inaccurate watermark-absorption javadoc on the committer * fixup! Add shared-branch DV-merge regression test for the committer
…#16870) Override newInputFile(String, long) so OSSFileIO uses the known length instead of falling back to the default that drops it. Without the override, getLength() triggers a HEAD request (getSimplifiedObjectMeta) even when the caller already knows the size. S3FileIO, GCSFileIO, and ADLSFileIO already override this method. Generated-by: Claude Code (claude-opus-4-8)
…ache#16585) * Parquet: Fix variant metrics crash when value column has no stats * Fixup after 16327
…#16904) Bumps [com.google.cloud.gcs.analytics:gcs-analytics-core](https://github.com/GoogleCloudPlatform/gcs-analytics-core) from 1.3.0 to 1.3.1. - [Release notes](https://github.com/GoogleCloudPlatform/gcs-analytics-core/releases) - [Changelog](https://github.com/GoogleCloudPlatform/gcs-analytics-core/blob/main/CHANGELOG.md) - [Commits](GoogleCloudPlatform/gcs-analytics-core@v1.3.0...v1.3.1) --- updated-dependencies: - dependency-name: com.google.cloud.gcs.analytics:gcs-analytics-core dependency-version: 1.3.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…16903) Bumps [org.openapitools:openapi-generator-gradle-plugin](https://github.com/OpenAPITools/openapi-generator) from 7.22.0 to 7.23.0. - [Release notes](https://github.com/OpenAPITools/openapi-generator/releases) - [Changelog](https://github.com/OpenAPITools/openapi-generator/blob/master/docs/release-summary.md) - [Commits](OpenAPITools/openapi-generator@v7.22.0...v7.23.0) --- updated-dependencies: - dependency-name: org.openapitools:openapi-generator-gradle-plugin dependency-version: 7.23.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…6902) Bumps [io.grpc:grpc-netty-shaded](https://github.com/grpc/grpc-java) from 1.81.0 to 1.82.0. - [Release notes](https://github.com/grpc/grpc-java/releases) - [Commits](grpc/grpc-java@v1.81.0...v1.82.0) --- updated-dependencies: - dependency-name: io.grpc:grpc-netty-shaded dependency-version: 1.82.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…#16901) Bumps software.amazon.awssdk:bom from 2.46.5 to 2.46.10. --- updated-dependencies: - dependency-name: software.amazon.awssdk:bom dependency-version: 2.46.10 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [com.google.errorprone:error_prone_annotations](https://github.com/google/error-prone) from 2.49.0 to 2.50.0. - [Release notes](https://github.com/google/error-prone/releases) - [Commits](google/error-prone@v2.49.0...v2.50.0) --- updated-dependencies: - dependency-name: com.google.errorprone:error_prone_annotations dependency-version: 2.50.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [gradle/actions](https://github.com/gradle/actions) from 5.0.2 to 6.2.0. - [Release notes](https://github.com/gradle/actions/releases) - [Commits](gradle/actions@0723195...3f131e8) --- updated-dependencies: - dependency-name: gradle/actions dependency-version: 6.2.0 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps `nessie` from 0.107.9 to 0.108.0. Updates `org.projectnessie.nessie:nessie-client` from 0.107.9 to 0.108.0 - [Release notes](https://github.com/projectnessie/nessie/releases) - [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md) - [Commits](projectnessie/nessie@nessie-0.107.9...nessie-0.108.0) Updates `org.projectnessie.nessie:nessie-jaxrs-testextension` from 0.107.9 to 0.108.0 - [Release notes](https://github.com/projectnessie/nessie/releases) - [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md) - [Commits](projectnessie/nessie@nessie-0.107.9...nessie-0.108.0) Updates `org.projectnessie.nessie:nessie-versioned-storage-inmemory-tests` from 0.107.9 to 0.108.0 - [Release notes](https://github.com/projectnessie/nessie/releases) - [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md) - [Commits](projectnessie/nessie@nessie-0.107.9...nessie-0.108.0) Updates `org.projectnessie.nessie:nessie-versioned-storage-testextension` from 0.107.9 to 0.108.0 - [Release notes](https://github.com/projectnessie/nessie/releases) - [Changelog](https://github.com/projectnessie/nessie/blob/main/CHANGELOG.md) - [Commits](projectnessie/nessie@nessie-0.107.9...nessie-0.108.0) --- updated-dependencies: - dependency-name: org.projectnessie.nessie:nessie-client dependency-version: 0.108.0 dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.projectnessie.nessie:nessie-jaxrs-testextension dependency-version: 0.108.0 dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.projectnessie.nessie:nessie-versioned-storage-inmemory-tests dependency-version: 0.108.0 dependency-type: direct:production update-type: version-update:semver-minor - dependency-name: org.projectnessie.nessie:nessie-versioned-storage-testextension dependency-version: 0.108.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [pymarkdownlnt](https://github.com/jackdewinter/pymarkdown) from 0.9.37 to 0.9.38. - [Release notes](https://github.com/jackdewinter/pymarkdown/releases) - [Changelog](https://github.com/jackdewinter/pymarkdown/blob/main/changelog.md) - [Commits](jackdewinter/pymarkdown@v0.9.37...v0.9.38) --- updated-dependencies: - dependency-name: pymarkdownlnt dependency-version: 0.9.38 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
) * Flink, GCP, Kafka, Spark: Update runtime-deps.txt * Build: Bump datamodel-code-generator from 0.60.0 to 0.63.0 Bumps [datamodel-code-generator](https://github.com/koxudaxi/datamodel-code-generator) from 0.60.0 to 0.63.0. - [Release notes](https://github.com/koxudaxi/datamodel-code-generator/releases) - [Changelog](https://github.com/koxudaxi/datamodel-code-generator/blob/main/CHANGELOG.md) - [Commits](koxudaxi/datamodel-code-generator@0.60.0...0.63.0) --- updated-dependencies: - dependency-name: datamodel-code-generator dependency-version: 0.63.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…pache#16924) SparkValueConverter.convertToSpark returned a new UnsupportedOperationException for STRUCT, LIST, and MAP instead of throwing it, so the exception object would be passed downstream as a value. Throw it to match the method's own default branch.
Remove path filters from the ASF allowlist workflow so it runs for all pull requests and pushes to main. This surfaces upstream approved-pattern drift as a visible check failure even when a pull request does not edit workflow files. Fixes apache#16914 Generated-by: Codex Co-authored-by: Codex <codex@openai.com>
…value is null (apache#16826) * kafka-connect: evolve table schema when record schema is updated but value is null Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com> * kafka connect: fix nested schema evolution when parent evolves, add map key evolution Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com> * Fix style Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com> * kafka connect: improve docs and testing for evolve schema when value is null Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com> * kafka connect: defer nested null value evolution when parent evolves, drop map key recursion Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com> --------- Signed-off-by: Thomas Thornton <thomaswilliamthornton@gmail.com>
…e#16989) Bumps software.amazon.awssdk:bom from 2.46.10 to 2.46.15. --- updated-dependencies: - dependency-name: software.amazon.awssdk:bom dependency-version: 2.46.15 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
The current spec defines this object as a single enum constant, not a comma-separated list of enum constants.
* CVE scan: improve PR failure reporting Keep the Trivy scan step green so GitHub opens the reporting step that prints the actionable CVE findings, then fail PR runs from that reporting step. Also remove the Spark 3.5 Jackson CVE ignore entries on this test branch so the PR run exercises the failure UI. Generated-by: GPT-5 Codex * CVE scan: format Trivy findings as a table Parse Trivy SARIF messages into a compact table and keep the PR annotation concise so the failed reporting step is easier to read. Generated-by: GPT-5 Codex * CVE scan: simplify Trivy report step Extract SARIF parsing and report rendering into small shell helpers so the report step is easier to read without changing behavior. Generated-by: GPT-5 Codex * CVE scan: restore Spark 3.5 Trivy ignore Restore the Spark 3.5 Jackson CVE ignore entries that were removed only to exercise the PR failure UI. Generated-by: GPT-5 Codex * CVE scan: document PR and push behavior Clarify that Trivy always writes SARIF, PRs fail from the reporting step on findings, push runs keep findings informational, and missing or unparseable SARIF is still an error. Generated-by: GPT-5 Codex
…e#17015) Include snapshot row-lineage fields and manifest list key IDs in BaseSnapshot equals/hashCode and toString. This keeps comparisons and hashes aligned with metadata that changes row ID assignment or manifest-list encryption key selection, and makes those values visible in debug output. Co-authored-by: Codex <codex@openai.com>
…apache#17038) This PR fixes the CI test flakiness seen in: ``` TestConvertEqualityDeletesE2E > testConvertEqualityDeletesE2E(String) > [1] staging FAILED org.opentest4j.AssertionFailedError: expected: 2L but was: 1L at app//org.apache.iceberg.flink.maintenance.api.TestConvertEqualityDeletesE2E.lambda$testConvertEqualityDeletesE2E$1(TestConvertEqualityDeletesE2E.java:127) ``` e.g.: https://github.com/apache/iceberg/actions/runs/28499724055/job/84473933155?pr=16293 Flink's table maintenance framework maintains a lock to prevent concurrent execution of maintenance tasks. The component responsible for removing the lock (LockRemover) releases the task lock once a watermark reaches it past the task's start timestamp. EqualityConvertPlanner emits phase watermarks in the middle of its execution, and the committer forwarded them, so the lock was released before the run completed. The maintenance framework then started a next cycle concurrently, and both re-processed the same uncommitted staging snapshot. The overlapping commits conflicted, and one advanced its commit marker while dropping its deletion vector, causing the test flakiness. The solution is to forward watermarks from the committer only after it finishes the conversion cycle, to ensure mutual exclusive execution of the maintenance tasks.
…ime-4.0_2.13 into its source and javadoc-jar
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a PR to already closed issue #15824. (The comments on this issue don't address the problem stated in the issue description.)
To my mind, the build of subproject spark-runtime should not publish empty sources and javadoc jars.
It should either publish no sources and javadoc jar or a sources and javadoc Jar that contains the sources and javadoc of its subprojects.
This PR adds the sources and javadoc of the subprojects to the source and javadoc jar of spark-runtime.