Skip to content

ci: scope Spark SQL trigger paths to per-version shims and diff#4415

Open
andygrove wants to merge 1 commit into
apache:mainfrom
andygrove:ci/tighten-spark-sql-triggers
Open

ci: scope Spark SQL trigger paths to per-version shims and diff#4415
andygrove wants to merge 1 commit into
apache:mainfrom
andygrove:ci/tighten-spark-sql-triggers

Conversation

@andygrove
Copy link
Copy Markdown
Member

Summary

Tighten the path allow-list on each spark_sql_test_<v>.yml so a change confined to one Spark version no longer fans out to the others.

Each workflow currently triggers on spark/src/main/** and dev/diffs/**. Today that means:

  • editing spark/src/main/spark-4.1/ runs the 3.5 and 4.0 workflows for no reason
  • editing dev/diffs/3.4.3.diff runs the 4.0 workflow for no reason
  • editing dev/diffs/iceberg/*.diff runs all four Spark SQL workflows for no reason
  • if we ever add a spark/src/main/spark-4.3/ shim it will run the 3.x workflows for no reason

The change is purely to the paths: filters; job logic is untouched.

Per-version mapping (from pom.xml)

Profile Shim dirs that apply Diff
spark-3.4 spark-3.4, spark-3.x dev/diffs/3.4.3.diff
spark-3.5 spark-3.5, spark-3.x dev/diffs/3.5.8.diff
spark-4.0 spark-4.0, spark-4.x dev/diffs/4.0.2.diff
spark-4.1 spark-4.1, spark-4.x dev/diffs/4.1.1.diff

Each workflow keeps spark/src/main/** (so unrelated Java/Scala/resources still trigger) but adds !-exclusions for the shim dirs that don't apply, and replaces dev/diffs/** with the single applicable diff file.

Both push: and pull_request: filters are updated where present (3.4 and 4.1 only have push: paths since their PR trigger is labeled).

Test plan

  • actionlint passes on all four workflow files
  • After merge, edit only spark/src/main/spark-4.1/... in a follow-up PR and confirm only the 4.1 workflow is queued
  • After merge, edit only dev/diffs/3.5.8.diff and confirm only the 3.5 workflow is queued
  • After merge, edit only dev/diffs/iceberg/1.10.0.diff and confirm none of the Spark SQL workflows are queued

Each spark_sql_test_<v>.yml currently triggers on `spark/src/main/**`
and `dev/diffs/**`, so a change confined to one Spark version's shim
or diff still fans out and runs the other versions' jobs.

Tighten the path allow-list per version:
- exclude unrelated `spark/src/main/spark-3.4/`, `spark-3.5/`, `spark-3.x/`,
  `spark-4.0/`, `spark-4.1/`, `spark-4.2/`, `spark-4.x/` directories so a
  3.4-only shim edit never fires the 4.x workflows, a 4.1-only shim edit
  never fires the 3.x workflows, and a future `spark-4.3/` shim won't
  trigger any 3.x workflow either
- replace `dev/diffs/**` with the single `dev/diffs/<full-version>.diff`
  the workflow actually applies, which also stops `dev/diffs/iceberg/`
  edits from triggering the Spark SQL test workflows
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant