|
| 1 | +# GitHub Workflows |
| 2 | + |
| 3 | +GitHub Actions only loads `*.yml` / `*.yaml` files in this directory as |
| 4 | +workflows. This README is ignored by the runner. |
| 5 | + |
| 6 | +## Pipeline overview |
| 7 | + |
| 8 | +A single umbrella workflow (`ci.yml`) orchestrates everything that runs on |
| 9 | +pull requests and pushes to `main`. The umbrella runs cheap **preflight** |
| 10 | +checks first, computes which heavy jobs are relevant to the change, and only |
| 11 | +then fans out to the long-running test/build workflows. Each long workflow |
| 12 | +is a `workflow_call` reusable invoked from the umbrella. |
| 13 | + |
| 14 | +``` |
| 15 | + pull_request | push to main | workflow_dispatch |
| 16 | + | |
| 17 | + v |
| 18 | + +-----------------------+ |
| 19 | + | preflight | ubuntu-slim |
| 20 | + | (RAT, prettier, | |
| 21 | + | missing-suites, | |
| 22 | + | actionlint) | |
| 23 | + +-----------+-----------+ |
| 24 | + | on success |
| 25 | + v |
| 26 | + +-----------------------+ |
| 27 | + | changes | ubuntu-slim |
| 28 | + | (dorny/paths-filter: | |
| 29 | + | one boolean per | |
| 30 | + | heavy job) | |
| 31 | + +-----------+-----------+ |
| 32 | + | |
| 33 | + +-----------+-----------+-----------+-----------+-----------+-----------+ |
| 34 | + | | | | | | | |
| 35 | + v v v v v v v |
| 36 | + pr_build_ pr_build_ pr_benchmark_ docs spark_3_5 spark_4_0 iceberg_1_10 |
| 37 | + linux macos check (push) (PR+push) (PR+push) (PR+push) |
| 38 | + (PR+push) (PR+push) (PR+push) |
| 39 | + | | | |
| 40 | + v v v |
| 41 | + spark_3_4 / spark_4_1 iceberg_1_8 / 1_9 |
| 42 | + (push or PR + label) (push only) |
| 43 | +
|
| 44 | + reusable workflows invoked via `uses:`: |
| 45 | + pr_build_linux.yml spark_sql_test_reusable.yml |
| 46 | + pr_build_macos.yml iceberg_spark_test_reusable.yml |
| 47 | + pr_benchmark_check.yml |
| 48 | + docs.yaml |
| 49 | +``` |
| 50 | + |
| 51 | +## What runs when |
| 52 | + |
| 53 | +| Job in `ci.yml` | Triggered by | Path filter source | |
| 54 | +|---------------------|-----------------------------------------|------------------------------------------| |
| 55 | +| `preflight` | every PR / push to main / dispatch | none (always runs) | |
| 56 | +| `changes` | every PR / push to main / dispatch | runs `dorny/paths-filter@v3` | |
| 57 | +| `pr_build_linux` | PR or push, paths matched | inlined in `changes` job | |
| 58 | +| `pr_build_macos` | PR or push, paths matched | inlined in `changes` job | |
| 59 | +| `pr_benchmark_check`| PR or push, paths matched | benchmark sources only | |
| 60 | +| `docs` | push to main, paths matched | `.asf.yaml`, `docs/**`, `ci.yml`, `docs.yaml` | |
| 61 | +| `spark_3_5` | PR or push, paths matched | Spark 3.5 sources | |
| 62 | +| `spark_4_0` | PR or push, paths matched | Spark 4.0 sources | |
| 63 | +| `spark_3_4` | push, **or** PR with `run-spark-3.4-tests` label | Spark 3.4 sources | |
| 64 | +| `spark_4_1` | push, **or** PR with `run-spark-4.1-tests` label | Spark 4.1 sources | |
| 65 | +| `iceberg_1_10` | PR or push, paths matched | Iceberg sources | |
| 66 | +| `iceberg_1_8` | push only | Iceberg sources | |
| 67 | +| `iceberg_1_9` | push only | Iceberg sources | |
| 68 | + |
| 69 | +A heavy job appears in the PR's checks list as a `skipped` entry whenever |
| 70 | +its path filter or event criteria don't match. Skipped checks count as |
| 71 | +passing for branch protection. |
| 72 | + |
| 73 | +## Standalone workflows (not under the umbrella) |
| 74 | + |
| 75 | +These workflows have their own triggers because they fire on events the |
| 76 | +umbrella doesn't watch, or operate independently of the rest of CI: |
| 77 | + |
| 78 | +| File | Why standalone | |
| 79 | +|-------------------------------|-------------------------------------------------| |
| 80 | +| `pr_title_check.yml` | Fires on `pull_request.types: [edited]` so it re-runs when a PR title is edited without a code push. | |
| 81 | +| `codeql.yml` | Security scanner; weekly schedule + on every push/PR. | |
| 82 | +| `miri.yml` | Nightly Miri safety checks. | |
| 83 | +| `stale.yml` | Daily stale-PR closer. | |
| 84 | +| `take.yml` | Issue-comment trigger for `take` / `untake`. | |
| 85 | +| `label_new_issues.yml` | Issue trigger to apply `requires-triage`. | |
| 86 | + |
| 87 | +## Reusable workflows (called by `ci.yml`) |
| 88 | + |
| 89 | +| File | Called from `ci.yml` job(s) | |
| 90 | +|-------------------------------------|------------------------------------------------| |
| 91 | +| `pr_build_linux.yml` | `pr_build_linux` | |
| 92 | +| `pr_build_macos.yml` | `pr_build_macos` | |
| 93 | +| `pr_benchmark_check.yml` | `pr_benchmark_check` | |
| 94 | +| `docs.yaml` | `docs` | |
| 95 | +| `spark_sql_test_reusable.yml` | `spark_3_4`, `spark_3_5`, `spark_4_0`, `spark_4_1` | |
| 96 | +| `iceberg_spark_test_reusable.yml` | `iceberg_1_8`, `iceberg_1_9`, `iceberg_1_10` | |
| 97 | + |
| 98 | +## Modifying path filters |
| 99 | + |
| 100 | +Each long workflow's "what files trigger me" rules live in the `changes` |
| 101 | +job inside `ci.yml` (in the `dorny/paths-filter` block). When adding a new |
| 102 | +test suite or moving sources, update the filter for the affected job there; |
| 103 | +the gate `if:` on each job consumes `needs.changes.outputs.<name>`. |
| 104 | + |
| 105 | +## Branch protection |
| 106 | + |
| 107 | +Required-check names changed when these workflows were consolidated. The |
| 108 | +umbrella exposes per-job names like `CI / pr_build_linux / Lint`, |
| 109 | +`CI / spark_3_5 / linux-test (...)`, etc. Update repository branch |
| 110 | +protection rules to point at the new names; the old standalone workflow |
| 111 | +names (`Spark SQL Tests (Spark 3.5)`, `PR Build (Linux)`, ...) no longer |
| 112 | +exist as top-level workflows. |
0 commit comments