Skip to content

Commit 0b66e08

Browse files
committed
Gate long-running jobs behind ubuntu-slim jobs.
1 parent a08cb4e commit 0b66e08

17 files changed

Lines changed: 608 additions & 831 deletions

.github/workflows/README.md

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
# GitHub Workflows
2+
3+
GitHub Actions only loads `*.yml` / `*.yaml` files in this directory as
4+
workflows. This README is ignored by the runner.
5+
6+
## Pipeline overview
7+
8+
A single umbrella workflow (`ci.yml`) orchestrates everything that runs on
9+
pull requests and pushes to `main`. The umbrella runs cheap **preflight**
10+
checks first, computes which heavy jobs are relevant to the change, and only
11+
then fans out to the long-running test/build workflows. Each long workflow
12+
is a `workflow_call` reusable invoked from the umbrella.
13+
14+
```
15+
pull_request | push to main | workflow_dispatch
16+
|
17+
v
18+
+-----------------------+
19+
| preflight | ubuntu-slim
20+
| (RAT, prettier, |
21+
| missing-suites, |
22+
| actionlint) |
23+
+-----------+-----------+
24+
| on success
25+
v
26+
+-----------------------+
27+
| changes | ubuntu-slim
28+
| (dorny/paths-filter: |
29+
| one boolean per |
30+
| heavy job) |
31+
+-----------+-----------+
32+
|
33+
+-----------+-----------+-----------+-----------+-----------+-----------+
34+
| | | | | | |
35+
v v v v v v v
36+
pr_build_ pr_build_ pr_benchmark_ docs spark_3_5 spark_4_0 iceberg_1_10
37+
linux macos check (push) (PR+push) (PR+push) (PR+push)
38+
(PR+push) (PR+push) (PR+push)
39+
| | |
40+
v v v
41+
spark_3_4 / spark_4_1 iceberg_1_8 / 1_9
42+
(push or PR + label) (push only)
43+
44+
reusable workflows invoked via `uses:`:
45+
pr_build_linux.yml spark_sql_test_reusable.yml
46+
pr_build_macos.yml iceberg_spark_test_reusable.yml
47+
pr_benchmark_check.yml
48+
docs.yaml
49+
```
50+
51+
## What runs when
52+
53+
| Job in `ci.yml` | Triggered by | Path filter source |
54+
|---------------------|-----------------------------------------|------------------------------------------|
55+
| `preflight` | every PR / push to main / dispatch | none (always runs) |
56+
| `changes` | every PR / push to main / dispatch | runs `dorny/paths-filter@v3` |
57+
| `pr_build_linux` | PR or push, paths matched | inlined in `changes` job |
58+
| `pr_build_macos` | PR or push, paths matched | inlined in `changes` job |
59+
| `pr_benchmark_check`| PR or push, paths matched | benchmark sources only |
60+
| `docs` | push to main, paths matched | `.asf.yaml`, `docs/**`, `ci.yml`, `docs.yaml` |
61+
| `spark_3_5` | PR or push, paths matched | Spark 3.5 sources |
62+
| `spark_4_0` | PR or push, paths matched | Spark 4.0 sources |
63+
| `spark_3_4` | push, **or** PR with `run-spark-3.4-tests` label | Spark 3.4 sources |
64+
| `spark_4_1` | push, **or** PR with `run-spark-4.1-tests` label | Spark 4.1 sources |
65+
| `iceberg_1_10` | PR or push, paths matched | Iceberg sources |
66+
| `iceberg_1_8` | push only | Iceberg sources |
67+
| `iceberg_1_9` | push only | Iceberg sources |
68+
69+
A heavy job appears in the PR's checks list as a `skipped` entry whenever
70+
its path filter or event criteria don't match. Skipped checks count as
71+
passing for branch protection.
72+
73+
## Standalone workflows (not under the umbrella)
74+
75+
These workflows have their own triggers because they fire on events the
76+
umbrella doesn't watch, or operate independently of the rest of CI:
77+
78+
| File | Why standalone |
79+
|-------------------------------|-------------------------------------------------|
80+
| `pr_title_check.yml` | Fires on `pull_request.types: [edited]` so it re-runs when a PR title is edited without a code push. |
81+
| `codeql.yml` | Security scanner; weekly schedule + on every push/PR. |
82+
| `miri.yml` | Nightly Miri safety checks. |
83+
| `stale.yml` | Daily stale-PR closer. |
84+
| `take.yml` | Issue-comment trigger for `take` / `untake`. |
85+
| `label_new_issues.yml` | Issue trigger to apply `requires-triage`. |
86+
87+
## Reusable workflows (called by `ci.yml`)
88+
89+
| File | Called from `ci.yml` job(s) |
90+
|-------------------------------------|------------------------------------------------|
91+
| `pr_build_linux.yml` | `pr_build_linux` |
92+
| `pr_build_macos.yml` | `pr_build_macos` |
93+
| `pr_benchmark_check.yml` | `pr_benchmark_check` |
94+
| `docs.yaml` | `docs` |
95+
| `spark_sql_test_reusable.yml` | `spark_3_4`, `spark_3_5`, `spark_4_0`, `spark_4_1` |
96+
| `iceberg_spark_test_reusable.yml` | `iceberg_1_8`, `iceberg_1_9`, `iceberg_1_10` |
97+
98+
## Modifying path filters
99+
100+
Each long workflow's "what files trigger me" rules live in the `changes`
101+
job inside `ci.yml` (in the `dorny/paths-filter` block). When adding a new
102+
test suite or moving sources, update the filter for the affected job there;
103+
the gate `if:` on each job consumes `needs.changes.outputs.<name>`.
104+
105+
## Branch protection
106+
107+
Required-check names changed when these workflows were consolidated. The
108+
umbrella exposes per-job names like `CI / pr_build_linux / Lint`,
109+
`CI / spark_3_5 / linux-test (...)`, etc. Update repository branch
110+
protection rules to point at the new names; the old standalone workflow
111+
names (`Spark SQL Tests (Spark 3.5)`, `PR Build (Linux)`, ...) no longer
112+
exist as top-level workflows.

0 commit comments

Comments
 (0)