You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add actions-audit.py script for auditing Apache repo security tooling
Adds a new utility script that audits apache/ GitHub repositories for
baseline Actions security configurations (dependabot, CodeQL, zizmor,
allowlist-check) and can create PRs to add missing ones.
Key features:
- Uses GraphQL to batch-fetch workflow file contents per repo
- Dry-run mode shows detailed preview of what PRs would contain
- Prints zizmor findings so users can see issues before creating PRs
- Skips secrets-outside-env zizmor rule (too noisy for initial rollout)
- Includes zizmor error output in PR body when workflows are commented out
-[Manual Version Addition](#manual-addition-of-specific-versions)
32
32
-[Removing a Version](#removing-a-version-manually)
33
+
-[Auditing Repositories for Actions Security Tooling](#auditing-repositories-for-actions-security-tooling)
33
34
34
35
## Submitting an Action
35
36
@@ -250,3 +251,107 @@ existing/action:
250
251
The infrastructure team will prioritize these removal requests and may take additional steps to notify affected projects if necessary.
251
252
252
253
For 'regular' removals (not security responses), you can use `./utils/action-usage.sh someorg/theaction` to see if/how an action is still used anywhere in the ASF, and create a 'regular' PR removing it from `actions.yml` (or adding an expiration date) when it is no longer used.
254
+
255
+
## Auditing Repositories for Actions Security Tooling
256
+
257
+
Recent security breaches have shown that GitHub Actions can fail silently, leaving repositories vulnerable without any visible indication. The `actions-audit.py` script helps ensure that all Apache repositories using GitHub Actions have a baseline set of security tooling in place.
258
+
259
+
### Why This Matters
260
+
261
+
GitHub Actions workflows can introduce security risks in several ways:
262
+
- **Unpinned or unreviewed action versions** may contain malicious code or vulnerabilities
263
+
- **Missing static analysis** means workflow misconfigurations (secret exposure, injection vulnerabilities) go undetected
264
+
- **No dependabot** means action versions never get updated, accumulating known vulnerabilities over time
265
+
266
+
The audit script checks each repository for four security configurations and can automatically open PRs to add any that are missing:
267
+
268
+
| Check | What it does |
269
+
|-------|-------------|
270
+
| **Dependabot** | Keeps GitHub Actions dependencies up to date with a 4-day cooldown to avoid overwhelming reviewers |
271
+
| **CodeQL** | Runs static analysis on workflow files to detect security issues in Actions syntax |
| **ASF Allowlist Check** | Ensures every action used is on the ASF Infrastructure approved allowlist |
274
+
275
+
### Prerequisites
276
+
277
+
- **Python 3.11+** and [**uv**](https://docs.astral.sh/uv/) **>= 0.9.17** (dependencies are managed inline via PEP 723). Make sure your uv is up to date — depending on how you installed it, run `uv self update`, `pip install --upgrade uv`, `pipx upgrade uv`, or `brew upgrade uv`
278
+
- **`gh`** (GitHub CLI, authenticated via `gh auth login`) — or provide a `--github-token` with `repo` scope and use `--no-gh`
279
+
- **`zizmor`** ([install instructions](https://docs.zizmor.dev/installation/)) — required for PR creation mode; not needed for `--dry-run`. If missing, zizmor pre-checks are skipped with a warning
280
+
281
+
### Usage
282
+
283
+
Always start with `--dry-run` to see what the script would do without making any changes:
284
+
285
+
```bash
286
+
# Audit all repos for a specific PMC (prefix before first '-' in repo name)
287
+
uv run utils/actions-audit.py --dry-run --pmc spark --max-num 10
288
+
289
+
# Audit multiple PMCs
290
+
uv run utils/actions-audit.py --dry-run --pmc kafka --pmc flink
291
+
292
+
# Audit the first 50 repos (no PMC filter)
293
+
uv run utils/actions-audit.py --dry-run --max-num 50
294
+
295
+
# Increase GraphQL page size for fewer API round-trips
296
+
uv run utils/actions-audit.py --dry-run --max-num 200 --batch-size 100
297
+
```
298
+
299
+
When satisfied with the dry-run output, remove `--dry-run` to create PRs:
300
+
301
+
```bash
302
+
# Create PRs for spark repos missing security tooling
303
+
uv run utils/actions-audit.py --pmc spark --max-num 10
304
+
```
305
+
306
+
#### Options
307
+
308
+
| Flag | Description |
309
+
|------|-------------|
310
+
| `--pmc PMC` | Filter by PMC prefix (repeatable). The prefix is the text before the first `-` in the repo name, e.g. `spark` matches `spark`, `spark-connect-go`, `spark-docker`. |
311
+
| `--dry-run` | Report findings without creating PRs or branches. |
312
+
| `--max-num N` | Maximum number of repositories to check (0 = unlimited, default). |
313
+
| `--batch-size N` | Number of repos to fetch per GraphQL request (default: 50, max: 100). |
314
+
| `--github-token TOKEN` | GitHub token. Defaults to `GH_TOKEN` or `GITHUB_TOKEN` environment variable. |
315
+
| `--no-gh` | Use Python `requests` instead of the `gh` CLI for all API calls. Requires `--github-token` or a token env var. |
316
+
317
+
#### How PMC Filtering Works
318
+
319
+
The `--pmc` flag matches repos by prefix: the text before the first hyphen in the repository name. For example, `--pmc spark` matches `apache/spark`, `apache/spark-connect-go`, and `apache/spark-docker`. If the repo name has no hyphen, the full name is used as the prefix.
320
+
321
+
The script downloads the list of known PMCs from `whimsy.apache.org` on first run and caches it locally (`~/.cache/asf-actions-audit/pmc-list.json`) for 24 hours. If a `--pmc` value doesn't match any known PMC, a warning is printed but it is still used as a prefix filter.
322
+
323
+
#### What the PRs Contain
324
+
325
+
For each repository that is missing one or more checks, the script creates a single PR on a branch named `asf-actions-security-audit` containing only the missing files:
326
+
327
+
- `.github/dependabot.yml`— created or updated to include the `github-actions` ecosystem with a 4-day cooldown
328
+
- `.github/workflows/codeql-analysis.yml`— CodeQL scanning for the `actions` language
329
+
- `.github/workflows/zizmor.yml`— Zizmor scanning with SARIF upload
330
+
- `.github/workflows/allowlist-check.yml`— ASF allowlist verification on workflow changes
331
+
332
+
#### Zizmor Pre-Check
333
+
334
+
Before creating a PR, the script runs `zizmor` against the repository's existing workflow files. If zizmor finds errors, the **CodeQL and Zizmor workflow files are added but commented out**, with instructions explaining:
335
+
- That zizmor found existing issues in the workflows
336
+
- How to auto-fix common issues (`zizmor --fix .github/workflows/`)
337
+
- That the PMC should uncomment the workflows and fix remaining issues in a follow-up PR
338
+
339
+
This avoids creating PRs that would immediately fail CI due to pre-existing problems.
340
+
341
+
#### Interactive Confirmation
342
+
343
+
When not in `--dry-run` mode, the script prompts for confirmation before creating each PR:
344
+
345
+
```
346
+
Create PR for apache/spark?
347
+
Will add: dependabot, codeql, zizmor, allowlist-check
348
+
Proceed? [yes/no/quit] (yes):
349
+
```
350
+
351
+
- **yes** (default) — create the PR
352
+
- **no** — skip this repository and continue to the next
353
+
- **quit** — stop processing entirely and print the summary
354
+
355
+
#### Idempotency
356
+
357
+
The script is safe to re-run. Before creating a PR for a repository, it checks whether a PR with the branch name `asf-actions-security-audit` already exists — open, closed, or merged — and skips the repo if so.
0 commit comments