Skip to content

entrypoint: skip recursive chown over /actions-runner/{bin,externals}#583

Merged
myoung34 merged 1 commit intomyoung34:masterfrom
bitranox:reduce-actions-runner-chown
Apr 15, 2026
Merged

entrypoint: skip recursive chown over /actions-runner/{bin,externals}#583
myoung34 merged 1 commit intomyoung34:masterfrom
bitranox:reduce-actions-runner-chown

Conversation

@bitranox
Copy link
Copy Markdown
Contributor

What

Replace the recursive chown -R runner "${_RUNNER_WORKDIR}" /actions-runner on line 293 of entrypoint.sh with a targeted variant that:

  • non-recursively chowns /actions-runner itself + ${_RUNNER_WORKDIR}, and
  • uses find -maxdepth 1 ! -name bin ! -name externals -exec chown -R runner {} + for every other top-level entry under /actions-runner.

bin/ (~50 MB) and externals/ (~330 MB) are skipped because the image already ships them runner:runner.

Why

Verify on a pristine image:

docker run --rm --entrypoint sh myoung34/github-runner:ubuntu-noble \
    -c 'find /actions-runner -not -user runner'
# => prints nothing

Yet the current recursive chown walks 9 100+ files per container start. On overlayfs each chown syscall triggers copy-up even when ownership does not change, so the walk costs real disk I/O to flip exactly nothing. Under parallel starts the storage-driver contention dominates time-to-healthy.

The only genuinely root-owned files at chown time are the ones config.sh writes earlier in the same entrypoint.sh (.runner, .credentials*, .env, .path, svc.sh, and eventually _diag/). The find-based blacklist keeps chowning all of them (plus anything new config.sh might add in the future) while skipping the two known-heavy read-only dirs.

Measured impact

Host: LXC on Proxmox (ZFS-backed), 12 runner containers started in parallel via docker compose.

before after
all-12-containers time-to-healthy ~5 min ~25 s
per-container docker compose up -d return 1 s nominal, but 11 peers racing for copy-up I/O ~1 s, no contention

Single-container ephemeral starts on fast storage also benefit proportionally to the size of externals/ (~9 000 files).

Scope

  • Changed: only the one recursive chown line.
  • Unchanged: _CONFIGURED_ACTIONS_RUNNER_FILES_DIR chown (line 292), toolcache flat-chown (line 295), the RUN_AS_ROOT=true and non-root branches, _DEBUG_ONLY / _DEBUG_OUTPUT handling.

Prior art

Same spirit as #268 which narrowed the /opt/hostedtoolcache chown for the same reason (cache of already-correctly-owned files). This PR does the analogous narrowing for /actions-runner, which was not part of that change. Also relates to #239 and #267 (the /opt/hostedtoolcache half has been fixed; the /actions-runner half survives).

Notes / open questions

Happy to split further if a smaller diff is preferred — e.g. keep only the find-blacklist behind an opt-in env var. Also happy to add a similar narrowing in Dockerfile (recursive chown runner after tarball extract) so derived images don't depend on the tarball's happening-to-be-runner-owned ownership.

The Dockerfile (and the actions-runner tarball it extracts) ships
/actions-runner/ fully runner-owned, including ~50 MB of bin/ and
~330 MB of externals/ that contain node / .NET runtime libs used by
actions like setup-node and setup-python. Verify on a pristine image:

    docker run --rm --entrypoint sh myoung34/github-runner:ubuntu-noble \
        -c 'find /actions-runner -not -user runner'
    # => prints nothing

Yet `chown -R runner "${_RUNNER_WORKDIR}" /actions-runner` in
entrypoint.sh walks 9100+ files on every start. On overlayfs each chown
triggers copy-up regardless of whether ownership actually changes, so
the walk costs real disk I/O to flip exactly nothing. Under parallel
starts (e.g. 12 containers on one host) the resulting storage-driver
contention dominates time-to-healthy.

The files that do need flipping are the ones config.sh writes as root
earlier in this same entrypoint (.runner, .credentials,
.credentials_rsaparams, .env, .path, svc.sh, and eventually _diag/).
Enumerating them is fragile if config.sh ever adds an output, so instead
blacklist the two known-heavy dirs and chown everything else under
/actions-runner at depth 1:

- chown runner /actions-runner "${_RUNNER_WORKDIR}"  (non-recursive)
- find /actions-runner -mindepth 1 -maxdepth 1 \
      ! -name bin ! -name externals -exec chown -R runner {} +

This catches every top-level config-written file/dir (plus anything new
that may appear), skips the two bulk runtime dirs, and leaves -R on the
small subtrees that may legitimately need it (e.g. _diag/).

Unchanged:
- _CONFIGURED_ACTIONS_RUNNER_FILES_DIR chown on the preceding line
- toolcache flat-chown on the following line
- the RUN_AS_ROOT=true and non-root branches

Observed impact on a host running 12 parallel runners (ZFS-backed LXC
on Proxmox): time-to-all-healthy dropped from ~5 minutes to ~25 seconds;
per-container `docker compose up -d` returns in ~1 s instead of racing
11 peers for overlay copy-up I/O.
@myoung34 myoung34 merged commit 9c1fd73 into myoung34:master Apr 15, 2026
11 checks passed
@bitranox bitranox deleted the reduce-actions-runner-chown branch April 15, 2026 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants