Skip to content

Commit a094874

Browse files
dschoGit for Windows Build Agent
authored andcommitted
ci(dockerized): reduce the PID limit for private repositories
Every once in a while I need to verify that Microsoft Git's test suite passes for changes that are not yet meant for public consumption, and since it was (made) too difficult to keep up a working Azure Pipeline definition, I have to use GitHub Actions in a private GitHub repository for that purpose. In these tests, basically all Dockerized CI jobs fail consistently. The symptom is something like: error: cannot create async thread: Resource temporarily unavailable in the middle of a test, typically in the t5xxx-t6xxx range. The first such error is immediately followed by plenty more of these errors, and not a single test succeeds afterwards. At first, I thought that maybe the massive parallelism I enjoy there is the problem, and I thought that the cgroups limits might be shared between the many containers that run on essentially the same physical machine. But even reducing the matrix to just a single of those Dockerized jobs runs into the very same problems. The underlying reason seems to be a substantial difference in the hosted runners that execute these Dockerized jobs: forcing the PID limit of the container to a high number lets the jobs pass, even when running the complete matrix of all 13 Dockerized jobs concurrently. But that's not the only difference: The jobs seem to take a lot longer in these containers than, say, in the containers made available to https://github.com/git/git. When forcing a PID limit of 64k in that private repository, the jobs completed successfully, but they also took a lot longer, between 2x to 2.5x longer, i.e. painfully much longer. Reducing the PID limit to 16k, the CI jobs still passed, but took an equally long amount of time. Reducing the PID limit to 8k caused the errors to reappear. Here are the numbers from three example runs, the first one forcing the PID and nproc limit to 65536, the second one to 16384, the third run is from the public git/git repository: Job | 64k | 16k | reference ------------------------------|---------|---------|--------- almalinux-8 | 19m 3s | 16m 0s | 9m 36s debian-11 | 20m 31s | 20m 3s | 8m 5s fedora-breaking-changes-meson | 16m 29s | 19m 19s | 9m 40s linux-asan-ubsan | 1h 10m | 1h 11m | 34m 36s linux-breaking-changes | 25m 39s | 25m 58s | 13m 15s linux-leaks | 1h 9m | 1h 10m | 33m 30s linux-meson | 28m 9s | 27m 4s | 13m 45s linux-musl-meson | 16m 32s | 13m 39s | 8m 6s linux-reftable-leaks | 1h 13m | 1h 13m | 34m 34s linux-reftable | 26m 2s | 25m 48s | 13m 31s linux-sha256 | 26m 12s | 26m 3s | 12m 36s linux-TEST-vars | 26m 5s | 25m 21s | 13m 25s linux32 | 21m 16s | 19m 57s | 10m 44s It does not look as if the PID limit is the reason for the longer runtime, seeing as the 64k vs 16k timings deviate no more than as is usual with GitHub workflows. So let's go for 16k. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
1 parent 7c81b18 commit a094874

1 file changed

Lines changed: 3 additions & 1 deletion

File tree

.github/workflows/main.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -420,7 +420,9 @@ jobs:
420420
CI_JOB_IMAGE: ${{matrix.vector.image}}
421421
CUSTOM_PATH: /custom
422422
runs-on: ubuntu-latest
423-
container: ${{matrix.vector.image}}
423+
container:
424+
image: ${{ matrix.vector.image }}
425+
options: ${{ github.repository_visibility == 'private' && '--pids-limit 16384 --ulimit nproc=16384:16384 --ulimit nofile=32768:32768' || '' }}
424426
steps:
425427
- name: prepare libc6 for actions
426428
if: matrix.vector.jobname == 'linux32'

0 commit comments

Comments
 (0)