Skip to content

Commit 2644a56

Browse files
committed
docs: parallelize man page generation with make -j
The man_pages Bazel action invoked 'make bazel-manpages' serially, running ~10,000 pandoc/nroff invocations one at a time (~177s on a 64-core host). Split the single make call into two phases: 'preprocess' (serial) generates the md/man*/*.md sources, then 'cat web' fan out pandoc/nroff in parallel with -j. They cannot share one -j invocation because cat/web read the md files that preprocess produces and there is no dependency edge forcing preprocess first. Running 'cat web' as a second invocation also re-parses the Makefile so its $(wildcard md/man*/*.md) picks up the freshly generated sources. Job count uses nproc with a sysctl fallback (then 4) so the action also works on macOS, where nproc is not available by default. A resource_set reserves the host CPUs for this action so the parallel make does not oversubscribe other concurrently scheduled Bazel actions; Bazel clamps the request to the cores actually available. Regenerating the full doc set drops from ~177s to ~6s (~30x). Signed-off-by: Matt Liberty <mliberty@precisioninno.com>
1 parent b506257 commit 2644a56

1 file changed

Lines changed: 20 additions & 1 deletion

File tree

bazel/man_pages.bzl

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,14 @@ the real work is delegated to the `bazel-manpages` Makefile target.
1010
Host requirements: pandoc, nroff (groff), col (bsdextrautils), python3>=3.10.
1111
"""
1212

13+
def _man_pages_resource_set(_os, _num_inputs):
14+
# The 'cat web' make below fans out with -j$(nproc), so this action uses
15+
# the whole host. Reserve all local CPUs to keep Bazel from co-scheduling
16+
# other heavy actions alongside it and oversubscribing the machine. Bazel
17+
# clamps the request to the cores actually available, so this is safe on
18+
# small CI hosts too.
19+
return {"cpu": 512.0}
20+
1321
def _man_pages_impl(ctx):
1422
cat_dir = ctx.actions.declare_directory("cat")
1523
html_dir = ctx.actions.declare_directory("html")
@@ -18,14 +26,25 @@ def _man_pages_impl(ctx):
1826
set -euo pipefail
1927
CAT_OUT="$PWD/{cat_out}"
2028
HTML_OUT="$PWD/{html_out}"
21-
make --no-print-directory -C docs -f Makefile bazel-manpages \\
29+
# Two phases: 'preprocess' (serial) generates the md/man*/*.md sources, then
30+
# 'cat web' fan out pandoc/nroff in parallel. They cannot share one -j make
31+
# invocation: cat/web read the md files preprocess produces, and a parallel
32+
# build has no dependency edge forcing preprocess to finish first. Running
33+
# 'cat web' as a second invocation also re-parses the Makefile so its
34+
# $(wildcard md/man*/*.md) picks up the freshly generated sources.
35+
# nproc is GNU coreutils (absent on stock macOS); fall back to sysctl, then 4.
36+
JOBS="$(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo 4)"
37+
make --no-print-directory -C docs -f Makefile preprocess \\
38+
CAT_ROOT_DIR="$CAT_OUT" HTML_ROOT_DIR="$HTML_OUT"
39+
make --no-print-directory -j"$JOBS" -C docs -f Makefile cat web \\
2240
CAT_ROOT_DIR="$CAT_OUT" HTML_ROOT_DIR="$HTML_OUT"
2341
""".format(
2442
cat_out = cat_dir.path,
2543
html_out = html_dir.path,
2644
)
2745

2846
ctx.actions.run_shell(
47+
resource_set = _man_pages_resource_set,
2948
outputs = [cat_dir, html_dir],
3049
inputs = ctx.files.docs_srcs + ctx.files.scripts + ctx.files.readmes + ctx.files.messages,
3150
command = command,

0 commit comments

Comments
 (0)