Skip to content

build/github: quiet bazel output in unit_tests CI job#170521

Merged
trunk-io[bot] merged 5 commits into
cockroachdb:masterfrom
tbg:ci-quieter-unit-tests
May 26, 2026
Merged

build/github: quiet bazel output in unit_tests CI job#170521
trunk-io[bot] merged 5 commits into
cockroachdb:masterfrom
tbg:ci-quieter-unit-tests

Conversation

@tbg
Copy link
Copy Markdown
Member

@tbg tbg commented May 19, 2026

Cuts the unit_tests GitHub Actions log from ~2,800 lines on a green run to ~400 (–85%) by removing several orthogonal sources of bulk. The actual failure signal (test name, target, EngFlow invocation link) is preserved or made more prominent.

Measured impact

Metric Before After Δ
Total log lines 2,781 409 –85%
Per-target PASSED roster 919 0 –100%
Metamorphic/GoMockReflect blocks many 0 –100%
Bazel build progress [N / M] 910 14 –98%
Failed test surfacing inline test.log dump summary + EngFlow link clearer

Sources removed

  1. Bazel build progress ([N / M] ..., ~33% of baseline) — suppressed via --noshow_progress.
  2. Per-test summary roster (//pkg/...:foo_test (cached) PASSED in 0.1s for ~900 targets, ~33% of baseline) — suppressed via --test_summary=terse. The "Executed N out of M" line stays so the overall run shape is still visible.
  3. Inline per-test failure dump (full captured stderr/stdout of any failing target, often ~99% noise per failure) — --test_output=summary replaces it with one FAILED line plus the EngFlow invocation link. summarize-build.sh continues to publish per-test failure annotations on the PR.
  4. gomock reflect helper preamblepkg/util/metamorphic.init() was running in every Go binary built with the crdb_test tag, including the rules_go reflect helpers that link a package being mocked. Their stderr was captured by bazel as INFO: From GoMockReflectExecOnlyGen ...: and emitted to bazel's stdout (so neither --test_output=* nor any test-level setting could suppress it). Fixed at the source by gating metamorphicEligible() on testing.Testing() so only actual Go test binaries initialize.

Note on --remote_download_outputs warning

The bazel warning about --remote_download_minimal conflicting with --remote_download_toplevel (from --config=engflowpublic) still appears but is harmless. Bazel gives explicit command-line flags precedence over config expansions, so minimal is in effect as intended. A code comment documents this.

Verified

A throwaway commit deliberately breaking TestAnnotateCtxTags was pushed to confirm the failure surfaces clearly under the new configuration. Output:

FAIL: //pkg/util/log:log_test (see .../test.log)
INFO: Build completed, 1 test FAILED
//pkg/util/log:log_test  FAILED in 6.8s
| `//pkg/util/log:log_test` | `TestAnnotateCtxTags` | `FAILURE` | [Link](https://mesolite.cluster.engflow.com/...) |

Epic: none

The unit_tests job streams two high-volume, low-signal categories to the
GitHub Actions log:

  * a per-test PASSED line for every cached/passing target (~920 lines
    on a green run, ~33% of total output), emitted by the default
    bazel --test_output=summary;
  * incremental build-progress lines of the form `[N / M] ...` (~910
    lines, another ~33%), emitted by bazel's UI progress reporter.

Neither helps when triaging a failure: on green runs they bury the
EngFlow invocation link; on red runs the failure summary table and the
per-test logs (on EngFlow) are what matter, not which actions were
cached.

Set --test_output=errors so output of failing tests is still printed
(passing tests stay silent) and --noshow_progress to drop the build
progress ticker. Bazel still emits ERROR: lines on build failure and
the final `Executed N out of M` summary, so failure signal is intact.

Release note: None

Epic: none
@tbg tbg self-assigned this May 19, 2026
@trunk-io
Copy link
Copy Markdown
Contributor

trunk-io Bot commented May 19, 2026

😎 Merged successfully - details.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

tbg added 2 commits May 19, 2026 10:06
The crdb_test build tag is inherited by any binary built from a package
that transitively imports this one. That includes the gomock reflect
helpers that rules_go links and executes at build time to generate mock
sources. Those processes have no reason to randomize constants, and
their stderr is captured into bazel's stdout, producing many

  INFO: From GoMockReflectExecOnlyGen pkg/.../...:
  metamorphic: use COCKROACH_RANDOM_SEED=... for reproduction
  initialized metamorphic constant "..." with value ...

blocks during the build phase of CI runs. These are pure noise: the
helpers don't run tests, the values they pick are never used by anyone,
and the lines are emitted before any test target executes.

Gate metamorphicEligible() on testing.Testing(), which is set via a
linker-injected var in `go test` binaries and reads as a constant from
init(). Build-time helpers see it as false and skip the random
initialization (and its logging) entirely.

Precedent for importing testing into a non-test source file under a
crdb_test gate: pkg/sql/inspect/check_helpers.go:491.

Release note: None
Epic: none
--test_output=errors dumps the full captured stderr/stdout of any
failed test target into the GitHub Actions log. For a target like
pkg/util/log:log_test containing dozens of test cases, that means the
entire per-test trail (setup banners, t.Log noise, RUN/PASS markers if
-test.v is on) lands inline whenever any single test in the binary
fails — typically ~99% of the dump is unrelated to the failure.

The full test.log is always available on EngFlow at the per-target
invocation link that bazel prints in summary mode, so dumping it inline
in the GHA log is pure redundancy.

Switch to --test_output=summary. The GHA log then contains one
FAILED/TIMEOUT line per failing target plus the EngFlow invocation
link; users click through for full test output. The bazel-github-helper
postprocessor still publishes per-test annotations via summarize-build,
so failed test names remain discoverable from the PR UI without
clicking.

Release note: None
Epic: none
@tbg tbg force-pushed the ci-quieter-unit-tests branch 4 times, most recently from b982852 to 54b8552 Compare May 19, 2026 11:01
@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented May 19, 2026

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@tbg tbg marked this pull request as ready for review May 19, 2026 11:02
@tbg tbg requested a review from a team as a code owner May 19, 2026 11:02
@tbg tbg removed their assignment May 19, 2026
@blathers-crl
Copy link
Copy Markdown

blathers-crl Bot commented May 19, 2026

Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link)

Comment thread build/github/unit-tests.sh
@tbg tbg requested a review from rickystewart May 21, 2026 08:27
Comment thread build/github/unit-tests.sh
Copy link
Copy Markdown
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct - the removal was not a no-op. Here's the analysis:

Key Finding

Bazel's precedence is explicit command-line flags > config expansions, regardless of argument order. This means --remote_download_minimal on the command line overrides --remote_download_toplevel from --config=engflowpublic, even when the config appears later.

Evidence

1. Empirical testing with bazel canonicalize-flags:

# Config AFTER explicit flag (the actual CI order)
$ bazel canonicalize-flags -- --remote_download_minimal --config engflowpublic
--remote_download_outputs=minimal

# Just the config (no explicit flag)
$ bazel canonicalize-flags -- --config engflowpublic
(none - uses default toplevel)

2. Verification from actual CI logs (PR #170724):

The command runs with both flags and shows the warning, but minimal was actually in effect.

3. Bazel documentation:

From bazelrc docs: "Options on the command line always take precedence over those in rc files."

Behavior Change

  • Before (with flag): --remote_download_outputs=minimal → downloads only final outputs
  • After (without flag): --remote_download_outputs=toplevel → downloads top-level outputs (more data)

The warning was real, but the explicit flag was winning. I'll revert this change.

tbg added 2 commits May 22, 2026 10:51
After --test_output=summary, bazel still emits its end-of-run summary
roster listing every test target with PASSED/FAILED status. For a CI
job that runs ~900 test targets, that's ~900 lines of PASSED noise
trailing the actual failure line, with each cached target taking a row
like:

  //pkg/build/bazel:bazel_test          (cached) PASSED in 0.1s

--test_summary=terse drops the PASSED rows and keeps only the failed
targets (plus the FLAKY/TIMEOUT categories). The "Executed N out of M"
line stays, so the overall run shape is still visible at a glance.

Release note: None
Epic: none
Add comment explaining that --remote_download_minimal is intentionally
kept despite the bazel warning about conflict with --remote_download_toplevel
from --config=engflowpublic. Bazel gives explicit command-line flags
precedence over config expansions, so the warning can be ignored.

Release note: None
Epic: none
@tbg tbg force-pushed the ci-quieter-unit-tests branch from 54b8552 to 15ce15b Compare May 22, 2026 08:52
@tbg tbg requested a review from rickystewart May 22, 2026 08:53
@trunk-io trunk-io Bot merged commit 45687f8 into cockroachdb:master May 26, 2026
37 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants