Surface untested source files at 0% coverage in reports#9
Draft
olivembo wants to merge 6 commits into
Draft
Conversation
Introduces a reusable coverage toolchain based on llvm-cov: - coverage/merger.py: per-test profraw -> profdata + object file packaging - coverage/reporter.py: cross-test aggregation, HTML/LCOV/text reports - coverage/effective_coverage.py: justification overlay + effective metrics - coverage/justify.py: justification manifest resolution - coverage/defs.bzl: score_coverage_reporter macro for consumer wiring - coverage/coverage.bazelrc: shared coverage flags - coverage/filter_regexes.txt: baseline source exclusions - coverage/generate_coverage_html.sh: convenience entry point Adds an end-to-end example under tests/coverage exercising the pipeline with a small instrumented library, test, justification file, and consumer filter regexes. Known limitation: source files not linked into any cc_test are not yet included in the report (no instrumented object file -> invisible to llvm-cov).
llvm-cov only reports files linked into at least one test binary.
Sources that exist in the workspace but no test pulls in silently
disappear from the report, causing coverage to appear higher than
it actually is.
Three mechanisms work together to fix this:
Bazel aspect + manifest rule (coverage/defs.bzl):
_collect_sources_aspect walks the dependency graph of all configured
targets and collects C/C++ source files. score_instrumented_sources_manifest
writes a workspace-relative path-per-line manifest.
score_coverage_reporter gains an optional instrumented_sources_manifest
parameter that passes the manifest to the reporter via
--instrumented_sources_manifest.
Reporter augmentation (coverage/reporter.py):
After llvm-cov export, the reporter compares the manifest against
covered sources from the LCOV output. For each missing file a
synthetic 0%-coverage LCOV record is appended (SF + DA per
non-blank line + LF/LH). The llvm-cov text summary TOTALS line is
updated in-place using re.finditer to preserve fixed-width column
alignment. Per-file HTML pages and a "Not Linked Into Tests" index
section are generated for visibility.
Correctness and security fixes applied during review:
- Use str.replace() instead of str.format() for HTML template
rendering so that { and } in C++ source bodies do not crash the
reporter with KeyError/ValueError.
- Separate stderr from stdout for run_llvm_cov_export and
run_llvm_cov_report (separate_stderr=True) so that llvm-cov
warning messages are not mixed into LCOV/summary output.
- Validate that resolved manifest paths stay within workspace_root
via Path.is_relative_to() before reading files.
- Extend _escape_html to cover ' (') and " (") so that
file paths with apostrophes do not break HTML attributes.
- Count only non-blank lines for LF in synthetic LCOV records
to avoid inflating the denominator in aggregate metrics.
Test fixture (tests/coverage/uncovered.cpp, uncovered.h, BUILD.bazel):
cc_library intentionally not linked into any cc_test. Verifies that
the reporter surfaces the file at 0% coverage rather than omitting it.
Docs (coverage/README.md): new section 5a with usage example.
- Use heuristic to identify instrumentable lines instead of counting all non-blank lines. Filters comments, preprocessor directives, lone braces, namespace declarations, and access specifiers to avoid inflating LF values in synthetic LCOV records. - Augment summary.txt and console output with untested file line counts so the visible TOTALS reflect the true coverage including 0%-files. - Parse the llvm-cov column header to determine the Lines group index dynamically instead of hardcoding position 1. - Add workspace-bounds check after resolve() in _find_untested_sources to prevent path traversal via symlinks. - Escape single and double quotes in _escape_html to prevent attribute breakout in generated HTML pages.
- Narrow _NON_EXECUTABLE_RE block-comment pattern from `|\*.*` to `|\*(?:[/\s].*)?` so that pointer dereferences (`*ptr = value;`) are correctly classified as executable. - Add py_test with unit tests for all reporter augmentation helpers: _is_likely_executable, _count_instrumentable_lines, _covered_sources_from_lcov, _find_untested_sources, _append_zero_coverage_lcov, _augment_text_summary, _escape_html. Includes a path-traversal rejection test for _find_untested_sources.
- Add py_library target for reporter so unit tests can import it. - Remove coverable_test from instrumented_sources manifest targets to avoid testonly dependency violation (test sources don't need to appear in the manifest — they're tested by definition).
The heuristic line count (_count_instrumentable_lines) cannot replicate what llvm-cov would report for actually-instrumented objects. Rewriting TOTALS with approximate numbers gives false precision. - _augment_text_summary: no longer rewrites the TOTALS line; appends a clearly-labelled WARNING banner with ~N estimated lines instead. - _inject_untested_section_into_index: injects a prominent banner right after <body> so it is the first thing reviewers see. Detail table uses ~N notation and includes a disclaimer about the heuristic. - _append_zero_coverage_lcov docstring documents the approximation. - Tests updated to match banner-only behavior.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
llvm-covonly reports files whose object files are linked into at least one test. Source files that exist in the workspace but nocc_testpulls in silently disappear from the coverage report.This PR adds a Bazel aspect that walks the dependency graph to collect all C/C++ sources, compares them against what
llvm-covactually reported, and augments the LCOV, HTML, and text outputs with synthetic 0%-coverage entries for the missing files.Approach
Why an aspect + manifest instead of patching
--instrumentation_filter?instrumentation_filtercontrols which targets get built with coverage instrumentation, butllvm-covstill only reports files whose.ois linked into a test binary. There's no Bazel-native way to surface the gap. The aspect walksdeps/srcs/implementation_depstransitively and writes a manifest of all reachable.cpp/.cc/.cxx/.cfiles. The reporter then diffs manifest vs. LCOV output.Why heuristic line counts instead of exact numbers?
Without running
llvm-covagainst actual instrumented object files (which don't exist for untested sources), we can't get exact instrumentable line counts. The heuristic (_count_instrumentable_lines) filters blank lines, comments, preprocessor directives, lone braces, and namespace declarations. All outputs explicitly label these as estimates (~N, "estimated via heuristic") to avoid false precision. The TOTALS line insummary.txtis intentionally left untouched — only a WARNING banner is appended.What changed
coverage/defs.bzl:_collect_sources_aspect,score_instrumented_sources_manifestrule,instrumented_sources_manifestparameter onscore_coverage_reportermacrocoverage/reporter.py: LCOV augmentation, HTML augmentation (top-banner + per-file pages + detail table), text summary banner, workspace-bounds check, HTML escaping including quotescoverage/BUILD.bazel:reporter_libpy_library for testabilitytests/coverage/:uncovered.cpp/.hfixture (library with no test),reporter_test.pywith unit tests for all augmentation helpers including path-traversal rejectionConsumer usage