|
| 1 | +# Branch Coverage MVP via Static Branch-Point Detection |
| 2 | + |
| 3 | +* Status: accepted |
| 4 | +* Date: 2026-05-04 |
| 5 | + |
| 6 | +## Context and Problem Statement |
| 7 | + |
| 8 | +Coverage today reports line-level execution only. Standard tooling (genhtml, Codecov, Coveralls) consumes branch records via the LCOV `BRDA`/`BRF`/`BRH` fields, which let reviewers see whether `else`/`elif` arms and individual `case` patterns were exercised. Adding true branch coverage to a Bash framework is non-trivial because: |
| 9 | + |
| 10 | +1. Bash exposes no native instrumentation comparable to gcov branch counters. |
| 11 | +2. The DEBUG trap fires on commands, not on branch decisions. |
| 12 | +3. `BASH_COMMAND` reflects the *next* command, not the boolean outcome of a conditional. |
| 13 | + |
| 14 | +We need a path that yields useful, mostly-correct branch metrics in LCOV reports without breaking Bash 3.0+ compatibility or the cost profile of the existing line tracker. |
| 15 | + |
| 16 | +## Decision Drivers |
| 17 | + |
| 18 | +* Bash 3.0+ compatibility (no associative arrays, no `[[`, no Bash 4-only features). |
| 19 | +* Reuse existing line-hit data; do not double the runtime cost of coverage. |
| 20 | +* LCOV output must be consumable by genhtml, Codecov and Coveralls without custom processing. |
| 21 | +* Implementation must fit in `src/coverage.sh` and remain testable with the existing unit-test patterns. |
| 22 | +* Behavior must be predictable enough to pin in tests; "best-effort heuristic" outputs are not acceptable. |
| 23 | + |
| 24 | +## Considered Options |
| 25 | + |
| 26 | +1. **Static branch-point detection plus line-hit inference** — parse the source file for branch-introducing constructs (`if`/`elif`/`else`, `case` patterns), compute the line range owned by each outcome, then mark the outcome as "taken" iff any line inside its range was hit. |
| 27 | +2. **Runtime decision tracing via `BASH_COMMAND`** — record the actual command being executed in the DEBUG trap and reconstruct decisions taken (`if X` followed by execution of either then-block or else-block). |
| 28 | +3. **Patch-based instrumentation** — preprocess source files to insert hit recorders inside each branch arm, run tests against the instrumented copy, post-process the data file. |
| 29 | + |
| 30 | +## Decision Outcome |
| 31 | + |
| 32 | +Chosen option: **Option 1 (static branch-point detection plus line-hit inference)**. |
| 33 | + |
| 34 | +It reuses the existing line-hit data file with no DEBUG-trap changes. Bash 3.0+ compatibility is preserved because the parser is a single pass over the source with brace counting, identical in shape to the existing `extract_functions` walker. The output maps cleanly to LCOV `BRDA` records, and the contract ("an arm is taken iff any executable line inside it was hit") is precise enough to write unit tests against. |
| 35 | + |
| 36 | +### Positive Consequences |
| 37 | + |
| 38 | +* Zero runtime cost beyond the existing line tracker. Branch records are computed during report generation, not during test execution. |
| 39 | +* Reuses `is_executable_line` and `get_all_line_hits`, which already tolerate Bash 3.0 limitations. |
| 40 | +* LCOV output remains a single file, consumed unchanged by downstream tools. |
| 41 | + |
| 42 | +### Negative Consequences |
| 43 | + |
| 44 | +* Branch detection is line-presence based, not outcome based. A `then` arm whose only statement is a comment-line will register as `not taken` even if the conditional fired (because there are no executable lines inside). This is documented as a known limitation. |
| 45 | +* Implicit `else` (when an `if/elif` chain has no explicit `else`) is reported only when at least one explicit arm exists; the synthetic "fall-through" outcome is omitted from this MVP and may be added in a follow-up. |
| 46 | +* Compound conditionals (`if A && B`) are reported as a single binary decision, not per sub-expression. |
| 47 | + |
| 48 | +## Pros and Cons of the Options |
| 49 | + |
| 50 | +### Option 1: Static + line-hit inference (chosen) |
| 51 | + |
| 52 | +* Good, because reuses existing data and code paths. |
| 53 | +* Good, because matches the implementation pattern of `extract_functions` already shipping in the codebase. |
| 54 | +* Good, because output is deterministic and easy to test. |
| 55 | +* Bad, because cannot distinguish "arm executed but produced no executable lines" from "arm not executed". |
| 56 | + |
| 57 | +### Option 2: Runtime DEBUG-trap decision tracing |
| 58 | + |
| 59 | +* Good, because reflects actual runtime behavior. |
| 60 | +* Bad, because `BASH_COMMAND` semantics across Bash 3.x and 5.x diverge for `((...))`, `[[...]]` and pipelines, requiring per-version logic. |
| 61 | +* Bad, because increases per-line overhead; the existing tracker already has measurable cost. |
| 62 | +* Bad, because subshell context loss (already documented for line coverage) extends to branches taken inside `$(...)`. |
| 63 | + |
| 64 | +### Option 3: Source-rewrite instrumentation |
| 65 | + |
| 66 | +* Good, because most accurate signal possible. |
| 67 | +* Bad, because requires either running tests against a rewritten source tree or hooking `source` to redirect to instrumented copies — both invasive and brittle. |
| 68 | +* Bad, because debugging stack traces and line numbers no longer match the user's source. |
| 69 | +* Bad, because doubles the code surface and breaks the "DEBUG-trap only" simplicity model. |
| 70 | + |
| 71 | +## Scope of MVP |
| 72 | + |
| 73 | +Included: |
| 74 | + |
| 75 | +* `if`/`elif`/`else` chains: each arm is one outcome. |
| 76 | +* `case` statements: each pattern is one outcome. |
| 77 | +* LCOV `BRDA:<line>,<block>,<branch>,<taken>` lines. |
| 78 | +* `BRF:<count>` and `BRH:<count>` per file. |
| 79 | + |
| 80 | +Deferred (potential follow-ups): |
| 81 | + |
| 82 | +* Synthetic "implicit-else" outcomes for `if/elif` chains without an explicit `else`. |
| 83 | +* Per-sub-expression decisions inside `if A && B`. |
| 84 | +* `&&` / `||` short-circuit branches outside `if`. |
| 85 | +* Loop-entry decisions (`while`/`until`). |
| 86 | + |
| 87 | +## Links |
| 88 | + |
| 89 | +* Builds on the function extractor introduced in `src/coverage.sh` (see `bashunit::coverage::extract_functions`). |
| 90 | +* LCOV format reference: <https://manpages.debian.org/unstable/lcov/geninfo.1.en.html> |
0 commit comments