Skip to content

Commit e7ca88e

Browse files
committed
refactor(contracts): make the coverage-sharding scripts maintainable
Address review feedback that the test-sharding / coverage scripts were hard to follow, with no change to results (parallel coverage in CI and locally, identical merged coverage). Rewrite mergeLcov.ts into a documented, unit-tested pure core (output byte-identical to before) and add its unit test; extract named helpers in runParallelCoverage.ts; document why runShardCoverage.ts must bypass the shell (the npx/HH308 brace-expansion, verified) and why compile.ts gates the coverage compile-skip on solidity-coverage's __SOLIDITY_COVERAGE_RUNNING flag; add a README explaining the whole shard pipeline. Verified locally: the mergeLcov unit test, parallel coverage at 461 files / 96.459% lines / 97.160% functions, and the solhint ratchet unchanged at 181. Signed-off-by: Miguel_LZPF <miguel.carpena@io.builders>
1 parent 10943f1 commit e7ca88e

11 files changed

Lines changed: 432 additions & 177 deletions

File tree

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# Test sharding & parallel coverage
2+
3+
This directory + a few test-side files implement two things on top of the ATS mega-asset test suite:
4+
5+
1. **Parallel local test runs** (`npm run test:parallel:ats`) — split the mega-asset core across mocha workers.
6+
2. **Sharded coverage** — run `solidity-coverage` in parallel and merge into one report. Both in CI (a 4-way
7+
GitHub Actions matrix) and locally (`npm run test:coverage:parallel`, one git worktree per shard).
8+
9+
Net effect: CI coverage ~14m → ~7m, local coverage ~11m → ~5m, with the **same** merged numbers as a serial run.
10+
11+
## The constraint that shapes everything
12+
13+
The IAsset suites export a `…Tests(getCtx)` function and **do not self-register** a `describe`. The only file
14+
that executes them is `ats.test.ts`. Consequences:
15+
16+
- Coverage **cannot** be sharded by directory glob (`--testfiles 'layer_1/**'` would run none of them).
17+
- `mocha --parallel` shards by **file**, so the whole core in one file = one worker = no speedup.
18+
19+
So a single piece of infrastructure feeds both goals: a discovery + balancing core, driven two ways.
20+
21+
## The pieces
22+
23+
| File | Role |
24+
| ------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
25+
| `test/contracts/integration/suiteDiscovery.ts` | Discover the IAsset suite files; `suiteWeight` (an `it()` count) and `assignBalancedShards` (greedy longest-processing-time bin-packing) to split them into evenly-weighted shards. |
26+
| `test/contracts/integration/atsShardRunner.ts` | `runAtsShard(index, total)` — deploy the mega-asset **once** (`loadFixture`) and run only this shard's balanced slice. |
27+
| `test/contracts/integration/ats.test.ts` | The entry. Reads `ATS_MEGA_SHARD_INDEX/TOTAL` (default `0/1` = every suite, one deploy = the serial CI run). |
28+
| `test/contracts/integration/shards/ats.shard.{1..8}.test.ts` | Physical entries for **local** `mocha --parallel` (one file → one worker → `runAtsShard(k-1, 8)`). |
29+
| `scripts/tools/coverage-shard/planShard.ts` | A coverage shard's file list: the mega entry (in every shard) + a balanced slice of the standalone suites. |
30+
| `scripts/tools/coverage-shard/runShardCoverage.ts` | Run `hardhat coverage` for ONE shard (CI calls per runner; the local runner calls per worktree). |
31+
| `scripts/tools/coverage-shard/mergeLcov.ts` | Union the per-shard lcovs into one report (preserving line/function/branch). |
32+
| `scripts/tools/coverage-shard/runParallelCoverage.ts` | Local orchestrator: a git worktree per shard, run them concurrently, merge, tear down. |
33+
| `tasks/compile.ts` | Skips the redundant second compile `hardhat coverage` would trigger (see below). |
34+
| `.github/workflows/100-flow-ats-test.yaml` | The CI matrix (`coverage-ats`, 4 shards) + the `merge-coverage` job. |
35+
36+
## Two parallelism mechanisms — and why both exist
37+
38+
| | Parallelism unit | How a shard is selected |
39+
| ------------------------------------------ | -------------------------------- | -------------------------------------------------------- |
40+
| **Local tests** (`test:parallel:ats`) | mocha workers (one per **file**) | the 8 physical `shards/ats.shard.N.test.ts` files |
41+
| **Coverage** (CI matrix & local worktrees) | separate runners / worktrees | one `ats.test.ts` parametrised by `ATS_MEGA_SHARD_*` env |
42+
43+
mocha can only parallelise across _files_, so local parallel tests need physical shard files. Coverage gets its
44+
isolation from separate runners/worktrees, so it needs no extra files — the shard count is just an env var.
45+
46+
## Why worktrees for local parallel coverage
47+
48+
A coverage run **rewrites generated source in place** — the compile hook regenerates
49+
`contracts/infrastructure/utils/EvmAccessors.sol` and `scripts/domain/atsRegistry.generated.ts`, and
50+
solidity-coverage instruments the build. N concurrent shards in one checkout would race and corrupt each other.
51+
A git worktree per shard gives each an isolated working tree (the same isolation CI gets from separate runners);
52+
`node_modules` is symlinked, not reinstalled. Shard lcovs have worktree-absolute `SF:` paths, so they are
53+
normalised to package-relative before merging (otherwise the merge can't union them).
54+
55+
## Why an in-repo lcov merger (not `lcov` / `lcov-result-merger`)
56+
57+
- `lcov-result-merger` (npm) silently **drops function coverage**.
58+
- The system `lcov` CLI reproduces the numbers only with a large `--rc`/`--ignore-errors` flag soup, emits
59+
thousands of "inconsistent" warnings on solidity-coverage's output, and rewrites the file in a newer
60+
function-record format (`FNL`/`FNA`) whose Codecov support is unverified.
61+
62+
`mergeLcov.ts` is ~100 lines, exact, dependency-free, and unit-tested
63+
(`test/scripts/unit/tools/mergeLcov.test.ts`).
64+
65+
## Why the compile-skip in `tasks/compile.ts`
66+
67+
`hardhat coverage` compiles the instrumented sources itself and then runs `TASK_TEST`, which compiles **again**.
68+
That second pass is a wasted ~60s/shard. The override forces `noCompile` while a coverage run is in progress,
69+
detected via solidity-coverage's own `__SOLIDITY_COVERAGE_RUNNING` HRE flag (set only during `hardhat coverage`).
70+
71+
## `runShardCoverage` must not go through a shell
72+
73+
`solidity-coverage --testfiles` takes one glob; for >1 file we pass a brace list `{a,b,c}`. We run Hardhat's CLI
74+
directly under `node` (not `npx`), because a shell brace-expands `{a,b,c}` into separate arguments, which Hardhat
75+
then rejects with `HH308: Unrecognized positional argument …` (verified).
76+
77+
## Running it
78+
79+
```bash
80+
npm run test # serial, all suites — the CI gate (ats.test.ts at 0/1)
81+
npm run test:parallel:ats # local parallel tests across the 8 shard files
82+
npm run test:coverage # serial single-run coverage (SHARD_TOTAL=1)
83+
npm run test:coverage:parallel # local parallel coverage (4 worktrees) — needs `lcov`? no, uses mergeLcov.ts
84+
SHARD_INDEX=2 SHARD_TOTAL=4 npm run test:coverage:shard # reproduce one CI coverage shard locally
85+
```
86+
87+
## Maintenance notes
88+
89+
- **Changing the CI shard count:** edit the matrix `shard: [...]` list AND `SHARD_TOTAL` in
90+
`100-flow-ats-test.yaml` (keep them equal). `planShard`/`mergeLcov` handle any count. 4 is the measured
91+
sweet spot — beyond it the per-shard ~2m45s instrument+compile floor dominates and wall-clock stops improving.
92+
- **The 8 local shard files hardcode `N=8`** (`runAtsShard(k-1, 8)`). Changing local parallel granularity means
93+
adding/removing files and keeping the indices in sync; `MIN_EXPECTED_SUITES` in `suiteDiscovery` fails loudly
94+
if discovery ever drops suites.
95+
- Adding an IAsset suite needs no changes here — discovery picks up any `…Tests(getCtx)` file automatically.
Lines changed: 109 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -1,71 +1,77 @@
11
// SPDX-License-Identifier: Apache-2.0
22
//
3-
// Merge several LCOV reports (one per coverage shard) into a single LCOV report.
3+
// Merge per-shard LCOV reports into one, summing coverage across shards.
44
//
5-
// Each shard instruments the SAME contract set but exercises a disjoint slice of the tests, so the
6-
// union of their hit counts is the full-suite coverage. This sums per-line (DA), per-function
7-
// (FN/FNDA) and per-branch (BRDA) hits across shards and recomputes the LF/LH/FNF/FNH/BRF/BRH
8-
// summary lines — preserving ALL three metrics. (The off-the-shelf `lcov-result-merger` drops FN
9-
// records, silently losing function coverage; `lcov` is not always installed. A small in-repo
10-
// merger avoids both the lost metric and a runtime-fetched dependency.)
5+
// Each coverage shard instruments the SAME contracts but runs a DISJOINT slice of the tests, so the
6+
// full-suite coverage is the union: a line/function/branch counts as "hit" if ANY shard hit it.
7+
// This merger parses the handful of LCOV record types solidity-coverage emits, sums the hit counts
8+
// per source file, and re-emits one report. See ./README.md for the whole shard→merge pipeline.
119
//
12-
// Usage: npx tsx scripts/tools/coverage-shard/mergeLcov.ts <output.info> <input...>
13-
// where each <input> is an lcov file or a directory searched recursively for `*.info`.
10+
// Why an in-repo merger and not a standard tool:
11+
// - `lcov-result-merger` (npm) silently DROPS function records → function coverage lost.
12+
// - `lcov` 2.x reproduces the numbers only with a large --rc/--ignore-errors flag soup, emits
13+
// thousands of "inconsistent" warnings on solidity-coverage's output, and rewrites the file in a
14+
// newer function format (FNL/FNA) whose Codecov support is unverified.
15+
// This stays exact, dependency-free, and unit-tested (test/scripts/unit/tools/mergeLcov.test.ts).
16+
//
17+
// LCOV record types, one block per source file (see `man geninfo`):
18+
// TN:<test name> SF:<source file path>
19+
// FN:<line>,<name> function <name> is defined at <line>
20+
// FNDA:<hits>,<name> function <name> was called <hits> times
21+
// DA:<line>,<hits> line <line> executed <hits> times
22+
// BRDA:<line>,<block>,<branch>,<taken> branch outcome; <taken> is a count, or "-" if never reached
23+
// FNF/FNH · BRF/BRH · LF/LH found/hit summary counters (RECOMPUTED here on emit)
24+
// end_of_record
25+
//
26+
// CLI: npx tsx scripts/tools/coverage-shard/mergeLcov.ts <output.info> <input.info|dir ...>
1427

1528
import { readdirSync, statSync, readFileSync, writeFileSync, mkdirSync } from "fs";
1629
import { dirname, join } from "path";
1730

31+
/** Accumulated coverage for ONE source file. Maps are keyed so records across shards sum. */
1832
interface FileCoverage {
1933
testName: string;
20-
functionLine: Map<string, number>; // fn name -> definition line (FN)
21-
functionHits: Map<string, number>; // fn name -> summed hits (FNDA)
22-
lineHits: Map<number, number>; // line -> summed hits (DA)
23-
branchTaken: Map<string, number | "-">; // "line,block,branch" -> summed taken, or "-" if never evaluated
34+
functionDefLine: Map<string, number>; // FN: function name -> definition line
35+
functionHits: Map<string, number>; // FNDA: function name -> summed call count
36+
lineHits: Map<number, number>; // DA: line number -> summed execution count
37+
branchTaken: Map<string, number | "-">; // BRDA: "line,block,branch" -> summed taken, or "-" if unreached
2438
}
2539

26-
function collectInputs(paths: string[]): string[] {
27-
const files: string[] = [];
28-
for (const p of paths) {
29-
if (statSync(p).isDirectory()) {
30-
for (const entry of readdirSync(p)) files.push(...collectInputs([join(p, entry)]));
31-
} else if (p.endsWith(".info")) {
32-
files.push(p);
33-
}
34-
}
35-
return files;
40+
type CoverageByFile = Map<string, FileCoverage>;
41+
42+
function newFileCoverage(testName: string): FileCoverage {
43+
return {
44+
testName,
45+
functionDefLine: new Map(),
46+
functionHits: new Map(),
47+
lineHits: new Map(),
48+
branchTaken: new Map(),
49+
};
3650
}
3751

38-
function mergeInto(coverage: Map<string, FileCoverage>, lcov: string): void {
52+
/** Parse one LCOV report's text and sum its records into `acc` (the running union across shards). */
53+
function accumulate(acc: CoverageByFile, lcovText: string): void {
54+
// `TN:` precedes `SF:`; hold its value until SF: names the file the block belongs to.
55+
let pendingTestName = "";
3956
let current: FileCoverage | undefined;
40-
for (const raw of lcov.split("\n")) {
57+
58+
for (const raw of lcovText.split("\n")) {
4159
const line = raw.trim();
60+
4261
if (line.startsWith("TN:")) {
43-
// Deferred until SF: associates the record with a file.
44-
current = {
45-
testName: line.slice(3),
46-
functionLine: new Map(),
47-
functionHits: new Map(),
48-
lineHits: new Map(),
49-
branchTaken: new Map(),
50-
} as FileCoverage;
62+
pendingTestName = line.slice(3);
5163
} else if (line.startsWith("SF:")) {
52-
const sf = line.slice(3);
53-
if (!coverage.has(sf)) {
54-
coverage.set(sf, {
55-
testName: current?.testName ?? "",
56-
functionLine: new Map(),
57-
functionHits: new Map(),
58-
lineHits: new Map(),
59-
branchTaken: new Map(),
60-
});
64+
const sourceFile = line.slice(3);
65+
current = acc.get(sourceFile);
66+
if (!current) {
67+
current = newFileCoverage(pendingTestName);
68+
acc.set(sourceFile, current);
6169
}
62-
current = coverage.get(sf);
6370
} else if (!current) {
64-
continue;
71+
continue; // records before the first SF: have no file to attach to
6572
} else if (line.startsWith("FN:")) {
6673
const [defLine, ...nameParts] = line.slice(3).split(",");
67-
const name = nameParts.join(",");
68-
current.functionLine.set(name, Number(defLine));
74+
current.functionDefLine.set(nameParts.join(","), Number(defLine));
6975
} else if (line.startsWith("FNDA:")) {
7076
const [hits, ...nameParts] = line.slice(5).split(",");
7177
const name = nameParts.join(",");
@@ -77,57 +83,90 @@ function mergeInto(coverage: Map<string, FileCoverage>, lcov: string): void {
7783
} else if (line.startsWith("BRDA:")) {
7884
const [ln, block, branch, taken] = line.slice(5).split(",");
7985
const key = `${ln},${block},${branch}`;
80-
const prev = current.branchTaken.get(key);
86+
const previous = current.branchTaken.get(key);
8187
if (taken === "-") {
82-
if (prev === undefined) current.branchTaken.set(key, "-");
88+
// Only record "unreached" if no shard ever reached this branch.
89+
if (previous === undefined) current.branchTaken.set(key, "-");
8390
} else {
84-
const base = typeof prev === "number" ? prev : 0;
91+
const base = typeof previous === "number" ? previous : 0;
8592
current.branchTaken.set(key, base + Number(taken));
8693
}
8794
}
8895
}
8996
}
9097

91-
function emit(coverage: Map<string, FileCoverage>): string {
98+
/** Render merged coverage back to LCOV text, recomputing the found/hit (`*F`/`*H`) counters. */
99+
function render(acc: CoverageByFile): string {
92100
const out: string[] = [];
93-
for (const [sf, fc] of coverage) {
101+
for (const [sourceFile, fc] of acc) {
94102
out.push(`TN:${fc.testName}`);
95-
out.push(`SF:${sf}`);
96-
for (const [name, defLine] of fc.functionLine) out.push(`FN:${defLine},${name}`);
103+
out.push(`SF:${sourceFile}`);
104+
105+
for (const [name, defLine] of fc.functionDefLine) out.push(`FN:${defLine},${name}`);
97106
for (const [name, hits] of fc.functionHits) out.push(`FNDA:${hits},${name}`);
98-
const fnHit = [...fc.functionHits.values()].filter((h) => h > 0).length;
99-
out.push(`FNF:${fc.functionLine.size}`);
100-
out.push(`FNH:${fnHit}`);
101-
for (const [key, taken] of fc.branchTaken) out.push(`BRDA:${key},${taken === "-" ? "-" : taken}`);
102-
const brFound = fc.branchTaken.size;
103-
const brHit = [...fc.branchTaken.values()].filter((t) => typeof t === "number" && t > 0).length;
104-
out.push(`BRF:${brFound}`);
105-
out.push(`BRH:${brHit}`);
107+
out.push(`FNF:${fc.functionDefLine.size}`);
108+
out.push(`FNH:${countHit(fc.functionHits.values())}`);
109+
110+
for (const [key, taken] of fc.branchTaken) out.push(`BRDA:${key},${taken}`);
111+
out.push(`BRF:${fc.branchTaken.size}`);
112+
out.push(`BRH:${[...fc.branchTaken.values()].filter((t) => typeof t === "number" && t > 0).length}`);
113+
106114
for (const [ln, hits] of [...fc.lineHits.entries()].sort((a, b) => a[0] - b[0])) out.push(`DA:${ln},${hits}`);
107-
const lHit = [...fc.lineHits.values()].filter((h) => h > 0).length;
108115
out.push(`LF:${fc.lineHits.size}`);
109-
out.push(`LH:${lHit}`);
116+
out.push(`LH:${countHit(fc.lineHits.values())}`);
117+
110118
out.push("end_of_record");
111119
}
112120
return out.join("\n") + "\n";
113121
}
114122

123+
/** Number of entries with a positive hit count (the `*H` half of an LCOV found/hit pair). */
124+
function countHit(hits: Iterable<number>): number {
125+
let n = 0;
126+
for (const h of hits) if (h > 0) n++;
127+
return n;
128+
}
129+
130+
/**
131+
* Merge several LCOV report CONTENTS (not file paths) into one LCOV text. Pure and order-independent
132+
* over the union — the unit-test entry point.
133+
*/
134+
export function mergeLcovReports(reports: string[]): string {
135+
const acc: CoverageByFile = new Map();
136+
for (const text of reports) accumulate(acc, text);
137+
return render(acc);
138+
}
139+
140+
/** Recursively collect `*.info` files under the given paths (a path may be a file or a directory). */
141+
function collectInfoFiles(paths: string[]): string[] {
142+
const files: string[] = [];
143+
for (const p of paths) {
144+
if (statSync(p).isDirectory()) {
145+
for (const entry of readdirSync(p)) files.push(...collectInfoFiles([join(p, entry)]));
146+
} else if (p.endsWith(".info")) {
147+
files.push(p);
148+
}
149+
}
150+
return files;
151+
}
152+
115153
function main(): void {
116154
const [output, ...inputs] = process.argv.slice(2);
117155
if (!output || inputs.length === 0) {
118-
process.stderr.write("usage: mergeLcov.ts <output.info> <input.info|dir...>\n");
156+
process.stderr.write("usage: mergeLcov.ts <output.info> <input.info|dir ...>\n");
119157
process.exit(1);
120158
}
121-
const files = collectInputs(inputs);
159+
const files = collectInfoFiles(inputs);
122160
if (files.length === 0) {
123161
process.stderr.write(`mergeLcov: no .info files found under ${inputs.join(", ")}\n`);
124162
process.exit(1);
125163
}
126-
const coverage = new Map<string, FileCoverage>();
127-
for (const file of files) mergeInto(coverage, readFileSync(file, "utf8"));
164+
const merged = mergeLcovReports(files.map((f) => readFileSync(f, "utf8")));
128165
mkdirSync(dirname(output), { recursive: true });
129-
writeFileSync(output, emit(coverage));
130-
process.stdout.write(`mergeLcov: merged ${files.length} reports → ${output} (${coverage.size} files)\n`);
166+
writeFileSync(output, merged);
167+
const fileCount = (merged.match(/^SF:/gm) || []).length;
168+
process.stdout.write(`mergeLcov: merged ${files.length} reports → ${output} (${fileCount} files)\n`);
131169
}
132170

133-
main();
171+
// Run only when invoked as a CLI; importing the module (e.g. from the unit test) does not run main.
172+
if (require.main === module) main();

0 commit comments

Comments
 (0)