Skip to content

Commit b28fe68

Browse files
committed
perf(harness): downgrade sample ok when a batch step reports ok:false
Defensive belt-and-suspenders for the Codex review note: stop-only batch already surfaces a failed step as a top-level failure (caught by invokeCli), but if an on-error=continue mode ever keeps the batch ok while a step fails, don't silently count that step as a successful sample — derive ok from the step's own result.ok.
1 parent 9049561 commit b28fe68

1 file changed

Lines changed: 9 additions & 1 deletion

File tree

scripts/perf/cli.ts

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,15 @@ export function invokeCli(args: string[], baseFlags: string[]): CliResult {
6161

6262
// Wrap a single command in its own `batch` invocation to read per-step durationMs.
6363
export function invokeBatchStep(spec: BatchStepSpec, baseFlags: string[]): CliResult {
64-
return invokeCli(['batch', '--steps', JSON.stringify([spec])], baseFlags);
64+
const result = invokeCli(['batch', '--steps', JSON.stringify([spec])], baseFlags);
65+
// Defensive: today's stop-only batch surfaces a failed step as a top-level non-zero/ok:false
66+
// (already caught by invokeCli). But if a future on-error mode keeps the batch ok while a step
67+
// fails, don't silently count that step as a success — downgrade ok from the step's own ok.
68+
const stepOk = firstBatchResult(result.json)?.ok;
69+
if (result.ok && stepOk === false) {
70+
return { ...result, ok: false };
71+
}
72+
return result;
6573
}
6674

6775
function firstBatchResult(json: unknown): Record<string, unknown> | undefined {

0 commit comments

Comments
 (0)