Commit adba28a
committed
fix(ce-codex-reviewer): NDJSON output and schema-valid line numbers
Addresses both Codex findings on a8a5ed2 (April 2026):
P1 — Emit schema-valid line numbers for file-level findings.
The findings schema requires `line >= 1` (verified in
references/findings-schema.json). The previous prompt told codex to
emit `LINE=0` for file-level issues, which would have been dropped
by the merge validator as malformed before synthesis — silently
losing every file-level Codex finding.
Fix: the codex prompt now requires `line` to be a positive integer
and adds a `file_level` boolean. When `file_level: true`, codex sets
`line: 1` and the agent prepends a "file-level finding (no specific
line applies)" string to the evidence array so synthesis and
downstream surfaces can still distinguish "line 1 was the issue"
from "this is a whole-file concern." The `file_level` signal is
preserved in evidence rather than as a separate field because the
findings schema doesn't expose a top-level file_level flag, and
inventing one would fail the strict-validator path.
P2 — Use structured output so evidence cannot break parsing.
The previous pipe-delimited contract dropped any row that wasn't
exactly five `|`-separated fields. EVIDENCE is a raw code snippet
that can legitimately contain `|` (bitwise OR / union types, shell
pipes, markdown tables, regex alternation). When that happened,
valid findings disappeared silently.
Fix: switch the codex prompt to NDJSON — one JSON object per line.
The agent JSON-parses each line independently; embedded pipes in
evidence are no longer a parsing hazard because JSON quoting handles
them. Lines that fail to parse are skipped (no retry, no inference).
Test coverage:
- New contract test "uses NDJSON output contract so evidence can
carry pipes safely" guards the prompt format and asserts the
pipe-delimited shape stays gone.
- New contract test "emits schema-valid line numbers for file-level
findings" guards the line=1 + file_level=true convention, asserts
the old "0 means file-level" wording is gone, and reads the
findings schema to verify minimum=1 still holds (so a future
schema relaxation triggers a deliberate test update rather than a
silent drift).
Both tests cite their originating Codex finding inline so future
regressions get a clear pointer to PR #356.
bun test tests/review-skill-contract.test.ts: 30 pass (was 28)
bun run release:validate: clean1 parent a8a5ed2 commit adba28a
2 files changed
Lines changed: 57 additions & 11 deletions
File tree
- plugins/compound-engineering/agents
- tests
Lines changed: 17 additions & 11 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
48 | | - | |
| 48 | + | |
49 | 49 | | |
50 | | - | |
| 50 | + | |
51 | 51 | | |
52 | | - | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
53 | 59 | | |
54 | | - | |
| 60 | + | |
55 | 61 | | |
56 | 62 | | |
57 | 63 | | |
| |||
60 | 66 | | |
61 | 67 | | |
62 | 68 | | |
63 | | - | |
| 69 | + | |
64 | 70 | | |
65 | | - | |
| 71 | + | |
66 | 72 | | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
72 | 78 | | |
73 | 79 | | |
74 | 80 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
760 | 760 | | |
761 | 761 | | |
762 | 762 | | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
| 773 | + | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
| 780 | + | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
763 | 803 | | |
764 | 804 | | |
765 | 805 | | |
| |||
0 commit comments