You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: skills/ce-code-review/references/findings-schema.json
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -76,7 +76,7 @@
76
76
},
77
77
"evidence": {
78
78
"type": "array",
79
-
"description": "Code-grounded evidence: snippets, line references, or pattern descriptions. At least 1 item.",
79
+
"description": "Code-grounded evidence: snippets, line references, or pattern descriptions. At least 1 item. For any finding at confidence anchor 75 or 100, the first item MUST be the verbatim motivating line(s) with file:line -- the exact code text that makes the finding true (the quote-the-line gate). A finding whose triggering line cannot be quoted must step down to anchor 50.",
80
80
"items": { "type": "string" },
81
81
"minItems": 1
82
82
},
@@ -130,8 +130,8 @@
130
130
},
131
131
"return_tiers": {
132
132
"description": "Finding fields are split into two tiers. The full schema (with all required fields) applies to the artifact file on disk. The compact return to the orchestrator omits detail-tier fields. Both are valid uses of this schema in different contexts.",
133
-
"merge_tier": "Returned to orchestrator: title, severity, file, line, confidence, autofix_class, owner, requires_verification, pre_existing, suggested_fix (optional). Plus top-level reviewer, residual_risks, testing_gaps.",
134
-
"detail_tier": "Required in artifact file, omitted from compact return: why_it_matters, evidence. The artifact file must pass full schema validation including all required fields. Headless output depends on why_it_matters and evidence being present in the artifact."
133
+
"merge_tier": "Returned to orchestrator: title, severity, file, line, confidence, autofix_class, owner, requires_verification, pre_existing, suggested_fix (optional), first_evidence (required for anchor 75/100; the verbatim first evidence line, used to enforce the quote-the-line gate in-band). Plus top-level reviewer, residual_risks, testing_gaps.",
134
+
"detail_tier": "Required in artifact file, omitted from compact return: why_it_matters, and the full evidence array (the compact return carries only first_evidence). The artifact file must pass full schema validation including all required fields. Headless output depends on why_it_matters and evidence being present in the artifact."
Do NOT include why_it_matters or the full evidence array in the returned JSON.
32
+
`first_evidence` is the ONE exception to "no evidence in the compact return": it is the verbatim motivating line with `file:line` (the same string you put first in the `evidence` array). It is **REQUIRED for every finding at anchor 75 or 100** — the orchestrator enforces the quote-the-line gate from this field, and a 75/100 finding without it is demoted to anchor 50 at merge. Omit it only for anchor-50 findings. Keep it to that single line; the rest of `evidence` stays in the artifact file.
32
33
Include reviewer, residual_risks, and testing_gaps at the top level.
33
34
34
35
The full file preserves detail for downstream consumers (agent-mode output, debugging).
35
36
The compact return keeps the orchestrator's context lean for merge and synthesis.
36
37
37
-
The schema below describes the **full artifact file format** (all fields required). For the compact return, follow the field list above -- omit why_it_matters and evidence even though the schema marks them as required.
38
+
The schema below describes the **full artifact file format** (all fields required). For the compact return, follow the field list above -- omit why_it_matters and the full evidence array (but include `first_evidence`) even though the schema marks evidence as required.
38
39
39
40
{schema}
40
41
@@ -43,7 +44,7 @@ The schema below describes the **full artifact file format** (all fields require
43
44
- `severity`: one of `"P0"`, `"P1"`, `"P2"`, `"P3"` — use these exact strings. Do NOT use `"high"`, `"medium"`, `"low"`, `"critical"`, or any other vocabulary, even if your persona's prose discusses priorities in those terms conceptually.
44
45
- `autofix_class`: one of `"gated_auto"`, `"manual"`, `"advisory"`.
45
46
- `owner`: one of `"downstream-resolver"`, `"human"`, `"release"`.
46
-
- `evidence`: an ARRAY of strings with at least one element. A single string value is a validation failure — wrap every quote in `["..."]` even when there is only one.
47
+
- `evidence`: an ARRAY of strings with at least one element. A single string value is a validation failure — wrap every quote in `["..."]` even when there is only one. **For any finding at anchor `75` or `100`, the first evidence item MUST be the verbatim motivating line(s) with `file:line`** — the exact code text that makes the finding true (see "Quote-the-line gate" below).
47
48
- `pre_existing`: boolean, never null.
48
49
- `requires_verification`: boolean, never null.
49
50
- `confidence`: one of exactly `0`, `25`, `50`, `75`, or `100` — a discrete anchor, NOT a continuous number. Any other value (e.g., `72`, `0.85`, `"high"`) is a validation failure. Pick the anchor whose behavioral criterion you can honestly self-apply to this finding (see "Confidence rubric" below).
@@ -62,6 +63,15 @@ If your persona description uses severity vocabulary like "high-priority" or "cr
62
63
63
64
Anchor and severity are independent axes. A P2 finding can be anchor `100` if the evidence is airtight; a P0 finding can be anchor `50` if it is an important concern you could not fully verify. Anchor gates where the finding surfaces (drop / soft bucket / actionable); severity orders it within the actionable surface.
64
65
66
+
**Quote-the-line gate (kills the "field/symbol doesn't exist" false-positive class).** Before you anchor a finding at `75` or `100`, quote the verbatim line(s) that make it true, with `file:line`, as the first `evidence` item:
67
+
68
+
- "field X doesn't exist on model Y" → quote the class/`Meta`/migration where X would be defined.
69
+
- "`dict.get()` may return None" → quote the dict's initialization.
70
+
- "race between A and B" → quote both A and B.
71
+
- "swapped argument / wrong return" → quote the call site and the signature.
72
+
73
+
**If you cannot quote the motivating line, you cannot claim `75`+ — step down to `50` (suppressed from primary findings).** When the symbol is generated by a framework metaclass, ORM `Meta`, decorator, or migration history (Rails `has_many`/`scope`, Django `Meta`, SQLAlchemy `Column`/`relationship`, Prisma client, TypeORM/Sequelize decorators), quote the meta-construct that creates it — reading the source that generates the symbol satisfies the gate; a failed `grep` for the literal name does not.
74
+
65
75
Synthesis suppresses anchors `0` and `25` silently. Anchor `50` is dropped from primary findings unless the severity is P0 (P0+50 survives) or synthesis routes it to a soft bucket (testing_gaps, residual_risks, advisory) per mode-aware demotion. Anchors `75` and `100` enter the actionable tier.
66
76
67
77
Example of a schema-valid finding (all required fields, correct enum values, correct array shape):
Copy file name to clipboardExpand all lines: skills/ce-work/SKILL.md
+6-10Lines changed: 6 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -333,17 +333,13 @@ Determine how to proceed based on what was provided in `<input_document>`.
333
333
334
334
When all Phase 2 tasks are complete and execution transitions to quality check, you must read `references/shipping-workflow.md` for the full shipping workflow. Do not skip this.
335
335
336
-
**Code review tiers:**Tier 1 when the harness has built-in review. Tier 2 only when escalation criteria in `shipping-workflow.md` match — not because Tier 1 is missing.
336
+
**Code review: one portable path.**Review with `ce-code-review`, which self-sizes (lite roster for small low-risk code-only diffs, full roster otherwise). No harness-native review detection and no escalation tiers — the size/sensitive-surface judgment lives inside `ce-code-review`. Skip dedicated review only for a purely mechanical diff (formatting, dep-bumps, lint-only, generated). Full rules (autonomous Residual Gate, infra fallback) in `shipping-workflow.md`.
337
337
338
-
**Tier 2 is two steps — review, then fix.**`ce-code-review` is review-only. It returns findings (markdown or `mode:agent` JSON); it never edits the checkout, commits, or applies fixes.
338
+
**Review is two steps — review, then fix.**`ce-code-review` is review-only. It returns findings (markdown or `mode:agent` JSON); it never edits the checkout, commits, or applies fixes.
339
339
340
-
When Tier 2 applies:
341
-
342
-
1.**Review** — Invoke the `ce-code-review` skill (invocation command in `references/review-findings-followup.md` § Fallback). Use `mode:agent` in orchestrated workflows; pass `plan:<path>` when you have a plan and `base:<ref>` when the merge base is already known.
340
+
1.**Review** — Invoke the `ce-code-review` skill (invocation command in `references/review-findings-followup.md` § Fallback). Use `mode:agent` in orchestrated workflows; pass `plan:<path>` when you have a plan, `base:<ref>` when the merge base is known, and `depth:full` when a deep/thorough review was explicitly requested.
343
341
2.**Apply fixes** — Load `references/review-findings-followup.md`. Filter eligibility on JSON only, **batch applicable findings by file**, dispatch fix subagents (parallel when file sets are disjoint). The orchestrator merges diffs, runs tests, and commits — it does not pre-investigate findings.
344
-
3.**Residual Work Gate** — Only after followup; unresolved actionable findings go through the gate in `shipping-workflow.md`.
345
-
346
-
Tier 1 harness-native review may still fix inline; Tier 2 always separates review from apply.
342
+
3.**Residual Work Gate** — Only after followup; unresolved actionable findings go through the gate in `shipping-workflow.md` (autonomous sessions auto-accept + record residuals; interactive sessions ask).
347
343
348
344
## Key Principles
349
345
@@ -367,7 +363,7 @@ Tier 1 harness-native review may still fix inline; Tier 2 always separates revie
367
363
368
364
### Quality is Built In
369
365
370
-
- Review when Tier 1 is available or Tier 2 criteria match (see `shipping-workflow.md`)
366
+
- Review every non-mechanical diff with `ce-code-review` (it self-sizes; see `shipping-workflow.md`)
371
367
372
368
### Ship Complete Features
373
369
@@ -383,5 +379,5 @@ Tier 1 harness-native review may still fix inline; Tier 2 always separates revie
383
379
-**Testing at the end** - Test continuously or suffer later
384
380
-**Forgetting to track progress** - Update task status as you go or lose track of what's done
385
381
-**80% done syndrome** - Finish the feature, don't move on early
386
-
-**Skipping review without reason** — Use Tier 1 when available; escalate to Tier 2 only on criteria in `shipping-workflow.md`; document when both are skipped
382
+
-**Skipping review without reason** — review every non-mechanical diff with `ce-code-review`; skip only for a purely mechanical diff or when it is genuinely unavailable, and document the skip reason
387
383
-**Re-scoping the plan into human-time phases** - The plan's Implementation Units define the scope of execution. Do not estimate human-hours per unit, propose multi-day breakdowns, or ask the user to pick a subset of units for "this session". Agents execute at agent speed, and context-window pressure is addressed by subagent dispatch (Phase 1 Step 4), not by phased sessions. If a plan-file input is genuinely too large for a single execution, say so plainly and suggest the user return to `/ce-plan` to reduce scope — don't invent session phases as a workaround. For bare-prompt input, Phase 0's Large routing already handles oversized work
Load this reference when Tier 2 `ce-code-review` has finished and **ce-work** (or another caller) should apply fixes before the Residual Work Gate.
3
+
Load this reference when `ce-code-review` has finished and **ce-work** (or another caller) should apply fixes before the Residual Work Gate.
4
4
5
5
`ce-code-review` is invoked here with `mode:agent`, so it is **review-only** in this context — it reports findings and writes artifacts and does not mutate the checkout, commit, push, or file tickets. **The caller owns apply/fix policy.** (In its own default/interactive mode the review applies safe fixes itself; that path does not apply here.)
6
6
7
7
## Consume the completed review (do not re-run it)
8
8
9
-
This reference loads **after** review has run. In the ce-work Tier 2 path, step 2a already invoked `ce-code-review`; this apply step **consumes that output** — do not start a second review, which would waste reviewer dispatches and risk overwriting the artifact the Residual Work Gate reconciles.
9
+
This reference loads **after** review has run. In the ce-work shipping flow, step 3a already invoked `ce-code-review`; this apply step **consumes that output** — do not start a second review, which would waste reviewer dispatches and risk overwriting the artifact the Residual Work Gate reconciles.
10
10
11
11
Reuse the review output already in hand:
12
12
@@ -17,7 +17,7 @@ If `status` is `failed`, stop shipping and surface `reason`. If `degraded`, note
17
17
18
18
### Fallback — invoke review only for cold callers
19
19
20
-
Only when the caller reached this file **without** already running review (no review output in hand): invoke `ce-code-review` once, then proceed to apply. Do not invoke when the caller already ran review (e.g., ce-work Tier 2 step 2a).
20
+
Only when the caller reached this file **without** already running review (no review output in hand): invoke `ce-code-review` once, then proceed to apply. Do not invoke when the caller already ran review (e.g., ce-work shipping step 3a).
21
21
22
22
Invoke the skill explicitly — do not treat a casual "review my changes" prompt as a substitute unless the harness routed it to `ce-code-review`.
0 commit comments