EveryInc
diff --git a/‎skills/ce-code-review/SKILL.md‎
Lines changed: 93 additions & 9 deletions b/‎skills/ce-code-review/SKILL.md‎
Lines changed: 93 additions & 9 deletions
diff --git a/‎skills/ce-code-review/references/findings-schema.json‎
Lines changed: 3 additions & 3 deletions b/‎skills/ce-code-review/references/findings-schema.json‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎skills/ce-code-review/references/subagent-template.md‎
Lines changed: 14 additions & 4 deletions b/‎skills/ce-code-review/references/subagent-template.md‎
Lines changed: 14 additions & 4 deletions
diff --git a/‎skills/ce-work/SKILL.md‎
Lines changed: 6 additions & 10 deletions b/‎skills/ce-work/SKILL.md‎
Lines changed: 6 additions & 10 deletions
diff --git a/‎skills/ce-work/references/review-findings-followup.md‎
Lines changed: 3 additions & 3 deletions b/‎skills/ce-work/references/review-findings-followup.md‎
Lines changed: 3 additions & 3 deletions
@@ -76,7 +76,7 @@
           },
           "evidence": {
             "type": "array",
-            "description": "Code-grounded evidence: snippets, line references, or pattern descriptions. At least 1 item.",
+            "description": "Code-grounded evidence: snippets, line references, or pattern descriptions. At least 1 item. For any finding at confidence anchor 75 or 100, the first item MUST be the verbatim motivating line(s) with file:line -- the exact code text that makes the finding true (the quote-the-line gate). A finding whose triggering line cannot be quoted must step down to anchor 50.",
             "items": { "type": "string" },
             "minItems": 1
           },
@@ -130,8 +130,8 @@
     },
     "return_tiers": {
       "description": "Finding fields are split into two tiers. The full schema (with all required fields) applies to the artifact file on disk. The compact return to the orchestrator omits detail-tier fields. Both are valid uses of this schema in different contexts.",
-      "merge_tier": "Returned to orchestrator: title, severity, file, line, confidence, autofix_class, owner, requires_verification, pre_existing, suggested_fix (optional). Plus top-level reviewer, residual_risks, testing_gaps.",
-      "detail_tier": "Required in artifact file, omitted from compact return: why_it_matters, evidence. The artifact file must pass full schema validation including all required fields. Headless output depends on why_it_matters and evidence being present in the artifact."
+      "merge_tier": "Returned to orchestrator: title, severity, file, line, confidence, autofix_class, owner, requires_verification, pre_existing, suggested_fix (optional), first_evidence (required for anchor 75/100; the verbatim first evidence line, used to enforce the quote-the-line gate in-band). Plus top-level reviewer, residual_risks, testing_gaps.",
+      "detail_tier": "Required in artifact file, omitted from compact return: why_it_matters, and the full evidence array (the compact return carries only first_evidence). The artifact file must pass full schema validation including all required fields. Headless output depends on why_it_matters and evidence being present in the artifact."
     }
   }
 }
@@ -27,14 +27,15 @@ You produce up to two outputs depending on whether a run ID was provided:
    If no Run ID is provided (the field is empty or absent), skip this step entirely -- do not attempt any file write.
 
 2. **Compact return (always).** RETURN compact JSON to the parent with ONLY merge-tier fields per finding:
-   title, severity, file, line, confidence, autofix_class, owner, requires_verification, pre_existing, suggested_fix.
-   Do NOT include why_it_matters or evidence in the returned JSON.
+   title, severity, file, line, confidence, autofix_class, owner, requires_verification, pre_existing, suggested_fix, first_evidence.
+   Do NOT include why_it_matters or the full evidence array in the returned JSON.
+   `first_evidence` is the ONE exception to "no evidence in the compact return": it is the verbatim motivating line with `file:line` (the same string you put first in the `evidence` array). It is **REQUIRED for every finding at anchor 75 or 100** — the orchestrator enforces the quote-the-line gate from this field, and a 75/100 finding without it is demoted to anchor 50 at merge. Omit it only for anchor-50 findings. Keep it to that single line; the rest of `evidence` stays in the artifact file.
    Include reviewer, residual_risks, and testing_gaps at the top level.
 
 The full file preserves detail for downstream consumers (agent-mode output, debugging).
 The compact return keeps the orchestrator's context lean for merge and synthesis.
 
-The schema below describes the **full artifact file format** (all fields required). For the compact return, follow the field list above -- omit why_it_matters and evidence even though the schema marks them as required.
+The schema below describes the **full artifact file format** (all fields required). For the compact return, follow the field list above -- omit why_it_matters and the full evidence array (but include `first_evidence`) even though the schema marks evidence as required.
 
 {schema}
 
@@ -43,7 +44,7 @@ The schema below describes the **full artifact file format** (all fields require
 - `severity`: one of `"P0"`, `"P1"`, `"P2"`, `"P3"` — use these exact strings. Do NOT use `"high"`, `"medium"`, `"low"`, `"critical"`, or any other vocabulary, even if your persona's prose discusses priorities in those terms conceptually.
 - `autofix_class`: one of `"gated_auto"`, `"manual"`, `"advisory"`.
 - `owner`: one of `"downstream-resolver"`, `"human"`, `"release"`.
-- `evidence`: an ARRAY of strings with at least one element. A single string value is a validation failure — wrap every quote in `["..."]` even when there is only one.
+- `evidence`: an ARRAY of strings with at least one element. A single string value is a validation failure — wrap every quote in `["..."]` even when there is only one. **For any finding at anchor `75` or `100`, the first evidence item MUST be the verbatim motivating line(s) with `file:line`** — the exact code text that makes the finding true (see "Quote-the-line gate" below).
 - `pre_existing`: boolean, never null.
 - `requires_verification`: boolean, never null.
 - `confidence`: one of exactly `0`, `25`, `50`, `75`, or `100` — a discrete anchor, NOT a continuous number. Any other value (e.g., `72`, `0.85`, `"high"`) is a validation failure. Pick the anchor whose behavioral criterion you can honestly self-apply to this finding (see "Confidence rubric" below).
@@ -62,6 +63,15 @@ If your persona description uses severity vocabulary like "high-priority" or "cr
 
 Anchor and severity are independent axes. A P2 finding can be anchor `100` if the evidence is airtight; a P0 finding can be anchor `50` if it is an important concern you could not fully verify. Anchor gates where the finding surfaces (drop / soft bucket / actionable); severity orders it within the actionable surface.
 
+**Quote-the-line gate (kills the "field/symbol doesn't exist" false-positive class).** Before you anchor a finding at `75` or `100`, quote the verbatim line(s) that make it true, with `file:line`, as the first `evidence` item:
+
+- "field X doesn't exist on model Y" → quote the class/`Meta`/migration where X would be defined.
+- "`dict.get()` may return None" → quote the dict's initialization.
+- "race between A and B" → quote both A and B.
+- "swapped argument / wrong return" → quote the call site and the signature.
+
+**If you cannot quote the motivating line, you cannot claim `75`+ — step down to `50` (suppressed from primary findings).** When the symbol is generated by a framework metaclass, ORM `Meta`, decorator, or migration history (Rails `has_many`/`scope`, Django `Meta`, SQLAlchemy `Column`/`relationship`, Prisma client, TypeORM/Sequelize decorators), quote the meta-construct that creates it — reading the source that generates the symbol satisfies the gate; a failed `grep` for the literal name does not.
+
 Synthesis suppresses anchors `0` and `25` silently. Anchor `50` is dropped from primary findings unless the severity is P0 (P0+50 survives) or synthesis routes it to a soft bucket (testing_gaps, residual_risks, advisory) per mode-aware demotion. Anchors `75` and `100` enter the actionable tier.
 
 Example of a schema-valid finding (all required fields, correct enum values, correct array shape):
 
@@ -333,17 +333,13 @@ Determine how to proceed based on what was provided in `<input_document>`.
 
 When all Phase 2 tasks are complete and execution transitions to quality check, you must read `references/shipping-workflow.md` for the full shipping workflow. Do not skip this.
 
-**Code review tiers:** Tier 1 when the harness has built-in review. Tier 2 only when escalation criteria in `shipping-workflow.md` match — not because Tier 1 is missing.
+**Code review: one portable path.** Review with `ce-code-review`, which self-sizes (lite roster for small low-risk code-only diffs, full roster otherwise). No harness-native review detection and no escalation tiers — the size/sensitive-surface judgment lives inside `ce-code-review`. Skip dedicated review only for a purely mechanical diff (formatting, dep-bumps, lint-only, generated). Full rules (autonomous Residual Gate, infra fallback) in `shipping-workflow.md`.
 
-**Tier 2 is two steps — review, then fix.** `ce-code-review` is review-only. It returns findings (markdown or `mode:agent` JSON); it never edits the checkout, commits, or applies fixes.
+**Review is two steps — review, then fix.** `ce-code-review` is review-only. It returns findings (markdown or `mode:agent` JSON); it never edits the checkout, commits, or applies fixes.
 
-When Tier 2 applies:
-
-1. **Review** — Invoke the `ce-code-review` skill (invocation command in `references/review-findings-followup.md` § Fallback). Use `mode:agent` in orchestrated workflows; pass `plan:<path>` when you have a plan and `base:<ref>` when the merge base is already known.
+1. **Review** — Invoke the `ce-code-review` skill (invocation command in `references/review-findings-followup.md` § Fallback). Use `mode:agent` in orchestrated workflows; pass `plan:<path>` when you have a plan, `base:<ref>` when the merge base is known, and `depth:full` when a deep/thorough review was explicitly requested.
 2. **Apply fixes** — Load `references/review-findings-followup.md`. Filter eligibility on JSON only, **batch applicable findings by file**, dispatch fix subagents (parallel when file sets are disjoint). The orchestrator merges diffs, runs tests, and commits — it does not pre-investigate findings.
-3. **Residual Work Gate** — Only after followup; unresolved actionable findings go through the gate in `shipping-workflow.md`.
-
-Tier 1 harness-native review may still fix inline; Tier 2 always separates review from apply.
+3. **Residual Work Gate** — Only after followup; unresolved actionable findings go through the gate in `shipping-workflow.md` (autonomous sessions auto-accept + record residuals; interactive sessions ask).
 
 ## Key Principles
 
@@ -367,7 +363,7 @@ Tier 1 harness-native review may still fix inline; Tier 2 always separates revie
 
 ### Quality is Built In
 
-- Review when Tier 1 is available or Tier 2 criteria match (see `shipping-workflow.md`)
+- Review every non-mechanical diff with `ce-code-review` (it self-sizes; see `shipping-workflow.md`)
 
 ### Ship Complete Features
 
@@ -383,5 +379,5 @@ Tier 1 harness-native review may still fix inline; Tier 2 always separates revie
 - **Testing at the end** - Test continuously or suffer later
 - **Forgetting to track progress** - Update task status as you go or lose track of what's done
 - **80% done syndrome** - Finish the feature, don't move on early
-- **Skipping review without reason** — Use Tier 1 when available; escalate to Tier 2 only on criteria in `shipping-workflow.md`; document when both are skipped
+- **Skipping review without reason** — review every non-mechanical diff with `ce-code-review`; skip only for a purely mechanical diff or when it is genuinely unavailable, and document the skip reason
 - **Re-scoping the plan into human-time phases** - The plan's Implementation Units define the scope of execution. Do not estimate human-hours per unit, propose multi-day breakdowns, or ask the user to pick a subset of units for "this session". Agents execute at agent speed, and context-window pressure is addressed by subagent dispatch (Phase 1 Step 4), not by phased sessions. If a plan-file input is genuinely too large for a single execution, say so plainly and suggest the user return to `/ce-plan` to reduce scope — don't invent session phases as a workaround. For bare-prompt input, Phase 0's Large routing already handles oversized work
@@ -1,12 +1,12 @@
 # Apply Code Review Findings (after `ce-code-review`)
 
-Load this reference when Tier 2 `ce-code-review` has finished and **ce-work** (or another caller) should apply fixes before the Residual Work Gate.
+Load this reference when `ce-code-review` has finished and **ce-work** (or another caller) should apply fixes before the Residual Work Gate.
 
 `ce-code-review` is invoked here with `mode:agent`, so it is **review-only** in this context — it reports findings and writes artifacts and does not mutate the checkout, commit, push, or file tickets. **The caller owns apply/fix policy.** (In its own default/interactive mode the review applies safe fixes itself; that path does not apply here.)
 
 ## Consume the completed review (do not re-run it)
 
-This reference loads **after** review has run. In the ce-work Tier 2 path, step 2a already invoked `ce-code-review`; this apply step **consumes that output** — do not start a second review, which would waste reviewer dispatches and risk overwriting the artifact the Residual Work Gate reconciles.
+This reference loads **after** review has run. In the ce-work shipping flow, step 3a already invoked `ce-code-review`; this apply step **consumes that output** — do not start a second review, which would waste reviewer dispatches and risk overwriting the artifact the Residual Work Gate reconciles.
 
 Reuse the review output already in hand:
 
@@ -17,7 +17,7 @@ If `status` is `failed`, stop shipping and surface `reason`. If `degraded`, note
 
 ### Fallback — invoke review only for cold callers
 
-Only when the caller reached this file **without** already running review (no review output in hand): invoke `ce-code-review` once, then proceed to apply. Do not invoke when the caller already ran review (e.g., ce-work Tier 2 step 2a).
+Only when the caller reached this file **without** already running review (no review output in hand): invoke `ce-code-review` once, then proceed to apply. Do not invoke when the caller already ran review (e.g., ce-work shipping step 3a).
 
 Invoke the skill explicitly — do not treat a casual "review my changes" prompt as a substitute unless the harness routed it to `ce-code-review`.
Original file line number	Diff line number	Diff line change
`@@ -76,7 +76,7 @@`
`76`	`76`	`},`
`77`	`77`	`"evidence": {`
`78`	`78`	`"type": "array",`
`79`		`- "description": "Code-grounded evidence: snippets, line references, or pattern descriptions. At least 1 item.",`
	`79`	`+ "description": "Code-grounded evidence: snippets, line references, or pattern descriptions. At least 1 item. For any finding at confidence anchor 75 or 100, the first item MUST be the verbatim motivating line(s) with file:line -- the exact code text that makes the finding true (the quote-the-line gate). A finding whose triggering line cannot be quoted must step down to anchor 50.",`
`80`	`80`	`"items": { "type": "string" },`
`81`	`81`	`"minItems": 1`
`82`	`82`	`},`
`@@ -130,8 +130,8 @@`
`130`	`130`	`},`
`131`	`131`	`"return_tiers": {`
`132`	`132`	`"description": "Finding fields are split into two tiers. The full schema (with all required fields) applies to the artifact file on disk. The compact return to the orchestrator omits detail-tier fields. Both are valid uses of this schema in different contexts.",`
`133`		`- "merge_tier": "Returned to orchestrator: title, severity, file, line, confidence, autofix_class, owner, requires_verification, pre_existing, suggested_fix (optional). Plus top-level reviewer, residual_risks, testing_gaps.",`
`134`		`- "detail_tier": "Required in artifact file, omitted from compact return: why_it_matters, evidence. The artifact file must pass full schema validation including all required fields. Headless output depends on why_it_matters and evidence being present in the artifact."`
	`133`	`+ "merge_tier": "Returned to orchestrator: title, severity, file, line, confidence, autofix_class, owner, requires_verification, pre_existing, suggested_fix (optional), first_evidence (required for anchor 75/100; the verbatim first evidence line, used to enforce the quote-the-line gate in-band). Plus top-level reviewer, residual_risks, testing_gaps.",`
	`134`	`+ "detail_tier": "Required in artifact file, omitted from compact return: why_it_matters, and the full evidence array (the compact return carries only first_evidence). The artifact file must pass full schema validation including all required fields. Headless output depends on why_it_matters and evidence being present in the artifact."`
`135`	`135`	`}`
`136`	`136`	`}`
`137`	`137`	`}`