feat(lfg): add dogfood step and riffrec/video/screenshot input

kieranklaassen · claude · tmchow · commit 36db308280cb · 2026-06-12T13:18:32.000-07:00
Bring the linear /lfg skill and the lfg workflow to parity and extend
both:

- Add a Dogfood step to the skill, inlining ce-dogfood-beta's diff-scoped
  behavior (it is disable-model-invocation and can't be invoked from the
  pipeline), mirroring the workflow's Dogfood phase.
- Accept a Riffrec bundle, video, audio, or screenshots as input and
  analyze it before planning, on both surfaces.
- Reproduce bugs locally with synthetic, anonymized state before writing
  the fix.
- Capture demos non-interactively via ce-demo-reel for observable
  changes, and emit a fixed PR template for feedback-sourced runs.

Co-Authored-By: Claude Opus 4.8 &lt;noreply@anthropic.com&gt;
diff --git a/.claude/workflows/lfg.js b/.claude/workflows/lfg.js
@@ -53,7 +53,7 @@
 export const meta = {
   name: 'lfg',
   description: 'Compounding autonomous engineering pipeline named in compound-engineering steps: Worktree, Riffrec, Research, Ideate, Plan, Doc Review, Work, Code Review, Autofix, Re-review, Simplify, Test, Dogfood, Commit & PR, Compound, Cleanup. Runs in an isolated git worktree, recalls prior learnings in and captures a durable learning out, watching CI to green.',
-  whenToUse: 'Hands-off execution of a software task when you want the full compound-engineering loop (institutional recall in, durable learning out, CI watched to green) rather than a single linear pass. Runs in an isolated git worktree so your checkout is untouched. Pass the feature description (or a Riffrec bundle path) as args. Add { dryRun: true } to stop before Commit & PR / CI / Compound.',
+  whenToUse: 'Hands-off execution of a software task when you want the full compound-engineering loop (institutional recall in, durable learning out, CI watched to green) rather than a single linear pass. Runs in an isolated git worktree so your checkout is untouched. Pass the feature description — or a Riffrec bundle, video, audio, or screenshot path — as args. Add { dryRun: true } to stop before Commit & PR / CI / Compound.',
   phases: [
     { title: 'Worktree', detail: 'Create an isolated git worktree so the run never touches the user checkout' },
     { title: 'Riffrec', detail: 'ce-riffrec-feedback-analysis — if a recording is present, find + analyze it and fold its findings into the task' },
@@ -419,7 +419,7 @@ if (wtSetup && wtSetup.created && wtSetup.path) {
 
 phase('Riffrec')
 const riffrec = await agent(
-  `Determine whether a Riffrec product-feedback recording is available for this task, then act.\n\nTask:\n${task}\n\nLook for a \`riffrec-*.zip\` or a bundle containing session.json + events.json + recording.webm + voice.webm — in any path named in the task, the repo root, the current directory, or ~/Downloads.\n- If you find one, invoke the ce-riffrec-feedback-analysis skill to analyze it, and return found=true, the path, and the structured product feedback (bugs, UX issues, repro steps, the user's spoken intent).\n- If none exists, return found=false and stop — do not fabricate feedback. ${NON_INTERACTIVE}`,
+  `Determine whether product-feedback input is available for this task, then act.\n\nTask:\n${task}\n\nLook for any of these — in any path named in the task, the repo root, the current directory, or ~/Downloads:\n- a \`riffrec-*.zip\` or a bundle containing session.json + events.json + recording.webm + voice.webm;\n- a video or audio recording (.mp4 / .mov / .webm / .m4a / .mp3 / .wav);\n- one or more screenshots (image files) showing the problem.\n\n- For a Riffrec bundle, video, or audio: invoke the ce-riffrec-feedback-analysis skill on it and return found=true, the path, and the structured product feedback (bugs, UX issues, repro steps, the user's spoken intent).\n- For screenshots: view them, derive what is broken or requested, and return found=true, the path, and that as the feedback.\n- If none exists, return found=false and stop — do not fabricate feedback. ${NON_INTERACTIVE}`,
   { label: 'riffrec', phase: 'Riffrec', schema: RIFFREC_SCHEMA }
 )
 const riffrecFeedback = (riffrec && riffrec.found && riffrec.feedback) ? riffrec.feedback : ''
@@ -540,7 +540,7 @@ if (planConcerns.length) log(`Doc Review raised ${planConcerns.length} non-fatal
 
 phase('Work')
 const build = await agent(
-  `${wt}Invoke the ce-work skill to execute the plan at ${planPath}. Implement the full feature, following the codebase's existing patterns and the conventions surfaced in research. Address these Doc Review concerns as you go:\n${JSON.stringify(planConcerns)}\n${NON_INTERACTIVE}\n\nWhen done, report whether files actually changed, a concise summary, the files touched, and which observable surfaces (web-ui / ios / cli / api / library / docs) the change affects.`,
+  `${wt}Invoke the ce-work skill to execute the plan at ${planPath}. Implement the full feature, following the codebase's existing patterns and the conventions surfaced in research. If this is a bug fix (including any Riffrec/feedback-sourced run), reproduce the bug locally FIRST — minimal synthetic state, anonymize any production data, never commit throwaway repro data — then fix the confirmed root cause. Address these Doc Review concerns as you go:\n${JSON.stringify(planConcerns)}\n${NON_INTERACTIVE}\n\nWhen done, report whether files actually changed, a concise summary, the files touched, and which observable surfaces (web-ui / ios / cli / api / library / docs) the change affects.`,
   { label: 'work', phase: 'Work', schema: BUILD_SCHEMA }
 )
 if (!build || !build.changed) {
@@ -698,6 +698,15 @@ if (DRY_RUN) {
     (observable
       ? `This change has an observable surface (${surfaceList}) — also invoke the ce-demo-reel skill to capture visual/CLI proof and include its markdown in the PR body.\n\n`
       : '') +
+    (riffrecFeedback
+      ? `This is a feedback-sourced run. Write the PR body from this exact template so the issue is understandable WITHOUT watching the recording:\n` +
+        `## What the user reported\n<one or two plain-language sentences>\n> <verbatim narration excerpt from the analysis, as a Markdown block quote>\n<any "before" frames from the analysis>\n\n` +
+        `## The problem\n<technical root cause>\n\n` +
+        `## How we reproduced it\n<minimal local state + exact steps>\n\n` +
+        `## The fix\n<what changed and why, kept tight>\n\n` +
+        `## Demo\n<after evidence from ce-demo-reel proving it works>\n\n` +
+        `## Testing\n<regression test covering this bug + other checks run>\n\n`
+      : '') +
     `In the PR body, add a "## Residual Findings" section listing these unresolved items verbatim (omit the section if the list is empty):\n${JSON.stringify(residualForPr, null, 2)}\n\nReturn the PR number, URL, and branch.`,
     { label: 'commit-pr', phase: 'Commit & PR', schema: SHIP_SCHEMA }
   )
diff --git a/plugins/compound-engineering/skills/lfg/SKILL.md b/plugins/compound-engineering/skills/lfg/SKILL.md
@@ -1,19 +1,29 @@
 ---
 name: lfg
-description: Run the full autonomous engineering pipeline end-to-end (plan, work, code review, test, commit, push, open PR, watch CI, fix CI failures until green). Use only when the user explicitly requests hands-off execution of a software task and provides a feature description; do not auto-route casual conversation here.
-argument-hint: "[feature description]"
+description: Run the full autonomous engineering pipeline end-to-end (plan, work, code review, test, dogfood, commit, push, open PR, watch CI, fix CI failures until green). Use only when the user explicitly requests hands-off execution of a software task and provides a feature description, Riffrec recording, video, or screenshots; do not auto-route casual conversation here.
+argument-hint: "[feature description | riffrec zip | video | screenshots]"
 ---
 
 CRITICAL: You MUST execute every step below IN ORDER. Do NOT skip any required step. Do NOT jump ahead to coding or implementation. The plan phase (step 1) MUST be completed and verified BEFORE any work begins. Violating this order produces bad output.
 
 When invoking any skill referenced below, resolve its name against the available-skills list the host platform provides and use that exact entry. Some platforms list skills under a plugin namespace (e.g., `compound-engineering:ce-plan`); others list the bare name. Invoking a short-form guess that isn't in the list will fail — always match a listed entry verbatim before calling the Skill/Task tool.
 
-1. Invoke the `ce-plan` skill with `$ARGUMENTS`.
+**Input resolution (do this before step 1).** `$ARGUMENTS` may be a plain feature description, a path to a Riffrec bundle (`riffrec-*.zip`, or a folder with `session.json` + `events.json` + `recording.webm` + `voice.webm`), a video/audio recording, or one or more screenshots. Resolve it into a concrete task before planning:
 
-   GATE: STOP. If ce-plan reported the task is non-software and cannot be processed in pipeline mode, stop the pipeline and inform the user that LFG requires software tasks. Otherwise, verify that the `ce-plan` workflow produced a plan file in `docs/plans/`. If no plan file was created, invoke `ce-plan` again with `$ARGUMENTS`. Do NOT proceed to step 2 until a written plan exists. **Record the plan file path** — it will be passed to ce-code-review in step 3.
+- Riffrec bundle, video, or audio: invoke the `ce-riffrec-feedback-analysis` skill on the path to extract structured product feedback — the bugs, UX issues, repro steps, and the user's spoken intent. Use that feedback as the task.
+- Screenshot image(s): view them and derive what is broken or requested. Use that as the task.
+- Plain text: use it verbatim.
+
+Call the result the **resolved task**, and note whether it came from a recording, video, or screenshots — a **feedback-sourced** run, which changes the PR body in step 8. Pass the resolved task (not bare `$ARGUMENTS`) to every step below.
+
+1. Invoke the `ce-plan` skill with the resolved task.
+
+   GATE: STOP. If ce-plan reported the task is non-software and cannot be processed in pipeline mode, stop the pipeline and inform the user that LFG requires software tasks. Otherwise, verify that the `ce-plan` workflow produced a plan file in `docs/plans/`. If no plan file was created, invoke `ce-plan` again with the resolved task. Do NOT proceed to step 2 until a written plan exists. **Record the plan file path** — it will be passed to ce-code-review in step 3.
 
 2. Invoke the `ce-work` skill.
 
+   If the task is a bug fix (especially a feedback-sourced run), reproduce the bug locally FIRST — set up the minimal local state and trigger the failing behavior — before writing any fix, so the fix targets the confirmed root cause. Prefer building that state synthetically; only pull production data as a last resort, and always anonymize names, emails, and other PII. Do not commit throwaway repro data (e.g., to seed files) unless it has lasting value for the team.
+
    GATE: STOP. Verify that implementation work was performed - files were created or modified beyond the plan. Do NOT proceed to step 3 if no code changes were made.
 
 3. Invoke the `ce-code-review` skill with `mode:agent plan:<plan-path-from-step-1>`.
@@ -52,13 +62,37 @@ When invoking any skill referenced below, resolve its name against the available
 
 6. Invoke the `ce-test-browser` skill with `mode:pipeline`.
 
-7. Invoke the `ce-commit-push-pr` skill.
+7. **Dogfood the change as a real user** (ALWAYS run; do not skip)
+
+   `ce-dogfood-beta` is `disable-model-invocation` and cannot be invoked from this pipeline, so perform its diff-scoped dogfooding behavior directly. This is hands-on exercise of the changed journeys, distinct from the automated browser tests in step 6.
+
+   Determine which observable surfaces the branch diff touches (web-ui, ios, cli, api, library, docs), then exercise the CHANGED user journeys the way a real user would — not just the happy path; include bad input, edge states, and back/forward navigation:
+
+   - web-ui: start or attach to the running app and drive the affected pages in a real browser (e.g., agent-browser).
+   - ios: drive the changed flows on the simulator.
+   - cli: run the changed commands as a user would, including bad input and unusual flag combinations.
+   - api / library: call the changed entrypoints as a consumer would, including misuse.
+
+   Fix any UX or behavior breakage found at its root cause, add a regression test for each fix, and re-check until the changed journeys work. Leave the fixes uncommitted in the working tree — step 8 commits and pushes them. If there is genuinely nothing runnable to exercise (e.g., a pure docs change), state that explicitly and continue.
+
+8. Invoke the `ce-commit-push-pr` skill.
+
+   This commits any remaining changes (including dogfood fixes from step 7), pushes the branch, and opens a pull request. If step 5 already opened a PR (check with `gh pr view --json number,url,state 2>/dev/null`), skip PR creation but still commit and push any uncommitted changes.
+
+   For observable changes (UI, CLI output, API behavior, generated artifacts), capture a demo NON-INTERACTIVELY via the `ce-demo-reel` skill and splice its markdown into the PR body. This run is autonomous — do not wait for an interactive "capture evidence?" prompt; decide to capture for observable changes, and skip only when there is genuinely nothing runnable to show.
+
+   If this is a feedback-sourced run (input resolution), write the PR body from this fixed template so the issue is understandable without watching the original recording:
 
-   This commits any remaining changes, pushes the branch, and opens a pull request. If step 5 already opened a PR (check with `gh pr view --json number,url,state 2>/dev/null`), skip PR creation but still commit and push any uncommitted changes.
+   - `## What the user reported` — one or two plain-language sentences, then the user's narration as a verbatim Markdown block quote (`>`), and any "before" frames from the analysis.
+   - `## The problem` — the technical root cause.
+   - `## How we reproduced it` — the minimal local state and exact steps.
+   - `## The fix` — what changed and why, kept tight.
+   - `## Demo` — the after evidence from `ce-demo-reel` proving it works.
+   - `## Testing` — the regression test covering this bug, plus any other checks run.
 
-8. **CI watch and autofix loop** (only when an open PR exists for the current branch)
+9. **CI watch and autofix loop** (only when an open PR exists for the current branch)
 
-   Detect the PR; if none exists or `gh` is unavailable, skip this step entirely and proceed to step 9.
+   Detect the PR; if none exists or `gh` is unavailable, skip this step entirely and proceed to step 10.
 
    ```bash
    gh pr view --json number,url,state
@@ -72,7 +106,7 @@ When invoking any skill referenced below, resolve its name against the available
       gh pr checks --watch
       ```
 
-      If the command exits 0, all checks passed. Break out of the loop and proceed to step 9.
+      If the command exits 0, all checks passed. Break out of the loop and proceed to step 10.
 
       If it exits non-zero, one or more checks failed. Continue to (2).
 
@@ -105,8 +139,8 @@ When invoking any skill referenced below, resolve its name against the available
      gh pr edit PR_NUMBER --body-file BODY_FILE
      ```
 
-   - Do NOT continue looping. The autopilot contract is "make residuals durable, then exit." Proceed to step 9.
+   - Do NOT continue looping. The autopilot contract is "make residuals durable, then exit." Proceed to step 10.
 
-9. Output `<promise>DONE</promise>` when complete
+10. Output `<promise>DONE</promise>` when complete
 
 Start with step 1 now. Remember: plan FIRST, then work. Never skip the plan.