Implements docs/helix/02-design/plan-2026-05-24-helix-self-improvement.md in
full across the three buckets (docs/ specs, workflows/, skill). No git commit
performed, per instructions.
git diff --check: CLEAN.- New/edited YAML frontmatter and edited
meta.ymlfiles: parse OK. skills/helix/SKILL.md: contains nocompatibilitytoken and no runtime-specific assertions (tool_use,ddx …,claude -p,codex exec); allvalidate-skills.shSKILL.md assertions pass (the suite reaches the later beads block, i.e. it got past every SKILL.md assertion).- ddx review graph: cycles I introduced via bidirectional
[[ID]]links were removed (see Assumption 4).ddx doc validateshows only pre-existing warnings (thedata-architecturetemplate cycle;ADR-001/ADR-003/TP-XXXduplicate-id warnings across example/fixture dirs;FEAT-004→FEAT-002/SD-002→SD-001missing-dep warnings). just test: three failures, all proven pre-existing by runningjust teston a pristineHEADworktree (sametest-skillsfailure point, no edits of mine). See "Pre-existing test failures" below.
| File | Why |
|---|---|
docs/helix/01-frame/features/FEAT-011-slider-autonomy.md |
L2 — restore the slider-autonomy feature (resolves dangling refs in input.md, CONTRACT-001/002). Reuses low/medium/high semantics; extends with concern-inference at high autonomy (FR-3), resolution precedence (FR-2), hard-stop invariant (FR-4), never-collapse-loop invariant (FR-5). Re-scoped to content-only methodology (no CLI). |
docs/helix/02-design/technical-designs/TD-011-slider-autonomy-implementation.md |
L2 — implementation design realizing FEAT-011 through action prompts + skill + runtime-neutral artifacts. Replaces the pre-collapse CLI/slider-config design. |
docs/helix/02-design/adr/ADR-003-autonomy-spectrum.md |
L2 — ADR recording the three-position autonomy decision, precedence chain, and the two invariants. Cited by FEAT-011 and the skill. |
docs/helix/01-frame/features/FEAT-016-artifact-honesty.md |
L1 — new feature for claims-vs-reality / ASSERTED_UNBACKED + zero-floor phantom-claim ratchet. Global artifact honesty; cross-referenced by FEAT-008 + FEAT-014. |
| File | Change |
|---|---|
docs/helix/01-frame/features/FEAT-006-...md |
L3 — FR-14 (concern selection is a required Frame step + high-autonomy inference) and FR-15 (propagation is a gate owned by check/polish, not selection). Re-stamped. |
docs/helix/01-frame/features/FEAT-008-...md |
L6 — FR-6 (AC↔test traceability: stable US-<n>-AC<m> IDs, story-level matrix, test-plan aggregation + layer allocation) + AC#5; cross-ref to FEAT-016. Re-stamped. |
docs/helix/01-frame/features/FEAT-014-workflow-coverage.md |
L4/L5 — new "Self-Validation Meta-Gates and Convergence" section (MG-01 mode-gates, MG-02 convergence = "verified + finding-classes folded into gates", MG-03 cross-ref FEAT-016) + two AC rows. |
docs/helix/02-design/contracts/CONTRACT-002-...md |
Re-stamped so the recorded FEAT-011 dependency hash matches the restored FEAT-011 (the specific "dependency changed: FEAT-011" staleness is resolved). |
docs/install/claude-code.md |
L7a — "Interactive vs headless activation" note: /helix slash is interactive-only; headless claude -p runs activate by skill name/description. Cross-ref FEAT-013. |
docs/install/codex.md |
L7a — "Activation phrasing (by name, not slash)" note for codex exec. Cross-ref FEAT-013. |
docs/helix/01-frame/prd.md |
Pre-existing reword at :97 (autonomy spectrum) / :98 (no loop flattening) / :90 (zero runtime-specific commands metric) — verified present; not edited this session. |
| File | Change |
|---|---|
workflows/actions/input.md |
Autonomy semantics extended with resolution precedence (FR-2), all-levels invariants (hard stop + never-collapse-loop), and high-autonomy concern inference. |
workflows/actions/frame.md |
L3 — STEP 0a makes concern selection a required step with low/medium interactive vs high inferred paths; STEP 6 measure gate now requires concern selection to have happened. |
workflows/references/concern-resolution.md |
L3/L7b — selection-vs-propagation-gate distinction; high-autonomy inferred-selection path; "Composed-concern runtime friction" (typescript-bun + react-nextjs / bun:sqlite under next build). Re-stamped. |
workflows/actions/implementation.md |
L4 — Step 7 self-validation mode-gate (acceptance SATISFIED + claims-vs-reality clean / zero ASSERTED_UNBACKED), framed as a verify-activity gate not a literal validate/align/check command list. |
workflows/actions/evolve.md |
L4/L5 — "Evolve Default and Convergence" section (progressive evolve over re-splat; convergence ≠ reviewer SHIP) + STEP 8 self-validation mode-gate. |
workflows/actions/measure.md |
L4 — STEP 2.5 Claims-vs-Reality check (ASSERTED_UNBACKED, zero floor). |
workflows/actions/report.md |
L4 — phantom-claim follow-on category. |
workflows/actions/fresh-eyes-review.md |
L5 — "Convergence" section (clean verdict necessary-not-sufficient; finding-classes folded into gates; progressive evolve). |
workflows/actions/reconcile-alignment.md |
L5 — convergence criterion in STEP 10. (The L1 ASSERTED_UNBACKED classification at Step 3 was already present pre-session and is kept.) |
workflows/activities/01-frame/artifacts/user-stories/{template,prompt,example}.md, meta.yml |
L6 — stable US-<n>-AC<m> AC IDs in Given/When/Then; example uses US-001-AC1..3; meta adds a blocking check + US-[0-9]+-AC[0-9]+ pattern check; test-scenario tables gain an AC-ID column. |
workflows/activities/03-test/artifacts/story-test-plan/{template,prompt,example}.md, meta.yml |
L6 — AC↔test matrix re-keyed on the stable AC ID (new AC ID column); meta adds an AC-ID pattern check; prompt/example state STP owns the matrix and TP aggregates. Example re-stamped. |
workflows/activities/03-test/artifacts/test-plan/{template,prompt}.md |
L6 — "Acceptance Criteria Layer Allocation" section: aggregates strategy + allocates AC classes to layers without duplicating the per-AC story matrix. |
workflows/concerns/typescript-bun/practices.md |
L7b — "Composed-Concern Friction": bun:* built-ins (incl. bun:sqlite) need the Bun runtime; next build runs under Node → use bun --bun or isolate; record as project override. |
workflows/concerns/react-nextjs/practices.md |
L7b — friction with typescript-bun: run Next.js under bun --bun so bun:sqlite resolves, or isolate the data layer; record as project override. |
workflows/ratchets.md |
Pre-existing L1 phantom-claim zero-floor sub-ratchet — kept (not edited this session). |
| File | Change |
|---|---|
skills/helix/SKILL.md |
New ## Autonomy section (low/medium/high + precedence + hard-stop + never-collapse-loop, runtime-neutral); Frame route makes concern selection a prominent required step with high-autonomy inference and adds the stable-AC-ID step; Evolve/Review/Build routes reference the self-validation/convergence gates (claims-vs-reality, finding-classes folded into gates, progressive evolve). No runtime-specific assertions; no compatibility token. |
Re-stamped (ddx doc stamp) the documents I content-edited that carry a review
block, in dependency order, plus CONTRACT-002:
FEAT-006, FEAT-008, concern-resolution.md,
user-stories/example.md, story-test-plan/example.md, CONTRACT-002.
The new draft features/design/ADR are left without their own review block
(matching the existing FEAT-014 precedent); the repo's review graph is broadly
unblessed at the root (helix.prd carries no review block), so a full re-bless
was intentionally not cascaded.
Proven by running just test on a pristine HEAD git worktree — it fails at the
identical point with none of my edits present:
test-skills(validate-skills.shexecution-ready beads block): the installedddx v0.6.2-alpha103CLI now returns the ready epichx-ready-epicfrombead ready --execution, whilescripts/validate_execution_ready_beads.pyexcludes epics. CLI-vs-validator version drift. The failing files (validate-skills.sh, the validator, the fixture) are unmodified by me.test-context-digests: beadhelix-e3c7d0e4(integration-test epic) is missing a<context-digest>— present in committed HEAD.ddx/beads.jsonl. I did not touch.ddx/beads.jsonl.test-demo-fixtures:docs/demos/helix-concerns/demo.shis absent (the dir predates this session). I did not touchdocs/demos/.
Per the project guidance "do not add workaround flags for others' bugs," I did not patch the validators or fixtures to mask these.
- FEAT-011 / TD-011 were rewritten, not byte-restored. The pre-collapse
versions governed a removed CLI surface (
helix run,execute-loop,.helix/slider-config.yaml) and were marked superseded; restoring them verbatim would contradict the current PRD. I reused thelow/medium/highsemantics frominput.mdand extended per the plan, scoped to the content-only methodology. - ADR number = ADR-003 (next free in
docs/helix/02-design/adr/after ADR-002; ADR-001 was deleted in the collapse). The duplicate-id warning vsdocs/examples/crm/.../ADR-003-database-orm.mdis the same tolerated class as the existingADR-001duplicates across example/fixture dirs. - Graph cycles avoided. Bidirectional
[[ID]]links create ddx graph cycles. To honor the plan's cross-ref direction (FEAT-008/014 → FEAT-016; autonomy deps point up), I trimmedFEAT-016depends_ontohelix.prdand made the back-references plain text:FEAT-016→FEAT-008/014, FEAT-011→ADR-003, FEAT-006→FEAT-011. Forward[[ ]]cross-refs (FEAT-008/014 → FEAT-016) are kept. - Website mirrors untouched.
website/content/artifacts/...is generated byscripts/publish-artifacts.py; the new FEAT-011/016, TD-011, ADR-003 were not hand-mirrored there. - L7a scope = headless runtimes. Added the by-name-activation note to
claude-code.mdandcodex.md(where slash inertness in-p/execruns matters). Copilot (repo-instructions file) and Genie (browser-only) do not have the same slash surface, so they were left unchanged. reconcile-alignment.mdandratchets.mdalready carried the L1ASSERTED_UNBACKEDratchet (Phase 0, uncommitted before this session). I kept them and only added the L5 convergence note toreconcile-alignment.md.