Implementation Notes — HELIX self-improvement (plan rev 2, 2026-05-24)

Implements docs/helix/02-design/plan-2026-05-24-helix-self-improvement.md in full across the three buckets (docs/ specs, workflows/, skill). No git commit performed, per instructions.

Verification summary

git diff --check: CLEAN.
New/edited YAML frontmatter and edited meta.yml files: parse OK.
skills/helix/SKILL.md: contains no compatibility token and no runtime-specific assertions (tool_use, ddx …, claude -p, codex exec); all validate-skills.sh SKILL.md assertions pass (the suite reaches the later beads block, i.e. it got past every SKILL.md assertion).
ddx review graph: cycles I introduced via bidirectional [[ID]] links were removed (see Assumption 4). ddx doc validate shows only pre-existing warnings (the data-architecture template cycle; ADR-001/ADR-003/TP-XXX duplicate-id warnings across example/fixture dirs; FEAT-004→FEAT-002 / SD-002→SD-001 missing-dep warnings).
just test: three failures, all proven pre-existing by running just test on a pristine HEAD worktree (same test-skills failure point, no edits of mine). See "Pre-existing test failures" below.

Files created

File	Why
`docs/helix/01-frame/features/FEAT-011-slider-autonomy.md`	L2 — restore the slider-autonomy feature (resolves dangling refs in `input.md`, CONTRACT-001/002). Reuses `low/medium/high` semantics; extends with concern-inference at high autonomy (FR-3), resolution precedence (FR-2), hard-stop invariant (FR-4), never-collapse-loop invariant (FR-5). Re-scoped to content-only methodology (no CLI).
`docs/helix/02-design/technical-designs/TD-011-slider-autonomy-implementation.md`	L2 — implementation design realizing FEAT-011 through action prompts + skill + runtime-neutral artifacts. Replaces the pre-collapse CLI/slider-config design.
`docs/helix/02-design/adr/ADR-003-autonomy-spectrum.md`	L2 — ADR recording the three-position autonomy decision, precedence chain, and the two invariants. Cited by FEAT-011 and the skill.
`docs/helix/01-frame/features/FEAT-016-artifact-honesty.md`	L1 — new feature for claims-vs-reality / `ASSERTED_UNBACKED` + zero-floor phantom-claim ratchet. Global artifact honesty; cross-referenced by FEAT-008 + FEAT-014.

Files modified

Bucket 1 — docs/ specs

File	Change
`docs/helix/01-frame/features/FEAT-006-...md`	L3 — FR-14 (concern selection is a required Frame step + high-autonomy inference) and FR-15 (propagation is a gate owned by check/polish, not selection). Re-stamped.
`docs/helix/01-frame/features/FEAT-008-...md`	L6 — FR-6 (AC↔test traceability: stable `US-<n>-AC<m>` IDs, story-level matrix, test-plan aggregation + layer allocation) + AC#5; cross-ref to FEAT-016. Re-stamped.
`docs/helix/01-frame/features/FEAT-014-workflow-coverage.md`	L4/L5 — new "Self-Validation Meta-Gates and Convergence" section (MG-01 mode-gates, MG-02 convergence = "verified + finding-classes folded into gates", MG-03 cross-ref FEAT-016) + two AC rows.
`docs/helix/02-design/contracts/CONTRACT-002-...md`	Re-stamped so the recorded `FEAT-011` dependency hash matches the restored FEAT-011 (the specific "dependency changed: FEAT-011" staleness is resolved).
`docs/install/claude-code.md`	L7a — "Interactive vs headless activation" note: `/helix` slash is interactive-only; headless `claude -p` runs activate by skill name/description. Cross-ref FEAT-013.
`docs/install/codex.md`	L7a — "Activation phrasing (by name, not slash)" note for `codex exec`. Cross-ref FEAT-013.
`docs/helix/01-frame/prd.md`	Pre-existing reword at `:97` (autonomy spectrum) / `:98` (no loop flattening) / `:90` (zero runtime-specific commands metric) — verified present; not edited this session.

Bucket 2 — workflows/

File	Change
`workflows/actions/input.md`	Autonomy semantics extended with resolution precedence (FR-2), all-levels invariants (hard stop + never-collapse-loop), and high-autonomy concern inference.
`workflows/actions/frame.md`	L3 — STEP 0a makes concern selection a required step with `low/medium` interactive vs `high` inferred paths; STEP 6 measure gate now requires concern selection to have happened.
`workflows/references/concern-resolution.md`	L3/L7b — selection-vs-propagation-gate distinction; high-autonomy inferred-selection path; "Composed-concern runtime friction" (typescript-bun + react-nextjs / `bun:sqlite` under `next build`). Re-stamped.
`workflows/actions/implementation.md`	L4 — Step 7 self-validation mode-gate (acceptance SATISFIED + claims-vs-reality clean / zero `ASSERTED_UNBACKED`), framed as a verify-activity gate not a literal validate/align/check command list.
`workflows/actions/evolve.md`	L4/L5 — "Evolve Default and Convergence" section (progressive evolve over re-splat; convergence ≠ reviewer SHIP) + STEP 8 self-validation mode-gate.
`workflows/actions/measure.md`	L4 — STEP 2.5 Claims-vs-Reality check (`ASSERTED_UNBACKED`, zero floor).
`workflows/actions/report.md`	L4 — `phantom-claim` follow-on category.
`workflows/actions/fresh-eyes-review.md`	L5 — "Convergence" section (clean verdict necessary-not-sufficient; finding-classes folded into gates; progressive evolve).
`workflows/actions/reconcile-alignment.md`	L5 — convergence criterion in STEP 10. (The L1 `ASSERTED_UNBACKED` classification at Step 3 was already present pre-session and is kept.)
`workflows/activities/01-frame/artifacts/user-stories/{template,prompt,example}.md`, `meta.yml`	L6 — stable `US-<n>-AC<m>` AC IDs in Given/When/Then; example uses `US-001-AC1..3`; meta adds a blocking check + `US-[0-9]+-AC[0-9]+` pattern check; test-scenario tables gain an AC-ID column.
`workflows/activities/03-test/artifacts/story-test-plan/{template,prompt,example}.md`, `meta.yml`	L6 — AC↔test matrix re-keyed on the stable AC ID (new `AC ID` column); meta adds an AC-ID pattern check; prompt/example state STP owns the matrix and TP aggregates. Example re-stamped.
`workflows/activities/03-test/artifacts/test-plan/{template,prompt}.md`	L6 — "Acceptance Criteria Layer Allocation" section: aggregates strategy + allocates AC classes to layers without duplicating the per-AC story matrix.
`workflows/concerns/typescript-bun/practices.md`	L7b — "Composed-Concern Friction": `bun:*` built-ins (incl. `bun:sqlite`) need the Bun runtime; `next build` runs under Node → use `bun --bun` or isolate; record as project override.
`workflows/concerns/react-nextjs/practices.md`	L7b — friction with typescript-bun: run Next.js under `bun --bun` so `bun:sqlite` resolves, or isolate the data layer; record as project override.
`workflows/ratchets.md`	Pre-existing L1 phantom-claim zero-floor sub-ratchet — kept (not edited this session).

Bucket 3 — skill

File Change

skills/helix/SKILL.md New ## Autonomy section (low/medium/high + precedence + hard-stop + never-collapse-loop, runtime-neutral); Frame route makes concern selection a prominent required step with high-autonomy inference and adds the stable-AC-ID step; Evolve/Review/Build routes reference the self-validation/convergence gates (claims-vs-reality, finding-classes folded into gates, progressive evolve). No runtime-specific assertions; no compatibility token.

File	Change
`skills/helix/SKILL.md`	New `## Autonomy` section (low/medium/high + precedence + hard-stop + never-collapse-loop, runtime-neutral); Frame route makes concern selection a prominent required step with high-autonomy inference and adds the stable-AC-ID step; Evolve/Review/Build routes reference the self-validation/convergence gates (claims-vs-reality, finding-classes folded into gates, progressive evolve). No runtime-specific assertions; no `compatibility` token.

ddx re-blessing

Re-stamped (ddx doc stamp) the documents I content-edited that carry a review block, in dependency order, plus CONTRACT-002: FEAT-006, FEAT-008, concern-resolution.md, user-stories/example.md, story-test-plan/example.md, CONTRACT-002. The new draft features/design/ADR are left without their own review block (matching the existing FEAT-014 precedent); the repo's review graph is broadly unblessed at the root (helix.prd carries no review block), so a full re-bless was intentionally not cascaded.

Pre-existing test failures (NOT caused by this work)

Proven by running just test on a pristine HEAD git worktree — it fails at the identical point with none of my edits present:

test-skills (validate-skills.sh execution-ready beads block): the installed ddx v0.6.2-alpha103 CLI now returns the ready epic hx-ready-epic from bead ready --execution, while scripts/validate_execution_ready_beads.py excludes epics. CLI-vs-validator version drift. The failing files (validate-skills.sh, the validator, the fixture) are unmodified by me.
test-context-digests: bead helix-e3c7d0e4 (integration-test epic) is missing a <context-digest> — present in committed HEAD .ddx/beads.jsonl. I did not touch .ddx/beads.jsonl.
test-demo-fixtures: docs/demos/helix-concerns/demo.sh is absent (the dir predates this session). I did not touch docs/demos/.

Per the project guidance "do not add workaround flags for others' bugs," I did not patch the validators or fixtures to mask these.

Assumptions recorded

FEAT-011 / TD-011 were rewritten, not byte-restored. The pre-collapse versions governed a removed CLI surface (helix run, execute-loop, .helix/slider-config.yaml) and were marked superseded; restoring them verbatim would contradict the current PRD. I reused the low/medium/high semantics from input.md and extended per the plan, scoped to the content-only methodology.
ADR number = ADR-003 (next free in docs/helix/02-design/adr/ after ADR-002; ADR-001 was deleted in the collapse). The duplicate-id warning vs docs/examples/crm/.../ADR-003-database-orm.md is the same tolerated class as the existing ADR-001 duplicates across example/fixture dirs.
Graph cycles avoided. Bidirectional [[ID]] links create ddx graph cycles. To honor the plan's cross-ref direction (FEAT-008/014 → FEAT-016; autonomy deps point up), I trimmed FEAT-016 depends_on to helix.prd and made the back-references plain text: FEAT-016→FEAT-008/014, FEAT-011→ADR-003, FEAT-006→FEAT-011. Forward [[ ]] cross-refs (FEAT-008/014 → FEAT-016) are kept.
Website mirrors untouched. website/content/artifacts/... is generated by scripts/publish-artifacts.py; the new FEAT-011/016, TD-011, ADR-003 were not hand-mirrored there.
L7a scope = headless runtimes. Added the by-name-activation note to claude-code.md and codex.md (where slash inertness in -p/exec runs matters). Copilot (repo-instructions file) and Genie (browser-only) do not have the same slash surface, so they were left unchanged.
reconcile-alignment.md and ratchets.md already carried the L1 ASSERTED_UNBACKED ratchet (Phase 0, uncommitted before this session). I kept them and only added the L5 convergence note to reconcile-alignment.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation Notes — HELIX self-improvement (plan rev 2, 2026-05-24)

Verification summary

Files created

Files modified

Bucket 1 — docs/ specs

Bucket 2 — workflows/

Bucket 3 — skill

ddx re-blessing

Pre-existing test failures (NOT caused by this work)

Assumptions recorded

FilesExpand file tree

IMPLEMENTATION-NOTES.md

Latest commit

History

IMPLEMENTATION-NOTES.md

File metadata and controls

Implementation Notes — HELIX self-improvement (plan rev 2, 2026-05-24)

Verification summary

Files created

Files modified

Bucket 1 — docs/ specs

Bucket 2 — workflows/

Bucket 3 — skill

ddx re-blessing

Pre-existing test failures (NOT caused by this work)

Assumptions recorded