| author | agents |
|---|---|
| status | active roadmap |
| last_reviewed | 2026-03-05 |
| source_of_truth_scope | next steps to move from partial checks to fully enforced contributor harnesses |
Purpose: track the next steps to move from "documented guidance + partial checks" to "fully enforced, measurable, and continuously maintained" contributor harnesses.
Implemented:
- Root AGENTS is compact and acts as a TOC.
pnpm agents:checkvalidates AGENTS/doc harness integrity.- CI job
agent-harnessenforces these checks on PRs/pushes.
Still pending:
- More architecture constraints are prose or warning-only, not blocking.
- No scenario-based eval harness with pass-rate trend tracking.
- No scheduled drift-maintenance automation that opens maintenance PRs.
Goal: convert boundary guidance into blocking checks with safe rollout.
- Add a baseline-backed guardrail for
common/** -> modules/**imports. - Promote selected warning-level boundary rules to error-level where safe.
- Encode critical module dependency direction rules as machine checks.
- Add
tools/scripts/common-modules-guardrail.mjswith baseline support. - Add
tools/baselines/common-to-modules-baseline.txt. - Add scripts:
common-modules:checkcommon-modules:baseline:update
- Wire
common-modules:checkintoagents:check. - Incrementally tighten lint rules in
eslint.config.js:- start with narrow, high-confidence denies
- keep remediation text actionable
- New
common -> modulesviolations fail CI. - Selected boundary rules promoted from warning to error with no broad breakage.
- Legacy violations visible via baseline files and trendable counts.
Goal: measure agent/code-change quality with repeatable tasks and score trends.
- Create executable scenario suite for representative repo tasks.
- Add pass/fail assertions and metrics output.
- Run smoke evals on PR and full evals on schedule.
- Add
tools/harness/scenarios/with 10-20 scoped tasks:- boundary fix
- data fetching pattern slice
- typed bugfix
- small refactor with tests
- Add
tools/harness/run.mjs:- executes scenarios
- emits JSON report with pass rate and regressions
- Add CI integration:
- PR: smoke subset
- nightly: full suite
- Store historical reports as workflow artifacts or committed snapshots.
- Harness pass-rate and regressions are visible over time.
- Failing scenarios are actionable and link to exact checks.
- Scenario set is maintained as architecture/rules evolve.
Goal: pay down harness/codebase drift continuously with small automated maintenance loops.
- Add scheduled workflow that runs harness and drift scans.
- Auto-open maintenance PRs for safe updates.
- Keep quality grades and plan status fresh.
- Add scheduled workflow (weekly):
- run
pnpm agents:check - run architecture drift scripts
- scan stale
.plans/activeentries
- run
- Generate/update:
docs/QUALITY.mdfreshness section- baseline drift summaries
- Auto-open a maintenance PR with constrained file scope.
- Require normal CI + owner review for merge.
- Weekly maintenance PR is generated reliably.
- Quality ledger and baselines stay current with minimal manual effort.
- Drift backlog no longer accumulates silently.
Recommended order:
- Track 1 (highest risk-reduction and immediate enforcement value)
- Track 2 (adds measurable quality signal)
- Track 3 (sustains long-term health with lower manual overhead)
- Update this roadmap when milestone status changes.
- Mirror key progress in
docs/QUALITY.md. - For active execution, create/maintain a concrete plan under
.plans/active/.