You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: .agents/skills/validate-user-surface/SKILL.md
+8-2Lines changed: 8 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,6 +18,7 @@ Treat the repository like a release candidate. Prefer live execution against bui
18
18
19
19
- Run live checks for user-facing behavior. Do not count unit tests or code reading as release validation.
20
20
- Use isolated temp homes and temp repos. Do not reuse the operator's real `~/.config/devagent`.
21
+
- Environment variables are not the only credential source. The live harness can seed non-expired credentials from the local DevAgent `CredentialStore` into isolated homes; run `bun run validate:live:provider-smoke` before marking providers blocked just because API-key env vars are unset.
21
22
- Prefer the publish bundle for install and packaging checks. Validate the developer CLI separately only when comparing dev-versus-publish behavior.
22
23
- Treat missing provider credentials or missing external dependencies as validation gaps, not silent skips.
23
24
- Do not publish to npm unless the user explicitly asks.
@@ -39,7 +40,7 @@ bun run test:live-validation
39
40
bun run validate:live:full
40
41
```
41
42
42
-
2. Create isolated homes and disposable workspaces for each install, auth, TUI, and query-flow pass.
43
+
2. Create isolated homes and disposable workspaces for each install, auth, TUI, and query-flow pass. When provider credentials exist in the local DevAgent credential store, copy only the required non-expired credentials into those isolated homes rather than running against the operator's real HOME.
43
44
3. Use `cd dist && npm pack` to create a publishable tarball, then validate install and launch paths from that artifact.
44
45
4. Exercise documented install and launch paths live: tarball install, `npx`, `bunx`, bundled bootstrap, and linked local CLI when helpful.
45
46
5. Cover the provider matrix from `references/release-matrix.md`. Prefer every documented provider. If full coverage is impossible, call out each unvalidated provider explicitly.
@@ -49,13 +50,18 @@ bun run validate:live:full
49
50
50
51
## Mandatory Surfaces
51
52
52
-
- Packaging and install: `bun run build:publish`, `bun run test:bundle-smoke`, `npm pack` from `dist/`, tarball install, uninstall and reinstall, Node 20 bootstrap, and upgrade behavior.
53
+
- Packaging and install: `bun run build:publish`, `bun run test:bundle-smoke`, `npm pack` from `dist/`, tarball install, uninstall and reinstall, Node 20 bootstrap help, installed-runtime session startup, and upgrade behavior.
- Auth: `devagent auth login/status/logout` for API-key providers and device-code providers in isolated homes.
56
57
- Query execution: interactive TUI, single-shot query execution, quiet and non-TTY behavior, `devagent review`, and `devagent execute`.
57
58
- Provider coverage: Anthropic, OpenAI, Devagent API, DeepSeek, OpenRouter, Ollama, ChatGPT, and GitHub Copilot when credentials or local services are available.
58
59
60
+
## Credential And Bootstrap Notes
61
+
62
+
- Use `bun run validate:live:provider-smoke` to discover locally stored credentials and local services; it seeds isolated homes from `CredentialStore` and reports per-provider pass/block status.
63
+
- For raw bundle checks, `node dist/bootstrap.js --help` is valid. Validate `sessions` from a staged or installed publish runtime, because raw `dist/` does not include installed native dependencies such as `better-sqlite3`.
64
+
59
65
## Reporting
60
66
61
67
- Summarize by surface: packaging, install, docs, CLI, TUI, auth, review, execute, and provider matrix.
Copy file name to clipboardExpand all lines: .agents/skills/validate-user-surface/references/release-matrix.md
+4-1Lines changed: 4 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,6 +23,7 @@ Use this file when planning coverage or writing the final report.
23
23
- Validate on real Node 20+ because the publish bootstrap targets Node, not Bun's Node shim.
24
24
- Validate Bun-backed developer flows when the README or local contributor workflow depends on Bun.
25
25
- Use isolated `HOME`, `XDG_CONFIG_HOME`, and `XDG_CACHE_HOME` for every install and auth pass.
26
+
- Do not infer provider credentials only from environment variables. The live validation harness can copy non-expired credentials from the local DevAgent `CredentialStore` into isolated homes; run provider smoke before marking credential-backed providers blocked.
26
27
- Keep one clean temp repo for install and help checks and separate temp repos for query and mutation scenarios.
27
28
28
29
## Packaging And Install Matrix
@@ -32,7 +33,8 @@ Use this file when planning coverage or writing the final report.
32
33
- Run `cd dist && npm pack`.
33
34
- Install the tarball into a temp prefix and verify `devagent help`, `devagent version`, and `devagent doctor`.
34
35
- Remove that install and repeat to catch stale-file issues.
35
-
- Validate `node dist/bootstrap.js --help` and `node dist/bootstrap.js sessions`.
36
+
- Validate `node dist/bootstrap.js --help` directly from raw `dist/`.
37
+
- Validate `sessions` from the staged or installed publish runtime, because raw `dist/` intentionally lacks installed native dependencies such as `better-sqlite3`.
36
38
- Validate `npx` and `bunx` invocation paths against a prerelease tag when one exists.
37
39
- If no prerelease tag exists, validate the closest local equivalent and mark registry-backed `npx` or `bunx` as still pending.
38
40
- Compare `dist/package.json` and copied `dist/README.md` against the root contract.
@@ -133,6 +135,7 @@ For each provider:
133
135
- Use `bun run scripts/live-validation.ts --list-scenarios` to inventory current scenarios.
134
136
- Run the full suite before adding bespoke manual checks.
135
137
- Inspect `summary.json` and `summary.md` from the generated output directory.
138
+
- Run `bun run validate:live:provider-smoke` before declaring provider coverage blocked; it verifies stored local credentials and Ollama service availability from isolated homes.
136
139
- Use `bun run validate:live:execute-deep` when you need one ordered `execute` packet with prereqs, canonical staged flow, continuity checks, remainder coverage, and per-scenario review notes.
137
140
- Use `bun run validate:live:execute-deep --only canonical|continuity|remainder --skip-prereqs` for focused local reruns after a broader packet establishes the baseline.
138
141
- Use `bun run validate:live:execute-chain` when you need one disposable-worktree run that carries real stage artifacts forward into `implement`, `review`, and `repair`.
Copy file name to clipboardExpand all lines: packages/executor/src/index.ts
+3-1Lines changed: 3 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -787,7 +787,9 @@ export function buildTaskQuery(
787
787
);
788
788
break;
789
789
case"review":
790
-
sections.push("Review the current workspace changes and produce a report with either `No defects found.` or one section per defect using the format `Severity: <low|medium|high|critical>` plus a concrete fix recommendation.");
790
+
sections.push("Workspace is review-only for this stage. No file changes are allowed.");
791
+
sections.push("Do not use update_plan for this stage. Inspect the current workspace changes as needed, then return the final review artifact directly.");
792
+
sections.push("Produce a direct review report with either exactly `No defects found.` or one section per defect using the format `Severity: <low|medium|high|critical>` plus a concrete fix recommendation.");
791
793
break;
792
794
case"repair":
793
795
sections.push("Apply repairs for the current issue, address the review findings, and summarize fixes applied plus remaining concerns.");
0 commit comments