You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix CLI to accept positional URI arguments and support init command
- Remove Required:true from URI flags in schemas, records, namespaces, watch, import/export commands to allow positional argument parsing via getURI()
- Skip RPC connection in root Before hook for init and daemon commands since they manage configuration before the daemon is running
- Add ArgsUsage hints to schema commands for better help text
- Add Before hook to init command to prevent connection attempt during initialization
Run the xdb end-to-end suite. `$ARGUMENTS` (may be empty) is an optional scenario filter.
7
+
8
+
## Hard rules — for you AND every sub-agent you spawn
9
+
10
+
1.**No edits to anything in the repo.** Not source, not `scenarios.yaml`, not `RUNBOOK.md`, not commit history. The only mutable state is `$T=$(mktemp -d)` per sub-agent.
11
+
2.**Sub-agents are black-box runners.** If a scenario fails because the CLI is broken, that is the *finding*. Do not patch the code. Do not modify the YAML to make the test pass.
12
+
3.**You (the orchestrator) do not patch either.** If multiple sub-agents report the same failure, surface it once and stop. Do not spawn a "fixer" agent.
13
+
4.**Build is the only allowed write.**`make build` (or `go build -o bin/xdb ./cmd/xdb` from the cmd module if `make build` doesn't produce a binary). Nothing else.
14
+
5.**Do not skip [tests/e2e/RUNBOOK.md](tests/e2e/RUNBOOK.md).** It contains the same hard rules in agent-facing form. Spawn sub-agents with it as the prompt verbatim — do not paraphrase or shorten.
15
+
16
+
## Steps
17
+
18
+
1.**Build if missing.** If `bin/xdb` does not exist, build it. If it exists, skip and tell the user.
19
+
20
+
2.**Spawn one sub-agent per scenario, in parallel** (one Agent tool call per scenario, all in a single message). Each agent owns its own `HOME=$T` so daemons cannot collide.
21
+
22
+
Read `tests/e2e/scenarios.yaml` first to enumerate scenario names. For each scenario matching `$ARGUMENTS` (empty = all):
-`prompt`: the **full contents** of `tests/e2e/RUNBOOK.md` followed by:
27
+
```
28
+
Repo root: <absolute path>
29
+
xdb binary: <absolute path>/bin/xdb
30
+
Scenario filter: <single-scenario-name>
31
+
32
+
REMINDER: black-box runner only. No edits to any file in the repo. CLI bugs are FAILs to report, not problems to solve. Violating this is itself a test failure.
33
+
```
34
+
35
+
3. **Single-scenario invocation.** If `$ARGUMENTS` names exactly one scenario, spawn just that one sub-agent (no fan-out).
36
+
37
+
4. **Cleanup verification.** After all sub-agents finish:
38
+
- Run `git status --short`. If anything beyond untracked temp files appears (especially under `cmd/`, `api/`, `rpc/`, `core/`, `store/`, `tests/e2e/scenarios.yaml`, `tests/e2e/RUNBOOK.md`), a sub-agent overstepped — report which files and stop. **Do not auto-revert** without telling the user.
39
+
- Run `pgrep -f "/tmp.*xdb.*daemon\|xdb-test.*daemon"`. If any stray daemon is still running, list it and ask the user before killing.
40
+
41
+
5. **Report.** Merge all sub-agent reports into one markdown table. Print only:
42
+
- the table
43
+
- the summary line (e.g. `8 scenarios: 5 PASS, 1 SKIP, 2 FAIL`)
44
+
- any cleanup-verification anomalies from step 4
45
+
- final `**suite passed**` or `**suite failed**`
46
+
47
+
Suppress build chatter and per-agent reasoning. Do not propose fixes for failing scenarios in this message — that's a separate task the user can request.
You are running the xdb end-to-end suite. **You are a black-box test runner. Your only job is to execute the spec and report what happened — never to fix what you find.**
4
+
5
+
## Hard rules (violating any of these is itself a test failure)
6
+
7
+
1.**No file edits anywhere in the repo.** Not source code, not `scenarios.yaml`, not even comments. Tools allowed: `Bash`, `Read`, `Grep`, `Glob`. **Do not** call `Edit`, `Write`, or any tool that mutates the working tree.
8
+
2.**No `git` mutations.** No commits, checkouts, stashes, resets. `git status` / `git diff` for diagnostics only.
9
+
3.**If a step fails because the CLI itself is broken** (wrong flag accepted, missing command, wrong error code), that is a **scenario FAIL**, not a problem for you to solve. Record it and move on.
10
+
4.**If you cannot even start** (binary missing, `xdb init` exits non-zero, daemon won't bind), abort the entire run and emit `SUITE_FAILED` with the startup error in the report. Do not try to repair the binary, the config, or the daemon.
11
+
5.**All mutable state lives under `$T` (your `mktemp -d`).** Never write outside it.
12
+
13
+
If you are tempted to "just fix this small thing to make the test pass" — stop. The test failing is the whole point.
14
+
15
+
All state lives in a temp dir you control. Use only Bash, Read, Grep, Glob.
16
+
17
+
## Inputs
18
+
-`tests/e2e/scenarios.yaml` — the spec (read it).
19
+
- Optional scenario filter passed in the prompt: a single `name`, a comma-separated list, or empty (run all).
20
+
- Skip any scenario with `skip_if_unimplemented: true` and report it as SKIP.
21
+
22
+
## Setup (once)
23
+
1. Create temp root: `T=$(mktemp -d)` and `mkdir -p "$T/home"`.
24
+
2. Export an isolated env for every command you run:
25
+
```
26
+
HOME="$T/home"
27
+
PATH="<repo>/bin:$PATH" # so `xdb` resolves to the freshly-built binary
28
+
```
29
+
3. Run `xdb init`. It must exit 0 and start the daemon. If exit ≠ 0, abort the run, print the stderr, and report.
30
+
4. Verify with `xdb daemon status` (exit 0).
31
+
5. Capture `NOW=$(date -u +%Y-%m-%dT%H:%M:%SZ)`.
32
+
33
+
## Per scenario (run sequentially)
34
+
1. Generate a unique namespace: `NS="e2e-<scenario-slug>-$(openssl rand -hex 2)"`.
35
+
2. For each `step` in order:
36
+
- Substitute `$NS` and `$NOW` in `run`.
37
+
- Execute via Bash with the isolated env from setup. Capture stdout, stderr, exit code.
38
+
- Apply each assertion in `expect`:
39
+
-`exit: N` — exit code must equal N.
40
+
-`exit_nonzero: true` — exit code must be ≠ 0.
41
+
-`stdout_contains: S` — S must appear in stdout.
42
+
-`stderr_contains: S` — S must appear in stderr.
43
+
-`stdout_empty: true` — stdout must be empty (whitespace-only counts as empty).
44
+
-`json: {...}` — parse stdout as JSON; every key/value in the assertion must be present (deep partial match).
45
+
-`ndjson_count: N` — stdout must contain exactly N non-empty lines, each parseable as JSON.
46
+
-`ndjson_ids: [...]` — the set of `_id` values across NDJSON lines must equal this set (unordered).
47
+
-`error: {...}` — parse stdout (or stderr if stdout is empty) as JSON envelope; partial-match on `code`, `resource`, `action`.
48
+
- On the **first failing assertion** in a scenario: mark the scenario FAIL with `{step name, assertion, expected, actual (truncated to 400 chars)}`. Stop that scenario, move to the next.
If any scenario is FAIL, end your message with the literal line `SUITE_FAILED`. Otherwise end with `SUITE_PASSED`. The parent uses these markers to decide overall result.
0 commit comments