Skip to content

Commit 2c0667c

Browse files
committed
Merge remote-tracking branch 'origin/main'
# Conflicts: # docs/ops/SCRIPT_INDEX.md # scripts/registry.json
2 parents 4ee6af9 + 300163f commit 2c0667c

File tree

5 files changed

+362
-0
lines changed

5 files changed

+362
-0
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,8 @@ scripts/plot_csb_mcp_blog_figures.py
5757
ralph/
5858
ralph-*/
5959
reports/
60+
!reports/nightly/
61+
!reports/nightly/**
6062
eval_reports/
6163
tmp/
6264
*.log

AGENTS.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,45 @@ curl -fsSL https://raw.githubusercontent.com/steveyegge/beads/main/scripts/insta
5151
- `docs/reference/README.md` - stable specs and reference docs
5252
- `docs/explanations/README.md` - rationale and context docs
5353

54+
## Common Gotchas (from session history)
55+
56+
### Documentation Generation
57+
- **NEVER edit root `CLAUDE.md` or `AGENTS.md` directly.** Edit canonical sources under `docs/ops/` and regenerate. Direct edits cause `agent_guides_drift` failures in `repo_health.py`.
58+
- After removing directories from the repo, also clean references from `scripts/sync_agent_guides.py` (`LOCAL_SOURCES`) and `scripts/docs_consistency_check.py` (`LOCAL_AGENT_TARGET_DIRS`).
59+
60+
### Daytona / Harbor
61+
- Daytona builds images from Dockerfiles at sandbox creation time (`Image.from_dockerfile()`). Dockerfile fixes pushed to `main` take effect on the next run -- **no manual image rebuild needed**. Exception: pre-built GHCR base images must be rebuilt separately.
62+
- Harbor+Daytona (`harbor run --environment-type daytona`) is the recommended production approach. The standalone `scripts/daytona_runner.py` is for quick validation only.
63+
- Use `BASELINE_MCP_TYPE` env var to control MCP configuration: `none`, `sourcegraph`, `deepsearch`.
64+
- Daytona SDK (`daytona_sdk`) over CLI for sandbox interaction -- the CLI is interactive-only for SSH.
65+
- GHCR packages default to **private** for personal accounts and visibility cannot be changed via API. Use the GitHub web UI or push to an org.
66+
67+
### Docker / Build
68+
- `uv tool install` segfaults on ARM64/QEMU emulation. Use `pip install` instead, or switch to Daytona (native x86_64).
69+
- Build-push-clean pattern when building Docker images with limited disk (~45GB): build one image, push, then clean locally before the next.
70+
- Colons in agent names (e.g., `module:ClassName`) break Docker volume mounts. Sanitize paths: replace `:` with `__`.
71+
72+
### MCP Configuration (inside sandboxes)
73+
- `.mcp.json` must be placed at `$CLAUDE_CONFIG_DIR` (typically `/logs/agent/sessions/`), not `/app/` or `/root/`.
74+
- Claude Code requires the `--mcp-config` CLI flag to load MCP config -- it does not auto-detect.
75+
- Inject MCP usage instructions into the task prompt. Agents won't use MCP tools just because they're available.
76+
- Set `NODE_TLS_REJECT_UNAUTHORIZED=0` for Node.js SSL in Docker containers (curl working does not mean Node.js fetch will work).
77+
78+
### Harbor Result Format
79+
- Timing fields (`started_at`, `finished_at`) live at the **top level** of `result.json`, not nested under `timing`.
80+
- `trajectory.json` is generated by Harbor's `_convert_events_to_trajectory()` post-processing, NOT by Claude Code CLI directly.
81+
- SWE-bench `test.sh` redirects stdout to a temp file -- Harbor never sees the parser's `START_TEST_OUTPUT`/`END_TEST_OUTPUT` markers via its normal capture.
82+
83+
### Validation / Scoring
84+
- `validators.py` is duplicated across `ccb_build` tasks. Changes must be applied to **all copies** (verify with `sha256sum`).
85+
- Install scripts that print "INSTALL_SUCCESS" regardless of actual outcome are common. Always verify the binary exists and is executable.
86+
- Agent completing in **<2 seconds** = agent never installed/ran (smoke test heuristic).
87+
88+
### Git / Auth
89+
- `gh auth refresh` without `-s <scope>` is a no-op for adding scopes. Must use `gh auth refresh -h github.com -s write:packages` explicitly.
90+
- Environment variables must be **explicitly exported** for Harbor subprocesses. Use `set -a` before sourcing `.env.local`.
91+
- GitHub push protection blocks synthetic/fake API keys in test data. Use `git reset --soft origin/main` to squash intermediate commits that contained fake credentials.
92+
5493
## Maintenance
5594
- Root and local `AGENTS.md` / `CLAUDE.md` files are generated from sources in `docs/ops/`.
5695
- `docs/START_HERE_BY_TASK.md` is generated from `docs/ops/task_routes.json`.

CLAUDE.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,45 @@ curl -fsSL https://raw.githubusercontent.com/steveyegge/beads/main/scripts/insta
5151
- `docs/reference/README.md` - stable specs and reference docs
5252
- `docs/explanations/README.md` - rationale and context docs
5353

54+
## Common Gotchas (from session history)
55+
56+
### Documentation Generation
57+
- **NEVER edit root `CLAUDE.md` or `AGENTS.md` directly.** Edit canonical sources under `docs/ops/` and regenerate. Direct edits cause `agent_guides_drift` failures in `repo_health.py`.
58+
- After removing directories from the repo, also clean references from `scripts/sync_agent_guides.py` (`LOCAL_SOURCES`) and `scripts/docs_consistency_check.py` (`LOCAL_AGENT_TARGET_DIRS`).
59+
60+
### Daytona / Harbor
61+
- Daytona builds images from Dockerfiles at sandbox creation time (`Image.from_dockerfile()`). Dockerfile fixes pushed to `main` take effect on the next run -- **no manual image rebuild needed**. Exception: pre-built GHCR base images must be rebuilt separately.
62+
- Harbor+Daytona (`harbor run --environment-type daytona`) is the recommended production approach. The standalone `scripts/daytona_runner.py` is for quick validation only.
63+
- Use `BASELINE_MCP_TYPE` env var to control MCP configuration: `none`, `sourcegraph`, `deepsearch`.
64+
- Daytona SDK (`daytona_sdk`) over CLI for sandbox interaction -- the CLI is interactive-only for SSH.
65+
- GHCR packages default to **private** for personal accounts and visibility cannot be changed via API. Use the GitHub web UI or push to an org.
66+
67+
### Docker / Build
68+
- `uv tool install` segfaults on ARM64/QEMU emulation. Use `pip install` instead, or switch to Daytona (native x86_64).
69+
- Build-push-clean pattern when building Docker images with limited disk (~45GB): build one image, push, then clean locally before the next.
70+
- Colons in agent names (e.g., `module:ClassName`) break Docker volume mounts. Sanitize paths: replace `:` with `__`.
71+
72+
### MCP Configuration (inside sandboxes)
73+
- `.mcp.json` must be placed at `$CLAUDE_CONFIG_DIR` (typically `/logs/agent/sessions/`), not `/app/` or `/root/`.
74+
- Claude Code requires the `--mcp-config` CLI flag to load MCP config -- it does not auto-detect.
75+
- Inject MCP usage instructions into the task prompt. Agents won't use MCP tools just because they're available.
76+
- Set `NODE_TLS_REJECT_UNAUTHORIZED=0` for Node.js SSL in Docker containers (curl working does not mean Node.js fetch will work).
77+
78+
### Harbor Result Format
79+
- Timing fields (`started_at`, `finished_at`) live at the **top level** of `result.json`, not nested under `timing`.
80+
- `trajectory.json` is generated by Harbor's `_convert_events_to_trajectory()` post-processing, NOT by Claude Code CLI directly.
81+
- SWE-bench `test.sh` redirects stdout to a temp file -- Harbor never sees the parser's `START_TEST_OUTPUT`/`END_TEST_OUTPUT` markers via its normal capture.
82+
83+
### Validation / Scoring
84+
- `validators.py` is duplicated across `ccb_build` tasks. Changes must be applied to **all copies** (verify with `sha256sum`).
85+
- Install scripts that print "INSTALL_SUCCESS" regardless of actual outcome are common. Always verify the binary exists and is executable.
86+
- Agent completing in **<2 seconds** = agent never installed/ran (smoke test heuristic).
87+
88+
### Git / Auth
89+
- `gh auth refresh` without `-s <scope>` is a no-op for adding scopes. Must use `gh auth refresh -h github.com -s write:packages` explicitly.
90+
- Environment variables must be **explicitly exported** for Harbor subprocesses. Use `set -a` before sourcing `.env.local`.
91+
- GitHub push protection blocks synthetic/fake API keys in test data. Use `git reset --soft origin/main` to squash intermediate commits that contained fake credentials.
92+
5493
## Maintenance
5594
- Root and local `AGENTS.md` / `CLAUDE.md` files are generated from sources in `docs/ops/`.
5695
- `docs/START_HERE_BY_TASK.md` is generated from `docs/ops/task_routes.json`.

docs/ops/ROOT_AGENT_GUIDE.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,45 @@ curl -fsSL https://raw.githubusercontent.com/steveyegge/beads/main/scripts/insta
5151
- `docs/reference/README.md` - stable specs and reference docs
5252
- `docs/explanations/README.md` - rationale and context docs
5353

54+
## Common Gotchas (from session history)
55+
56+
### Documentation Generation
57+
- **NEVER edit root `CLAUDE.md` or `AGENTS.md` directly.** Edit canonical sources under `docs/ops/` and regenerate. Direct edits cause `agent_guides_drift` failures in `repo_health.py`.
58+
- After removing directories from the repo, also clean references from `scripts/sync_agent_guides.py` (`LOCAL_SOURCES`) and `scripts/docs_consistency_check.py` (`LOCAL_AGENT_TARGET_DIRS`).
59+
60+
### Daytona / Harbor
61+
- Daytona builds images from Dockerfiles at sandbox creation time (`Image.from_dockerfile()`). Dockerfile fixes pushed to `main` take effect on the next run -- **no manual image rebuild needed**. Exception: pre-built GHCR base images must be rebuilt separately.
62+
- Harbor+Daytona (`harbor run --environment-type daytona`) is the recommended production approach. The standalone `scripts/daytona_runner.py` is for quick validation only.
63+
- Use `BASELINE_MCP_TYPE` env var to control MCP configuration: `none`, `sourcegraph`, `deepsearch`.
64+
- Daytona SDK (`daytona_sdk`) over CLI for sandbox interaction -- the CLI is interactive-only for SSH.
65+
- GHCR packages default to **private** for personal accounts and visibility cannot be changed via API. Use the GitHub web UI or push to an org.
66+
67+
### Docker / Build
68+
- `uv tool install` segfaults on ARM64/QEMU emulation. Use `pip install` instead, or switch to Daytona (native x86_64).
69+
- Build-push-clean pattern when building Docker images with limited disk (~45GB): build one image, push, then clean locally before the next.
70+
- Colons in agent names (e.g., `module:ClassName`) break Docker volume mounts. Sanitize paths: replace `:` with `__`.
71+
72+
### MCP Configuration (inside sandboxes)
73+
- `.mcp.json` must be placed at `$CLAUDE_CONFIG_DIR` (typically `/logs/agent/sessions/`), not `/app/` or `/root/`.
74+
- Claude Code requires the `--mcp-config` CLI flag to load MCP config -- it does not auto-detect.
75+
- Inject MCP usage instructions into the task prompt. Agents won't use MCP tools just because they're available.
76+
- Set `NODE_TLS_REJECT_UNAUTHORIZED=0` for Node.js SSL in Docker containers (curl working does not mean Node.js fetch will work).
77+
78+
### Harbor Result Format
79+
- Timing fields (`started_at`, `finished_at`) live at the **top level** of `result.json`, not nested under `timing`.
80+
- `trajectory.json` is generated by Harbor's `_convert_events_to_trajectory()` post-processing, NOT by Claude Code CLI directly.
81+
- SWE-bench `test.sh` redirects stdout to a temp file -- Harbor never sees the parser's `START_TEST_OUTPUT`/`END_TEST_OUTPUT` markers via its normal capture.
82+
83+
### Validation / Scoring
84+
- `validators.py` is duplicated across `ccb_build` tasks. Changes must be applied to **all copies** (verify with `sha256sum`).
85+
- Install scripts that print "INSTALL_SUCCESS" regardless of actual outcome are common. Always verify the binary exists and is executable.
86+
- Agent completing in **<2 seconds** = agent never installed/ran (smoke test heuristic).
87+
88+
### Git / Auth
89+
- `gh auth refresh` without `-s <scope>` is a no-op for adding scopes. Must use `gh auth refresh -h github.com -s write:packages` explicitly.
90+
- Environment variables must be **explicitly exported** for Harbor subprocesses. Use `set -a` before sourcing `.env.local`.
91+
- GitHub push protection blocks synthetic/fake API keys in test data. Use `git reset --soft origin/main` to squash intermediate commits that contained fake credentials.
92+
5493
## Maintenance
5594
- Root and local `AGENTS.md` / `CLAUDE.md` files are generated from sources in `docs/ops/`.
5695
- `docs/START_HERE_BY_TASK.md` is generated from `docs/ops/task_routes.json`.

0 commit comments

Comments
 (0)