Skip to content

Commit e8e06b7

Browse files
Antigravity Agentclaude
andcommitted
feat(cloud): glm-5 model fix + 4 new agent CLI commands + MCP tools
Root cause: z.ai proxy routes claude-sonnet requests to glm-4.7 (wrong model). Fix: --model glm-5 in agent-entrypoint.sh + CLAUDE_MODEL env var. New tri cloud subcommands: - api-check: test API key connectivity + model routing - redeploy <svc> <N>: reuse Railway service for new issue - diagnose <N>: why did agent fail? (comments + events + PR) - issue-create <title>: create issue with agent:spawn label Corresponding MCP tools: cloud_api_check, cloud_redeploy, cloud_diagnose, cloud_issue_create. Validated: Agent #145 completed full E2E cycle with glm-5 in ~5min. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent bd41e8b commit e8e06b7

14 files changed

Lines changed: 598 additions & 71 deletions

File tree

.github/workflows/agent-spawn.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,8 @@ jobs:
100100
\"TELEGRAM_BOT_TOKEN\": \"${TELEGRAM_BOT_TOKEN}\",
101101
\"TELEGRAM_CHAT_ID\": \"${TELEGRAM_CHAT_ID}\",
102102
\"MONITOR_TOKEN\": \"${MONITOR_TOKEN}\",
103-
\"REPO_URL\": \"https://github.com/gHashTag/trinity.git\"
103+
\"REPO_URL\": \"https://github.com/gHashTag/trinity.git\",
104+
\"CLAUDE_MODEL\": \"glm-5\"
104105
}
105106
}
106107
}

.ralph/state/last_wake

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
22
1+
23

.ralph/state/wake_count

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
22
1+
27

.trinity/faculty_prev.dat

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
1773237667
1+
1773247907
22
100
33
5
4-
37
4+
30
55
337
66
337
7-
19
7+
18

.trinity/mu/heartbeat.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
{"agent":"mu","wake":75,"timestamp":1773248114,"errors_scanned":25,"fixes_applied":0,"build_ok":false,"test_ok":true}
1+
{"agent":"mu","wake":103,"timestamp":1773247774,"errors_scanned":25,"fixes_applied":0,"build_ok":false,"test_ok":true}

.trinity/mu/state/wake_count

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
75
1+
103

.trinity/night-evolution-map.md

Lines changed: 95 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
# Night Evolution Map — Cloud Dev Pipeline Hardening
22
# 2026-03-11 night → 2026-03-12 morning
33

4-
## Final Status
4+
## FINAL STATUS — Night Complete
55

6-
### PRs Merged (5) + Closed (2 quality)
6+
### PRs Merged (5)
77
| PR | Title | Source |
88
|----|-------|--------|
99
| #129 | JSONL event persistence + deduplication | agent-124 |
@@ -12,12 +12,14 @@
1212
| #141 | Golden Chain pipeline — tri cloud pipeline/verify/merge | agent-140 |
1313
| #142 | Telegram log streaming — batch every 5s + output classifier | agent-131 |
1414

15-
### PRs Closed (3, superseded by direct fixes)
15+
### PRs Closed (5, quality gate or conflicts)
1616
| PR | Reason |
1717
|----|--------|
1818
| #132 | Merge conflict after #129/#130 merged |
1919
| #133 | Merge conflict after #129/#130 merged |
2020
| #139 | Merge conflict, fixes applied directly |
21+
| #143 | Review: grep -oP not portable, destructive git checkout, worktree cleanup order |
22+
| #144 | Modified generated files (trinity-nexus/output/) — forbidden per CLAUDE.md |
2123

2224
### Issues Closed (5)
2325
| Issue | Resolution |
@@ -28,69 +30,115 @@
2830
| #137 | Fixed: pipefail, bash shebang, Telegram ordering |
2931
| #140 | Fixed via PR #141 merge |
3032

31-
### Direct Commits to Main (3)
33+
### Direct Commits to Main (4)
3234
1. `b470c5ae7` — heartbeat subshell + pipefail + Telegram ordering + HTML escape
3335
2. `fe6dc534e` — u32 overflow, entry_idx duplicates, VOLUME shadow, worktree conflict
34-
3. Merge commits for PRs #129, #130, #138, #141
36+
3. `9362cec04` — reuse Railway services instead of delete+create
37+
4. `f803a5fbd` — gh auth setup-git + --repo flag + push failure tracking
3538

36-
### Docker Image Rebuilt (2x)
37-
- First: heartbeat + pipefail + Telegram fixes
38-
- Second: VOLUME shadow removal + worktree branch fix
39+
### Docker Image Rebuilt (3x)
40+
1. heartbeat + pipefail + Telegram fixes
41+
2. VOLUME shadow removal + worktree branch fix
42+
3. `gh auth setup-git` + `--repo` flag + PUSH_OK tracking (sha256:b1c73cbc)
3943

4044
## Phase Completion
4145

4246
| Phase | Status | Detail |
4347
|-------|--------|--------|
44-
| 1. Merge PRs | DONE | 4 merged, 3 closed |
45-
| 2. Entrypoint Hardening | DONE | 6 fixes applied |
48+
| 1. Merge PRs | DONE | 5 merged, 5 closed |
49+
| 2. Entrypoint Hardening | DONE | 17 fixes applied |
4650
| 3. Orchestrator CLI | 80% | pipeline/verify/merge added, logs TBD |
47-
| 4. Auto-Pipeline | 70% | In PR #141, needs testing |
48-
| 5. Monitoring | 30% | JSONL working, dashboard TBD |
49-
| 6. Agent Intelligence | 20% | SOUL.md works, branch reuse TBD |
50-
51-
## Remaining Open Issues
52-
- #131 feat(cloud): Stream all container logs to Telegram in realtime
53-
- #126 Cloud Dev: Structured ACI protocol
54-
- #128, #127 FPGA/pipeline TODOs (lower priority)
55-
56-
## Key Fixes Applied
57-
1. Heartbeat reads from temp file (subshell isolation solved)
58-
2. Telegram gets notifications on every status change (ordering fix)
51+
| 4. Auto-Pipeline | 80% | PR #141 merged, Telegram streaming in #142 |
52+
| 5. Monitoring | 40% | JSONL + Telegram live, dashboard TBD |
53+
| 6. Agent Intelligence | 30% | SOUL.md works, --repo fix, auth fixed |
54+
55+
## Agent Spawns (10 total runs, 2 services)
56+
57+
| Run | Service | Issue | Result | Duration | Notes |
58+
|-----|---------|-------|--------|----------|-------|
59+
| 1 | ubuntu | #126 | 🔴 FAILED | 619s | Too abstract, 0 commits |
60+
| 2 | Agents Anywhere | #131 | 🔵 DONE | ~300s | PR #142 merged ✅ |
61+
| 3 | Agents Anywhere | #115 | 🔴 FAILED | 303s | Push failed 3x (no gh auth setup-git) |
62+
| 4 | ubuntu | #114 | 🔴 FAILED | 519s | Push failed (same auth bug) |
63+
| 5 | Agents Anywhere | #116 | 🔴 FAILED | 81s | Can't read issue (no --repo flag) |
64+
| 6 | ubuntu | #126 (prev) | 🔴 CLOSED || PR #143 closed: quality issues |
65+
| 7 | ubuntu | #114 (retry) | 🔴 FAILED | 253s | 0 commits: generated files forbidden |
66+
| 8 | Agents Anywhere | #116 (retry) | 🔴 FAILED | 586s | 0 commits: generated files forbidden |
67+
| 9 | ubuntu | #114 (prev) | 🔴 CLOSED || PR #144 closed: edited output/ |
68+
|||| **1/8 success** || 12.5% solve rate |
69+
70+
## All Bugs Fixed (17)
71+
1. Heartbeat reads from temp file (subshell isolation)
72+
2. Telegram notification ordering (LAST_STATUS moved after send)
5973
3. HTML escaping + safe JSON via temp files
6074
4. `#!/bin/bash` + `set -eo pipefail`
61-
5. `i64` timestamps (no more u32 overflow)
62-
6. No duplicate JSONL entries
75+
5. `i64` timestamps (u32 overflow)
76+
6. No duplicate JSONL entries (entry_idx fix)
6377
7. No VOLUME shadowing bare repo
64-
8. Concurrent agents get unique branches
78+
8. Concurrent agents get unique worktree branches
6579
9. Golden Chain: `tri cloud pipeline <N>` automates full cycle
6680
10. Telegram `editMessageText` — 1 dashboard message updated in place
6781
11. `NO_COLOR=1` in containers for clean output
6882
12. Worktree lock/unlock prevents accidental pruning
69-
13. Workflow reuses services instead of delete+create (avoids 25/day limit)
70-
71-
## Active Agents (latest cycle — 16:33 UTC)
72-
- **ubuntu** service → #126 — 🔴 FAILED (0 commits, 619s — issue too abstract for autonomous agent)
73-
- **Agents Anywhere** service → #131 — 🔵 DONE → PR #142 merged
74-
- **Agents Anywhere** service → #115 (VIBEE eqlPrimitive fix) — 🔴 DONE but push failed 3x, no PR created
75-
- **ubuntu** service → #114 (VIBEE undefined Field type) — 🔴 DONE but push failed (git auth bug)
76-
- **Agents Anywhere** service → #116 (Re-verify stale ast-check) — 🔴 FAILED (gh can't read issue — missing --repo)
77-
- PR #143 from agent-126 — 🔴 CLOSED (review: grep -oP not portable, worktree cleanup order)
78-
- **Docker rebuild #3** — fixes: `gh auth setup-git`, `--repo` on all gh commands, PUSH_OK tracking
79-
- **ubuntu** service → #114 (RETRY) — 🚀 REDEPLOYED 16:55 UTC with fixed image
80-
- **Agents Anywhere** service → #116 (RETRY) — 🚀 REDEPLOYED 16:55 UTC with fixed image
81-
82-
## Bug Found & Fixed This Cycle
83-
14. `sleepApplication: true` on "Agents Anywhere" service — Railway was sleeping container before entrypoint ran. Fixed via `serviceInstanceUpdate` + redeploy.
83+
13. Workflow reuses services instead of delete+create (25/day limit)
84+
14. `sleepApplication: true` on Agents Anywhere — disabled
85+
15. Push failure silently swallowed — PUSH_OK tracking added
86+
16. **CRITICAL**: `gh auth setup-git` — bridges gh→git credential helper
87+
17. **CRITICAL**: `--repo` flag on all gh commands — bare-repo worktrees lack context
8488

8589
## Lessons Learned
8690
1. Railway MCP `deploy` uploads source, NOT Docker image — use GraphQL API
8791
2. `startCommand` overrides Docker ENTRYPOINT — must set via serviceInstanceUpdate
8892
3. 25 service/day creation limit — never delete+create, always reuse
8993
4. `variableCollectionUpsert` needs actual values, not empty shell vars
90-
5. Service names with spaces break Railway CLI — avoid spaces in service names
91-
6. `sleepApplication: true` silently kills agent containers — always set to false for batch jobs
92-
7. Abstract/design issues (#126 "Structured ACI protocol") produce 0 commits — agents need concrete, code-level tasks with specific files/functions to modify
93-
8. `retry "git push ... 2>/dev/null" || true` silently swallows push failures — agent reports DONE with no PR. Fixed: track PUSH_OK, skip PR creation if push fails, report FAILED explicitly
94-
9. **CRITICAL**: `gh auth login` only configures `gh` CLI, NOT `git push`. Fixed: `gh auth setup-git`
95-
10. **CRITICAL**: All `gh issue/pr` commands lack `--repo` flag — bare-repo worktrees have no git remote context. Fixed: extract `GH_REPO` from `REPO_URL`, add `--repo` to all gh calls
96-
11. Docker rebuild #3 deployed with fixes #8-10. Both services redeployed 16:55 UTC
94+
5. Service names with spaces break Railway CLI — avoid spaces
95+
6. `sleepApplication: true` silently kills batch containers
96+
7. Abstract issues produce 0 commits — agents need concrete file/function targets
97+
8. `2>/dev/null || true` on push hides critical auth failures
98+
9. `gh auth login` ≠ git push auth — need `gh auth setup-git`
99+
10. Bare-repo worktrees have no git remote — all gh commands need `--repo`
100+
11. Codegen issues (#114-116) require editing generated files — agents can't solve them
101+
12. Agent solve rate: ~12.5% — need better issue selection + more specific SOUL.md
102+
103+
## Night 2 (2026-03-12) — Model Fix + CLI Tools
104+
105+
### Root Cause Found: z.ai proxy returns GLM-4.7 instead of Claude
106+
- **Bug #18 (CRITICAL)**: z.ai proxy routes `claude-sonnet-4-20250514``glm-4.7` (wrong model!)
107+
- GLM-4.7 cannot handle Claude Code's tool-use protocol → 0 commits on ALL agents
108+
- **Fix**: `--model glm-5` flag in entrypoint + `CLAUDE_MODEL=glm-5` env var
109+
- z.ai's top model is `glm-5` — confirmed working via API test
110+
111+
### Changes Applied
112+
1. `deploy/agent-entrypoint.sh`: Added `--model "${CLAUDE_MODEL:-glm-5}"` to claude invocation
113+
2. `.github/workflows/agent-spawn.yml`: Added `CLAUDE_MODEL=glm-5` to Railway env vars
114+
3. Railway ubuntu service: `CLAUDE_MODEL=glm-5` set via MCP
115+
4. Docker image: Rebuilt and pushed to GHCR (sha256 new)
116+
5. **Bug #19**: `railway deploy` overwrote Docker image source with `railway.toml` (Dockerfile.px-bridge)
117+
- Fixed via `serviceInstanceUpdate` GraphQL — restored image source + startCommand
118+
- Lesson: NEVER use `railway deploy`/`redeploy` on Docker image services — it uploads source code
119+
120+
### New CLI Commands (4) + MCP Tools (4)
121+
| Command | Purpose |
122+
|---------|---------|
123+
| `tri cloud api-check` | Test API key + model routing (catches proxy mismatch) |
124+
| `tri cloud redeploy <svc> <N>` | Reuse Railway service for new issue |
125+
| `tri cloud diagnose <N>` | Why did agent fail? (comments + events + PR) |
126+
| `tri cloud issue-create <title>` | Create issue with `agent:spawn` label |
127+
128+
### Agent Spawn #145 (glm-5 validation)
129+
- **RESULT: SUCCESS** — Full E2E cycle in ~5 minutes
130+
- Auth OK, clone OK, read issue OK, code OK, self-review OK, push OK, PR #146 created
131+
- PR closed (local branch has richer impl), issue closed
132+
- **Agent solve rate: 2/9 = 22%** (up from 12.5%)
133+
- Bug #19 also found: `railway deploy` overwrites Docker image source
134+
135+
### Docker Image Rebuilt (4th time)
136+
- glm-5 model fix + 4 new CLI commands
137+
- sha256 new, pushed to GHCR
138+
139+
### Remaining Work
140+
- [ ] Create new agent-friendly issues and spawn more agents
141+
- [ ] Push local changes (4 new CLI commands + glm-5 fix)
142+
- [ ] Dashboard UI (Phase 5)
143+
- [ ] Agent self-metrics tracking
144+
- [ ] Investigate why `railway deploy` via MCP overrides Docker image source

.trinity/scholar/heartbeat.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
{"agent":"scholar","wake":6,"timestamp":1773238322,"fails_found":0,"researched":0,"fed_mu":0}
1+
{"agent":"scholar","wake":22,"timestamp":1773248005,"fails_found":0,"researched":0,"fed_mu":0}

.trinity/scholar/state/wake_count

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
6
1+
22

deploy/agent-entrypoint.sh

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -457,7 +457,9 @@ Comment on the issue at each major step."
457457

458458
emit_event "status" '{"status":"CODING","detail":"Claude Code starting"}'
459459
CLAUDE_EXIT=0
460-
timeout "${AGENT_TIMEOUT}" claude -p "${PROMPT}" --allowedTools "Bash,Read,Write,Edit,Glob,Grep" 2>&1 | \
460+
CLAUDE_MODEL="${CLAUDE_MODEL:-glm-5}"
461+
log "Using model: ${CLAUDE_MODEL}"
462+
timeout "${AGENT_TIMEOUT}" claude -p "${PROMPT}" --model "${CLAUDE_MODEL}" --allowedTools "Bash,Read,Write,Edit,Glob,Grep" 2>&1 | \
461463
while IFS= read -r line; do
462464
echo "$line"
463465
stream_to_telegram "$line"

0 commit comments

Comments
 (0)