Skip to content

Commit bd41e8b

Browse files
Antigravity Agentclaude
andcommitted
fix(cloud): gh auth setup-git + --repo flag + push failure tracking
3 critical bugs fixed in agent entrypoint: 1. gh auth setup-git — bridges gh→git credential helper (push was failing) 2. --repo flag on all gh issue/pr commands — bare-repo worktrees lack remote context 3. PUSH_OK tracking — push failures now reported instead of silently swallowed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent adb3a5a commit bd41e8b

22 files changed

Lines changed: 5277 additions & 77 deletions

.trinity/mu/heartbeat.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
{"agent":"mu","wake":74,"timestamp":1773238120,"errors_scanned":25,"fixes_applied":0,"build_ok":false,"test_ok":true}
1+
{"agent":"mu","wake":75,"timestamp":1773248114,"errors_scanned":25,"fixes_applied":0,"build_ok":false,"test_ok":true}

.trinity/mu/state/wake_count

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
74
1+
75

.trinity/night-evolution-map.md

Lines changed: 83 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -1,79 +1,96 @@
11
# Night Evolution Map — Cloud Dev Pipeline Hardening
22
# 2026-03-11 night → 2026-03-12 morning
33

4-
## Current State (updated)
5-
- 3 PRs merged (#129, #130, #138), 3 closed with conflicts (#132, #133, #139)
6-
- PR #141 (Golden Chain pipeline from agent-140) — CI pending, compile fixes pushed
7-
- Docker image rebuilt with heartbeat + pipefail + Telegram fixes
8-
- 2 Railway services reusable (agent-126, agent-131)
9-
- agent-126 redeployed for issue #134, agent-131 for issue #135
4+
## Final Status
105

11-
## Evolution Phases (Priority Order)
6+
### PRs Merged (5) + Closed (2 quality)
7+
| PR | Title | Source |
8+
|----|-------|--------|
9+
| #129 | JSONL event persistence + deduplication | agent-124 |
10+
| #130 | Git worktree isolation for faster startup | agent-125 |
11+
| #138 | Buffer size increase, CLAUDE.md.agent to .gitignore | agent-136 |
12+
| #141 | Golden Chain pipeline — tri cloud pipeline/verify/merge | agent-140 |
13+
| #142 | Telegram log streaming — batch every 5s + output classifier | agent-131 |
1214

13-
### Phase 1: Merge Ready PRs ✅ DONE
14-
- [x] PR #129 (JSONL event persistence) → merged
15-
- [x] PR #130 (git worktree isolation) → merged
16-
- [x] PR #138 (fix #136 — buffer + .gitignore) → merged
17-
- [x] PR #132, #133, #139 → closed (conflicts after merges)
18-
- [x] PR #141 (agent-140 Golden Chain) → compile fixes pushed, CI pending
15+
### PRs Closed (3, superseded by direct fixes)
16+
| PR | Reason |
17+
|----|--------|
18+
| #132 | Merge conflict after #129/#130 merged |
19+
| #133 | Merge conflict after #129/#130 merged |
20+
| #139 | Merge conflict, fixes applied directly |
1921

20-
### Phase 2: Entrypoint Hardening ✅ DONE
21-
- [x] Heartbeat subshell bug → temp file `/tmp/agent_heartbeat_state`
22-
- [x] `report_status()` writes to heartbeat file
23-
- [x] Telegram notification ordering (BEFORE LAST_STATUS update)
24-
- [x] `set -eo pipefail` + `#!/bin/bash` shebang
25-
- [x] HTML escape helper `escape_html()`
26-
- [x] `send_telegram()` uses temp file for JSON (no escaping issues)
27-
- [x] Docker image rebuilt + pushed to GHCR
22+
### Issues Closed (5)
23+
| Issue | Resolution |
24+
|-------|-----------|
25+
| #134 | Fixed: u32→i64 timestamp, entry_idx dedup |
26+
| #135 | Fixed: VOLUME shadow, worktree -b branch |
27+
| #136 | Fixed via PR #138 merge |
28+
| #137 | Fixed: pipefail, bash shebang, Telegram ordering |
29+
| #140 | Fixed via PR #141 merge |
2830

29-
### Phase 3: Orchestrator CLI (partially done by agent-140)
30-
Agent-140's PR #141 adds:
31-
- [x] `tri cloud pipeline <N>` — spawn → monitor → verify → merge → cleanup
32-
- [x] `tri cloud verify <N>` — local zig build check
33-
- [x] `tri cloud merge <N>` — merge PR via gh CLI
34-
- [x] Enhanced `tri cloud agents` — stuck detection, health indicators, elapsed formatting
35-
Already working from before:
36-
- [x] `tri cloud spawn <N>` — calls Railway API
37-
- [x] `tri cloud kill <N>` — delete service
38-
- [x] `tri cloud agents` — list active containers
39-
- [ ] `tri cloud logs <N>` — fetch Railway deploy logs
40-
- [ ] Service recycling in CLI (currently manual via env var update)
31+
### Direct Commits to Main (3)
32+
1. `b470c5ae7` — heartbeat subshell + pipefail + Telegram ordering + HTML escape
33+
2. `fe6dc534e` — u32 overflow, entry_idx duplicates, VOLUME shadow, worktree conflict
34+
3. Merge commits for PRs #129, #130, #138, #141
4135

42-
### Phase 4: Auto-Pipeline (in PR #141)
43-
- [x] Spawn → monitor heartbeats → detect DONE/FAIL (in PR)
44-
- [x] On DONE: fetch PR, run `zig build` locally (in PR)
45-
- [x] On pass: auto-merge PR (in PR)
46-
- [x] On fail: respawn (max 3x) (in PR)
47-
- [ ] Create fix-issue with review on failure
48-
- [ ] Cleanup container after completion
36+
### Docker Image Rebuilt (2x)
37+
- First: heartbeat + pipefail + Telegram fixes
38+
- Second: VOLUME shadow removal + worktree branch fix
4939

50-
### Phase 5: Monitoring & Metrics
51-
- [x] JSONL event persistence (PR #129 merged)
52-
- [ ] Agent solve rate dashboard
53-
- [ ] Cost per agent tracking
54-
- [ ] Token usage estimation
55-
- [ ] Success/fail/retry counters
40+
## Phase Completion
5641

57-
### Phase 6: Agent Intelligence
58-
- [x] Agent reads CLAUDE.md (via SOUL.md injection)
59-
- [x] Better commit messages (include issue number)
60-
- [ ] Agent checks out existing branch for fix-issues
61-
- [ ] Agent runs `zig build -Dci=true` instead of full build
62-
- [ ] Multi-file context awareness
42+
| Phase | Status | Detail |
43+
|-------|--------|--------|
44+
| 1. Merge PRs | DONE | 4 merged, 3 closed |
45+
| 2. Entrypoint Hardening | DONE | 6 fixes applied |
46+
| 3. Orchestrator CLI | 80% | pipeline/verify/merge added, logs TBD |
47+
| 4. Auto-Pipeline | 70% | In PR #141, needs testing |
48+
| 5. Monitoring | 30% | JSONL working, dashboard TBD |
49+
| 6. Agent Intelligence | 20% | SOUL.md works, branch reuse TBD |
6350

64-
## Active Agents
65-
- agent-126 → issue #134 (fix PR #129 bugs — u32 timestamp, duplicates, buffer)
66-
- agent-131 → issue #135 (fix PR #130 bugs — VOLUME shadow, worktree conflicts)
51+
## Remaining Open Issues
52+
- #131 feat(cloud): Stream all container logs to Telegram in realtime
53+
- #126 Cloud Dev: Structured ACI protocol
54+
- #128, #127 FPGA/pipeline TODOs (lower priority)
6755

68-
## Next Steps
69-
1. Wait for PR #141 CI → merge if passes
70-
2. Monitor agents #134, #135 → review PRs when ready
71-
3. Spawn agent for #137 (fix PR #133 bugs) when slot frees up
72-
4. Create issue for `tri cloud logs` command
73-
5. Create issue for service recycling in CLI
56+
## Key Fixes Applied
57+
1. Heartbeat reads from temp file (subshell isolation solved)
58+
2. Telegram gets notifications on every status change (ordering fix)
59+
3. HTML escaping + safe JSON via temp files
60+
4. `#!/bin/bash` + `set -eo pipefail`
61+
5. `i64` timestamps (no more u32 overflow)
62+
6. No duplicate JSONL entries
63+
7. No VOLUME shadowing bare repo
64+
8. Concurrent agents get unique branches
65+
9. Golden Chain: `tri cloud pipeline <N>` automates full cycle
66+
10. Telegram `editMessageText` — 1 dashboard message updated in place
67+
11. `NO_COLOR=1` in containers for clean output
68+
12. Worktree lock/unlock prevents accidental pruning
69+
13. Workflow reuses services instead of delete+create (avoids 25/day limit)
7470

75-
## Constraints
76-
- 2 Railway services available (agent-126, agent-131)
77-
- z.ai proxy ~8min per agent run
78-
- Telegram 30 msg/min rate limit
79-
- Docker rebuild ~90s (cached layers)
71+
## Active Agents (latest cycle — 16:33 UTC)
72+
- **ubuntu** service → #126 — 🔴 FAILED (0 commits, 619s — issue too abstract for autonomous agent)
73+
- **Agents Anywhere** service → #131 — 🔵 DONE → PR #142 merged
74+
- **Agents Anywhere** service → #115 (VIBEE eqlPrimitive fix) — 🔴 DONE but push failed 3x, no PR created
75+
- **ubuntu** service → #114 (VIBEE undefined Field type) — 🔴 DONE but push failed (git auth bug)
76+
- **Agents Anywhere** service → #116 (Re-verify stale ast-check) — 🔴 FAILED (gh can't read issue — missing --repo)
77+
- PR #143 from agent-126 — 🔴 CLOSED (review: grep -oP not portable, worktree cleanup order)
78+
- **Docker rebuild #3** — fixes: `gh auth setup-git`, `--repo` on all gh commands, PUSH_OK tracking
79+
- **ubuntu** service → #114 (RETRY) — 🚀 REDEPLOYED 16:55 UTC with fixed image
80+
- **Agents Anywhere** service → #116 (RETRY) — 🚀 REDEPLOYED 16:55 UTC with fixed image
81+
82+
## Bug Found & Fixed This Cycle
83+
14. `sleepApplication: true` on "Agents Anywhere" service — Railway was sleeping container before entrypoint ran. Fixed via `serviceInstanceUpdate` + redeploy.
84+
85+
## Lessons Learned
86+
1. Railway MCP `deploy` uploads source, NOT Docker image — use GraphQL API
87+
2. `startCommand` overrides Docker ENTRYPOINT — must set via serviceInstanceUpdate
88+
3. 25 service/day creation limit — never delete+create, always reuse
89+
4. `variableCollectionUpsert` needs actual values, not empty shell vars
90+
5. Service names with spaces break Railway CLI — avoid spaces in service names
91+
6. `sleepApplication: true` silently kills agent containers — always set to false for batch jobs
92+
7. Abstract/design issues (#126 "Structured ACI protocol") produce 0 commits — agents need concrete, code-level tasks with specific files/functions to modify
93+
8. `retry "git push ... 2>/dev/null" || true` silently swallows push failures — agent reports DONE with no PR. Fixed: track PUSH_OK, skip PR creation if push fails, report FAILED explicitly
94+
9. **CRITICAL**: `gh auth login` only configures `gh` CLI, NOT `git push`. Fixed: `gh auth setup-git`
95+
10. **CRITICAL**: All `gh issue/pr` commands lack `--repo` flag — bare-repo worktrees have no git remote context. Fixed: extract `GH_REPO` from `REPO_URL`, add `--repo` to all gh calls
96+
11. Docker rebuild #3 deployed with fixes #8-10. Both services redeployed 16:55 UTC

deploy/agent-entrypoint.sh

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@
88
set -eo pipefail
99

1010
REPO_URL="${REPO_URL:-https://github.com/gHashTag/trinity.git}"
11+
# Extract owner/repo for gh --repo flag (bare-repo worktrees lack git remote context)
12+
GH_REPO=$(echo "${REPO_URL}" | sed 's|.*github.com[:/]||; s|\.git$||')
1113
ISSUE="${ISSUE_NUMBER:?ISSUE_NUMBER is required}"
1214
AGENT_TIMEOUT="${AGENT_TIMEOUT:-3600}" # 1 hour default
1315
HEARTBEAT_INTERVAL="${HEARTBEAT_INTERVAL:-30}"
@@ -76,7 +78,7 @@ ${CURRENT_DETAIL} (${ELAPSED}s)"
7678

7779
# 3. GitHub issue comment on status change (skip duplicates)
7880
if [ "${CURRENT_STATUS}" != "${LAST_STATUS}" ]; then
79-
gh issue comment "${ISSUE}" --body "${EMOJI} **Trinity Agent** | ${TIMESTAMP}
81+
gh issue comment "${ISSUE}" --repo "${GH_REPO}" --body "${EMOJI} **Trinity Agent** | ${TIMESTAMP}
8082
📋 **Step**: ${STEP_NUM}/${TOTAL_STEPS}${CURRENT_DETAIL}
8183
🔄 **Status**: ${CURRENT_STATUS}
8284
⏱️ **Elapsed**: ${ELAPSED}s" 2>/dev/null || log "Warning: GitHub comment failed"
@@ -96,7 +98,7 @@ ${CURRENT_DETAIL} (${ELAPSED}s)"
9698
| **Updated** | ${TIMESTAMP} |"
9799

98100
if [ -z "${DASHBOARD_COMMENT_ID}" ]; then
99-
DASHBOARD_COMMENT_ID=$(gh issue comment "${ISSUE}" --body "${DASHBOARD_BODY}" 2>/dev/null | grep -o '/[0-9]*$' | tr -d '/' || true)
101+
DASHBOARD_COMMENT_ID=$(gh issue comment "${ISSUE}" --repo "${GH_REPO}" --body "${DASHBOARD_BODY}" 2>/dev/null | grep -o '/[0-9]*$' | tr -d '/' || true)
100102
if [ -z "${DASHBOARD_COMMENT_ID}" ]; then
101103
DASHBOARD_COMMENT_ID=$(gh api "repos/{owner}/{repo}/issues/${ISSUE}/comments" --jq '.[-1].id' 2>/dev/null || true)
102104
fi
@@ -378,6 +380,10 @@ log "gh auth status: ${GH_STATUS}"
378380
git config --global user.name "Trinity Agent"
379381
git config --global user.email "trinity-agent@users.noreply.github.com"
380382

383+
# Configure git to use gh as credential helper (fixes push auth)
384+
gh auth setup-git 2>/dev/null || true
385+
log "gh auth setup-git done — git push will use GITHUB_TOKEN"
386+
381387
# === 2. Setup worktree from shared bare repo ===
382388
report_status "AWAKENING" "Creating worktree from bare repository"
383389

@@ -422,7 +428,7 @@ sed "s/{ISSUE_NUMBER}/${ISSUE}/g" /etc/trinity/SOUL.md > "${WORKTREE_PATH}/CLAUD
422428

423429
# === 4. Read issue ===
424430
report_status "READING" "Reading issue #${ISSUE}"
425-
ISSUE_BODY=$(gh issue view "${ISSUE}" --json title,body,labels --jq '.' 2>/dev/null || echo '{"title":"Unknown","body":"Failed to fetch issue"}')
431+
ISSUE_BODY=$(gh issue view "${ISSUE}" --repo "${GH_REPO}" --json title,body,labels --jq '.' 2>/dev/null || echo '{"title":"Unknown","body":"Failed to fetch issue"}')
426432
ISSUE_TITLE=$(echo "${ISSUE_BODY}" | grep -oP '"title"\s*:\s*"[^"]*"' | head -1 | sed 's/"title"\s*:\s*"//;s/"$//' || echo "issue #${ISSUE}")
427433
log "Issue title: ${ISSUE_TITLE}"
428434
send_telegram "📖 <b>Agent #${ISSUE}</b> читает задачу:
@@ -468,7 +474,7 @@ emit_event "command" "{\"cmd\":\"claude\",\"exit_code\":${CLAUDE_EXIT},\"timeout
468474

469475
if [ "${CLAUDE_EXIT}" -eq 124 ]; then
470476
report_status "STUCK" "Timeout after ${AGENT_TIMEOUT}s"
471-
gh issue comment "${ISSUE}" --body "⏰ **Trinity Agent**: Timed out after ${AGENT_TIMEOUT}s. Manual intervention needed." 2>/dev/null || true
477+
gh issue comment "${ISSUE}" --repo "${GH_REPO}" --body "⏰ **Trinity Agent**: Timed out after ${AGENT_TIMEOUT}s. Manual intervention needed." 2>/dev/null || true
472478
elif [ "${CLAUDE_EXIT}" -ne 0 ]; then
473479
report_status "ERROR" "Claude Code exited with code ${CLAUDE_EXIT}"
474480
fi
@@ -521,7 +527,7 @@ fi
521527
# === 8. Push and create PR if not already done ===
522528
report_status "TESTING" "Checking/creating PR"
523529
stream_to_telegram "Checking for existing PR..."
524-
EXISTING_PR=$(gh pr list --head "feat/issue-${ISSUE}" --json number --jq '.[0].number' 2>/dev/null || echo "")
530+
EXISTING_PR=$(gh pr list --repo "${GH_REPO}" --head "feat/issue-${ISSUE}" --json number --jq '.[0].number' 2>/dev/null || echo "")
525531

526532
if [ -z "${EXISTING_PR}" ]; then
527533
# Check if there are actually commits to push
@@ -530,12 +536,22 @@ if [ -z "${EXISTING_PR}" ]; then
530536
if [ "${COMMIT_COUNT}" -gt 0 ]; then
531537
log "Pushing ${COMMIT_COUNT} commit(s)..."
532538
stream_to_telegram "Pushing ${COMMIT_COUNT} commit(s) to origin..."
533-
retry "git push -u origin 'feat/issue-${ISSUE}' 2>/dev/null" || true
539+
PUSH_OK=0
540+
retry "git push -u origin 'feat/issue-${ISSUE}'" && PUSH_OK=1 || true
534541
stream_to_telegram "Push completed."
535542

543+
if [ "${PUSH_OK}" -eq 0 ]; then
544+
log "Push failed after 3 retries — cannot create PR"
545+
report_status "FAILED" "Push failed after 3 retries"
546+
send_telegram "❌ Agent #${ISSUE}: Push failed after 3 retries — code ready but cannot push"
547+
# Still try to report what happened
548+
gh issue comment "${ISSUE}" --repo "${GH_REPO}" --body "❌ **Trinity Agent**: Code committed locally but push to origin failed 3 times. Branch: feat/issue-${ISSUE}" 2>/dev/null || true
549+
fi
550+
551+
if [ "${PUSH_OK}" -eq 1 ]; then
536552
log "Creating PR..."
537553
stream_to_telegram "Creating pull request..."
538-
PR_URL=$(gh pr create \
554+
PR_URL=$(gh pr create --repo "${GH_REPO}" \
539555
--title "feat: solve issue #${ISSUE}" \
540556
--body "Closes #${ISSUE}
541557
@@ -553,7 +569,7 @@ Commits: ${COMMIT_COUNT}" \
553569
DIFF_STAT=$(git diff --stat main..HEAD 2>/dev/null || echo "N/A")
554570
FINAL_ELAPSED=$(( $(date +%s) - START_TIME ))
555571
stream_to_telegram "Posting final summary..."
556-
gh issue comment "${ISSUE}" --body "🚀 **Trinity Agent — Summary**
572+
gh issue comment "${ISSUE}" --repo "${GH_REPO}" --body "🚀 **Trinity Agent — Summary**
557573
558574
| Field | Value |
559575
|-------|-------|
@@ -577,10 +593,11 @@ ${DIFF_STAT}
577593
else
578594
stream_to_telegram "Failed to create PR."
579595
fi
596+
fi
580597
else
581598
stream_to_telegram "No commits produced — agent could not solve issue."
582599
report_status "FAILED" "No commits produced — agent could not solve issue"
583-
gh issue comment "${ISSUE}" --body "❌ **Trinity Agent**: No solution produced. Issue may need manual attention." 2>/dev/null || true
600+
gh issue comment "${ISSUE}" --repo "${GH_REPO}" --body "❌ **Trinity Agent**: No solution produced. Issue may need manual attention." 2>/dev/null || true
584601
fi
585602
else
586603
stream_to_telegram "PR already exists: #${EXISTING_PR}"

fpga/openxc7-synth/Makefile.200t

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# =============================================================================
2+
# HSLM on Artix-7 XC7A200T — Scaled-up Ternary Transformer
3+
# =============================================================================
4+
# 365 BRAM36 (2.7x more than XC7A100T)
5+
# Allows: 11 TrinityBlocks + Embedding + LM Head + Attention
6+
#
7+
# Board options:
8+
# - Digilent Nexys A7 200T (~$300)
9+
# - AliExpress XC7A200T boards (~$50-80)
10+
# - QMTECH 200T variant (if available)
11+
#
12+
# Just change PART to synthesize for 200T
13+
# =============================================================================
14+
15+
FAMILY = artix7
16+
PART = xc7a200tfbg484-1
17+
PROJECT = trinity_200t
18+
TOP_MODULE = hslm_full_top
19+
XDC = trinity_200t.xdc
20+
21+
# Paths for openXC7 tools (snap-based)
22+
NEXTPNR_XILINX_DIR ?= /snap/openxc7/current/opt/nextpnr-xilinx
23+
NEXTPNR_XILINX_PYTHON_DIR ?= $(NEXTPNR_XILINX_DIR)/python
24+
PRJXRAY_DB_DIR ?= $(NEXTPNR_XILINX_DIR)/external/prjxray-db
25+
CHIPDB ?= ./chipdb/
26+
27+
DBPART = $(shell echo $(PART) | sed -e 's/-[0-9]//g')
28+
29+
PYPY3 ?= pypy3
30+
31+
# Source files for 11-block design
32+
VERILOG_SRCS = hslm_full_top.v trinity_block.v ternary_matvec_bram.v \
33+
ternary_activation.v ternary_rmsnorm.v embedding_lookup.v \
34+
lm_head_matvec.v argmax_unit.v
35+
36+
.PHONY: all clean info
37+
38+
all: $(PROJECT).bit
39+
40+
info:
41+
@echo "=== Trinity 200T Build ==="
42+
@echo "PART: $(PART)"
43+
@echo "BRAM36: 365 (vs 135 on 100T)"
44+
@echo "LUT: 134,600 (vs 63,400)"
45+
@echo "DSP48: 740 (vs 240)"
46+
@echo "Potential: 11 TrinityBlocks (~1.9M weights)"
47+
@echo ""
48+
49+
$(PROJECT).json: $(VERILOG_SRCS)
50+
yosys -p "synth_xilinx -flatten -abc9 -arch xc7 -top $(TOP_MODULE); write_json $(PROJECT).json" $(VERILOG_SRCS)
51+
52+
$(CHIPDB)/$(DBPART).bin:
53+
mkdir -p $(CHIPDB)
54+
$(PYPY3) $(NEXTPNR_XILINX_PYTHON_DIR)/bbaexport.py --device $(PART) --bba $(DBPART).bba
55+
bbasm -l $(DBPART).bba $(CHIPDB)/$(DBPART).bin
56+
rm -f $(DBPART).bba
57+
58+
$(PROJECT).fasm: $(PROJECT).json $(CHIPDB)/$(DBPART).bin $(XDC)
59+
nextpnr-xilinx --chipdb $(CHIPDB)/$(DBPART).bin --xdc $(XDC) --json $(PROJECT).json --fasm $@
60+
61+
$(PROJECT).frames: $(PROJECT).fasm
62+
fasm2frames --part $(PART) --db-root $(PRJXRAY_DB_DIR)/$(FAMILY) $< > $@
63+
64+
$(PROJECT).bit: $(PROJECT).frames
65+
xc7frames2bit --part_file $(PRJXRAY_DB_DIR)/$(FAMILY)/$(PART)/part.yaml --part_name $(PART) --frm_file $< --output_file $@
66+
67+
clean:
68+
rm -f *.bit *.frames *.fasm *.json
69+
rm -rf chipdb/

0 commit comments

Comments
 (0)