Skip to content

Commit 1f2cb45

Browse files
authored
TUI visual clarity: 60% strip density improvement + richer status bar (#26)
* autoresearch: compact idle/dead pills to 4-char names in strip Passive (idle/dead) unselected session pills now show icon + first 4 chars of name instead of full 20-char name, saving ~16 chars per passive pill. At 5 idle + 5 running sessions, this recovers ~80 chars of strip space, fitting far more sessions in a typical 80-char terminal. * autoresearch: attention-first ordering in unified strip Sessions are now sorted by state priority before rendering: pending-tool sessions first, then running, waiting, idle, dead. The selected session's index is remapped after sorting so selection tracking remains correct. This ensures urgent sessions are always visible even when +N overflow hides items at the end. * autoresearch: filter terminal PRs, show compact done count When active (non-merged, non-closed) PRs exist, merged/closed PRs are hidden from the strip and replaced with a compact "(+N done)" indicator. This reduces PR clutter significantly — 3 merged PRs take 0 pill slots instead of 3, replaced by a 10-char "(+3 done)" label. * autoresearch: badge-style pending alert in status bar Replace dim "⚡ 2 pending" text with orange background badge "⚡ 2 PENDING" that stands out against the status line and catches attention even when the user's focus is elsewhere. * autoresearch: remove background from passive pills for visual hierarchy Idle/dead unselected pills now render without background — just dim foreground text. Active (running/waiting) pills retain their colored dim background, creating a clear visual hierarchy: colored boxes draw attention, plain text recedes. Also removes 2 padding chars per passive pill (6 chars vs 8 chars). * autoresearch: mini fleet map line above session zoom When 3+ sessions exist, show a 1-line glyph overview above the session zoom panel: "Sessions: [▶]▶▶⏸⏸✔●". Current session is bracketed, pending-tool sessions appear in orange. Gives full fleet context without leaving the zoom view. * autoresearch: badge-style failing PR alert in status bar Apply same badge treatment to failing PR count as pending tools: red background "✗ N FAILING" badge for visual parity. Both urgent alerts now use consistent badge style for instant recognition. * autoresearch: merge readiness summary line at top of PR zoom Add a quick-scan summary line before the checks section: "✓ approved ✓ checks (12/12) ✓ mergeable ⎇ squash" or "✗ changes requested ✗ checks (9/12) ✗ conflicts ⎇ unset". Lets user assess merge readiness in under 1 second. * autoresearch: show session index/total in fleet map line Fleet map now reads "Session 3/10: [▶]▶▶⏸⏸✔●" — the bold N/total position indicator tells the operator exactly where they are in the fleet while still showing all session states. * autoresearch: compact PR pills — title only for critical/selected state Non-critical, non-selected PR pills now show just icon + number (e.g. "⏳ #42" = 7 chars) instead of icon + number + truncated title (~23 chars). Title is shown only for: checks_failing, approved (need action), or when the PR is selected. Saves ~15 chars per non-critical PR pill. * autoresearch: group queue tools by safety level (destructive vs safe) In the approval queue panel, split pending tools per session into "⚠ Destructive:" and "✓ Safe:" groups. Destructive tools listed first so operators see what needs careful scrutiny immediately. Labels only appear when a session has both types. * autoresearch: show oldest-pending age next to pending badge After the pending badge, show "Xs ago" indicating how long the oldest pending approval has been waiting. Helps operator assess urgency at a glance: "⚡ 2 PENDING 45s ago" vs "⚡ 1 PENDING 5m ago". * autoresearch: done-state treatment for merged/closed PR zoom Merged/closed PRs now show "✔ Merged — no further action required" (or ● Closed) at the top of the body, and skip the merge readiness summary which is irrelevant for done PRs. Reduces visual clutter. * autoresearch: enrich overflow indicator with active-session state breakdown Overflow indicators now show state breakdown of hidden active sessions: "+4(▶2⏸1)" instead of just "+4" when hidden pills include running/waiting sessions. Plain "+N" is used when only idle/dead/PR pills are hidden. Directly solves the "+N confusion" problem — user can always see if critical sessions are out of view. * autoresearch: PR state breakdown in status bar (3✓ 1✗ 1⏳) After the PR count, show compact per-state counts: passing(✓), failing(✗), running(⏳), merged(✔) — each colored and only shown when count > 0. Mirrors session state breakdown for full fleet overview in the status bar. * autoresearch: bold+tinted background for critical unselected PR pills checks_failing PRs get bold + red dim background. approved PRs get bold + green dim background. Non-critical (checks_running, checks_passing) remain plain text. Creates clear urgency gradient: plain < tinted < selected-border. * autoresearch: agent-c findings and log (10 cycles) * autoresearch: collapse dead sessions to compact count at 8+ session load When 8+ sessions are present, dead sessions without pending tools or active selection are removed from the pill list and replaced with a compact "(●N)" count indicator. This eliminates ~6 chars per dead pill (3 dead sessions → 18 chars replaced by 5-char "(●3)"). Saves ~13 chars and removes truly-done sessions from the active view. * autoresearch: visual clarity sweep complete (30 cycles, 3 agents) Strip: 295→118 chars at 10+5 load (60% reduction) Status bar: state breakdowns, badge alerts, fleet map Zoom: merge readiness summary, safety grouping, dead-state treatment * fix: remove duplicate mergeable/method from PR zoom line 2, show repo#N in compact pills * chore: gitignore .autoresearch/, remove from tracking * Streaming agent output with live timeline updates, cleanup, and tests - Agent commands use --output-format stream-json --verbose for real-time output - Parse STATUS: lines from agent text and push as ⚙ timeline events - Prefix timeline entries with agent type (fix-CI:, review:, fix-review:) - Write full stream log to /tmp/csm-agent-*-stream.log for debugging - Accumulate agent cost from result events - Remove fleet map (redundant with strip state summary) - Remove duplicate mergeable/method from PR zoom header line 2 - Add 8 regression tests (stream-json flags, agentLabel, writeAgentLog, PR zoom dedup) * fix: CI test failure — disable review spawn in TestPoll_MultiplePRs The streaming agent code spawns goroutines that race with TempDir cleanup in CI where claude binary doesn't exist. Set ReviewState="clean" to skip agent spawning since this test is about polling, not agent execution.
1 parent 947821e commit 1f2cb45

11 files changed

Lines changed: 873 additions & 61 deletions

File tree

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,4 @@ node_modules/
2020
# Logs
2121
*.log
2222
.claude/worktrees/
23+
.autoresearch/

daemon/internal/pr/agent.go

Lines changed: 183 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
package pr
22

33
import (
4+
"bufio"
5+
"bytes"
46
"context"
57
"encoding/json"
68
"fmt"
@@ -57,6 +59,11 @@ func cloneForAgent(owner, repo, branch string) (string, error) {
5759
return tmpDir, nil
5860
}
5961

62+
// statusInstruction is appended to all agent prompts to get live status updates.
63+
const statusInstruction = "\n\nIMPORTANT: At each major step, print a short status line starting with " +
64+
"\"STATUS: \" (e.g. \"STATUS: reading CI logs\", \"STATUS: found root cause in foo.go\", " +
65+
"\"STATUS: running tests\", \"STATUS: pushing fix\"). These are shown in a live dashboard."
66+
6067
// --- command builders ---
6168

6269
func buildFixCICmd(pr *TrackedPR, workDir string) *exec.Cmd {
@@ -82,10 +89,11 @@ func buildFixCICmd(pr *TrackedPR, workDir string) *exec.Cmd {
8289
"Do not change test expectations unless the test itself is wrong.",
8390
pr.Number, pr.Owner, pr.Repo, pr.HeadBranch,
8491
strings.Join(failing, "\n"),
85-
)
92+
) + statusInstruction
8693

8794
args := []string{
8895
"-p", prompt,
96+
"--output-format", "stream-json", "--verbose",
8997
"--no-session-persistence",
9098
"--max-budget-usd", "5",
9199
"--model", "sonnet",
@@ -117,10 +125,11 @@ func buildCodeReviewCmd(pr *TrackedPR, workDir string) *exec.Cmd {
117125
"If the code is clean, output: []\n"+
118126
"Output the JSON array and nothing else.",
119127
pr.HeadBranch, pr.BaseBranch, pr.BaseBranch,
120-
)
128+
) + statusInstruction
121129

122130
args := []string{
123131
"-p", prompt,
132+
"--output-format", "stream-json", "--verbose",
124133
"--no-session-persistence",
125134
"--max-budget-usd", "3",
126135
"--model", "sonnet",
@@ -151,10 +160,11 @@ func buildFixReviewCmd(pr *TrackedPR, workDir string) *exec.Cmd {
151160
"1. Run tests to verify nothing is broken\n"+
152161
"2. Commit and push to the current branch",
153162
strings.Join(issues, "\n"),
154-
)
163+
) + statusInstruction
155164

156165
args := []string{
157166
"-p", prompt,
167+
"--output-format", "stream-json", "--verbose",
158168
"--no-session-persistence",
159169
"--max-budget-usd", "5",
160170
"--model", "sonnet",
@@ -175,6 +185,147 @@ func buildFixReviewCmd(pr *TrackedPR, workDir string) *exec.Cmd {
175185
return cmd
176186
}
177187

188+
// --- streaming agent runner ---
189+
190+
// streamEvent is the minimal structure for parsing claude stream-json events.
191+
type streamEvent struct {
192+
Type string `json:"type"`
193+
Message struct {
194+
Content []struct {
195+
Type string `json:"type"`
196+
Text string `json:"text"`
197+
} `json:"content"`
198+
} `json:"message"`
199+
Result string `json:"result"`
200+
CostUSD float64 `json:"total_cost_usd"`
201+
Duration float64 `json:"duration_ms"`
202+
}
203+
204+
// runStreamingAgent runs a claude -p command with stream-json output,
205+
// parsing STATUS: lines and forwarding them to the PR timeline in real-time.
206+
// Returns the final result text and accumulated full output for logging.
207+
func (p *Poller) runStreamingAgent(ctx context.Context, cmd *exec.Cmd, key, agentType string) (result string, allOutput []byte, err error) {
208+
stdout, pipeErr := cmd.StdoutPipe()
209+
if pipeErr != nil {
210+
return "", nil, fmt.Errorf("stdout pipe: %w", pipeErr)
211+
}
212+
// Capture stderr separately for diagnostics.
213+
var stderrBuf bytes.Buffer
214+
cmd.Stderr = &stderrBuf
215+
216+
if startErr := cmd.Start(); startErr != nil {
217+
return "", nil, fmt.Errorf("start: %w", startErr)
218+
}
219+
220+
var fullOutput bytes.Buffer
221+
scanner := bufio.NewScanner(stdout)
222+
// stream-json can have long lines (tool results with file contents).
223+
scanner.Buffer(make([]byte, 0, 256*1024), 1024*1024)
224+
225+
// Live stream log — write every event to a file for debugging.
226+
safe := strings.NewReplacer("/", "-", "#", "-").Replace(key)
227+
streamLogPath := fmt.Sprintf("/tmp/csm-agent-%s-%s-stream.log", safe, agentType)
228+
streamLog, _ := os.Create(streamLogPath)
229+
defer func() {
230+
if streamLog != nil {
231+
streamLog.Close()
232+
}
233+
}()
234+
log.Printf("pr: agent %s stream log: %s", agentType, streamLogPath)
235+
236+
for scanner.Scan() {
237+
line := scanner.Bytes()
238+
fullOutput.Write(line)
239+
fullOutput.WriteByte('\n')
240+
if streamLog != nil {
241+
streamLog.Write(line)
242+
streamLog.WriteString("\n")
243+
streamLog.Sync()
244+
}
245+
246+
var ev streamEvent
247+
if json.Unmarshal(line, &ev) != nil {
248+
continue
249+
}
250+
251+
switch ev.Type {
252+
case "assistant":
253+
// Look for STATUS: lines in assistant text content.
254+
for _, block := range ev.Message.Content {
255+
if block.Type != "text" {
256+
continue
257+
}
258+
for _, textLine := range strings.Split(block.Text, "\n") {
259+
trimmed := strings.TrimSpace(textLine)
260+
if strings.HasPrefix(trimmed, "STATUS:") {
261+
status := strings.TrimSpace(strings.TrimPrefix(trimmed, "STATUS:"))
262+
if status != "" {
263+
p.agentProgress(key, agentType, status)
264+
}
265+
}
266+
}
267+
}
268+
case "result":
269+
result = ev.Result
270+
if ev.CostUSD > 0 {
271+
p.agentCostUpdate(key, ev.CostUSD)
272+
}
273+
}
274+
}
275+
276+
waitErr := cmd.Wait()
277+
278+
// Append stderr to output for logging.
279+
if stderrBuf.Len() > 0 {
280+
fullOutput.WriteString("\n--- stderr ---\n")
281+
fullOutput.Write(stderrBuf.Bytes())
282+
}
283+
284+
return result, fullOutput.Bytes(), waitErr
285+
}
286+
287+
// agentLabel returns a human-friendly label for a timeline prefix.
288+
func agentLabel(agentType string) string {
289+
switch agentType {
290+
case "fix_ci":
291+
return "fix-CI"
292+
case "review":
293+
return "review"
294+
case "fix_review":
295+
return "fix-review"
296+
default:
297+
return agentType
298+
}
299+
}
300+
301+
// agentProgress adds a status update to the PR timeline from a running agent.
302+
func (p *Poller) agentProgress(key, agentType, status string) {
303+
p.mu.Lock()
304+
pr, ok := p.tracked[key]
305+
if ok {
306+
pr.Timeline = append(pr.Timeline, PREvent{
307+
Time: time.Now(), Icon: "⚙",
308+
Message: agentLabel(agentType) + ": " + status,
309+
})
310+
p.save()
311+
}
312+
p.mu.Unlock()
313+
314+
if ok && p.onChange != nil {
315+
p.onChange()
316+
}
317+
}
318+
319+
// agentCostUpdate records the agent cost on the PR.
320+
func (p *Poller) agentCostUpdate(key string, costUSD float64) {
321+
p.mu.Lock()
322+
pr, ok := p.tracked[key]
323+
if ok {
324+
pr.AgentCostUSD += costUSD
325+
}
326+
p.mu.Unlock()
327+
}
328+
178329
// --- spawn functions ---
179330

180331
const agentTimeout = 15 * time.Minute
@@ -198,7 +349,7 @@ func (p *Poller) spawnFixCI(pr *TrackedPR) {
198349
cmd = exec.CommandContext(ctx, cmd.Path, cmd.Args[1:]...)
199350
cmd.Dir = workDir
200351

201-
output, err := cmd.CombinedOutput()
352+
_, output, err := p.runStreamingAgent(ctx, cmd, key, "fix_ci")
202353
p.agentComplete(key, "fix_ci", err, output)
203354
}()
204355
}
@@ -222,7 +373,12 @@ func (p *Poller) spawnCodeReview(pr *TrackedPR) {
222373
cmd = exec.CommandContext(ctx, cmd.Path, cmd.Args[1:]...)
223374
cmd.Dir = workDir
224375

225-
output, err := cmd.CombinedOutput()
376+
result, output, err := p.runStreamingAgent(ctx, cmd, key, "review")
377+
// For review, the result field contains the final text output.
378+
// Pass it as output for parseReviewOutput.
379+
if err == nil && result != "" {
380+
output = []byte(result)
381+
}
226382
p.agentComplete(key, "review", err, output)
227383
}()
228384
}
@@ -246,11 +402,30 @@ func (p *Poller) spawnFixReview(pr *TrackedPR) {
246402
cmd = exec.CommandContext(ctx, cmd.Path, cmd.Args[1:]...)
247403
cmd.Dir = workDir
248404

249-
output, err := cmd.CombinedOutput()
405+
_, output, err := p.runStreamingAgent(ctx, cmd, key, "fix_review")
250406
p.agentComplete(key, "fix_review", err, output)
251407
}()
252408
}
253409

410+
// writeAgentLog writes agent output + error to /tmp/csm-agent-<key>-<type>.log.
411+
// Returns the log path for use in the daemon log line.
412+
func writeAgentLog(key, agentType string, output []byte, runErr error) string {
413+
// Sanitize key for use in filename (replace / and # with -).
414+
safe := strings.NewReplacer("/", "-", "#", "-").Replace(key)
415+
path := fmt.Sprintf("/tmp/csm-agent-%s-%s.log", safe, agentType)
416+
var buf strings.Builder
417+
buf.WriteString(fmt.Sprintf("=== CSM agent log: %s %s ===\n", key, agentType))
418+
buf.WriteString(fmt.Sprintf("error: %v\n", runErr))
419+
buf.WriteString("--- output ---\n")
420+
if len(output) > 0 {
421+
buf.Write(output)
422+
} else {
423+
buf.WriteString("(no output)\n")
424+
}
425+
_ = os.WriteFile(path, []byte(buf.String()), 0o644)
426+
return path
427+
}
428+
254429
// --- completion callback ---
255430

256431
func (p *Poller) agentComplete(key, agentType string, err error, output []byte) {
@@ -268,7 +443,8 @@ func (p *Poller) agentComplete(key, agentType string, err error, output []byte)
268443
Time: time.Now(), Icon: "✗",
269444
Message: fmt.Sprintf("Agent %s failed: %v", agentType, err),
270445
})
271-
log.Printf("pr: agent %s for %s failed: %v", agentType, key, err)
446+
logFile := writeAgentLog(key, agentType, output, err)
447+
log.Printf("pr: agent %s for %s failed: %v (log: %s)", agentType, key, err, logFile)
272448
} else {
273449
msg := fmt.Sprintf("Agent %s completed", agentType)
274450

daemon/internal/pr/agent_test.go

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
package pr
22

33
import (
4+
"fmt"
45
"os"
56
"strings"
67
"testing"
@@ -173,6 +174,106 @@ func TestParseReviewOutput_Empty(t *testing.T) {
173174
}
174175
}
175176

177+
// === stream-json flags ===
178+
179+
func TestBuildFixCICmd_StreamJSON(t *testing.T) {
180+
pr := &TrackedPR{
181+
Owner: "test", Repo: "repo", Number: 1,
182+
HeadBranch: "fix", AutopilotMode: PRAuto,
183+
Checks: []Check{{Name: "ci", Conclusion: "FAILURE"}},
184+
}
185+
args := strings.Join(buildFixCICmd(pr, "/tmp").Args, " ")
186+
if !strings.Contains(args, "--output-format stream-json") {
187+
t.Error("fix_ci should use stream-json output")
188+
}
189+
if !strings.Contains(args, "--verbose") {
190+
t.Error("stream-json requires --verbose")
191+
}
192+
if !strings.Contains(args, "STATUS:") {
193+
t.Error("prompt should contain STATUS instruction")
194+
}
195+
}
196+
197+
func TestBuildCodeReviewCmd_StreamJSON(t *testing.T) {
198+
pr := &TrackedPR{
199+
Owner: "test", Repo: "repo", Number: 1,
200+
HeadBranch: "feat", BaseBranch: "main",
201+
}
202+
args := strings.Join(buildCodeReviewCmd(pr, "/tmp").Args, " ")
203+
if !strings.Contains(args, "--output-format stream-json") {
204+
t.Error("review should use stream-json output")
205+
}
206+
if !strings.Contains(args, "--verbose") {
207+
t.Error("stream-json requires --verbose")
208+
}
209+
}
210+
211+
func TestBuildFixReviewCmd_StreamJSON(t *testing.T) {
212+
pr := &TrackedPR{
213+
Owner: "test", Repo: "repo", Number: 1,
214+
HeadBranch: "fix", AutopilotMode: PRAuto,
215+
ReviewFindings: []ReviewFinding{
216+
{Severity: SeverityCritical, File: "a.go", Message: "bug"},
217+
},
218+
}
219+
args := strings.Join(buildFixReviewCmd(pr, "/tmp").Args, " ")
220+
if !strings.Contains(args, "--output-format stream-json") {
221+
t.Error("fix_review should use stream-json output")
222+
}
223+
if !strings.Contains(args, "--verbose") {
224+
t.Error("stream-json requires --verbose")
225+
}
226+
}
227+
228+
// === agentLabel ===
229+
230+
func TestAgentLabel(t *testing.T) {
231+
cases := []struct{ in, want string }{
232+
{"fix_ci", "fix-CI"},
233+
{"review", "review"},
234+
{"fix_review", "fix-review"},
235+
{"unknown", "unknown"},
236+
}
237+
for _, c := range cases {
238+
if got := agentLabel(c.in); got != c.want {
239+
t.Errorf("agentLabel(%q) = %q, want %q", c.in, got, c.want)
240+
}
241+
}
242+
}
243+
244+
// === writeAgentLog ===
245+
246+
func TestWriteAgentLog_CreatesFile(t *testing.T) {
247+
path := writeAgentLog("test/repo#1", "fix_ci", []byte("some output"), nil)
248+
defer os.Remove(path)
249+
250+
data, err := os.ReadFile(path)
251+
if err != nil {
252+
t.Fatalf("failed to read log: %v", err)
253+
}
254+
s := string(data)
255+
if !strings.Contains(s, "test/repo#1") {
256+
t.Error("log should contain PR key")
257+
}
258+
if !strings.Contains(s, "some output") {
259+
t.Error("log should contain output")
260+
}
261+
}
262+
263+
func TestWriteAgentLog_NoOutput(t *testing.T) {
264+
path := writeAgentLog("test/repo#2", "review", nil, fmt.Errorf("signal: killed"))
265+
defer os.Remove(path)
266+
267+
data, _ := os.ReadFile(path)
268+
s := string(data)
269+
if !strings.Contains(s, "signal: killed") {
270+
t.Error("log should contain error")
271+
}
272+
if !strings.Contains(s, "(no output)") {
273+
t.Error("log should indicate no output")
274+
}
275+
}
276+
176277
// === cloneForAgent (mock test) ===
177278

178279
func TestCloneForAgent_BadRepo(t *testing.T) {

0 commit comments

Comments
 (0)