Skip to content

Commit 5af7ca7

Browse files
committed
Add memory UX improvements and session filtering
Container: - Update devcontainer.json configuration - Expand claude-code-headless skill documentation Dashboard — Parser/Server: - Add memory-sync service for background synchronization - Update parser queries, types, and DB layer for memory filtering - Extend API routes with memory and session endpoints Dashboard — UI Components: - Add ApproveModal, MaintenanceModal, and ConfirmModal components - Enhance memory views: MemoriesPage, MemoriesTab, ObservationsTab, RunDetail, RunsTab with filtering and approval workflows - Update session views: SessionList, SessionDetail, AgentsView - Improve dashboard charts: ActivityHeatmap, DurationDistribution, HourlyHeatmap, ModelDistribution, OverviewCards, ToolUsage - Update Sidebar, ProjectDetail, and global styles Dashboard — Stores/Routes: - Extend memory, sessions, and SSE stores for new data flows - Update page routes for memories and session detail views
1 parent 0e2fe37 commit 5af7ca7

36 files changed

+2555
-330
lines changed

container/.devcontainer/devcontainer.json

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -205,7 +205,9 @@
205205
"projectManager.git.maxDepthRecursion": 2,
206206
"projectManager.showProjectNameInStatusBar": true,
207207
"projectManager.openInNewWindowWhenClickingInStatusBar": false,
208-
"projectManager.projectsLocation": "/workspaces/.config/project-manager"
208+
"projectManager.projectsLocation": "/workspaces/.config/project-manager",
209+
"git.autofetch": false,
210+
"git.autorefresh": false
209211
},
210212
"extensions": [
211213
"wenbopan.vscode-terminal-osc-notifier",

container/.devcontainer/plugins/devs-marketplace/plugins/skill-engine/skills/claude-code-headless/SKILL.md

Lines changed: 115 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ description: >-
88
programmatically", "set up permissions for scripts", or works with
99
--output-format stream-json, --permission-mode, --json-schema, --resume.
1010
DO NOT USE for the TypeScript SDK API — use claude-agent-sdk instead.
11-
version: 0.2.0
11+
version: 0.3.0
1212
---
1313

1414
# Claude Code Headless
@@ -49,11 +49,15 @@ claude -p "Review this code for security issues" < src/auth.py
4949
| `--max-turns` | Limit agentic turns; exits with error at limit | `--max-turns 10` |
5050
| `--max-budget-usd` | Maximum dollar spend for the invocation | `--max-budget-usd 5.00` |
5151
| `--output-format` | Output format: `text`, `json`, `stream-json` | `--output-format json` |
52-
| `--verbose` | Full turn-by-turn output to stderr | `--verbose` |
52+
| `--verbose` | **Required with `stream-json`**. Full turn-by-turn JSON to stdout | `--verbose` |
5353
| `--allowedTools` | Auto-approve specific tools (glob patterns) | `--allowedTools "Read" "Bash(git *)"` |
5454
| `--disallowedTools` | Remove tools from model context entirely | `--disallowedTools "Write" "Edit"` |
5555
| `--permission-mode` | Permission behavior preset | `--permission-mode acceptEdits` |
5656

57+
> **CRITICAL — `--verbose` is mandatory with `--output-format stream-json`.**
58+
> Without it, the CLI immediately errors: `Error: When using --print, --output-format=stream-json requires --verbose`.
59+
> This applies to both CLI invocations and programmatic `Bun.spawn`/`child_process.spawn` usage.
60+
5761
### System Prompt Flags
5862

5963
| Flag | Behavior | Recommendation |
@@ -100,29 +104,64 @@ system (init) → assistant/user messages (interleaved) → result (final)
100104
```
101105

102106
```bash
103-
claude -p "Refactor the database module" --output-format stream-json | while IFS= read -r line; do
107+
claude -p "Refactor the database module" --verbose --output-format stream-json | while IFS= read -r line; do
104108
type=$(echo "$line" | jq -r '.type')
105109
case "$type" in
106-
system) echo "Session: $(echo "$line" | jq -r '.session_id')" ;;
110+
system) ;; # Hook events — skip
107111
assistant) echo "$line" | jq -r '.message.content[]? | select(.type == "text") | .text' ;;
108112
result) echo "Cost: $(echo "$line" | jq -r '.total_cost_usd')" ;;
109113
esac
110114
done
111115
```
112116

117+
### stream-json Gotchas
118+
119+
**1. `--verbose` is mandatory.** See the Key Flags table above. Without it, the process errors immediately.
120+
121+
**2. Event field nesting.** The `assistant` event nests everything under `event.message`:
122+
- Model: `event.message.model` (NOT `event.model`)
123+
- Usage: `event.message.usage.input_tokens` / `.output_tokens` (NOT top-level)
124+
- Content: `event.message.content[]`
125+
126+
The `result` event has top-level fields: `event.total_cost_usd`, `event.structured_output`, `event.subtype`, `event.num_turns`.
127+
128+
**3. `ReadableStream.getReader()` is unreliable for subprocess stdout in long-lived Bun.serve processes.** When spawning Claude as a subprocess inside a Bun HTTP server, `proc.stdout.getReader()` may silently close after reading only a few events — even though the process runs to completion and produces all output. This is a Bun runtime issue with streaming readers in server contexts.
129+
130+
**Fix:** Collect all stdout after the process exits instead of streaming:
131+
```typescript
132+
// BROKEN inside Bun.serve — reader closes prematurely
133+
const reader = proc.stdout.getReader();
134+
while (true) {
135+
const { done, value } = await reader.read(); // ← may return done=true early
136+
if (done) break;
137+
}
138+
139+
// WORKS — collect after exit
140+
const exitCode = await proc.exited;
141+
const stdout = await new Response(proc.stdout).text();
142+
for (const line of stdout.split("\n")) {
143+
const event = JSON.parse(line.trim());
144+
// process event
145+
}
146+
```
147+
148+
This trades real-time streaming for reliability. If you need real-time progress, emit SSE events based on the process being alive rather than individual stream events.
149+
150+
**4. `system` events from hooks.** With `--verbose`, the stream includes `system` events for hooks (SessionStart, PreToolUse, etc.) before the first `assistant` event. Filter by `event.type === "assistant"` or `event.type === "result"` to skip them.
151+
113152
### jq Filtering Recipes
114153

115154
```bash
116155
# Extract only text output from assistant messages
117-
claude -p "query" --output-format stream-json \
156+
claude -p "query" --verbose --output-format stream-json \
118157
| jq -r 'select(.type == "assistant") | .message.content[]? | select(.type == "text") | .text'
119158

120159
# Extract tool usage
121-
claude -p "query" --output-format stream-json \
160+
claude -p "query" --verbose --output-format stream-json \
122161
| jq -r 'select(.type == "assistant") | .message.content[]? | select(.type == "tool_use") | "\(.name): \(.input | tostring)"'
123162

124163
# Get final cost
125-
claude -p "query" --output-format stream-json \
164+
claude -p "query" --verbose --output-format stream-json \
126165
| jq -r 'select(.type == "result") | "Cost: $\(.total_cost_usd) | Turns: \(.num_turns)"'
127166
```
128167

@@ -365,12 +404,79 @@ Check `is_error` and `subtype` to determine whether the invocation completed suc
365404

366405
---
367406

407+
## Data Quality for LLM Analyzers
408+
409+
When using Claude Code headless (`-p`) to build an agent that **analyzes other Claude Code sessions** (conversation logs, behavioral patterns, usage metrics), the quality of analysis depends entirely on how data is presented to the subprocess. These rules prevent the most common failure modes.
410+
411+
### Rule 1: Give Tools, Not Data Dumps
412+
413+
Instead of pre-loading all data into the prompt, provide query commands the agent can call via Bash:
414+
415+
```bash
416+
# BAD — dumps everything into the prompt, wastes tokens, overwhelms the model
417+
claude -p "Here are 500 messages: $(cat messages.json). Analyze them." \
418+
--allowedTools "Read"
419+
420+
# GOOD — gives the agent tools to explore on its own
421+
claude -p "Use these query commands to explore the session data..." \
422+
--allowedTools "Bash(bun run scripts/query-db.ts *)" "Read" "Glob" "Grep"
423+
```
424+
425+
Benefits:
426+
- The agent decides what's relevant, not you
427+
- Keeps prompt tokens low
428+
- Scales to any data size
429+
- The agent can drill into interesting areas
430+
431+
### Rule 2: Explain the Data Taxonomy
432+
433+
When an LLM agent processes conversation data, it **MUST** understand the role taxonomy. Claude Code sessions have multiple message types that look similar but carry very different signal:
434+
435+
| Message type | Role | Signal value |
436+
|-------------|------|-------------|
437+
| `user` (string, no system tags) | Human input | **PRIMARY** — preferences, corrections, instructions |
438+
| `user` (string, with `<system-reminder>`) | System injection | **NOISE** — hooks, workspace reminders |
439+
| `user` (array, text blocks only) | Human input | **PRIMARY** — if no system tags in text |
440+
| `user` (array, tool_result only) | Tool plumbing | **NOISE** — tool execution results |
441+
| `assistant` (text blocks) | Claude's response | **CONTEXT** — what user reacted to |
442+
| `assistant` (tool_use blocks) | Claude's actions | **CONTEXT** — workflow patterns |
443+
| `assistant` (thinking blocks) | Internal reasoning | **NOISE** — skip entirely |
444+
| `system` | System hooks | **NOISE** — skip |
445+
| `progress` | Tool execution | **NOISE** — skip |
446+
447+
**Critical distinction:** In a typical 264-message session, only **7 messages** (~3%) are actual human input. The rest is tool plumbing, system injections, and assistant actions. Without taxonomy guidance, the analyzer treats all 264 messages as equally important and produces garbage.
448+
449+
### Rule 3: Distinguish AI-Generated from Human-Authored Content
450+
451+
Plans, specs, and proposals submitted by the user were typically **AI-generated in a previous session**. The user approved them, but didn't write them:
452+
453+
```
454+
# This is a user message, but the PLAN CONTENT is AI-generated:
455+
"Implement the following plan:\n\n# Feature X\n\n## Context\n..."
456+
```
457+
458+
The behavioral signal is: "user uses a plan-first workflow" — NOT the plan's content, structure, or technical decisions. Never attribute AI writing style to the user.
459+
460+
### Rule 4: Comprehensive Prompts Beat Ambiguous Ones
461+
462+
For analysis tasks, a detailed prompt with explicit guidance produces far better results than a terse one. Include:
463+
464+
- **What good output looks like** — concrete examples of correct analysis
465+
- **What bad output looks like** — explicit anti-patterns with explanations
466+
- **Category definitions** — with concrete signals to look for in each
467+
- **Evidence requirements** — what counts as evidence, what doesn't
468+
- **Quality gates** — when to return empty results vs forcing observations
469+
470+
A 3000-token prompt that produces 5 high-quality observations is better than a 500-token prompt that produces 20 garbage observations.
471+
472+
---
473+
368474
## Ambiguity Policy
369475

370476
These defaults apply when the user does not specify a preference. State the assumption when applying a default:
371477

372478
- **System prompt:** `--append-system-prompt` over `--system-prompt` to preserve built-in behaviors
373-
- **Output format:** `--output-format json` for scripts; `stream-json` when real-time display is needed
479+
- **Output format:** `--output-format json` for scripts; `--verbose --output-format stream-json` when event-level access is needed (both flags required)
374480
- **Model:** `sonnet` for automation tasks (balanced cost and capability)
375481
- **Permission mode:** `--permission-mode acceptEdits` for trusted pipelines; `plan` for read-only analysis
376482
- **Session persistence:** `--no-session-persistence` in CI/CD unless session continuation is required
@@ -382,5 +488,5 @@ These defaults apply when the user does not specify a preference. State the assu
382488

383489
| File | Contents |
384490
|------|----------|
385-
| `references/cli-flags-and-output.md` | Complete flag reference, `stream-json` event type catalog with TypeScript types, JSON output schema, jq recipes, `--input-format stream-json` chaining, environment variables, `--verbose` and `--debug` flags |
491+
| `references/cli-flags-and-output.md` | Complete flag reference, `stream-json` event type catalog with TypeScript types, JSON output schema, jq recipes, `--input-format stream-json` chaining, environment variables, `--verbose` requirement for `stream-json`, and `--debug` flags |
386492
| `references/sdk-and-mcp.md` | TypeScript and Python SDK full options, `canUseTool` callback, `ClaudeSDKClient` multi-turn, `createSdkMcpServer` custom tools, `--mcp-config`, `--permission-prompt-tool`, `--agents` subagent definitions, session continuation patterns |

0 commit comments

Comments
 (0)