feat: LLM-powered trajectory compaction#17
Conversation
- workflows/llm-compaction.ts: replaces mechanical compaction with LLM intelligence - .gitignore: exclude .agent-relay/ metadata
Replaces mechanical keyword-based compaction with intelligent LLM summarization. New compact/ module: - provider.ts: OpenAI + Anthropic providers (raw fetch, no deps) - serializer.ts: trajectory → LLM-readable text with token budgeting - prompts.ts: system + user prompts for compaction - parser.ts: parse LLM JSON output with fallbacks - markdown.ts: generate readable .md summaries - config.ts: env vars or .trajectories/config.json CLI updates: - trail compact now uses LLM by default (if API key present) - --mechanical flag for old behavior - --focus <areas> for targeted summaries - --markdown flag (default: true) for .md output - Dry-run shows prompt + cost estimate Output includes: - Narrative summary (what happened, how) - Key decisions with reasoning and impact - Extracted conventions/patterns for future work - Synthesized lessons from challenges - Open questions / unresolved issues Backwards compatible: falls back to mechanical if no LLM provider.
| if (typeof value !== "string") { | ||
| return undefined; | ||
| } | ||
|
|
||
| const parsed = Number(value); | ||
| return Number.isFinite(parsed) ? parsed : undefined; |
There was a problem hiding this comment.
🔴 readNumber treats empty/whitespace-only env vars as 0 instead of unset
readNumber("") returns 0 because Number("") === 0 and Number.isFinite(0) === true. This is inconsistent with the sibling readString function at src/compact/config.ts:96-103 which correctly returns undefined for empty strings. When a user clears an env var (e.g., export TRAJECTORIES_LLM_MAX_INPUT_TOKENS=), readNumberEnv returns 0 instead of undefined. Because 0 ?? DEFAULT evaluates to 0 (nullish coalescing does not trigger for 0), this silently overrides the default. The most severe downstream effect: maxInputTokens = 0 causes serializeForLLM to produce an empty string (truncateText(document, 0) returns "" at src/compact/serializer.ts:61), so the LLM receives no trajectory data and generates a useless compaction. Similarly, maxOutputTokens = 0 would be sent as max_tokens: 0 to the LLM API, likely causing an API error.
Was this helpful? React with 👍 or 👎 to provide feedback.
When no API keys (OPENAI_API_KEY / ANTHROPIC_API_KEY) are available, resolveProvider now falls back to locally installed CLI tools (claude, codex) detected via @agent-relay/sdk's resolveCli(). This removes the hard requirement for API keys when users already have a CLI installed. Resolution order: explicit API keys → CLI detection → mechanical fallback. Users can also force CLI with TRAJECTORIES_LLM_PROVIDER=cli. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… workflow compact.ts: execSync → execFileSync to prevent shell injection (HIGH) compact.ts: restore process.env immediately after FileStorage constructor (HIGH) compact.ts: extend shared CompactedTrajectoryMetadata type from parser.ts (MEDIUM) compact.ts: only pass jsonMode for OpenAIProvider (MEDIUM) provider.ts: warn on non-default base URLs to mitigate SSRF (MEDIUM) provider.ts: trim API keys and use || instead of ?? (MEDIUM) provider.ts: throw on empty Anthropic conversation instead of fabricating (MEDIUM) provider.ts: add AbortController with 300s timeout to fetch calls (MEDIUM) provider.ts: update Anthropic API version to 2024-10-22 (MEDIUM) provider.ts: clarify Message type relationship with prompts.ts (MEDIUM) provider.ts: use stdin pipe via spawnWithStdin to avoid arg length limits (MEDIUM) provider.ts: remove explicit env spread from spawn (MEDIUM) provider.ts: redact response text from parseJson error messages (LOW) workflows/llm-compaction.ts: use process.cwd() instead of hardcoded path (MEDIUM) config.ts: document merge precedence in loadFileConfig (MEDIUM) package.json: move @agent-relay/sdk to optionalDependencies (MEDIUM) tests: add full LLM pipeline test with mocked provider (LOW) Co-Authored-By: My Senior Dev <dev@myseniordev.com>
Cast the dynamic import result to an explicit type inline so TypeScript doesn't require module declarations for @agent-relay/sdk. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use a top-level static import of resolveCli from @agent-relay/sdk instead of a dynamic import(). The SDK ships type declarations so TypeScript resolves it correctly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The bundler (esbuild via tsup) tried to resolve and inline the SDK. Mark it as external so the import is left as-is for runtime resolution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The entire SDK was too heavy just for resolveCli(). Replaced with an inlined findBinary() that does a `which` lookup + fallback to well-known install directories (~/.local/bin, ~/.claude/local, /usr/local/bin, /opt/homebrew/bin). Zero new dependencies. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…low)
Compaction stays in one place — the local trail CLI. The SDK only tags
trajectories and shells out. No API key is ever required: the CLI
provider (claude, codex, gemini, or opencode — whichever is installed
and authenticated) is the default, with API providers only used on
explicit opt-in via TRAJECTORIES_LLM_PROVIDER=openai|anthropic.
Core changes:
- Trajectory gains an optional workflowId field
- TrajectoryClient.start() stamps workflowId from TRAJECTORIES_WORKFLOW_ID
env var, or an explicit option (explicit takes precedence)
- New SDK helper compactWorkflow() spawns trail compact --workflow <id>
- trail compact --workflow <id> filter selects trajectories by run
- Output: .trajectories/compacted/workflow-<id>.{json,md}
Provider defaults (src/compact/provider.ts):
- resolveProvider() prefers the CLI provider in auto mode, so no
OPENAI/ANTHROPIC key is needed when any supported CLI is on PATH
- SUPPORTED_CLIS expanded to claude, codex, gemini, opencode
- buildCliArgs() has one-shot invocations for each
- TRAJECTORIES_LLM_CLI env var pins which CLI to use when multiple
are installed
Schema hardening (src/core/schema.ts):
- TrajectoryEventTypeSchema is now a permissive union: trajectories
emitted by other tools (notably agent-relay's completion-evidence
and completion-marker event types) parse successfully instead of
being entirely rejected at load time
Harness:
- scripts/benchmark-compaction.ts — reproducible noisy fixture that
records a multi-chapter trajectory with dozens of low-significance
noise events; caller cd's into an isolated dir for sandboxing
- tests/sdk/workflow-compact.test.ts — 5 vitest cases covering env
tagging, CLI filter, compactWorkflow() helper, and schema leniency
- workflows/sdk-workflow-autocompact.ts — 80 -> 100 relay workflow
that runs the full before/after validation, self-review, peer
review, and commit + push pipeline
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous test asserted the old behavior: API providers before CLI when API keys are present. That behavior was intentionally reversed in commit 21de705 so users never need to set an API key — auto mode now prefers local CLIs (claude / codex / gemini / opencode) even when an API key is in the environment. - Rename + rewrite the "API before CLI" test to lock in CLI preference - Add a new test covering the explicit opt-in path: setting TRAJECTORIES_LLM_PROVIDER=openai still routes to the OpenAI provider even when a CLI is installed 213/213 vitest suite passing. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…late Closes the last gap in workflow-id plumbing: the CLI path (trail start) now reads TRAJECTORIES_WORKFLOW_ID from the environment (and accepts an explicit --workflow flag) and stamps workflowId on the new trajectory. Previously only the SDK path (TrajectoryClient.start) honored it, so agents invoking trail directly would produce untagged trajectories even when running under a relay workflow. - src/cli/commands/start.ts: read TRAJECTORIES_WORKFLOW_ID, accept -w/--workflow <id>, spread onto the created trajectory before save - tests/sdk/workflow-compact.test.ts: two new cases exercising the real CLI via spawnSync (env var + explicit flag) workflows/compact-on-workflow-run.ts: new template workflow showing the canonical auto-compact pattern — assign one workflow id at the top of the file (child processes inherit it via env), let agents record trajectories normally, final deterministic step runs trail compact --workflow <id> --markdown to collate them into a single tight artifact. Copy-pasteable for any workflow that wants auto-compact. 215/215 vitest suite passing. agent-relay dry-run clean. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rkflow on complete
When TrajectoryClient is constructed with autoCompact: true (or an
options object with mechanical/markdown overrides) and the trajectory
has a workflowId stamped, session.complete() and session.done() will
automatically shell out to trail compact --workflow <id> after saving
the raw trajectory. The compacted artifact appears at
.trajectories/compacted/workflow-<id>.{json,md}.
This removes the need for a separate compact step in any SDK consumer
running under a relay workflow — just set TRAJECTORIES_WORKFLOW_ID in
the environment and construct the client with autoCompact: true, and
complete() produces the tight artifact as a side effect.
- autoCompact is opt-in: default behavior unchanged
- Compaction failures are logged but do NOT fail complete() — the raw
trajectory is always saved first
- Backed by a BEFORE/AFTER validation workflow under workflows/
- Tests cover all four permutations plus graceful failure
sdk-autocompact-option — validated end-to-endBEFORE/AFTER gate PASSED. The feature is a genuine behavior change, not a no-op:
Ran via |
Three fixes surfaced by PR #17 review: 1. Lint (biome lint/style/useTemplate): scripts/autocompact-probe.mts:42 — use template literal instead of string concatenation. Local biome ran clean because the file was added pre-commit hook on a different code path; CI caught it. 2. CI test failures (Node 20 + Node 22): Two vitest cases spawn the built trail CLI via child_process and require dist/cli/index.js on disk. The CI test job was running `npm run test:run` without building first, so those tests threw "dist CLI missing". Added a `Build` step to the test job in .github/workflows/ci.yml before `Run tests`. The pre-existing `compactWorkflow SDK helper` test on line 260 was already affected but silently — this also fixes it. 3. Devin review (src/compact/config.ts:111-122): readNumber(""): Number("") === 0, and 0 ?? DEFAULT evaluates to 0 because nullish coalescing doesn't fall back for 0. That meant clearing TRAJECTORIES_LLM_MAX_INPUT_TOKENS silently set it to 0, truncating serialized trajectories to an empty string and sending max_tokens: 0 to API providers. Now mirrors readString: empty or whitespace-only strings return undefined so defaults apply. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| child.stdin.write(input); | ||
| child.stdin.end(); |
There was a problem hiding this comment.
🔴 Missing error handler on child.stdin in spawnWithStdin causes unhandled EPIPE crash
In spawnWithStdin (src/compact/provider.ts:313-346), stdin data is written to the child process at lines 343-344 but there is no error handler on child.stdin. If the spawned CLI process exits before consuming all stdin data (e.g., immediately fails due to misconfigured args, missing auth, or a crash during startup), Node.js emits an EPIPE error on the stdin writable stream. Without a child.stdin.on('error', ...) handler, this becomes an unhandled error event that crashes the Node.js process with ERR_UNHANDLED_ERROR: Unhandled error. (Error: write EPIPE). The child.on('error', reject) handler at line 332 only catches spawn-level errors (e.g., ENOENT), not errors on individual stdio streams. The child.on('close') handler would correctly reject the promise for the non-zero exit code, but the unhandled stdin error event fires independently and crashes the process before the promise rejection can propagate.
| child.stdin.write(input); | |
| child.stdin.end(); | |
| child.stdin.on("error", () => {}); // Swallow EPIPE — the close handler covers non-zero exits | |
| child.stdin.write(input); | |
| child.stdin.end(); |
Was this helpful? React with 👍 or 👎 to provide feedback.
Replaces mechanical keyword-based compaction with intelligent LLM summarization.
New
src/compact/module.trajectories/config.jsonCLI updates
trail compactuses LLM by default (if API key present)--mechanicalflag for old behavior--focus <areas>for targeted summaries--markdownflag for .md output alongside JSON--dry-runshows prompt + cost estimateOutput includes
Backwards compatible: falls back to mechanical compaction if no LLM provider configured.
No new npm dependencies.