Skip to content

feat: LLM-powered trajectory compaction#17

Merged
khaliqgant merged 14 commits intomainfrom
feat/llm-compaction
Apr 12, 2026
Merged

feat: LLM-powered trajectory compaction#17
khaliqgant merged 14 commits intomainfrom
feat/llm-compaction

Conversation

@khaliqgant
Copy link
Copy Markdown
Member

@khaliqgant khaliqgant commented Mar 28, 2026

Replaces mechanical keyword-based compaction with intelligent LLM summarization.

New src/compact/ module

  • provider.ts — OpenAI + Anthropic via raw fetch (no new deps), auto-detect from env
  • serializer.ts — Trajectories → LLM-readable text with token budgeting
  • prompts.ts — System + user prompts for compaction
  • parser.ts — Parse LLM JSON with fallbacks (JSON → code blocks → regex)
  • markdown.ts — Generate readable .md summaries
  • config.ts — Env vars or .trajectories/config.json

CLI updates

  • trail compact uses LLM by default (if API key present)
  • --mechanical flag for old behavior
  • --focus <areas> for targeted summaries
  • --markdown flag for .md output alongside JSON
  • --dry-run shows prompt + cost estimate

Output includes

  • Narrative summary (what happened, how, why)
  • Key decisions with reasoning and impact analysis
  • Extracted conventions/patterns for future work
  • Synthesized lessons from challenges
  • Open questions / unresolved issues

Backwards compatible: falls back to mechanical compaction if no LLM provider configured.

No new npm dependencies.


Open with Devin

- workflows/llm-compaction.ts: replaces mechanical compaction with LLM intelligence
- .gitignore: exclude .agent-relay/ metadata
Replaces mechanical keyword-based compaction with intelligent LLM summarization.

New compact/ module:
  - provider.ts: OpenAI + Anthropic providers (raw fetch, no deps)
  - serializer.ts: trajectory → LLM-readable text with token budgeting
  - prompts.ts: system + user prompts for compaction
  - parser.ts: parse LLM JSON output with fallbacks
  - markdown.ts: generate readable .md summaries
  - config.ts: env vars or .trajectories/config.json

CLI updates:
  - trail compact now uses LLM by default (if API key present)
  - --mechanical flag for old behavior
  - --focus <areas> for targeted summaries
  - --markdown flag (default: true) for .md output
  - Dry-run shows prompt + cost estimate

Output includes:
  - Narrative summary (what happened, how)
  - Key decisions with reasoning and impact
  - Extracted conventions/patterns for future work
  - Synthesized lessons from challenges
  - Open questions / unresolved issues

Backwards compatible: falls back to mechanical if no LLM provider.
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment on lines +110 to +115
if (typeof value !== "string") {
return undefined;
}

const parsed = Number(value);
return Number.isFinite(parsed) ? parsed : undefined;
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration bot Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 readNumber treats empty/whitespace-only env vars as 0 instead of unset

readNumber("") returns 0 because Number("") === 0 and Number.isFinite(0) === true. This is inconsistent with the sibling readString function at src/compact/config.ts:96-103 which correctly returns undefined for empty strings. When a user clears an env var (e.g., export TRAJECTORIES_LLM_MAX_INPUT_TOKENS=), readNumberEnv returns 0 instead of undefined. Because 0 ?? DEFAULT evaluates to 0 (nullish coalescing does not trigger for 0), this silently overrides the default. The most severe downstream effect: maxInputTokens = 0 causes serializeForLLM to produce an empty string (truncateText(document, 0) returns "" at src/compact/serializer.ts:61), so the LLM receives no trajectory data and generates a useless compaction. Similarly, maxOutputTokens = 0 would be sent as max_tokens: 0 to the LLM API, likely causing an API error.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

khaliqgant and others added 11 commits March 28, 2026 11:26
When no API keys (OPENAI_API_KEY / ANTHROPIC_API_KEY) are available,
resolveProvider now falls back to locally installed CLI tools (claude,
codex) detected via @agent-relay/sdk's resolveCli(). This removes the
hard requirement for API keys when users already have a CLI installed.

Resolution order: explicit API keys → CLI detection → mechanical fallback.
Users can also force CLI with TRAJECTORIES_LLM_PROVIDER=cli.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… workflow

compact.ts: execSync → execFileSync to prevent shell injection (HIGH)
compact.ts: restore process.env immediately after FileStorage constructor (HIGH)
compact.ts: extend shared CompactedTrajectoryMetadata type from parser.ts (MEDIUM)
compact.ts: only pass jsonMode for OpenAIProvider (MEDIUM)
provider.ts: warn on non-default base URLs to mitigate SSRF (MEDIUM)
provider.ts: trim API keys and use || instead of ?? (MEDIUM)
provider.ts: throw on empty Anthropic conversation instead of fabricating (MEDIUM)
provider.ts: add AbortController with 300s timeout to fetch calls (MEDIUM)
provider.ts: update Anthropic API version to 2024-10-22 (MEDIUM)
provider.ts: clarify Message type relationship with prompts.ts (MEDIUM)
provider.ts: use stdin pipe via spawnWithStdin to avoid arg length limits (MEDIUM)
provider.ts: remove explicit env spread from spawn (MEDIUM)
provider.ts: redact response text from parseJson error messages (LOW)
workflows/llm-compaction.ts: use process.cwd() instead of hardcoded path (MEDIUM)
config.ts: document merge precedence in loadFileConfig (MEDIUM)
package.json: move @agent-relay/sdk to optionalDependencies (MEDIUM)
tests: add full LLM pipeline test with mocked provider (LOW)

Co-Authored-By: My Senior Dev <dev@myseniordev.com>
Cast the dynamic import result to an explicit type inline so TypeScript
doesn't require module declarations for @agent-relay/sdk.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use a top-level static import of resolveCli from @agent-relay/sdk
instead of a dynamic import(). The SDK ships type declarations so
TypeScript resolves it correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The bundler (esbuild via tsup) tried to resolve and inline the SDK.
Mark it as external so the import is left as-is for runtime resolution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The entire SDK was too heavy just for resolveCli(). Replaced with an
inlined findBinary() that does a `which` lookup + fallback to well-known
install directories (~/.local/bin, ~/.claude/local, /usr/local/bin,
/opt/homebrew/bin). Zero new dependencies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…low)

Compaction stays in one place — the local trail CLI. The SDK only tags
trajectories and shells out. No API key is ever required: the CLI
provider (claude, codex, gemini, or opencode — whichever is installed
and authenticated) is the default, with API providers only used on
explicit opt-in via TRAJECTORIES_LLM_PROVIDER=openai|anthropic.

Core changes:
- Trajectory gains an optional workflowId field
- TrajectoryClient.start() stamps workflowId from TRAJECTORIES_WORKFLOW_ID
  env var, or an explicit option (explicit takes precedence)
- New SDK helper compactWorkflow() spawns trail compact --workflow <id>
- trail compact --workflow <id> filter selects trajectories by run
- Output: .trajectories/compacted/workflow-<id>.{json,md}

Provider defaults (src/compact/provider.ts):
- resolveProvider() prefers the CLI provider in auto mode, so no
  OPENAI/ANTHROPIC key is needed when any supported CLI is on PATH
- SUPPORTED_CLIS expanded to claude, codex, gemini, opencode
- buildCliArgs() has one-shot invocations for each
- TRAJECTORIES_LLM_CLI env var pins which CLI to use when multiple
  are installed

Schema hardening (src/core/schema.ts):
- TrajectoryEventTypeSchema is now a permissive union: trajectories
  emitted by other tools (notably agent-relay's completion-evidence
  and completion-marker event types) parse successfully instead of
  being entirely rejected at load time

Harness:
- scripts/benchmark-compaction.ts — reproducible noisy fixture that
  records a multi-chapter trajectory with dozens of low-significance
  noise events; caller cd's into an isolated dir for sandboxing
- tests/sdk/workflow-compact.test.ts — 5 vitest cases covering env
  tagging, CLI filter, compactWorkflow() helper, and schema leniency
- workflows/sdk-workflow-autocompact.ts — 80 -> 100 relay workflow
  that runs the full before/after validation, self-review, peer
  review, and commit + push pipeline

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous test asserted the old behavior: API providers before CLI
when API keys are present. That behavior was intentionally reversed in
commit 21de705 so users never need to set an API key — auto mode now
prefers local CLIs (claude / codex / gemini / opencode) even when an
API key is in the environment.

- Rename + rewrite the "API before CLI" test to lock in CLI preference
- Add a new test covering the explicit opt-in path: setting
  TRAJECTORIES_LLM_PROVIDER=openai still routes to the OpenAI provider
  even when a CLI is installed

213/213 vitest suite passing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…late

Closes the last gap in workflow-id plumbing: the CLI path (trail start)
now reads TRAJECTORIES_WORKFLOW_ID from the environment (and accepts an
explicit --workflow flag) and stamps workflowId on the new trajectory.
Previously only the SDK path (TrajectoryClient.start) honored it, so
agents invoking trail directly would produce untagged trajectories even
when running under a relay workflow.

- src/cli/commands/start.ts: read TRAJECTORIES_WORKFLOW_ID, accept
  -w/--workflow <id>, spread onto the created trajectory before save
- tests/sdk/workflow-compact.test.ts: two new cases exercising the real
  CLI via spawnSync (env var + explicit flag)

workflows/compact-on-workflow-run.ts: new template workflow showing
the canonical auto-compact pattern — assign one workflow id at the top
of the file (child processes inherit it via env), let agents record
trajectories normally, final deterministic step runs trail compact
--workflow <id> --markdown to collate them into a single tight
artifact. Copy-pasteable for any workflow that wants auto-compact.

215/215 vitest suite passing. agent-relay dry-run clean.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rkflow on complete

When TrajectoryClient is constructed with autoCompact: true (or an
options object with mechanical/markdown overrides) and the trajectory
has a workflowId stamped, session.complete() and session.done() will
automatically shell out to trail compact --workflow <id> after saving
the raw trajectory. The compacted artifact appears at
.trajectories/compacted/workflow-<id>.{json,md}.

This removes the need for a separate compact step in any SDK consumer
running under a relay workflow — just set TRAJECTORIES_WORKFLOW_ID in
the environment and construct the client with autoCompact: true, and
complete() produces the tight artifact as a side effect.

- autoCompact is opt-in: default behavior unchanged
- Compaction failures are logged but do NOT fail complete() — the raw
  trajectory is always saved first
- Backed by a BEFORE/AFTER validation workflow under workflows/
- Tests cover all four permutations plus graceful failure
@khaliqgant
Copy link
Copy Markdown
Member Author

sdk-autocompact-option — validated end-to-end

BEFORE/AFTER gate PASSED. The feature is a genuine behavior change, not a no-op:

  • BEFORE: new TrajectoryClient() + session.done() produces NO compacted file (baseline locked).
  • AFTER: new TrajectoryClient({ autoCompact: { mechanical: true } }) + session.done() with TRAJECTORIES_WORKFLOW_ID set automatically produces .trajectories/compacted/workflow-<id>.{json,md}.

Ran via agent-relay run workflows/sdk-autocompact-option.ts with codex impl + claude tests + claude peer review + codex self-review.

Three fixes surfaced by PR #17 review:

1. Lint (biome lint/style/useTemplate):
   scripts/autocompact-probe.mts:42 — use template literal instead of
   string concatenation. Local biome ran clean because the file was
   added pre-commit hook on a different code path; CI caught it.

2. CI test failures (Node 20 + Node 22):
   Two vitest cases spawn the built trail CLI via child_process and
   require dist/cli/index.js on disk. The CI test job was running
   `npm run test:run` without building first, so those tests threw
   "dist CLI missing". Added a `Build` step to the test job in
   .github/workflows/ci.yml before `Run tests`. The pre-existing
   `compactWorkflow SDK helper` test on line 260 was already affected
   but silently — this also fixes it.

3. Devin review (src/compact/config.ts:111-122):
   readNumber(""): Number("") === 0, and 0 ?? DEFAULT evaluates to 0
   because nullish coalescing doesn't fall back for 0. That meant
   clearing TRAJECTORIES_LLM_MAX_INPUT_TOKENS silently set it to 0,
   truncating serialized trajectories to an empty string and sending
   max_tokens: 0 to API providers. Now mirrors readString: empty or
   whitespace-only strings return undefined so defaults apply.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@khaliqgant khaliqgant merged commit 4f3dba9 into main Apr 12, 2026
6 checks passed
@khaliqgant khaliqgant deleted the feat/llm-compaction branch April 12, 2026 20:13
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 7 additional findings in Devin Review.

Open in Devin Review

Comment on lines +343 to +344
child.stdin.write(input);
child.stdin.end();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing error handler on child.stdin in spawnWithStdin causes unhandled EPIPE crash

In spawnWithStdin (src/compact/provider.ts:313-346), stdin data is written to the child process at lines 343-344 but there is no error handler on child.stdin. If the spawned CLI process exits before consuming all stdin data (e.g., immediately fails due to misconfigured args, missing auth, or a crash during startup), Node.js emits an EPIPE error on the stdin writable stream. Without a child.stdin.on('error', ...) handler, this becomes an unhandled error event that crashes the Node.js process with ERR_UNHANDLED_ERROR: Unhandled error. (Error: write EPIPE). The child.on('error', reject) handler at line 332 only catches spawn-level errors (e.g., ENOENT), not errors on individual stdio streams. The child.on('close') handler would correctly reject the promise for the non-zero exit code, but the unhandled stdin error event fires independently and crashes the process before the promise rejection can propagate.

Suggested change
child.stdin.write(input);
child.stdin.end();
child.stdin.on("error", () => {}); // Swallow EPIPE — the close handler covers non-zero exits
child.stdin.write(input);
child.stdin.end();
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant