diff --git a/.agentsroom/.gitignore b/.agentsroom/.gitignore new file mode 100644 index 000000000..1acd1a387 --- /dev/null +++ b/.agentsroom/.gitignore @@ -0,0 +1,4 @@ +# AgentsRoom: personal files (not committed to git) +*-personal.json +agents-local.json +sessions/ diff --git a/.agentsroom/agents.json b/.agentsroom/agents.json new file mode 100644 index 000000000..e9a83e418 --- /dev/null +++ b/.agentsroom/agents.json @@ -0,0 +1,10 @@ +[ + { + "role": "fullstack", + "model": "opus", + "customName": "Full-Stack Developer", + "isPersonal": false, + "id": "agent-1776361243376-3sekdc", + "claudeSessionId": "96773a93-be2a-45a9-a732-ceb224d3d0e5" + } +] \ No newline at end of file diff --git a/.agentsroom/prompts.json b/.agentsroom/prompts.json new file mode 100644 index 000000000..f4455d843 --- /dev/null +++ b/.agentsroom/prompts.json @@ -0,0 +1,4 @@ +{ + "folders": [], + "prompts": [] +} \ No newline at end of file diff --git a/.changeset/ai-claude-code-initial.md b/.changeset/ai-claude-code-initial.md new file mode 100644 index 000000000..4cca73f0f --- /dev/null +++ b/.changeset/ai-claude-code-initial.md @@ -0,0 +1,5 @@ +--- +'@tanstack/ai-claude-code': minor +--- + +New `@tanstack/ai-claude-code` package: a Claude Code harness adapter that runs `@anthropic-ai/claude-agent-sdk` as a TanStack AI chat backend. Claude Code owns the agent loop and executes its built-in tools (bash, file edits, search) server-side; their activity streams back as resolved tool-call events. TanStack `toolDefinition()` server tools are bridged into the harness via an in-process MCP server, sessions are resumable via `modelOptions.sessionId` (surfaced through a `claude-code.session-id` custom event), and structured output uses the harness's native JSON-schema output format. diff --git a/.changeset/ai-client-noop-bridge-mount.md b/.changeset/ai-client-noop-bridge-mount.md new file mode 100644 index 000000000..46f22450d --- /dev/null +++ b/.changeset/ai-client-noop-bridge-mount.md @@ -0,0 +1,7 @@ +--- +'@tanstack/ai-client': patch +--- + +Fix `NoOpChatDevtoolsBridge` missing `mountWithTools`, `notifyToolsChanged`, and `recordStreamId` — the first call to `ChatClient.sendMessage` (with the default no-op devtools factory) threw `this.devtoolsBridge.mountWithTools is not a function` and silently rejected. `mountDevtools()` sets `devtoolsMounted = true` _before_ invoking `mountWithTools`, so the failure was non-obvious: the first send died inside the bridge call, while every subsequent send short-circuited past the broken line and worked normally. + +Also fix the structural-parity check that was supposed to prevent this drift. `const x: Missing = undefined as never` always typechecks (because `never` is assignable to anything), so the original check was a no-op. Replaced with `type _AssertBridgeParity = T`, which now fails the build the next time the real bridge grows a public method the no-op doesn't stub. diff --git a/.changeset/ai-codex-initial.md b/.changeset/ai-codex-initial.md new file mode 100644 index 000000000..4034a88b4 --- /dev/null +++ b/.changeset/ai-codex-initial.md @@ -0,0 +1,5 @@ +--- +'@tanstack/ai-codex': minor +--- + +New `@tanstack/ai-codex` package: a Codex harness adapter that runs `@openai/codex-sdk` as a TanStack AI chat backend. Codex owns the agent loop and executes its built-in tools (shell commands, file changes, web search, todo lists) server-side inside its sandbox; their activity streams back as resolved tool-call events. TanStack `toolDefinition()` server tools are bridged into the harness via a localhost Streamable-HTTP MCP server, threads are resumable via `modelOptions.sessionId` (surfaced through a `codex.session-id` custom event), and structured output uses the harness's native `outputSchema` support. Note: the Codex SDK reports assistant text only as completed messages — tool activity streams live, text arrives message-at-a-time. diff --git a/.changeset/ai-gemini-cli-initial.md b/.changeset/ai-gemini-cli-initial.md new file mode 100644 index 000000000..e20180e86 --- /dev/null +++ b/.changeset/ai-gemini-cli-initial.md @@ -0,0 +1,5 @@ +--- +'@tanstack/ai-gemini-cli': minor +--- + +New `@tanstack/ai-gemini-cli` package: a Gemini CLI harness adapter that drives `gemini --acp` (Agent Client Protocol) as a TanStack AI chat backend. Gemini CLI owns the agent loop and executes its built-in tools (shell, file edits, search) server-side; assistant text and thinking stream as true token-level deltas, and tool activity streams back as resolved tool-call events. TanStack `toolDefinition()` server tools are bridged into the harness via a localhost Streamable-HTTP MCP server, sessions are resumable via `modelOptions.sessionId` (surfaced through a `gemini-cli.session-id` custom event, with graceful fallback to transcript replay when the CLI can't load the session), and ACP permission requests are answered by a configurable never-hanging policy (`default` / `acceptEdits` / `bypassPermissions` or a custom handler). For headless hosts, the auth method is selectable up front via `authMethodId` (e.g. `'oauth-personal'`, `'gemini-api-key'`) — the adapter performs the ACP `authenticate` handshake before opening the session so a run never stalls on an interactive auth picker. Requires the `gemini` CLI to be installed and authenticated on the host. diff --git a/.changeset/ai-opencode-initial.md b/.changeset/ai-opencode-initial.md new file mode 100644 index 000000000..66cd96e33 --- /dev/null +++ b/.changeset/ai-opencode-initial.md @@ -0,0 +1,5 @@ +--- +'@tanstack/ai-opencode': minor +--- + +New `@tanstack/ai-opencode` package: an OpenCode harness adapter that drives [OpenCode](https://opencode.ai) (via `@opencode-ai/sdk`) as a TanStack AI chat backend. OpenCode owns the agent loop and executes its built-in tools (shell, file edits, search) locally; assistant text and thinking stream as token-level deltas, and tool activity streams back as resolved tool-call events. TanStack `toolDefinition()` server tools are bridged into the harness via a localhost MCP server, sessions are stateful and resumable, and OpenCode permission requests are answered by a configurable `permissionMode` (`default` / `acceptEdits` / `bypassPermissions` or a custom handler). Server-only (Node); requires the `opencode` CLI to be installed and authenticated on the host. diff --git a/.gitignore b/.gitignore index b261f62d1..5ea4db3e8 100644 --- a/.gitignore +++ b/.gitignore @@ -79,3 +79,7 @@ solo.yml .agent/gap-analysis/ .agent/triage/ .agent/research/ + +/OpenCode.md +.agentsroom/ +.opencode/ diff --git a/docs/adapters/claude-code.md b/docs/adapters/claude-code.md new file mode 100644 index 000000000..3f0f6dfbc --- /dev/null +++ b/docs/adapters/claude-code.md @@ -0,0 +1,181 @@ +--- +title: Claude Code +id: claude-code-adapter +order: 11 +description: "Use Claude Code as a chat backend in TanStack AI — agent harness with local tool execution, stateful coding sessions, and tool bridging via @tanstack/ai-claude-code." +keywords: + - tanstack ai + - claude code + - claude agent sdk + - anthropic + - harness + - agent + - coding agent + - adapter +--- + +The Claude Code adapter runs [Claude Code](https://docs.anthropic.com/en/docs/claude-code) (via the `@anthropic-ai/claude-agent-sdk`) as a chat backend. Unlike HTTP provider adapters, this is a **harness adapter**: Claude Code runs its own agent loop and executes its own tools — bash, file reads and edits, glob/grep search, web search — locally on your server. Each `chat()` call runs one full harness turn; the harness's tool activity streams back as already-resolved tool-call events your UI can render. + +> **Server-only.** The harness spawns the Claude Code runtime as a subprocess, so this adapter only works in a Node.js server environment — never in the browser. Treat it like giving Claude a shell on the machine it runs on, and configure permissions accordingly. + +## Installation + +```bash +npm install @tanstack/ai-claude-code +``` + +A runnable demo lives at [`examples/ts-react-coding-agent`](https://github.com/TanStack/ai/tree/main/examples/ts-react-coding-agent) — session resume, the harness tool timeline, permission modes, and tool bridging, wired into a React app. + +## Authentication + +The harness resolves credentials the same way Claude Code does: + +- `ANTHROPIC_API_KEY` in the server's environment (or the `apiKey` config option), or +- an existing Claude subscription login on the machine (`claude login`). + +## Basic Usage + +```typescript +import { chat } from "@tanstack/ai"; +import { claudeCodeText } from "@tanstack/ai-claude-code"; + +const stream = chat({ + adapter: claudeCodeText("claude-opus-4-8", { + cwd: "/path/to/project", + permissionMode: "acceptEdits", + }), + messages: [{ role: "user", content: "Fix the failing test in utils.test.ts" }], +}); +``` + +## Configuration + +| Option | Description | +| ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- | +| `cwd` | Working directory for the harness session. Defaults to `process.cwd()`. | +| `permissionMode` | Claude Code permission mode (`'default'`, `'acceptEdits'`, `'bypassPermissions'`, `'plan'`, `'dontAsk'`, `'auto'`). See the permissions note below. | +| `allowedTools` | Built-in tools the harness may use without prompting (e.g. `['Read', 'Grep', 'Bash(npm test:*)']`). | +| `disallowedTools` | Built-in tools removed from the harness entirely. | +| `maxTurns` | Maximum harness-internal turns per run. | +| `systemPromptMode` | `'append'` (default) keeps Claude Code's preset system prompt and appends your `systemPrompts`; `'replace'` sends yours as the entire prompt. | +| `mcpServers` | Extra MCP servers passed through to the harness untouched. | +| `apiKey` | Anthropic API key for the harness subprocess. | +| `env` | Extra environment variables for the harness subprocess. | +| `pathToClaudeCodeExecutable` | Use a specific Claude Code executable instead of the SDK's bundled one. | +| `streamPartials` | Emit true token-level text deltas (default `true`). | +| `canUseTool` | Custom permission handler; replaces the adapter's default handler. | +| `settingSources` | Claude Code settings tiers to load. Default `['project']`: the `cwd`'s CLAUDE.md and project settings apply, but user-level config on the host (`~/.claude` plugins, hooks, skills) is ignored. Pass `['user', 'project', 'local']` for CLI-equivalent behavior, or `[]` for full isolation. | + +**Permissions on headless servers.** Without an explicit `permissionMode` or `canUseTool`, the adapter installs a safe default handler: bridged TanStack tools always run, and any built-in tool call that would normally prompt a human is denied with guidance instead of hanging the request. To let the harness edit files or run commands, set `permissionMode: 'acceptEdits'` / `'bypassPermissions'`, or enumerate `allowedTools`. + +## Stateful Sessions + +Claude Code sessions are stateful — the harness keeps the full working context (files read, commands run, conclusions reached) between turns. The adapter surfaces the session id of every run as a custom stream event named `claude-code.session-id`; thread it back via `modelOptions.sessionId` to resume the session. When resuming, only the latest user message is sent — the harness already holds the prior context. + +Server endpoint: + +```typescript +import { + chat, + chatParamsFromRequest, + toServerSentEventsResponse, +} from "@tanstack/ai"; +import { claudeCodeText } from "@tanstack/ai-claude-code"; + +export async function POST(request: Request) { + const params = await chatParamsFromRequest(request); + + // Extra fields the client puts in the connection `body` arrive here. + const sessionId = + typeof params.forwardedProps.sessionId === "string" + ? params.forwardedProps.sessionId + : undefined; + + const stream = chat({ + adapter: claudeCodeText("claude-opus-4-8", { + cwd: "/path/to/project", + permissionMode: "acceptEdits", + }), + messages: params.messages, + modelOptions: { sessionId }, + }); + + return toServerSentEventsResponse(stream); +} +``` + +Client (React) — capture the session id from the custom event and send it back on subsequent requests: + +```typescript +import { useState } from "react"; +import { useChat } from "@tanstack/ai-react"; +import { fetchServerSentEvents } from "@tanstack/ai-client"; + +function CodingAssistant() { + const [sessionId, setSessionId] = useState(undefined); + + const { messages, sendMessage } = useChat({ + connection: fetchServerSentEvents("/api/chat", () => ({ + body: { sessionId }, + })), + onCustomEvent: (name, value) => { + if ( + name === "claude-code.session-id" && + typeof value === "object" && + value !== null && + "sessionId" in value && + typeof value.sessionId === "string" + ) { + setSessionId(value.sessionId); + } + }, + }); + + // ... render messages; harness tool activity (Bash, Edit, Read, ...) + // arrives as regular tool-call parts with their results attached. +} +``` + +Sessions are stored on the machine that ran them (`~/.claude/projects/`), so resuming only works on the same server instance. Pass `modelOptions: { forkSession: true }` alongside `sessionId` to branch a session instead of continuing it. + +## Tools + +Two kinds of tools flow through this adapter: + +1. **Built-in harness tools** (`Bash`, `Read`, `Write`, `Edit`, `Glob`, `Grep`, `WebSearch`, ...) are executed by Claude Code itself. Their activity streams back as tool-call events with results already attached, so `useChat` UIs render them with no extra wiring — but your code never executes them. + +2. **Your TanStack tools** are bridged *into* the harness as an in-process MCP server. Define them as usual with `toolDefinition().server()`; the model sees them as `mcp__tanstack__` and the adapter strips the prefix on the way back out, so events match the names you registered. + +```typescript +import { z } from "zod"; +import { chat, toolDefinition } from "@tanstack/ai"; +import { claudeCodeText } from "@tanstack/ai-claude-code"; + +const lookupTicket = toolDefinition({ + name: "lookup_ticket", + description: "Look up an issue ticket by id", + inputSchema: z.object({ ticketId: z.string() }), +}).server(async ({ ticketId }) => { + return { ticketId, status: "open", title: "Crash on startup" }; +}); + +const stream = chat({ + adapter: claudeCodeText("claude-opus-4-8"), + messages: [{ role: "user", content: "What's the status of ticket T-123?" }], + tools: [lookupTicket], +}); +``` + +**Client-side and approval-gated tools are not supported.** The harness executes tools inside a live subprocess, which cannot pause across HTTP requests to wait for a browser round-trip or a human approval. Passing a tool without a server `execute()` implementation — or one marked `needsApproval` — fails fast with a descriptive error. Run those tools outside the harness with a regular provider adapter. + +## Structured Output + +`structuredOutput()` uses the harness's native JSON-schema output format in a one-shot run (single turn, no tools). It works for finalization after a chat, but a plain provider adapter (e.g. `@tanstack/ai-anthropic`) is the better choice when structured extraction is the primary job — it's faster and doesn't spawn a subprocess. + +## Limitations + +- **Server-only (Node).** The harness spawns a subprocess; Windows support is untested. +- **The harness owns the agent loop.** TanStack's agent-loop strategies and per-iteration middleware don't apply inside a harness turn; `maxTurns` is the equivalent control. +- **No sampling controls.** `temperature`-style options don't exist here. +- **Sessions are machine-local.** Resume requires hitting the same server instance. +- **Cold starts.** Each call spawns a harness turn; expect higher first-token latency than HTTP adapters. diff --git a/docs/adapters/codex.md b/docs/adapters/codex.md new file mode 100644 index 000000000..199cffe7f --- /dev/null +++ b/docs/adapters/codex.md @@ -0,0 +1,182 @@ +--- +title: Codex +id: codex-adapter +order: 12 +description: "Use OpenAI Codex as a chat backend in TanStack AI — agent harness with local tool execution, stateful coding sessions, and tool bridging via @tanstack/ai-codex." +keywords: + - tanstack ai + - codex + - codex sdk + - openai + - harness + - agent + - coding agent + - adapter +--- + +The Codex adapter runs [OpenAI Codex](https://developers.openai.com/codex) (via the `@openai/codex-sdk`) as a chat backend. Unlike HTTP provider adapters, this is a **harness adapter**: Codex runs its own agent loop and executes its own tools — shell commands, file changes, web search — locally on your server, inside its sandbox. Each `chat()` call runs one full harness turn; the harness's tool activity streams back as already-resolved tool-call events your UI can render. + +> **Server-only.** The harness spawns the Codex runtime (bundled with the SDK) as a subprocess, so this adapter only works in a Node.js server environment — never in the browser. The sandbox mode is the safety boundary; configure it deliberately. + +## Installation + +```bash +npm install @tanstack/ai-codex +``` + +A runnable demo lives at [`examples/ts-react-coding-agent`](https://github.com/TanStack/ai/tree/main/examples/ts-react-coding-agent) — session resume, the harness tool timeline, sandbox modes, and tool bridging, wired into a React app. + +## Authentication + +The harness resolves credentials the same way the Codex CLI does: + +- the `apiKey` config option (exported to the subprocess as `CODEX_API_KEY`; usage-based billing), or +- an existing ChatGPT login on the machine (`codex login`). + +## Basic Usage + +```typescript +import { chat } from "@tanstack/ai"; +import { codexText } from "@tanstack/ai-codex"; + +const stream = chat({ + adapter: codexText("gpt-5.1-codex", { + cwd: "/path/to/project", + sandboxMode: "workspace-write", + }), + messages: [{ role: "user", content: "Fix the failing test in utils.test.ts" }], +}); +``` + +## Configuration + +| Option | Description | +| ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | +| `cwd` | Working directory for the harness session. Defaults to `process.cwd()`. | +| `sandboxMode` | Codex sandbox: `'read-only'` (harness default), `'workspace-write'`, or `'danger-full-access'`. This is the safety boundary on a server. | +| `approvalPolicy` | Codex approval policy. Defaults to `'never'` — headless runs have no approval UI, so anything else can stall a turn. | +| `modelReasoningEffort` | `'minimal'` \| `'low'` \| `'medium'` \| `'high'` \| `'xhigh'`. | +| `skipGitRepoCheck` | Skip the harness's git-repo safety check. Defaults to `true` (server adapters routinely point at scratch directories). | +| `networkAccessEnabled` | Allow network access inside the `workspace-write` sandbox. | +| `webSearchMode` | `'disabled'` \| `'cached'` \| `'live'`. | +| `additionalDirectories`| Extra writable directories beyond `cwd`. | +| `apiKey` | OpenAI API key for the harness subprocess. | +| `baseUrl` | Override the Codex backend base URL. | +| `codexPathOverride` | Use a specific codex executable instead of the SDK's bundled binary. | +| `env` | Environment variables for the subprocess. When set, `process.env` is **not** inherited (Codex SDK semantics). | +| `config` | Extra `--config key=value` overrides passed to the Codex CLI (e.g. additional `mcp_servers` entries). | + +Per-call overrides — `sessionId`, `sandboxMode`, `approvalPolicy`, `modelReasoningEffort`, `workingDirectory`, `skipGitRepoCheck` — go through `modelOptions`. + +## Stateful Sessions + +Codex threads are stateful — the harness keeps the full working context (files read, commands run, conclusions reached) between turns. The adapter surfaces the thread id of every fresh run as a custom stream event named `codex.session-id`; thread it back via `modelOptions.sessionId` to resume. When resuming, only the latest user message is sent — the harness already holds the prior context. + +Server endpoint: + +```typescript +import { + chat, + chatParamsFromRequest, + toServerSentEventsResponse, +} from "@tanstack/ai"; +import { codexText } from "@tanstack/ai-codex"; + +export async function POST(request: Request) { + const params = await chatParamsFromRequest(request); + + // Extra fields the client puts in the connection `body` arrive here. + const sessionId = + typeof params.forwardedProps.sessionId === "string" + ? params.forwardedProps.sessionId + : undefined; + + const stream = chat({ + adapter: codexText("gpt-5.1-codex", { + cwd: "/path/to/project", + sandboxMode: "workspace-write", + }), + messages: params.messages, + modelOptions: { sessionId }, + }); + + return toServerSentEventsResponse(stream); +} +``` + +Client (React) — capture the session id from the custom event and send it back on subsequent requests: + +```typescript +import { useState } from "react"; +import { useChat } from "@tanstack/ai-react"; +import { fetchServerSentEvents } from "@tanstack/ai-client"; + +function CodingAssistant() { + const [sessionId, setSessionId] = useState(undefined); + + const { messages, sendMessage } = useChat({ + connection: fetchServerSentEvents("/api/chat", () => ({ + body: { sessionId }, + })), + onCustomEvent: (name, value) => { + if ( + name === "codex.session-id" && + typeof value === "object" && + value !== null && + "sessionId" in value && + typeof value.sessionId === "string" + ) { + setSessionId(value.sessionId); + } + }, + }); + + // ... render messages; harness tool activity (command_execution, + // file_change, ...) arrives as regular tool-call parts with results. +} +``` + +Sessions are stored on the machine that ran them (`~/.codex/sessions/`), so resuming only works on the same server instance. + +## Tools + +Two kinds of tools flow through this adapter: + +1. **Built-in harness tools** are executed by Codex itself and stream back as tool-call events with results already attached: `command_execution` (shell), `file_change` (patches), `web_search`, and `todo_list` (the agent's running plan). Your code never executes them. + +2. **Your TanStack tools** are bridged *into* the harness: the adapter starts a short-lived Streamable-HTTP MCP server on `127.0.0.1` for the duration of the turn and points Codex at it. Define tools as usual with `toolDefinition().server()`; tool-call events come back under the names you registered. + +```typescript +import { z } from "zod"; +import { chat, toolDefinition } from "@tanstack/ai"; +import { codexText } from "@tanstack/ai-codex"; + +const lookupTicket = toolDefinition({ + name: "lookup_ticket", + description: "Look up an issue ticket by id", + inputSchema: z.object({ ticketId: z.string() }), +}).server(async ({ ticketId }) => { + return { ticketId, status: "open", title: "Crash on startup" }; +}); + +const stream = chat({ + adapter: codexText("gpt-5.1-codex"), + messages: [{ role: "user", content: "What's the status of ticket T-123?" }], + tools: [lookupTicket], +}); +``` + +**Client-side and approval-gated tools are not supported.** The harness executes tools inside a live subprocess, which cannot pause across HTTP requests to wait for a browser round-trip or a human approval. Passing a tool without a server `execute()` implementation — or one marked `needsApproval` — fails fast with a descriptive error. Run those tools outside the harness with a regular provider adapter. + +## Structured Output + +`structuredOutput()` uses Codex's native `outputSchema` support in a fresh, read-only, one-shot thread whose final message is a JSON string conforming to your schema. It works for finalization after a chat, but a plain provider adapter (e.g. `@tanstack/ai-openai`) is the better choice when structured extraction is the primary job — it's faster and doesn't spawn a subprocess. + +## Limitations + +- **No token-level text streaming.** The Codex SDK reports assistant text and reasoning only as completed items, so text arrives message-at-a-time. Tool activity (commands starting/finishing) still streams live, which keeps the UI feeling alive during long turns. +- **Server-only (Node).** The harness spawns a subprocess. +- **The harness owns the agent loop.** TanStack's agent-loop strategies and per-iteration middleware don't apply inside a harness turn. +- **No sampling controls.** `temperature`-style options don't exist here. +- **Sessions are machine-local.** Resume requires hitting the same server instance. +- **Cold starts.** Each call spawns a harness turn; expect higher first-token latency than HTTP adapters. diff --git a/docs/adapters/gemini-cli.md b/docs/adapters/gemini-cli.md new file mode 100644 index 000000000..9822c1298 --- /dev/null +++ b/docs/adapters/gemini-cli.md @@ -0,0 +1,205 @@ +--- +title: Gemini CLI +id: gemini-cli-adapter +order: 13 +description: "Use Gemini CLI as a chat backend in TanStack AI — agent harness with local tool execution, stateful coding sessions, and tool bridging via @tanstack/ai-gemini-cli." +keywords: + - tanstack ai + - gemini cli + - agent client protocol + - acp + - google + - harness + - agent + - coding agent + - adapter +--- + +The Gemini CLI adapter runs [Gemini CLI](https://github.com/google-gemini/gemini-cli) as a chat backend, driving it over the [Agent Client Protocol](https://agentclientprotocol.com) (`gemini --acp`) — the same interface editors like Zed use to embed it. Unlike HTTP provider adapters, this is a **harness adapter**: Gemini CLI runs its own agent loop and executes its own tools — shell commands, file reads and edits, search — locally on your server. Each `chat()` call runs one full harness turn; assistant text and thinking stream as true token-level deltas, and the harness's tool activity streams back as already-resolved tool-call events your UI can render. + +> **Server-only.** The adapter spawns the `gemini` CLI as a subprocess, so it only works in a Node.js server environment — never in the browser. Treat it like giving Gemini a shell on the machine it runs on, and configure permissions accordingly. + +## Installation + +```bash +npm install @tanstack/ai-gemini-cli +``` + +The `gemini` CLI itself is a prerequisite — it is **not** bundled: + +```bash +npm install -g @google/gemini-cli +``` + +A runnable demo lives at [`examples/ts-react-coding-agent`](https://github.com/TanStack/ai/tree/main/examples/ts-react-coding-agent) — session resume, the harness tool timeline, permission modes, and tool bridging, wired into a React app. + +## Authentication + +The harness resolves credentials the same way Gemini CLI does: + +- an existing Google login on the machine (run `gemini` once interactively), or +- `GEMINI_API_KEY` in the server's environment (pass it via the `env` config option if needed). + +**Headless ACP auth.** When driven over ACP, Gemini CLI can't pop an +interactive auth picker, so it needs to be told which method to use. Set +`authMethodId` to one of the methods the CLI advertises — commonly +`'oauth-personal'` (Log in with Google), `'gemini-api-key'`, or `'vertex-ai'`. +The adapter selects it (via the ACP `authenticate` call) before opening the +session, and fails fast with the list of available methods if the one you +asked for isn't offered. Some setups also require trusting the working +directory in headless mode — set `GEMINI_CLI_TRUST_WORKSPACE=true` (or pass +`--skip-trust` via `extraArgs`) when the CLI refuses an untrusted folder. + +```typescript +import { geminiCliText } from "@tanstack/ai-gemini-cli"; + +const adapter = geminiCliText("gemini-3-pro-preview", { + cwd: "/path/to/project", + authMethodId: "oauth-personal", // reuse the machine's Google login +}); +``` + +## Basic Usage + +```typescript +import { chat } from "@tanstack/ai"; +import { geminiCliText } from "@tanstack/ai-gemini-cli"; + +const stream = chat({ + adapter: geminiCliText("gemini-3-pro-preview", { + cwd: "/path/to/project", + permissionMode: "acceptEdits", + }), + messages: [{ role: "user", content: "Fix the failing test in utils.test.ts" }], +}); +``` + +## Configuration + +| Option | Description | +| --------------------- | --------------------------------------------------------------------------------------------------------------------- | +| `cwd` | Working directory for the harness session. Defaults to `process.cwd()`. | +| `executablePath` | Path to the Gemini CLI executable. Defaults to `gemini` on `PATH`. | +| `extraArgs` | Extra CLI arguments appended after `--acp` (e.g. `['--sandbox']`). | +| `env` | Extra environment variables merged over `process.env` for the subprocess. | +| `permissionMode` | `'default'`, `'acceptEdits'`, or `'bypassPermissions'`. See the permissions note below. | +| `onPermissionRequest` | Custom permission handler; replaces the adapter's default policy. | +| `authMethodId` | ACP auth method to select before the session starts, e.g. `'oauth-personal'`, `'gemini-api-key'`, `'vertex-ai'`. See Authentication. | + +Per-call overrides — `sessionId`, `permissionMode`, `cwd`, `authMethodId` — go through `modelOptions`. + +**Permissions on headless servers.** ACP routes the harness's tool-approval questions back to the embedding application. Without a custom `onPermissionRequest`, the adapter installs a safe default policy that always answers immediately: bridged TanStack tools are approved, `'acceptEdits'` additionally approves file-mutation tools (edit / move / delete kinds), `'bypassPermissions'` approves everything, and anything else is rejected — a headless server must never hang on a question only an interactive user could answer. + +## Stateful Sessions + +Gemini CLI sessions are stateful — the harness keeps the full working context between turns. The adapter surfaces the session id of every run as a custom stream event named `gemini-cli.session-id`; thread it back via `modelOptions.sessionId` to resume the session. When resuming, only the latest user message is sent — the harness already holds the prior context. If the installed CLI can't load the session (older CLI, different machine), the adapter transparently falls back to a fresh session seeded with the flattened transcript, and the new session id is emitted so the client can re-pin it. + +Server endpoint: + +```typescript +import { + chat, + chatParamsFromRequest, + toServerSentEventsResponse, +} from "@tanstack/ai"; +import { geminiCliText } from "@tanstack/ai-gemini-cli"; + +export async function POST(request: Request) { + const params = await chatParamsFromRequest(request); + + // Extra fields the client puts in the connection `body` arrive here. + const sessionId = + typeof params.forwardedProps.sessionId === "string" + ? params.forwardedProps.sessionId + : undefined; + + const stream = chat({ + adapter: geminiCliText("gemini-3-pro-preview", { + cwd: "/path/to/project", + permissionMode: "acceptEdits", + }), + messages: params.messages, + modelOptions: { sessionId }, + }); + + return toServerSentEventsResponse(stream); +} +``` + +Client (React) — capture the session id from the custom event and send it back on subsequent requests: + +```typescript +import { useState } from "react"; +import { useChat } from "@tanstack/ai-react"; +import { fetchServerSentEvents } from "@tanstack/ai-client"; + +function CodingAssistant() { + const [sessionId, setSessionId] = useState(undefined); + + const { messages, sendMessage } = useChat({ + connection: fetchServerSentEvents("/api/chat", () => ({ + body: { sessionId }, + })), + onCustomEvent: (name, value) => { + if ( + name === "gemini-cli.session-id" && + typeof value === "object" && + value !== null && + "sessionId" in value && + typeof value.sessionId === "string" + ) { + setSessionId(value.sessionId); + } + }, + }); + + // ... render messages; harness tool activity (execute, edit, read, ...) + // arrives as regular tool-call parts with their results attached. +} +``` + +Sessions are stored on the machine that ran them (under `~/.gemini/tmp/`), so resuming only works on the same server instance. + +## Tools + +Two kinds of tools flow through this adapter: + +1. **Built-in harness tools** (shell, file edits, reads, search, web fetch, ...) are executed by Gemini CLI itself. Their activity streams back as tool-call events — named by their ACP tool kind (`execute`, `edit`, `read`, `search`, ...), with the human-readable title in the arguments — and results attached, so `useChat` UIs render them with no extra wiring. Your code never executes them. The harness's running plan is surfaced as a CUSTOM `gemini-cli.plan` event. + +2. **Your TanStack tools** are bridged *into* the harness: the adapter starts a short-lived Streamable-HTTP MCP server on `127.0.0.1` for the duration of the turn and registers it with the ACP session. Define tools as usual with `toolDefinition().server()`; tool-call events come back under the names you registered, and the default permission policy auto-approves them. + +```typescript +import { z } from "zod"; +import { chat, toolDefinition } from "@tanstack/ai"; +import { geminiCliText } from "@tanstack/ai-gemini-cli"; + +const lookupTicket = toolDefinition({ + name: "lookup_ticket", + description: "Look up an issue ticket by id", + inputSchema: z.object({ ticketId: z.string() }), +}).server(async ({ ticketId }) => { + return { ticketId, status: "open", title: "Crash on startup" }; +}); + +const stream = chat({ + adapter: geminiCliText("gemini-3-pro-preview"), + messages: [{ role: "user", content: "What's the status of ticket T-123?" }], + tools: [lookupTicket], +}); +``` + +**Client-side and approval-gated tools are not supported.** The harness executes tools inside a live subprocess, which cannot pause across HTTP requests to wait for a browser round-trip or a human approval. Passing a tool without a server `execute()` implementation — or one marked `needsApproval` — fails fast with a descriptive error. Run those tools outside the harness with a regular provider adapter. + +## Structured Output + +ACP has no native JSON-schema output channel, so `structuredOutput()` is best-effort: the schema is embedded as a prompt instruction in a fresh one-shot session and the final text is parsed (markdown fences are stripped when present). For production structured extraction, use a plain provider adapter (e.g. `@tanstack/ai-gemini`) — it's faster, schema-enforced, and doesn't spawn a subprocess. + +## Limitations + +- **Server-only (Node)**, and the `gemini` CLI must be installed and authenticated on the host. +- **Token usage is usually unavailable.** ACP only recently added usage reporting; when the CLI doesn't report it, `RUN_FINISHED` carries no usage. +- **The harness owns the agent loop.** TanStack's agent-loop strategies and per-iteration middleware don't apply inside a harness turn. +- **No sampling controls.** `temperature`-style options don't exist here. +- **Sessions are machine-local.** Resume requires hitting the same server instance (with graceful fallback to a fresh transcript-seeded session). +- **Cold starts.** Each call spawns the CLI; expect higher first-token latency than HTTP adapters. +- **ACP is young.** Gemini CLI's ACP mode is still stabilizing; pin a known-good CLI version in production. diff --git a/docs/adapters/opencode.md b/docs/adapters/opencode.md new file mode 100644 index 000000000..ff2fa70e6 --- /dev/null +++ b/docs/adapters/opencode.md @@ -0,0 +1,186 @@ +--- +title: OpenCode +id: opencode-adapter +order: 14 +description: "Use OpenCode as a chat backend in TanStack AI — agent harness with local tool execution, token-level streaming, stateful sessions, and tool bridging via @tanstack/ai-opencode." +keywords: + - tanstack ai + - opencode + - opencode sdk + - harness + - agent + - coding agent + - adapter +--- + +The OpenCode adapter runs [OpenCode](https://opencode.ai) as a chat backend, driving it over its local HTTP server (`@opencode-ai/sdk`). Unlike HTTP provider adapters, this is a **harness adapter**: OpenCode runs its own agent loop and executes its own tools — shell commands, file reads and edits, search — locally on your server. Each `chat()` call runs one full harness turn; assistant text and reasoning stream as true token-level deltas, and the harness's tool activity streams back as already-resolved tool-call events your UI can render. + +> **Server-only.** The adapter spawns (or attaches to) an `opencode serve` process, so it only works in a Node.js server environment — never in the browser. Treat it like giving OpenCode a shell on the machine it runs on, and configure permissions accordingly. + +## Installation + +```bash +npm install @tanstack/ai-opencode +``` + +The `opencode` CLI must be installed and its providers authenticated on the host: + +```bash +npm install -g opencode-ai +opencode auth login +``` + +A runnable demo lives at [`examples/ts-react-coding-agent`](https://github.com/TanStack/ai/tree/main/examples/ts-react-coding-agent) — session resume, the harness tool timeline, permission modes, and tool bridging, wired into a React app. + +## Models + +OpenCode is provider-agnostic: it resolves any `provider/model` id its configured providers support. Address models as `provider/model` (the adapter splits on the first `/`): + +```typescript +import { chat } from "@tanstack/ai"; +import { opencodeText } from "@tanstack/ai-opencode"; + +const stream = chat({ + adapter: opencodeText("anthropic/claude-sonnet-4-5", { + directory: "/path/to/project", + permissionMode: "acceptEdits", + }), + messages: [{ role: "user", content: "Fix the failing test in utils.test.ts" }], +}); +``` + +## Configuration + +| Option | Description | +| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| `directory` | Working directory for the harness session. Defaults to `process.cwd()`. | +| `baseUrl` | Attach to an already-running `opencode serve` (e.g. `http://127.0.0.1:4096`) instead of spawning a new server per turn. | +| `hostname` | Hostname for the spawned server. Defaults to the SDK default (`127.0.0.1`). | +| `port` | Port for the spawned server. Defaults to the SDK default (`4096`). | +| `permissionMode` | `'default'` (bridged tools run, everything else that prompts is rejected), `'acceptEdits'` (also auto-approves file edits), or `'bypassPermissions'` (allow all). | +| `onPermissionRequest` | Custom permission handler; replaces the default policy entirely. | +| `config` | Extra OpenCode config merged with the adapter's MCP and permission config. | + +Per-call overrides — `sessionId`, `permissionMode`, `directory` — go through `modelOptions`. + +## Permissions + +OpenCode asks for permission before mutating files or running commands. A headless server has no one to answer those prompts, so the adapter applies a policy automatically — it never hangs a turn: + +- **`'default'`** — bridged TanStack tools run; anything else that would prompt (edits, shell, web fetch) is rejected. +- **`'acceptEdits'`** — additionally auto-approves file-mutation requests (edit / write / patch). +- **`'bypassPermissions'`** — approves everything. Only use this against a sandbox or scratch directory. + +Provide `onPermissionRequest` to implement your own policy (e.g. allow-list specific commands). + +## Stateful Sessions + +OpenCode sessions are stateful — the harness keeps the full working context (files read, commands run, conclusions reached) between turns. The adapter surfaces the session id of every fresh run as a custom stream event named `opencode.session-id`; thread it back via `modelOptions.sessionId` to resume. When resuming, only the latest user message is sent — the harness already holds the prior context. + +Server endpoint: + +```typescript +import { + chat, + chatParamsFromRequest, + toServerSentEventsResponse, +} from "@tanstack/ai"; +import { opencodeText } from "@tanstack/ai-opencode"; + +export async function POST(request: Request) { + const params = await chatParamsFromRequest(request); + + // Extra fields the client puts in the connection `body` arrive here. + const sessionId = + typeof params.forwardedProps.sessionId === "string" + ? params.forwardedProps.sessionId + : undefined; + + const stream = chat({ + adapter: opencodeText("anthropic/claude-sonnet-4-5", { + directory: "/path/to/project", + permissionMode: "acceptEdits", + }), + messages: params.messages, + modelOptions: { sessionId }, + }); + + return toServerSentEventsResponse(stream); +} +``` + +Client (React) — capture the session id from the custom event and send it back on subsequent requests: + +```typescript +import { useState } from "react"; +import { useChat } from "@tanstack/ai-react"; +import { fetchServerSentEvents } from "@tanstack/ai-client"; + +function CodingAssistant() { + const [sessionId, setSessionId] = useState(undefined); + + const { messages, sendMessage } = useChat({ + connection: fetchServerSentEvents("/api/chat", () => ({ + body: { sessionId }, + })), + onCustomEvent: (name, value) => { + if ( + name === "opencode.session-id" && + typeof value === "object" && + value !== null && + "sessionId" in value && + typeof value.sessionId === "string" + ) { + setSessionId(value.sessionId); + } + }, + }); + + // ... render messages; harness tool activity (bash, edit, read, ...) + // arrives as regular tool-call parts with results. +} +``` + +Sessions live on the server that ran them, so resuming only works against the same server instance (or a shared `baseUrl`). + +## Tools + +Two kinds of tools flow through this adapter: + +1. **Built-in harness tools** are executed by OpenCode itself and stream back as tool-call events with results already attached: `bash`, `edit`, `write`, `read`, `grep`, and the agent's running todo plan (surfaced as an `opencode.todo` custom event). Your code never executes them. + +2. **Your TanStack tools** are bridged *into* the harness: the adapter starts a short-lived Streamable-HTTP MCP server on `127.0.0.1` for the duration of the turn and registers it with OpenCode. Define tools as usual with `toolDefinition().server()`; tool-call events come back under the names you registered (OpenCode prefixes MCP tools `tanstack_…` internally, which the adapter strips). + +```typescript +import { z } from "zod"; +import { chat, toolDefinition } from "@tanstack/ai"; +import { opencodeText } from "@tanstack/ai-opencode"; + +const lookupTicket = toolDefinition({ + name: "lookup_ticket", + description: "Look up an issue ticket by id", + inputSchema: z.object({ ticketId: z.string() }), +}).server(async ({ ticketId }) => { + return { ticketId, status: "open", title: "Crash on startup" }; +}); + +const stream = chat({ + adapter: opencodeText("anthropic/claude-sonnet-4-5"), + messages: [{ role: "user", content: "What's the status of ticket T-123?" }], + tools: [lookupTicket], +}); +``` + +**Client-side and approval-gated tools are not supported.** The harness executes tools inside a live process, which cannot pause across HTTP requests to wait for a browser round-trip or a human approval. Passing a tool without a server `execute()` implementation — or one marked `needsApproval` — fails fast with a descriptive error. Run those tools outside the harness with a regular provider adapter. + +## Structured Output + +`structuredOutput()` is best-effort: OpenCode's prompt API has no native JSON-schema channel, so the schema is embedded as a prompt instruction in a fresh, one-shot session and the final text is parsed (markdown fences are stripped when present). It works for finalization after a chat, but a plain provider adapter (e.g. `@tanstack/ai-openai`) is the better choice when structured extraction is the primary job — it's faster, deterministic, and doesn't spawn a harness. + +## Limitations + +- **Server-only (Node).** The adapter spawns or attaches to an `opencode serve` process. +- **The harness owns the agent loop.** TanStack's agent-loop strategies and per-iteration middleware don't apply inside a harness turn. +- **No sampling controls.** `temperature`-style options don't exist here. +- **Sessions are server-local.** Resume requires hitting the same server instance (or a shared `baseUrl`). +- **Cold starts.** Spawning a server per turn adds first-token latency; point the adapter at a long-lived `baseUrl` to avoid it. diff --git a/docs/config.json b/docs/config.json index 966b75108..de0eda84f 100644 --- a/docs/config.json +++ b/docs/config.json @@ -461,6 +461,26 @@ "to": "adapters/openai-compatible", "addedAt": "2026-06-01", "updatedAt": "2026-06-20" + }, + { + "label": "Claude Code", + "to": "adapters/claude-code", + "addedAt": "2026-06-12" + }, + { + "label": "Codex", + "to": "adapters/codex", + "addedAt": "2026-06-12" + }, + { + "label": "Gemini CLI", + "to": "adapters/gemini-cli", + "addedAt": "2026-06-12" + }, + { + "label": "OpenCode", + "to": "adapters/opencode", + "addedAt": "2026-06-12" } ] }, diff --git a/examples/coco/README.md b/examples/coco/README.md new file mode 100644 index 000000000..883741703 --- /dev/null +++ b/examples/coco/README.md @@ -0,0 +1,201 @@ +# Coco — CLI dev-overlay coding agent + +Coco is a drop-in command-line tool that wraps your project's dev server with +an in-page AI coding-agent chat panel. Run `coco` inside any web project, open +the URL it prints, and a floating chat appears on top of your running app. The +chat drives a real coding agent — [Claude Code], [Codex], [Gemini CLI], or +[OpenCode] — pointed at the project's working directory, so the agent edits +the live code and the dev server's HMR reloads the page. + +[Claude Code]: https://docs.anthropic.com/en/docs/claude-code +[Codex]: https://developers.openai.com/codex +[Gemini CLI]: https://github.com/google-gemini/gemini-cli +[OpenCode]: https://opencode.ai + +Coco is framework-agnostic: it reverse-proxies the dev server and injects a +small `' + +/** + * Insert Coco's panel `