Skip to content

Commit e380851

Browse files
lowyellingclaude
andauthored
docs(integrations): rewrite Vercel AI SDK guide as cookbook style (DEV-1485) (plastic-labs#635)
* docs(integrations): add @honcho-ai/vercel-ai-sdk guide Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(integrations): rewrite Vercel AI SDK guide as cookbook style (DEV-1485) Reshapes the guide to cookbook formula, adds Full Script section, fixes maxSteps → stopWhen for ai-sdk v5, renames package, and prunes stale notes. See PR for full decision log. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(integrations): lead Vercel AI SDK verification with direct-inspection check - Restructure Verifying section: direct inspection (token delta + dashboard) is now step 1 so readers isolate Honcho's contribution before grading model behavior - Behavioral tests (first turn, multi-turn, cross-session, tool calling) follow as steps 2-5 - Note `result.toolCalls` as the way to confirm which Honcho tool fired (tool names don't appear in `result.text`) - Signpost the Full Script from Complete Example so the two snippets read as a staircase, not a duplicate Addresses review comments on PR plastic-labs#635. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tests): satisfy basedpyright in test_representation_manager The save-representation tests added in plastic-labs#615 were structurally correct but failed strict typing in two places. Static Analysis has been red on main since the merge. - `mock_save.await_args` is `_Call | None`; assert it's not None before reading `.kwargs` / `.args` so basedpyright can narrow the type - `SimpleNamespace(...)` passed as `message_level_configuration` is an intentional duck-typed mock (only `.dream.enabled` is read by `save_representation`), so opt out at the call site with `# pyright: ignore[reportArgumentType]` rather than constructing a full `ResolvedConfiguration` (matches the existing `reportPrivateUsage` ignore pattern in this file) No runtime behavior changes; `uv run basedpyright` is now clean project-wide. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tests): pad timestamp windows in test_messages for clock skew Three timestamp tests captured `before_request` / `after_request` with `datetime.now(UTC)` on the host and asserted the server's `created_at` fell within. Under Docker, the Postgres container's clock can skew tens of ms from the macOS host, flipping the assertion intermittently under parallel pytest load. Pad each window by 1 second on both sides — wide enough to absorb realistic skew, narrow enough that the test still proves the timestamp is server-current. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(integrations): tighten Verifying section after end-to-end smoke Smoke-tested all five verification steps against a fresh Sonnet 4.6 + Honcho integration. Three findings, all reflected here: - Cross-session recall (#4): added Note about DERIVER_REPRESENTATION_BATCH_MAX_TOKENS=1024 — short warmups don't accumulate enough content to flush observations, so cross-session recall returns empty even on a working integration. - Tool calling prompt (#5): replaced the honcho_chat patterns prompt with a verbatim-retrieval honcho_search prompt. Sonnet skips honcho_chat when middleware-injected context already answers; verbatim retrieval forces a fire. - Tool inspection (#5): replaced result.toolCalls reference with result.steps[i].toolCalls + flatMap snippet. Top-level toolCalls is empty in multi-step calls (stopWhen: stepCountIs(N)) — the fires are nested inside steps. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(integrations): make Step 4 cross-session test durable via honcho_search Replace the prose-recall test ("Based on what we've talked about, what do you know about me?") with a forced honcho_search call. Prose recall depended on the model getting deriver-built representation/peer-card in its system prompt, which is gated behind DERIVER_REPRESENTATION_BATCH_MAX_TOKENS=1024 — short tutorial-length conversations don't trigger it, producing false negatives on a working integration. honcho_search hits message embeddings, which are computed synchronously at message persist time (src/crud/message.py:262-276), so peer-scoped retrieval works regardless of how short the prior session was. Also folds the result.steps[i].toolCalls inspection snippet from the old Step 5 into Step 4 — same prompt, no need for two sections. Drops Step 5 entirely. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
1 parent f37338b commit e380851

4 files changed

Lines changed: 404 additions & 14 deletions

File tree

docs/docs.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,7 @@
104104
"pages": [
105105
"v3/guides/integrations/claude-code",
106106
"v3/guides/integrations/opencode",
107+
"v3/guides/integrations/vercel-ai-sdk",
107108
"v3/guides/integrations/crewai",
108109
"v3/guides/integrations/langgraph",
109110
"v3/guides/integrations/mcp",
Lines changed: 377 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,377 @@
1+
---
2+
title: "Vercel AI SDK"
3+
icon: "triangle"
4+
iconType: "solid"
5+
description: "Add persistent user memory and reasoning to any Vercel AI SDK app with Honcho"
6+
sidebarTitle: "Vercel AI SDK"
7+
---
8+
9+
Integrate Honcho with the Vercel AI SDK to build AI apps that remember users across sessions. The [Vercel AI SDK](https://sdk.vercel.ai) is an open-source TypeScript toolkit for building AI-powered apps with a unified API across providers. This guide shows you how to wrap any `generateText` or `streamText` call with Honcho's memory middleware and reasoning tools.
10+
11+
<Note>
12+
The full package source and examples are available on [GitHub](https://github.com/plastic-labs/vercel-ai-sdk-package).
13+
</Note>
14+
15+
## What We're Building
16+
17+
We'll wire Honcho into a Vercel AI SDK app so the model receives context from past conversations and can query what it knows about the user mid-generation. Here's how the pieces fit together:
18+
19+
- **Vercel AI SDK** handles model calls and streaming
20+
- **Honcho** stores messages and retrieves user context before each generation
21+
- **Your model provider** can be Anthropic, OpenAI, Google, etc.
22+
23+
The key benefit: you don't manually manage conversation history across sessions. Honcho handles persistence and context injection — the model always has a rich picture of who it's talking to. (New to Honcho's primitives? See [peers and sessions](/v3/documentation/core-concepts/architecture).)
24+
25+
## Setup
26+
27+
Install the package:
28+
29+
<CodeGroup>
30+
```bash npm
31+
npm install @honcho-ai/vercel-ai-sdk
32+
```
33+
34+
```bash pnpm
35+
pnpm add @honcho-ai/vercel-ai-sdk
36+
```
37+
38+
```bash yarn
39+
yarn add @honcho-ai/vercel-ai-sdk
40+
```
41+
42+
```bash bun
43+
bun add @honcho-ai/vercel-ai-sdk
44+
```
45+
</CodeGroup>
46+
47+
Get your API key at [app.honcho.dev](https://app.honcho.dev).
48+
49+
```bash
50+
HONCHO_API_KEY=your-api-key
51+
HONCHO_WORKSPACE_ID=your-workspace-id
52+
```
53+
54+
## Create a Provider Instance
55+
56+
`createHoncho()` is the entry point. It reads your API key and workspace from environment variables and returns a provider object with `middleware()`, `tools()`, and `send()`.
57+
58+
```typescript
59+
import { createHoncho } from '@honcho-ai/vercel-ai-sdk';
60+
61+
const honcho = createHoncho();
62+
```
63+
64+
You can set a stable `defaultAssistantId` on the provider to identify the AI peer across all calls:
65+
66+
```typescript
67+
const honcho = createHoncho({
68+
defaultAssistantId: 'my-assistant',
69+
});
70+
```
71+
72+
## Add Middleware
73+
74+
`honcho.middleware()` is compatible with `wrapLanguageModel`. Two things happen on each call:
75+
76+
1. **Before generation** — Honcho fetches the user's representation, peer card, session summary, and recent messages and injects them into the system prompt
77+
2. **After generation** — the user message and assistant response are stored back in Honcho with correct peer attribution
78+
79+
```typescript
80+
import { createHoncho } from '@honcho-ai/vercel-ai-sdk';
81+
import { wrapLanguageModel, generateText } from 'ai';
82+
import { anthropic } from '@ai-sdk/anthropic';
83+
84+
const honcho = createHoncho();
85+
86+
const model = wrapLanguageModel({
87+
model: anthropic('claude-sonnet-4-6'),
88+
middleware: honcho.middleware({
89+
userId: 'user-abc',
90+
sessionId: 'session-123',
91+
}),
92+
});
93+
94+
const { text } = await generateText({
95+
model,
96+
prompt: 'What should I focus on today?',
97+
});
98+
```
99+
100+
Pass `userId` and `sessionId` per request — no session handles to construct. Both default to lazily generated IDs if omitted, which is fine for local scripts but not for multi-user server traffic.
101+
102+
## Add Tools
103+
104+
`honcho.tools()` gives the model six tools it can call mid-generation to query or update what it knows about the user:
105+
106+
| Tool | What it does |
107+
| --- | --- |
108+
| `honcho_chat` | Dialectic reasoning — ask natural-language questions about the user; answers synthesized from full interaction history |
109+
| `honcho_context` | Short summary of recent context within the session |
110+
| `honcho_search` | Semantic search over stored conversation messages |
111+
| `honcho_search_conclusions` | Query derived conclusions: personality traits, preferences, behavioral patterns |
112+
| `honcho_get_representation` | Full synthesized profile of the user |
113+
| `honcho_save_conclusion` | Persist an observation about the user for future sessions |
114+
115+
Pass the same `userId` and `sessionId` to `honcho.tools()` so tool calls bind to the same peers as the middleware:
116+
117+
```typescript
118+
import { generateText, stepCountIs } from 'ai';
119+
120+
const { text } = await generateText({
121+
model,
122+
tools: honcho.tools({
123+
userId: 'user-abc',
124+
sessionId: 'session-123',
125+
}),
126+
stopWhen: stepCountIs(3),
127+
prompt: 'Based on our conversations, what do I care about most?',
128+
});
129+
```
130+
131+
## Complete Example
132+
133+
Here's a full working example combining middleware and tools.
134+
135+
Want a runnable end-to-end version? See the [Full Script](#full-script).
136+
137+
```typescript
138+
import { createHoncho } from '@honcho-ai/vercel-ai-sdk';
139+
import { wrapLanguageModel, generateText, stepCountIs } from 'ai';
140+
import { anthropic } from '@ai-sdk/anthropic';
141+
142+
const honcho = createHoncho({
143+
defaultAssistantId: 'assistant',
144+
});
145+
146+
const userId = 'user-abc';
147+
const sessionId = 'session-123';
148+
149+
const model = wrapLanguageModel({
150+
model: anthropic('claude-sonnet-4-6'),
151+
middleware: honcho.middleware({ userId, sessionId }),
152+
});
153+
154+
const { text } = await generateText({
155+
model,
156+
tools: honcho.tools({ userId, sessionId }),
157+
stopWhen: stepCountIs(3),
158+
prompt: 'What should we work on today?',
159+
});
160+
161+
console.log(text);
162+
```
163+
164+
## Streaming
165+
166+
`streamText` works the same way — middleware handles persistence after the stream completes:
167+
168+
```typescript
169+
import { createHoncho } from '@honcho-ai/vercel-ai-sdk';
170+
import { wrapLanguageModel, streamText } from 'ai';
171+
import { openai } from '@ai-sdk/openai';
172+
173+
const honcho = createHoncho();
174+
175+
const userId = 'user-abc';
176+
const sessionId = 'session-456';
177+
178+
const model = wrapLanguageModel({
179+
model: openai('gpt-4o'),
180+
middleware: honcho.middleware({ userId, sessionId }),
181+
});
182+
183+
const result = streamText({
184+
model,
185+
tools: honcho.tools({ userId, sessionId }),
186+
prompt: 'What should we work on today?',
187+
});
188+
189+
for await (const chunk of result.textStream) {
190+
process.stdout.write(chunk);
191+
}
192+
```
193+
194+
## Using with `messages`
195+
196+
If your app already manages conversation history and passes a `messages` array directly, set `injectHistory: false` to prevent Honcho from prepending duplicate history:
197+
198+
```typescript
199+
honcho.middleware({
200+
userId,
201+
sessionId,
202+
injectHistory: false, // don't prepend history — we're passing messages directly
203+
})
204+
```
205+
206+
Honcho still injects the user's representation and peer card into the system prompt, and still persists messages after generation. With `injectHistory: false` you must pass a `messages` array — without either `messages` or `prompt`, the Vercel AI SDK throws `Invalid prompt: prompt or messages must be defined`.
207+
208+
## Verifying the Integration
209+
210+
### 1. Isolate Honcho's Contribution
211+
212+
Let's confirm the memory is actually coming from Honcho and not your app's existing conversation history.
213+
214+
Two ways to check: 1) through a developer method 2) through the UI.
215+
216+
**Token delta (developer check).** On a session with a few prior turns, run the same prompt twice — once with `injectHistory: false` and once without.
217+
218+
Compare `result.usage.inputTokens`:
219+
220+
```typescript
221+
const baseline = await generateText({
222+
model: wrapLanguageModel({
223+
model: anthropic('claude-sonnet-4-6'),
224+
middleware: honcho.middleware({ userId, sessionId, injectHistory: false }),
225+
}),
226+
prompt: 'What do you know about my preferences?',
227+
});
228+
229+
const injected = await generateText({
230+
model: wrapLanguageModel({
231+
model: anthropic('claude-sonnet-4-6'),
232+
middleware: honcho.middleware({ userId, sessionId }),
233+
}),
234+
prompt: 'What do you know about my preferences?',
235+
});
236+
237+
console.log(injected.usage.inputTokens - baseline.usage.inputTokens);
238+
```
239+
240+
A positive delta is Honcho's representation, peer card, and session summary being injected into the system prompt. Expect ~0 on a fresh peer — the deriver runs asynchronously after messages persist, so injected context only populates after a few prior turns.
241+
242+
**Dashboard (UI check).** Open [app.honcho.dev/explore](https://app.honcho.dev/explore), select your workspace, and confirm your peer and session appear under the Peers and Sessions tables.
243+
244+
With Honcho's contribution isolated, the rest of this section shows what the integration feels like in practice.
245+
246+
### 2. First turn
247+
248+
Send any message. The model responds normally — nothing is stored yet. Context injection returns empty on the first turn.
249+
250+
### 3. Build memory across turns
251+
252+
Have a multi-turn conversation and share something about yourself:
253+
254+
```text
255+
I prefer concise answers and I mostly work in TypeScript.
256+
```
257+
258+
After a few turns, ask:
259+
260+
```text
261+
What do you know about my preferences?
262+
```
263+
264+
If the model references TypeScript and concise answers without being told again in this session, memory is working.
265+
266+
### 4. Cross-session recall
267+
268+
Start a new session (new `sessionId`) with the same `userId`. Ask:
269+
270+
```text
271+
Call your honcho_search tool with the query 'TypeScript' and quote the exact verbatim message that contained TypeScript. Do not paraphrase.
272+
```
273+
274+
If the search returns a message from the prior session word-for-word, peer-scoped retrieval is crossing session boundaries. `honcho_search` queries the user's messages across all their sessions and doesn't depend on the deriver, so it works regardless of how short the prior session was.
275+
276+
To confirm the tool actually fired, inspect `result.steps[i].toolCalls`:
277+
278+
```typescript
279+
const toolFires = result.steps?.flatMap((step, i) =>
280+
(step.toolCalls ?? []).map((tc) => ({ step: i, tool: tc.toolName, input: tc.input }))
281+
) ?? [];
282+
console.log(toolFires);
283+
// [{ step: 0, tool: "honcho_search", input: { query: "TypeScript", limit: 10 } }]
284+
```
285+
286+
When the model takes more than one turn (call a tool, see the result, then answer), the top-level `result.toolCalls` is empty — check inside each `step`.
287+
288+
## Full Script
289+
290+
<Accordion title="honcho_vercel_chat.ts">
291+
```typescript
292+
/**
293+
* Multi-turn chat with Honcho memory + Vercel AI SDK.
294+
*
295+
* Prerequisites:
296+
* 1. Install dependencies:
297+
* npm install @honcho-ai/vercel-ai-sdk ai @ai-sdk/anthropic dotenv
298+
* 2. Set environment variables in `.env`:
299+
* HONCHO_API_KEY=your-honcho-api-key
300+
* HONCHO_WORKSPACE_ID=your-workspace-id
301+
* ANTHROPIC_API_KEY=your-anthropic-api-key
302+
* 3. Run with: npx tsx honcho_vercel_chat.ts
303+
*
304+
* Pass a stable userId from your auth system and a sessionId for the conversation
305+
* thread; Honcho handles persistence and context injection on every turn.
306+
*/
307+
308+
import 'dotenv/config';
309+
import { createHoncho } from '@honcho-ai/vercel-ai-sdk';
310+
import { wrapLanguageModel, generateText, stepCountIs } from 'ai';
311+
import { anthropic } from '@ai-sdk/anthropic';
312+
import * as readline from 'node:readline/promises';
313+
import { stdin as input, stdout as output } from 'node:process';
314+
315+
const honcho = createHoncho({
316+
defaultAssistantId: 'assistant',
317+
});
318+
319+
const userId = process.env.USER_ID ?? 'demo-user';
320+
const sessionId = process.env.SESSION_ID ?? `session-${Date.now()}`;
321+
322+
const model = wrapLanguageModel({
323+
model: anthropic('claude-sonnet-4-6'),
324+
middleware: honcho.middleware({ userId, sessionId }),
325+
});
326+
327+
async function chat(prompt: string): Promise<string> {
328+
const { text } = await generateText({
329+
model,
330+
tools: honcho.tools({ userId, sessionId }),
331+
stopWhen: stepCountIs(3),
332+
prompt,
333+
});
334+
return text;
335+
}
336+
337+
async function main() {
338+
const rl = readline.createInterface({ input, output });
339+
console.log(`Honcho session: ${sessionId} (user: ${userId})`);
340+
console.log('Type a message, or "exit" to quit.\n');
341+
342+
while (true) {
343+
const userMessage = (await rl.question('you > ')).trim();
344+
if (!userMessage || userMessage === 'exit') break;
345+
const reply = await chat(userMessage);
346+
console.log(`bot > ${reply}\n`);
347+
}
348+
349+
rl.close();
350+
}
351+
352+
main().catch((err) => {
353+
console.error(err);
354+
process.exit(1);
355+
});
356+
```
357+
</Accordion>
358+
359+
## Next Steps
360+
361+
<CardGroup cols={2}>
362+
<Card title="Github Repository" icon="github" href="https://github.com/plastic-labs/vercel-ai-sdk-package">
363+
Source, tests, and full API reference for @honcho-ai/vercel-ai-sdk.
364+
</Card>
365+
366+
<Card title="Honcho Architecture" icon="sitemap" href="/v3/documentation/core-concepts/architecture">
367+
Learn about peers, sessions, and dialectic reasoning.
368+
</Card>
369+
370+
<Card title="Self-Hosting Guide" icon="server" href="/v3/contributing/self-hosting">
371+
Run Honcho locally with your Vercel AI SDK app.
372+
</Card>
373+
374+
<Card title="Vercel AI SDK Docs" icon="book" href="https://sdk.vercel.ai">
375+
wrapLanguageModel, middleware, and tool use reference.
376+
</Card>
377+
</CardGroup>

0 commit comments

Comments
 (0)