Skip to content

Commit 4fa6147

Browse files
grypezclaudesirtimid
authored
feat(kernel-language-model-service): Add language model client (#876)
## Summary Adds a client constructor to `@ocap/kernel-language-model-service`, allowing wiring ollama or any OpenAI-compatible language model server into the kernel as a first-class kernel service; and extends `@ocap/kernel-agents` with a chat-completion-based agent that uses the standard tool-calling interface. Verified end-to-end with integration and e2e tests in `@ocap/kernel-test-local`. ### `feat(kernel-language-model-service): Add /v1 API support and language model client` **Language model service supported backends**: - Open /v1: The OpenAI `/v1/chat/completions` API is de facto standard across industry, as evidenced by spaghetti compatibility across LiteLLM, Ollama, llama.cpp, and anthropic. - Ollama: Finer detail is possible with the ollama raw completions API, which generates raw token streams instead of structured chat objects. The REPL strategies depend upon such support. **Vat clients** (`src/client.ts`): `makeChatClient` returns an OpenAI-SDK-shaped object (`client.chat.completions.create(...)`) that routes calls through a CapTP `ChatService` reference via `E()`. `makeSampleClient` is the equivalent for raw token sampling. **Test utilities** (`src/test-utils/`): Replaces the previous queue-based mock model with `makeMockOpenV1Fetch` — a deterministic SSE mock that emits a configured sequence of token strings, keeping integration tests CI-safe and fast. `makeMockSample` provides the same for the sample path. Also updates `kernel-test` to use the new chat/sample API: replaces `lms-user-vat` / `lms-queue-vat` with `lms-chat-vat` / `lms-sample-vat` and the corresponding test files. ### `feat(kernel-agents): chat-completion-based agent with tool-calling` Adds `makeChatAgent` to `@ocap/kernel-agents` (exported as `@ocap/kernel-agents/chat`), a capability-augmented agent built on the chat completions API. Capabilities are passed as `tools`, responses arrive via `tool_calls`, and results are returned as `role: "tool"` messages. ### `test(kernel-test-local): kernel-LMS integration tests` Adds two test variants that share a common `runLmsChatKernelTest` helper: - **`lms-chat.test.ts`**: injects `makeMockOpenV1Fetch` — no network, runs under `test:dev`. - **`lms-chat.e2e.test.ts`** (local): uses real `fetch` against a local Ollama instance, run via `test:e2e:local`. Both launch a subcluster with `lms-chat-vat` through the kernel, register the LMS kernel service, and assert that the vat logs the model's response. Also adds `agents.e2e.test.ts` exercising chat/json/repl agents end-to-end, and `test/suite.test.ts` as a language-model-service pre-flight check, whether using ollama, llama-cpp, lite-llm, etc. Package layout is conformed to `kernel-test`: all vats, helpers, and test files live under `src/`; `test/` holds only the pre-flight suite. ## Test plan - [ ] `yarn workspace @ocap/kernel-test-local test:dev:quiet` — unit + integration tests pass (no Ollama required) - [ ] `yarn workspace @ocap/kernel-language-model-service test:dev:quiet` — all klms unit tests pass - [ ] `yarn workspace @ocap/kernel-agents test:dev:quiet` — all kernel-agents unit tests pass - [ ] With Ollama running and `llama3.2:3b` pulled: `yarn workspace @ocap/kernel-test-local test:e2e:local` - [ ] Or, with llama.cpp on localhost:8080 and `glm-4.7-flash` pulled and served, `yarn workspace @ocap/kernel-test-local test:e2e:local:llama-cpp` <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Medium Risk** > Moderate risk due to introducing new public exports and wiring for chat/tool-calling plus new Open /v1 and expanded Ollama service paths, which could affect downstream integrations and streaming behavior. > > **Overview** > Adds a first-class chat-completions client layer to `@ocap/kernel-language-model-service` (`makeChatClient`/`makeSampleClient`) plus a kernel-service wrapper (`makeKernelLanguageModelService`) so vats can call LMS backends via a stable `/v1`-style API. > > Introduces an OpenAI-compatible `/v1` Node.js backend with parameter/response validation, JSON stripping, and SSE streaming support, and expands the Ollama backend to support `chat` and both streaming and non-streaming raw `sample` requests (including tool-call argument parsing). > > Extends `@ocap/kernel-agents` with a new `makeChatAgent` strategy (exported as `@ocap/kernel-agents/chat`) that drives tool-calling loops, validates tool args via new `kernel-utils` JSON-schema→Superstruct helpers, and records chat turns as experiences; updates local and kernel tests to exercise the new service/client paths and replaces the old queue-based LMS test utils with deterministic `makeMockOpenV1Fetch`/`makeMockSample`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 6b23244. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Dimitris Marlagkoutsos <info@sirtimid.com>
1 parent 7fe14e2 commit 4fa6147

72 files changed

Lines changed: 3837 additions & 1231 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

packages/kernel-agents/package.json

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,16 @@
3333
"default": "./dist/strategies/json-agent.cjs"
3434
}
3535
},
36+
"./chat": {
37+
"import": {
38+
"types": "./dist/strategies/chat-agent.d.mts",
39+
"default": "./dist/strategies/chat-agent.mjs"
40+
},
41+
"require": {
42+
"types": "./dist/strategies/chat-agent.d.cts",
43+
"default": "./dist/strategies/chat-agent.cjs"
44+
}
45+
},
3646
"./agent": {
3747
"import": {
3848
"types": "./dist/agent.d.mts",
@@ -186,6 +196,7 @@
186196
"@metamask/kernel-errors": "workspace:^",
187197
"@metamask/kernel-utils": "workspace:^",
188198
"@metamask/logger": "workspace:^",
199+
"@metamask/superstruct": "^3.2.1",
189200
"@ocap/kernel-language-model-service": "workspace:^",
190201
"partial-json": "^0.1.7",
191202
"ses": "^1.14.0"
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
import { describe, expect, it } from 'vitest';
2+
3+
import { validateCapabilityArgs } from './validate-capability-args.ts';
4+
5+
describe('validateCapabilityArgs', () => {
6+
it('accepts values matching primitive arg schemas', () => {
7+
expect(() =>
8+
validateCapabilityArgs(
9+
{ a: 1, b: 2 },
10+
{
11+
description: 'add',
12+
args: {
13+
a: { type: 'number' },
14+
b: { type: 'number' },
15+
},
16+
},
17+
),
18+
).not.toThrow();
19+
});
20+
21+
it('throws when a required argument is missing', () => {
22+
expect(() =>
23+
validateCapabilityArgs(
24+
{ a: 1 },
25+
{
26+
description: 'add',
27+
args: {
28+
a: { type: 'number' },
29+
b: { type: 'number' },
30+
},
31+
},
32+
),
33+
).toThrow(/At path: b -- Expected a number/u);
34+
});
35+
36+
it('throws when a value does not match the schema', () => {
37+
expect(() =>
38+
validateCapabilityArgs(
39+
{ a: 'not-a-number' },
40+
{
41+
description: 'x',
42+
args: { a: { type: 'number' } },
43+
},
44+
),
45+
).toThrow(/Expected a number/u);
46+
});
47+
48+
it('does nothing when there are no declared arguments', () => {
49+
expect(() =>
50+
validateCapabilityArgs(
51+
{ extra: 1 },
52+
{
53+
description: 'ping',
54+
args: {},
55+
},
56+
),
57+
).not.toThrow();
58+
});
59+
});
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
import { methodArgsToStruct } from '@metamask/kernel-utils/json-schema-to-struct';
2+
import { assert } from '@metamask/superstruct';
3+
4+
import type { CapabilitySchema } from '../types.ts';
5+
6+
/**
7+
* Assert `values` match the capability's declared argument schemas using Superstruct.
8+
*
9+
* @param values - Parsed tool arguments (a plain object).
10+
* @param capabilitySchema - {@link CapabilitySchema} for this capability.
11+
*/
12+
export function validateCapabilityArgs(
13+
values: Record<string, unknown>,
14+
capabilitySchema: CapabilitySchema<string>,
15+
): void {
16+
assert(values, methodArgsToStruct(capabilitySchema.args));
17+
}
Lines changed: 240 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
import '@ocap/repo-tools/test-utils/mock-endoify';
2+
3+
import type {
4+
ChatMessage,
5+
ChatResult,
6+
ToolCall,
7+
} from '@ocap/kernel-language-model-service';
8+
import { describe, expect, it, vi } from 'vitest';
9+
10+
import { makeChatAgent } from './chat-agent.ts';
11+
import type { BoundChat } from './chat-agent.ts';
12+
import { capability } from '../capabilities/capability.ts';
13+
14+
const makeToolCall = (
15+
id: string,
16+
name: string,
17+
args: Record<string, unknown>,
18+
): ToolCall => ({
19+
id,
20+
type: 'function',
21+
function: { name, arguments: JSON.stringify(args) },
22+
});
23+
24+
const makeTextResponse = (content: string): ChatResult => ({
25+
id: '0',
26+
model: 'test',
27+
choices: [
28+
{
29+
message: { role: 'assistant', content },
30+
index: 0,
31+
finish_reason: 'stop',
32+
},
33+
],
34+
usage: { prompt_tokens: 0, completion_tokens: 0, total_tokens: 0 },
35+
});
36+
37+
const makeToolCallResponse = (
38+
id: string,
39+
toolCalls: ToolCall[],
40+
): ChatResult => ({
41+
id,
42+
model: 'test',
43+
choices: [
44+
{
45+
message: { role: 'assistant', content: '', tool_calls: toolCalls },
46+
index: 0,
47+
finish_reason: 'tool_calls',
48+
},
49+
],
50+
usage: { prompt_tokens: 0, completion_tokens: 0, total_tokens: 0 },
51+
});
52+
53+
const noCapabilities = {};
54+
55+
describe('makeChatAgent', () => {
56+
it('returns plain text response when model does not invoke a tool', async () => {
57+
const chat: BoundChat = async () => makeTextResponse('Hello, world!');
58+
const agent = makeChatAgent({ chat, capabilities: noCapabilities });
59+
60+
const result = await agent.task('say hello');
61+
expect(result).toBe('Hello, world!');
62+
});
63+
64+
it('dispatches a tool call and returns final text answer', async () => {
65+
const add = vi.fn(async ({ a, b }: { a: number; b: number }) => a + b);
66+
const addCap = capability(add, {
67+
description: 'Add two numbers',
68+
args: {
69+
a: { type: 'number' },
70+
b: { type: 'number' },
71+
},
72+
returns: { type: 'number' },
73+
});
74+
75+
let call = 0;
76+
const chat: BoundChat = async () => {
77+
call += 1;
78+
if (call === 1) {
79+
return makeToolCallResponse('0', [
80+
makeToolCall('c1', 'add', { a: 3, b: 4 }),
81+
]);
82+
}
83+
return makeTextResponse('7');
84+
};
85+
86+
const agent = makeChatAgent({ chat, capabilities: { add: addCap } });
87+
88+
const result = await agent.task('add 3 and 4');
89+
expect(add).toHaveBeenCalledWith({ a: 3, b: 4 });
90+
expect(result).toBe('7');
91+
});
92+
93+
it('injects tool result message before next turn', async () => {
94+
const recorded: ChatMessage[][] = [];
95+
const ping = capability(async () => 'pong', {
96+
description: 'Ping',
97+
args: {},
98+
returns: { type: 'string' },
99+
});
100+
101+
let call = 0;
102+
const chat: BoundChat = async ({ messages }) => {
103+
recorded.push([...messages]);
104+
call += 1;
105+
if (call === 1) {
106+
return makeToolCallResponse('0', [makeToolCall('c1', 'ping', {})]);
107+
}
108+
return makeTextResponse('done');
109+
};
110+
111+
const agent = makeChatAgent({ chat, capabilities: { ping } });
112+
await agent.task('ping');
113+
114+
// Second turn must include the tool result message
115+
const secondTurn = recorded[1] ?? [];
116+
expect(
117+
secondTurn.some(
118+
(message) => message.role === 'tool' && message.tool_call_id === 'c1',
119+
),
120+
).toBe(true);
121+
expect(secondTurn.some((message) => message.content === '"pong"')).toBe(
122+
true,
123+
);
124+
});
125+
126+
it('injects error message for unknown tool and continues', async () => {
127+
const recorded: ChatMessage[][] = [];
128+
let call = 0;
129+
const chat: BoundChat = async ({ messages }) => {
130+
recorded.push([...messages]);
131+
call += 1;
132+
if (call === 1) {
133+
return makeToolCallResponse('0', [
134+
makeToolCall('c1', 'nonexistent', {}),
135+
]);
136+
}
137+
return makeTextResponse('recovered');
138+
};
139+
140+
const agent = makeChatAgent({ chat, capabilities: noCapabilities });
141+
const result = await agent.task('do something');
142+
143+
expect(result).toBe('recovered');
144+
const secondTurn = recorded[1] ?? [];
145+
expect(
146+
secondTurn.some(
147+
(message) =>
148+
message.role === 'tool' &&
149+
message.content.includes('Unknown capability'),
150+
),
151+
).toBe(true);
152+
});
153+
154+
it('throws when invocation budget is exceeded', async () => {
155+
const ping = capability(async () => 'pong', {
156+
description: 'Ping',
157+
args: {},
158+
});
159+
const chat: BoundChat = async () =>
160+
makeToolCallResponse('0', [makeToolCall('c1', 'ping', {})]);
161+
162+
const agent = makeChatAgent({ chat, capabilities: { ping } });
163+
164+
await expect(
165+
agent.task('go', undefined, { invocationBudget: 3 }),
166+
).rejects.toThrow('Invocation budget exceeded');
167+
});
168+
169+
it('applies judgment to final answer', async () => {
170+
const chat: BoundChat = async () => makeTextResponse('hello');
171+
const agent = makeChatAgent({ chat, capabilities: noCapabilities });
172+
173+
const isNumber = (result: unknown): result is number =>
174+
typeof result === 'number';
175+
await expect(agent.task('go', isNumber)).rejects.toThrow('Invalid result');
176+
});
177+
178+
it('passes tools to the chat function', async () => {
179+
const recordedTools: unknown[] = [];
180+
const ping = capability(async () => 'pong', {
181+
description: 'Ping the server',
182+
args: {},
183+
returns: { type: 'string' },
184+
});
185+
186+
const chat: BoundChat = async ({ tools }) => {
187+
recordedTools.push(tools);
188+
return makeTextResponse('done');
189+
};
190+
191+
const agent = makeChatAgent({ chat, capabilities: { ping } });
192+
await agent.task('go');
193+
194+
expect(recordedTools[0]).toStrictEqual([
195+
{
196+
type: 'function',
197+
function: {
198+
name: 'ping',
199+
description: 'Ping the server',
200+
parameters: { type: 'object', properties: {}, required: [] },
201+
},
202+
},
203+
]);
204+
});
205+
206+
it('passes undefined tools when there are no capabilities', async () => {
207+
let recordedTools: unknown = 'not-set';
208+
const chat: BoundChat = async ({ tools }) => {
209+
recordedTools = tools;
210+
return makeTextResponse('done');
211+
};
212+
213+
const agent = makeChatAgent({ chat, capabilities: noCapabilities });
214+
await agent.task('go');
215+
216+
expect(recordedTools).toBeUndefined();
217+
});
218+
219+
it('accumulates experiences across tasks', async () => {
220+
let call = 0;
221+
const responses = ['hello', 'world'];
222+
const chat: BoundChat = async () => {
223+
const response = makeTextResponse(responses[call] ?? '');
224+
call += 1;
225+
return response;
226+
};
227+
const agent = makeChatAgent({ chat, capabilities: noCapabilities });
228+
229+
await agent.task('first');
230+
await agent.task('second');
231+
232+
const exps = [];
233+
for await (const exp of agent.experiences) {
234+
exps.push(exp);
235+
}
236+
expect(exps).toHaveLength(2);
237+
expect(exps[0]?.objective.intent).toBe('first');
238+
expect(exps[1]?.objective.intent).toBe('second');
239+
});
240+
});

0 commit comments

Comments
 (0)