Skip to content

Commit 09c8190

Browse files
authored
List a connection's tools from a full OpenAPI spec on Workers without re-parsing it (#1084)
* openapi: stream compile + serve from content-addressed bindings (no re-parse) The serve path (resolveOpenApiBackedTools, the 2nd OOM site) re-parsed the whole spec on every tools/list. For the full Microsoft Graph spec (37MB / 16.5k ops) that rebuilt a ~300MB document tree and OOM'd the 128MB Workers isolate even after the add-path parser swap. Add a memory-safe compile + serve path that never re-parses: - streamOperationBindings (extract.ts): two-pass streaming compile. Pass 1 plans tool paths from schema-free metadata (planToolPaths, extracted from compileToolDefinitions); pass 2 builds invocation bindings in bounded chunks, flushing each before the next so a huge spec's bindings are never all co-resident with the parsed tree. - Content-addressed defs blob (store.ts: putDefs/getDefs, keyed by specHash): the normalized #/$defs schemas, built once per spec at add time and shared across tenants. buildDefsJson serializes them one schema at a time so the normalized tree never co-resides with the parsed document. - Persist a resolved description per operation so the serve path rebuilds each tool def from the binding alone. - resolveOpenApiBackedTools fast path: getDefs(specHash) + listOperations + toolDefFromStoredOperation, falling back to a spec re-parse for legacy rows or a missing/corrupt blob. - Route the openapi, microsoft (full-graph), and google add/update paths through the streaming persist + putDefs. * e2e: guard full Graph catalog add + serve at real scale Drives the public API to add every Microsoft Graph workload (the full ~37MB / 16.5k-operation spec) and then list the served catalog. Pins both former OOM sites at real scale: the streaming add persists one binding per operation, and tools/list rebuilds >5000 tools from the persisted bindings plus the content-addressed defs blob, never re-parsing the spec. Proven green on cloud and selfhost. * openapi: keep the file-emit hint on the fast serve path The content-addressed serve path rebuilds a tool def from the persisted binding instead of re-parsing the spec. The re-parse path (kept as the fallback) appends the ToolFile emit contract to a file-returning tool's description at the projection step; the fast path bypassed that step and served the bare description, so a file-returning operation lost the contract whenever it was served from the binding (the common case once a spec is persisted). Apply the hint at the same projection step in the fast path, sourced from the binding's response fileHint (already inspected for the output schema), so a file tool carries the contract identically whether served fast or via the fallback. No persist-path or schema change: existing rows pick it up on the next list. Covered by the tool-descriptions e2e scenario.
1 parent 395ba97 commit 09c8190

11 files changed

Lines changed: 641 additions & 78 deletions

File tree

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
import { randomBytes } from "node:crypto";
2+
3+
import { expect } from "@effect/vitest";
4+
import { Effect } from "effect";
5+
import { composePluginApi } from "@executor-js/api/server";
6+
import {
7+
MICROSOFT_AUTH_TEMPLATE_SLUG,
8+
MICROSOFT_GRAPH_ALL_PRESET_IDS,
9+
MICROSOFT_GRAPH_DELEGATED_DEFAULT_SCOPES,
10+
} from "@executor-js/plugin-microsoft";
11+
import { microsoftHttpPlugin } from "@executor-js/plugin-microsoft/api";
12+
import { AuthTemplateSlug, ConnectionName, IntegrationSlug } from "@executor-js/sdk/shared";
13+
14+
import { scenario } from "../src/scenario";
15+
import { Api, Target } from "../src/services";
16+
17+
const api = composePluginApi([microsoftHttpPlugin()] as const);
18+
19+
type ToolView = {
20+
readonly name: string;
21+
};
22+
23+
const unique = (prefix: string) => `${prefix}_${randomBytes(4).toString("hex")}`;
24+
25+
// Adding *every* Graph workload pulls the full Microsoft Graph OpenAPI document
26+
// (~37MB, ~16.5k operations) and persists a binding per operation. That whole-
27+
// document path used to 503 on the Cloudflare worker: parsing the spec, and
28+
// then re-parsing it on every tools/list, each rebuilt a ~300MB JS tree that
29+
// blew the 128MB isolate. This scenario is the regression guard for both sites
30+
// at real scale: the add streams the compile + persist, and tools/list serves
31+
// the catalog back from the persisted bindings (+ the content-addressed defs
32+
// blob) without ever re-parsing the spec. It drives only the public API, so a
33+
// green run is evidence the full catalog lands and serves end to end.
34+
scenario(
35+
"Microsoft Graph: the full catalog adds and serves without re-parsing the spec",
36+
{ timeout: 300_000 },
37+
Effect.gen(function* () {
38+
const target = yield* Target;
39+
const { client: makeApiClient } = yield* Api;
40+
const identity = yield* target.newIdentity();
41+
const client = yield* makeApiClient(api, identity);
42+
43+
const integration = unique("msgraph_full");
44+
const connection = ConnectionName.make("main");
45+
46+
yield* Effect.ensuring(
47+
Effect.gen(function* () {
48+
// Add path (1st former OOM site): the full spec is fetched and
49+
// stream-compiled into one persisted binding per operation.
50+
const added = yield* client.microsoft.addGraph({
51+
payload: {
52+
presetIds: [...MICROSOFT_GRAPH_ALL_PRESET_IDS],
53+
customScopes: [],
54+
slug: integration,
55+
name: "Microsoft Graph (full)",
56+
},
57+
});
58+
expect(added.slug, "the full Graph source keeps the requested slug").toBe(integration);
59+
expect(
60+
added.toolCount,
61+
"adding every Graph workload extracts the whole catalog (thousands of operations)",
62+
).toBeGreaterThan(5_000);
63+
64+
const config = yield* client.microsoft.getConfig({ params: { slug: integration } });
65+
expect(config?.microsoftGraphPresetIds, "every Graph workload preset is persisted").toEqual(
66+
[...MICROSOFT_GRAPH_ALL_PRESET_IDS],
67+
);
68+
expect(
69+
config?.microsoftGraphCoversFullGraph,
70+
"selecting every workload is recognized as full Graph",
71+
).toBe(true);
72+
expect(
73+
config?.microsoftGraphScopes,
74+
"full Graph delegates the app-registration default scope set",
75+
).toEqual([...MICROSOFT_GRAPH_DELEGATED_DEFAULT_SCOPES]);
76+
77+
yield* client.connections.create({
78+
payload: {
79+
owner: "org",
80+
name: connection,
81+
integration: IntegrationSlug.make(integration),
82+
template: AuthTemplateSlug.make(MICROSOFT_AUTH_TEMPLATE_SLUG),
83+
value: "token-xyz",
84+
},
85+
});
86+
87+
// Serve path (2nd former OOM site): tools/list rebuilds the catalog from
88+
// the persisted bindings. The whole catalog must come back, with real
89+
// descriptions, and without re-parsing the 37MB spec.
90+
const tools = yield* client.tools.list({
91+
query: { integration: IntegrationSlug.make(integration), connection },
92+
});
93+
expect(
94+
tools.length,
95+
"the served catalog returns the whole set of operations, not a re-parse failure",
96+
).toBeGreaterThan(5_000);
97+
98+
const names = tools.map((tool: ToolView) => tool.name);
99+
const messageTools = names.filter((name) => name.toLowerCase().includes("message"));
100+
const siteTools = names.filter((name) => name.toLowerCase().includes("site"));
101+
const userTools = names.filter((name) => name.toLowerCase().includes("user"));
102+
expect(messageTools, "the served catalog spans mail operations").not.toEqual([]);
103+
expect(siteTools, "the served catalog spans SharePoint site operations").not.toEqual([]);
104+
expect(userTools, "the served catalog spans directory user operations").not.toEqual([]);
105+
}),
106+
Effect.gen(function* () {
107+
yield* client.connections
108+
.remove({
109+
params: {
110+
owner: "org",
111+
integration: IntegrationSlug.make(integration),
112+
name: connection,
113+
},
114+
})
115+
.pipe(Effect.ignore);
116+
yield* client.microsoft
117+
.removeGraph({ params: { slug: IntegrationSlug.make(integration) } })
118+
.pipe(Effect.ignore);
119+
}),
120+
);
121+
}),
122+
);

packages/plugins/google/src/sdk/plugin.ts

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,7 @@ const makeGooglePluginExtension = (
143143
};
144144

145145
yield* ctx.storage.putSpec(specHash, conversion.specText);
146+
yield* ctx.storage.putDefs(specHash, JSON.stringify(compiled.hoistedDefs));
146147

147148
yield* ctx.transaction(
148149
Effect.gen(function* () {
@@ -183,6 +184,7 @@ const makeGooglePluginExtension = (
183184

184185
const specHash = yield* sha256Hex(conversion.specText);
185186
yield* ctx.storage.putSpec(specHash, conversion.specText);
187+
yield* ctx.storage.putDefs(specHash, JSON.stringify(compiled.hoistedDefs));
186188

187189
const nextConfig: GoogleIntegrationConfig = {
188190
...current,
@@ -298,7 +300,8 @@ export const googlePlugin = definePlugin((options?: GooglePluginOptions) => ({
298300
describeAuthMethods: describeGoogleAuthMethods,
299301
describeIntegrationDisplay: describeGoogleIntegrationDisplay,
300302

301-
resolveTools: ({ config, storage }) => resolveOpenApiBackedTools({ config, storage }),
303+
resolveTools: ({ integration, config, storage }) =>
304+
resolveOpenApiBackedTools({ integration, config, storage }),
302305

303306
invokeTool: ({ ctx, toolRow, credential, args }) => {
304307
const httpClientLayer = options?.httpClientLayer ?? ctx.httpClientLayer;

packages/plugins/microsoft/src/sdk/plugin.test.ts

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ paths:
3333
/me:
3434
get:
3535
operationId: me.GetUser
36+
summary: Get the signed-in user profile
3637
security:
3738
- azureAdDelegated:
3839
- User.Read
@@ -341,6 +342,45 @@ describe("Microsoft Graph provider", () => {
341342
expect(delegated?.kind === "oauth2" ? delegated.scopes : undefined).toEqual([
342343
...MICROSOFT_GRAPH_DELEGATED_DEFAULT_SCOPES,
343344
]);
345+
346+
// Full-graph add routes through the streaming `parsedDocument` persist
347+
// branch (the path the real 37MB spec takes): it persists each op's
348+
// binding plus a `description` and writes the content-addressed defs
349+
// blob, never re-parsing on serve. Read the operations back through the
350+
// live serve path to prove they landed in storage AND that the serve
351+
// fast path rebuilds tools from the persisted bindings (no spec parse).
352+
yield* executor.connections.create({
353+
owner: "org",
354+
name: ConnectionName.make("full"),
355+
integration: IntegrationSlug.make("microsoft_graph_full"),
356+
template: AuthTemplateSlug.make(MICROSOFT_AUTH_TEMPLATE_SLUG),
357+
value: "token-xyz",
358+
});
359+
360+
const tools = yield* executor.tools.list();
361+
const toolNames = tools.map((tool) => String(tool.name));
362+
expect(toolNames).toContain("me.getUser");
363+
expect(toolNames).toContain("me.messagesListMessages");
364+
expect(toolNames).toContain("sites.listSites");
365+
366+
// The serve fast path must rebuild every tool's description from the
367+
// persisted operation, not drop it. Each graph tool carries a non-empty
368+
// description.
369+
for (const tool of tools) {
370+
expect(tool.description.length).toBeGreaterThan(0);
371+
}
372+
373+
// `me.getUser`'s spec summary survives the add -> persist -> serve
374+
// round-trip. The bare `${METHOD} ${path}` fallback inside the serve
375+
// path would be "GET /me", so matching the summary proves the persisted
376+
// `description` field is what's served.
377+
const getUser = tools.find((tool) => String(tool.name) === "me.getUser");
378+
expect(getUser?.description).toBe("Get the signed-in user profile");
379+
380+
// An op without a spec summary falls back to `${METHOD} ${path}`, also
381+
// sourced from the persisted binding on the serve fast path.
382+
const listSites = tools.find((tool) => String(tool.name) === "sites.listSites");
383+
expect(listSites?.description).toBe("GET /sites");
344384
}),
345385
),
346386
);

packages/plugins/microsoft/src/sdk/plugin.ts

Lines changed: 38 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -15,21 +15,25 @@ import {
1515
type IntegrationConfig,
1616
type IntegrationRecord,
1717
type PluginCtx,
18+
type StorageFailure,
1819
} from "@executor-js/sdk/core";
1920
import { describeApiKeyAuthMethod } from "@executor-js/sdk/http-auth";
2021
import {
21-
compileOpenApiDocument,
22-
compileOpenApiSpec,
22+
compileAndPersistOpenApiOperations,
23+
compileAndPersistOpenApiSpec,
2324
decodeOpenApiIntegrationConfig,
2425
invokeOpenApiBackedTool,
2526
makeDefaultOpenapiStore,
2627
normalizeOpenApiAuthInputs,
27-
openApiStoredOperationsFromCompiled,
28+
OpenApiExtractionError,
29+
OpenApiParseError,
2830
resolveOpenApiBackedAnnotations,
2931
resolveOpenApiBackedTools,
3032
type Authentication,
3133
type AuthenticationInput,
34+
type OpenApiPersistResult,
3235
type OpenapiStore,
36+
type ParsedDocument,
3337
} from "@executor-js/plugin-openapi";
3438

3539
import {
@@ -122,13 +126,31 @@ const makeMicrosoftPluginExtension = (
122126
ctx: PluginCtx<OpenapiStore>,
123127
httpClientLayer: Layer.Layer<HttpClient.HttpClient, never, never>,
124128
) => {
129+
const persistGraphOperations = (
130+
graph: { readonly parsedDocument?: ParsedDocument; readonly specText: string },
131+
integration: string,
132+
specHash: string,
133+
): Effect.Effect<
134+
OpenApiPersistResult,
135+
OpenApiExtractionError | OpenApiParseError | StorageFailure
136+
> =>
137+
graph.parsedDocument !== undefined
138+
? compileAndPersistOpenApiOperations({
139+
doc: graph.parsedDocument,
140+
integration,
141+
storage: ctx.storage,
142+
specHash,
143+
})
144+
: compileAndPersistOpenApiSpec({
145+
specText: graph.specText,
146+
integration,
147+
storage: ctx.storage,
148+
specHash,
149+
});
150+
125151
const addGraph = (config: MicrosoftGraphConfig) =>
126152
Effect.gen(function* () {
127153
const graph = yield* buildMicrosoftGraphOpenApiSpec(config, httpClientLayer);
128-
const compiled =
129-
graph.parsedDocument !== undefined
130-
? yield* compileOpenApiDocument(graph.parsedDocument)
131-
: yield* compileOpenApiSpec(graph.specText);
132154
const slug = IntegrationSlug.make(config.slug?.trim() || DEFAULT_MICROSOFT_SLUG);
133155

134156
const existing = yield* ctx.core.integrations.get(slug);
@@ -156,7 +178,7 @@ const makeMicrosoftPluginExtension = (
156178

157179
yield* ctx.storage.putSpec(specHash, graph.specText);
158180

159-
yield* ctx.transaction(
181+
const persisted = yield* ctx.transaction(
160182
Effect.gen(function* () {
161183
yield* ctx.core.integrations.register({
162184
slug,
@@ -167,14 +189,11 @@ const makeMicrosoftPluginExtension = (
167189
canRemove: true,
168190
canRefresh: true,
169191
});
170-
yield* ctx.storage.putOperations(
171-
String(slug),
172-
openApiStoredOperationsFromCompiled(String(slug), compiled),
173-
);
192+
return yield* persistGraphOperations(graph, String(slug), specHash);
174193
}),
175194
);
176195

177-
return { slug, toolCount: compiled.definitions.length };
196+
return { slug, toolCount: persisted.toolCount };
178197
});
179198

180199
const updateGraph = (rawSlug: string, input?: MicrosoftUpdateInput) =>
@@ -199,14 +218,8 @@ const makeMicrosoftPluginExtension = (
199218
},
200219
httpClientLayer,
201220
);
202-
const compiled =
203-
graph.parsedDocument !== undefined
204-
? yield* compileOpenApiDocument(graph.parsedDocument)
205-
: yield* compileOpenApiSpec(graph.specText);
206-
207221
const previousOperations = yield* ctx.storage.listOperations(rawSlug);
208222
const previousNames = new Set(previousOperations.map((op) => op.toolName));
209-
const nextNames = new Set(compiled.definitions.map((def) => def.toolPath));
210223

211224
const specHash = yield* sha256Hex(graph.specText);
212225
yield* ctx.storage.putSpec(specHash, graph.specText);
@@ -229,17 +242,15 @@ const makeMicrosoftPluginExtension = (
229242
...(input?.baseUrl ? { baseUrl: input.baseUrl } : {}),
230243
};
231244

232-
yield* ctx.transaction(
245+
const persisted = yield* ctx.transaction(
233246
Effect.gen(function* () {
234247
yield* ctx.core.integrations.update(slug, {
235248
config: nextConfig satisfies MicrosoftGraphIntegrationConfig as IntegrationConfig,
236249
});
237-
yield* ctx.storage.putOperations(
238-
rawSlug,
239-
openApiStoredOperationsFromCompiled(rawSlug, compiled),
240-
);
250+
return yield* persistGraphOperations(graph, rawSlug, specHash);
241251
}),
242252
);
253+
const nextNames = new Set(persisted.toolNames);
243254

244255
const connections = yield* ctx.connections.list({ integration: slug });
245256
yield* Effect.forEach(
@@ -257,7 +268,7 @@ const makeMicrosoftPluginExtension = (
257268

258269
return {
259270
slug,
260-
toolCount: compiled.definitions.length,
271+
toolCount: persisted.toolCount,
261272
addedTools: [...nextNames].filter((name) => !previousNames.has(name)).sort(),
262273
removedTools: [...previousNames].filter((name) => !nextNames.has(name)).sort(),
263274
};
@@ -340,7 +351,8 @@ export const microsoftPlugin = definePlugin((options?: MicrosoftPluginOptions) =
340351
describeAuthMethods: describeMicrosoftAuthMethods,
341352
describeIntegrationDisplay: describeMicrosoftIntegrationDisplay,
342353

343-
resolveTools: ({ config, storage }) => resolveOpenApiBackedTools({ config, storage }),
354+
resolveTools: ({ integration, config, storage }) =>
355+
resolveOpenApiBackedTools({ integration, config, storage }),
344356

345357
invokeTool: ({ ctx, toolRow, credential, args }) => {
346358
const httpClientLayer = options?.httpClientLayer ?? ctx.httpClientLayer;

0 commit comments

Comments
 (0)