feat: agentmask + service discovery infrastructure#952
Conversation
Add the @ocap/agentmask package with an OpenClaw plugin that lets an LLM agent request and use wallet capabilities from a MetaMask capability vendor via the OCAP kernel daemon. The plugin provides three tools: - metamask_request_capability: redeem OCAP URL and request capabilities - metamask_call_capability: call methods on obtained capabilities - metamask_list_capabilities: list capabilities obtained in the session Also includes docs migrated from the extension repo (capability-vendor, demo-two-way-comms, kernel-guide) and updates demo-two-way-comms with parallel Manual CLI / OpenClaw agent instructions for the away side. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…edemption The agent can now ask the user for their OCAP URL during conversation and redeem it on the fly via the new metamask_obtain_vendor tool, rather than requiring the URL to be pre-configured in .env or plugin config. - Add metamask_obtain_vendor tool (accepts URL, redeems, stores vendor kref) - Move ocapUrl into mutable PluginState (seeded from config, overridable) - ensureVendor() directs the agent to metamask_obtain_vendor when no URL set - Update SKILL.md workflow and demo doc OpenClaw section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ion__ After obtaining a capability from the vendor, request_capability now calls __getDescription__ on it to retrieve method schemas (name, args, return type). The full schema is shown in the tool response and stored in plugin state. list_capabilities also shows method names. Discovery is best-effort — if the capability is not discoverable, the tool still works without method listings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
|
Warning MetaMask internal reviewing guidelines:
|
Pre-redemption of the configured matcher URL was fire-and-forget, so a tool call landing during the redemption window would see a misleading 'no matcher connection' error from requireMatcher. Park the pending promise in state.matcherPending and have requireMatcher await it, surfacing the underlying failure when redemption rejects.
… discovery/daemon.ts Both OpenClaw plugins kept a near-identical copy of the daemon caller; the discovery version added optional ocapHome support, so a future fix in one copy would drift from the other. Sync the two files (metamask now accepts the same optional ocapHome param, even though it does not use it today) and add a header note instructing future editors to keep them in sync. Plugins remain self-contained so 'openclaw plugins install -l' still works on either directory in isolation.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit e046f8d. Configure here.
| pluginValue: pluginConfig?.timeoutMs, | ||
| envVar: 'OCAP_TIMEOUT_MS', | ||
| parse: Number, | ||
| }) ?? DEFAULT_TIMEOUT_MS; |
There was a problem hiding this comment.
NaN timeout from non-numeric env var bypasses default
Medium Severity
When OCAP_TIMEOUT_MS contains a non-numeric string (e.g. "abc"), parse: Number produces NaN. The nullish coalescing ?? DEFAULT_TIMEOUT_MS doesn't catch NaN (only null/undefined), so timeoutMs becomes NaN. setTimeout(fn, NaN) fires immediately (0 ms), causing every daemon call to be instantly killed with SIGKILL and a confusing "timed out after NaNms" error.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit e046f8d. Configure here.
| // eslint-disable-next-line @typescript-eslint/no-explicit-any | ||
| function tersePeer(peer: any): string { | ||
| const peerId = peer.toString(); | ||
| if (tersePeers.has(peerId)) { | ||
| return tersePeers.get(peerId) as string; | ||
| } | ||
| tersePeerIdx += 1; | ||
| const terse = `<PEER-${tersePeerIdx}>`; | ||
| tersePeers.set(peerId, terse); | ||
| logger.log(`[PEER] ${terse} = ${peerId}`); | ||
| return terse; | ||
| } | ||
|
|
||
| /** | ||
| * Produce a more legible multiaddr string | ||
| * | ||
| * @param multiaddr - The multiaddr we care about. | ||
| * @returns a terser form of `multiaddr`. | ||
| */ | ||
| // eslint-disable-next-line @typescript-eslint/no-explicit-any | ||
| function terseAddr(multiaddr: any): string { | ||
| const raw = multiaddr.toString(); | ||
| const slash = raw.lastIndexOf('/') + 1; | ||
| if (slash <= 0) { | ||
| return raw; | ||
| } | ||
| const prefix = raw.substring(0, slash); | ||
| const suffix = raw.substring(slash); | ||
| return `${prefix}${tersePeer(suffix)}`; | ||
| } |
There was a problem hiding this comment.
| // eslint-disable-next-line @typescript-eslint/no-explicit-any | |
| function tersePeer(peer: any): string { | |
| const peerId = peer.toString(); | |
| if (tersePeers.has(peerId)) { | |
| return tersePeers.get(peerId) as string; | |
| } | |
| tersePeerIdx += 1; | |
| const terse = `<PEER-${tersePeerIdx}>`; | |
| tersePeers.set(peerId, terse); | |
| logger.log(`[PEER] ${terse} = ${peerId}`); | |
| return terse; | |
| } | |
| /** | |
| * Produce a more legible multiaddr string | |
| * | |
| * @param multiaddr - The multiaddr we care about. | |
| * @returns a terser form of `multiaddr`. | |
| */ | |
| // eslint-disable-next-line @typescript-eslint/no-explicit-any | |
| function terseAddr(multiaddr: any): string { | |
| const raw = multiaddr.toString(); | |
| const slash = raw.lastIndexOf('/') + 1; | |
| if (slash <= 0) { | |
| return raw; | |
| } | |
| const prefix = raw.substring(0, slash); | |
| const suffix = raw.substring(slash); | |
| return `${prefix}${tersePeer(suffix)}`; | |
| } | |
| function tersePeer(peer: PeerId): string { | |
| const peerId = peer.toString(); | |
| if (tersePeers.has(peerId)) { | |
| return tersePeers.get(peerId) as string; | |
| } | |
| tersePeerIdx += 1; | |
| const terse = `<PEER-${tersePeerIdx}>`; | |
| tersePeers.set(peerId, terse); | |
| logger.log(`[PEER] ${terse} = ${peerId}`); | |
| return terse; | |
| } | |
| /** | |
| * Produce a more legible multiaddr string | |
| * | |
| * @param multiaddr - The multiaddr we care about. | |
| * @returns a terser form of `multiaddr`. | |
| */ | |
| function terseAddr(multiaddr: Multiaddr): string { | |
| const raw = multiaddr.toString(); | |
| const slash = raw.lastIndexOf('/') + 1; | |
| if (slash <= 0) { | |
| return raw; | |
| } | |
| const prefix = raw.substring(0, slash); | |
| const suffix = raw.substring(slash); | |
| if (!isPeerId(suffix)) { | |
| return raw; | |
| } | |
| return `${prefix}${tersePeer(suffix)}`; | |
| } |
So we can avoid the any here
| export type ContactResponse = | ||
| | PublicContactResponse | ||
| | PermissionedContactResponse | ||
| | ValidatedClientContactResponse; |
There was a problem hiding this comment.
Suggestion: make ContactResponse a discriminated union.
As written, this is a non-discriminated union: the PublicContactResponse = ServicePoint arm collides syntactically with the two object-shaped arms (both have [key]: ...), so consumers must structurally probe ('credentialSubmissionPoint' in response) to narrow. A ServicePoint that happens to define a method named credentialSpec would be misclassified.
Cheap fix now while in-tree consumers can be counted on one hand:
export type ContactResponse =
| { kind: 'public'; service: ServicePoint }
| ({ kind: 'permissioned' } & PermissionedContactResponse)
| ({ kind: 'validatedClient' } & ValidatedClientContactResponse);This matches the pattern already used by TypeSpec in service-description.ts and by Request/Reply in llm-bridge/src/protocol.ts, and lets consumers switch (response.kind) exhaustively. The downstream change is ~2 lines in sample-services/src/vat-lib/contact-endpoint.ts (return { kind: 'public', service }) and matching destructuring in the discovery plugin's initiate-contact.ts.
| export type PluginState = { | ||
| matcher: MatcherEntry | undefined; | ||
| /** | ||
| * Pending pre-redemption of a configured matcher URL. Set during | ||
| * `register()` when `matcherUrl` was supplied via config or env; cleared | ||
| * on either resolution or rejection. `requireMatcher` awaits it so a | ||
| * tool call issued before the pre-redemption settles does not see a | ||
| * misleading "no matcher connection" error. | ||
| */ | ||
| matcherPending: Promise<MatcherEntry> | undefined; | ||
| contacts: Map<string, ContactEntry>; | ||
| services: Map<string, ServiceEntry>; | ||
| }; |
There was a problem hiding this comment.
Suggestion: model the matcher slot as a discriminated 3-state union.
matcher: MatcherEntry | undefined + matcherPending: Promise<MatcherEntry> | undefined permits the contradictory state where both are set — a tool could in principle observe a resolved matcher and a still-pending pre-redemption. The matcher slot is naturally a 3-state value:
export type MatcherSlot =
| { status: 'absent' }
| { status: 'pending'; promise: Promise<MatcherEntry> }
| { status: 'resolved'; entry: MatcherEntry };
export type PluginState = {
matcher: MatcherSlot;
contacts: Map<string, ContactEntry>;
services: Map<string, ServiceEntry>;
};requireMatcher then becomes a clean switch on state.matcher.status. Three or four call sites to update. Particularly valuable here because the e359bf49b fix ("await pre-redemption before reporting no matcher") was exactly about racing on these two fields.
| async ingest(request: IngestRequest): Promise<void> { | ||
| persistent.push({ role: 'user', content: formatIngest(request) }); | ||
| const reply = await client.chat(persistent); | ||
| persistent.push({ role: 'assistant', content: reply }); | ||
| }, |
There was a problem hiding this comment.
Bug: persistent history corrupted on ingest failure.
If client.chat rejects (gateway 5xx, token error, transient network), the user turn is left in persistent with no paired assistant — every subsequent ingest will send a malformed conversation (multi-user-turns-in-a-row) that violates Anthropic/OpenAI conversational conventions and may degrade ranker behavior.
The matcher already rolls back its own registry entry on bridge ingest failure (matcher-vat/index.ts commitRegistration); the bridge's persistent log should match. Two options:
async ingest(request: IngestRequest): Promise<void> {
const before = persistent.length;
persistent.push({ role: 'user', content: formatIngest(request) });
try {
const reply = await client.chat(persistent);
persistent.push({ role: 'assistant', content: reply });
} catch (error) {
persistent.length = before;
throw error;
}
},Or — even simpler — build the user message locally, only push both after the reply resolves:
const userMsg = { role: 'user' as const, content: formatIngest(request) };
const reply = await client.chat([...persistent, userMsg]);
persistent.push(userMsg, { role: 'assistant', content: reply });| `Use 'ocap daemon stop' first.`, | ||
| ); | ||
| } | ||
| // Stale socket file from a previous run — clean up. |
There was a problem hiding this comment.
Suggestion (applies to the pre-existing try { await unlink(socketPath); } catch {} just below): narrow the catch to ENOENT.
The comment says "file may not exist" but the bare catch {} covers EPERM, EACCES, EBUSY, and EISDIR (someone replaced the socket with a directory) too. When any of these hit, the subsequent server.listen(socketPath) will fail with an opaque EADDRINUSE, leaving the operator wondering why the brand-new daemon-liveness interlock didn't help.
try {
await unlink(socketPath);
} catch (error) {
if ((error as NodeJS.ErrnoException).code !== 'ENOENT') {
throw error;
}
}Weakening this catch is what lets the rest of the new isSocketLive interlock land safely.
| for (const entry of ranked) { | ||
| const registered = registry.get(entry.id); | ||
| if (!registered) { | ||
| // The bridge cited an id we no longer have — could happen if | ||
| // the LLM hallucinates one, or if a service was unregistered | ||
| // between ingest and query. Skip and log; don't bubble up. | ||
| log(`bridge cited unknown id ${entry.id}; skipping`); | ||
| continue; | ||
| } | ||
| matches.push( | ||
| harden({ | ||
| description: registered.description, | ||
| rationale: entry.rationale, | ||
| }), | ||
| ); | ||
| } | ||
| log( | ||
| `findServices("${query.description.slice(0, 80)}") → ${matches.length} match(es)`, | ||
| ); | ||
| return matches; |
There was a problem hiding this comment.
Suggestion: distinguish "all-cited-ids-unknown" from "no matches".
A common failure mode of LLM rankers is to hallucinate plausible-looking IDs — especially after a context-reset or prompt-injection bug. If every id in the bridge reply is unknown, this loop currently returns [], which the consumer (discovery_find_services tool) reports to the agent as "No matching services found." indistinguishable from a legitimate zero-match query.
Track the drop count and surface it when it's suspicious:
let dropped = 0;
for (const entry of ranked) {
const registered = registry.get(entry.id);
if (!registered) {
dropped += 1;
log(`bridge cited unknown id ${entry.id}; skipping`);
continue;
}
matches.push(harden({ description: registered.description, rationale: entry.rationale }));
}
if (ranked.length > 0 && matches.length === 0) {
throw new Error(
`matcher: LLM bridge cited only unknown ids (${dropped}/${ranked.length}); ` +
'registry may be out of sync or ranker is hallucinating.',
);
}This turns a silent correctness bug (the exact class the matcher is most exposed to) into a loud one.
| export const ServiceQueryStruct: Struct<ServiceQuery> = object({ | ||
| description: string(), | ||
| }); | ||
|
|
||
| /** | ||
| * A candidate service returned by a matcher query. | ||
| */ | ||
| export type ServiceMatch = { | ||
| description: ServiceDescription; | ||
| /** Matcher's natural-language rationale for this match. */ | ||
| rationale?: string; | ||
| }; | ||
|
|
||
| export const ServiceMatchStruct: Struct<ServiceMatch> = object({ | ||
| description: ServiceDescriptionStruct, | ||
| rationale: exactOptional(string()), | ||
| }); | ||
|
|
||
| export const ServiceMatchListStruct: Struct<ServiceMatch[]> = | ||
| array(ServiceMatchStruct); |
There was a problem hiding this comment.
Suggestion: add _AssertSameShape drift guards for ServiceQuery and ServiceMatch.
Struct<ServiceQuery> = object({...}) enforces "struct fits type" but not the reverse — if someone adds a field to the type and forgets the struct (or vice-versa), the drift goes unnoticed at compile time. service-description.ts already establishes the two-way drift trap; copy the pattern here for 5 lines of zero-runtime-cost insurance:
type _AssertSameShape<A, B> = [A, B] extends [B, A] ? true : never;
type _AssertServiceQuery = _AssertSameShape<Infer<typeof ServiceQueryStruct>, ServiceQuery>;
type _AssertServiceMatch = _AssertSameShape<Infer<typeof ServiceMatchStruct>, ServiceMatch>;ServiceMatchList is already Infer-derived so it's safe.
| /** | ||
| * The vat-facing shape of the `ocapURLIssuerService` kernel-service | ||
| * endowment: accepts a remotable and returns an OCAP URL that resolves to | ||
| * that remotable when redeemed elsewhere. | ||
| */ | ||
| export type OcapURLIssuerService = { | ||
| issue: (obj: unknown) => Promise<string>; | ||
| }; | ||
|
|
||
| /** | ||
| * The vat-facing shape of the `ocapURLRedemptionService` kernel-service | ||
| * endowment: accepts an OCAP URL and returns the referenced remotable. | ||
| */ | ||
| export type OcapURLRedemptionService = { | ||
| redeem: (url: string) => Promise<unknown>; | ||
| }; |
There was a problem hiding this comment.
Suggestion: also use these new types as the return types of getIssuerService / getRedemptionService / getServices.
The newly-exported types here aren't actually used by the manager's public API — getIssuerService(): object and getRedemptionService(): object still return bare object, and getServices() types both services as { name: string; service: object } (pre-existing code, not in this diff). Two-line tweak that propagates the named types out:
getServices(): {
issuerService: { name: string; service: OcapURLIssuerService };
redemptionService: { name: string; service: OcapURLRedemptionService };
} { ... }
getIssuerService(): OcapURLIssuerService {
return this.#ocapURLIssuerService as OcapURLIssuerService;
}
getRedemptionService(): OcapURLRedemptionService {
return this.#ocapURLRedemptionService as OcapURLRedemptionService;
}The #-private fields stay typed as object (because Far(...) returns unknown-ish), but the accessor surface gets the named contract — which seems to be the point of adding these types in the first place.
| 'resetState', | ||
| ]); | ||
|
|
||
| const configSchema: PluginConfigSchema = { |
There was a problem hiding this comment.
Suggestion: replace hand-rolled safeParse with a superstruct wrapper (both plugins).
The project's CLAUDE.md says "Use @metamask/superstruct for runtime type checking and to define object types," but both plugin index.ts files implement configSchema.safeParse by hand: typeof-checking each known field and walking KNOWN_KEYS manually. The external PluginConfigSchema shape ({ safeParse, jsonSchema }) is what the OpenClaw SDK needs — the implementation can be a thin wrapper:
import { object, string, exactOptional, number, is, type Struct } from '@metamask/superstruct';
const PluginConfigStruct = object({
cliPath: exactOptional(string()),
matcherUrl: exactOptional(string()),
ocapHome: exactOptional(string()),
timeoutMs: exactOptional(number()),
});
const configSchema: PluginConfigSchema = {
safeParse(value: unknown) {
return is(value, PluginConfigStruct)
? { success: true, data: value }
: { success: false, error: 'invalid plugin config' };
},
jsonSchema: { /* unchanged */ },
};Halves the code, eliminates the KNOWN_KEYS allow-list drift hazard, and matches the rest of the codebase. Same pattern applies in openclaw-plugin-metamask/index.ts:41.


Summary
rekm/agentmaskbranch as base (agentmask vats, OpenClaw MetaMask vendor plugin, capability schema discovery) ontomainImportant caveat: this requires using the updated version of the MetaMask browser extension (Erik's updates to put the ocap kernel into it and then some further stuff I needed to add on top of that) in the
chip/agentmaskbranch of themetamask-extensionrepo. While the changes this PR represents are being proposed for merging into ourmainbranch, we don't really have that option for the browser extension since it's not our repo to mess with. (I suppose we could merge it with some negotiation with the MetaMake team, but since it's highly experimental I'm pretty sure they wouldn't be too keen on that idea. I believe Erik's plan was to eventually have it in under a flag, but I don't thing that's yet an option.)Test plan
packages/agentmask/openclaw-plugin-discovery/VALIDATION.mdend-to-end (matcher daemon and LLM service consumer on VPS, sample services + MetaMask browser extension (providing MetaMask services) on laptop, OpenClaw agent driving discovery and contact)Note
Medium Risk
Medium risk because it changes
ocapCLI process management (new global--home, daemon start interlocks, relay address announcements), which can affect developer workflows and daemon lifecycle behavior. Most other additions are new plugins/docs with limited blast radius.Overview
Introduces a new
@ocap/agentmaskpackage that ships two installable OpenClaw plugins: MetaMask capability vending/usage (metamask_*tools) and service discovery via a matcher (discovery_*+service_*tools), including shared daemon-CLI invocation helpers, session state tracking, skills, and Vitest coverage.Hardens and extends the
ocapCLI: adds a global--homeflag (overriding$OCAP_HOMEper invocation), addsrelay start --public-ip(and$LIBP2P_RELAY_PUBLIC_IP) to announce a reachable VPS address, and prevents accidental daemon orphaning by checkingdaemon.pid/liveness before spawning or starting a new daemon; also updates config/sandbox defaults (.yarnrc.ymlproxy wiring, Claude sandbox allowances) and adds/updates changelogs and operational docs.Reviewed by Cursor Bugbot for commit e046f8d. Bugbot is set up for automated code reviews on this repo. Configure here.