Skip to content

[bot] Cursor SDK (@cursor/sdk) not instrumented — add wrapper and auto-instrumentation for Cursor agents #1919

@AbhiPrasad

Description

Summary

Cursor's TypeScript SDK (@cursor/sdk, public beta, latest npm version 1.0.9) exposes Cursor's coding agent runtime programmatically from TypeScript. This repository currently has no dedicated Cursor SDK support: no wrapper, diagnostics channels, plugin, auto-instrumentation config, vendored SDK types, examples, or e2e coverage.

The SDK is a close parallel to our existing Claude Agent SDK support: users create or resume an agent, send prompts, stream normalized agent events, receive tool-call lifecycle events, and inspect run results. Users who build production workflows on @cursor/sdk should get Braintrust traces for agent runs, model output, tool calls, status transitions, token usage, and final results.

What instrumentation is missing

The Cursor SDK exposes these important surfaces:

SDK surface Description
Agent.create(options) Create a local, Cursor-hosted cloud, or self-hosted cloud agent
Agent.prompt(message, options) One-shot create/send/wait/dispose flow
Agent.resume(agentId, options) Reattach to an existing local or cloud agent
agent.send(message, options) Start a new run; supports per-run model, MCP servers, onDelta, onStep, and local.force
run.stream() Async stream of normalized SDKMessage events (assistant, thinking, tool_call, status, task, request, etc.)
run.wait() Wait for terminal run result (finished, error, cancelled)
run.cancel() Cancel a running agent task
run.conversation() Structured per-turn conversation with assistant, thinking, tool, and shell turns
Agent.getRun() / Agent.listRuns() / Agent.get() / Agent.list() Inspect historical agents and runs
Cursor.me() / Cursor.models.list() / Cursor.repositories.list() Account/catalog APIs
agent.listArtifacts() / agent.downloadArtifact() Cloud artifacts

No coverage in any Braintrust instrumentation layer:

  • No wrapper function (e.g. wrapCursorSDK())
  • No diagnostics channels for Cursor SDK calls
  • No plugin handler in js/src/instrumentation/plugins/
  • No auto-instrumentation config in js/src/auto-instrumentations/configs/
  • No vendor SDK types in js/src/vendor-sdk-types/
  • No examples in js/examples/
  • No e2e scenarios for @cursor/sdk

A grep for @cursor/sdk and Cursor SDK concepts across the repo returns no SDK-specific matches.

Desired experience

Suggested API, modeled after wrapClaudeAgentSDK():

import * as cursorSDK from "@cursor/sdk";
import { wrapCursorSDK } from "braintrust";

const { Agent } = wrapCursorSDK(cursorSDK);

const agent = await Agent.create({
  apiKey: process.env.CURSOR_API_KEY!,
  model: { id: "composer-2" },
  local: { cwd: process.cwd() },
});

const run = await agent.send("Summarize what this repository does");
for await (const event of run.stream()) {
  console.log(event);
}

Auto-instrumentation should also work for direct imports:

import { Agent } from "@cursor/sdk";

const result = await Agent.prompt("What does this repo do?", {
  apiKey: process.env.CURSOR_API_KEY!,
  model: { id: "composer-2" },
  local: { cwd: process.cwd() },
});

Trace shape to consider

Top-level span per run, likely named Cursor Agent, with metadata such as:

  • cursor_sdk.agent_id
  • cursor_sdk.run_id
  • cursor_sdk.runtime (local, cloud, maybe self-hosted variants from cloud.env)
  • cursor_sdk.model / Braintrust standard model
  • cursor_sdk.status
  • cursor_sdk.duration_ms
  • cursor_sdk.cloud.repos, cursor_sdk.cloud.auto_create_pr, PR/branch URLs from run.git
  • whether the call came from Agent.prompt, agent.send, or resumed agent flow

Inputs/outputs:

  • Input: prompt string or SDKUserMessage (text, image metadata; avoid logging raw image bytes by default)
  • Output: final result.result / assistant text
  • Metrics: usage from turn-ended deltas (inputTokens, outputTokens, cacheReadTokens, cacheWriteTokens) and/or aggregate token deltas

Child spans/events:

  • Assistant text and thinking messages from run.stream()
  • Tool call lifecycle from SDKToolUseMessage and/or raw InteractionUpdate:
    • tool_call start/completion/error
    • stable fields: call_id, name, status
    • defensively capture args / result because Cursor documents tool payload schemas as not stable
  • Shell/tool steps from onStep / run.conversation() where available
  • Task/status/request events for cloud lifecycle and human approval/request pauses
  • Subagent/tool delegation via Cursor's Agent tool, if surfaced in stream/conversation data

The implementation should preserve user-provided onDelta and onStep callbacks, wrapping them without changing backpressure semantics (Cursor awaits callbacks before processing the next update).

Parallels with Claude Agent SDK support

Cursor SDK support can likely reuse the same broad architecture as Claude Agent SDK instrumentation:

  • wrapper publishes diagnostics-channel events rather than owning span lifecycle directly
  • plugin owns span lifecycle and stream processing
  • vendor lightweight SDK types rather than taking a hard dependency
  • auto-instrumentation config patches @cursor/sdk exports
  • tests should cover property forwarding, callback preservation, async streaming, errors/cancellation, and cleanup/disposal

Differences to account for:

  • Cursor has explicit Agent static methods and SDKAgent / Run handles rather than a single query() generator
  • Agent.prompt() is a one-shot convenience that should still produce a run span
  • Local and cloud runtimes share one interface but expose different metadata and operation support
  • Cursor exposes both normalized stream events and lower-level raw deltas (onDelta) that may be useful for token accounting and tool-call span timing

Testing / coverage suggestions

  • Unit tests with a mocked @cursor/sdk module covering:
    • Agent.create().send().stream()
    • Agent.prompt()
    • Agent.resume().send().wait()
    • callback wrapping for onDelta and onStep
    • tool-call start/completion/error events
    • cancellation and error propagation
    • await using / [Symbol.asyncDispose] behavior is preserved
  • Auto-instrumentation tests for ESM import of @cursor/sdk
  • Optional external tests gated on CURSOR_API_KEY
  • E2E scenario with a mocked Cursor SDK/package if real Cursor API calls are unsuitable for CI

Upstream references

Local files inspected

  • js/src/wrappers/ — no Cursor SDK wrapper
  • js/src/instrumentation/plugins/ — no Cursor SDK channels or plugin
  • js/src/auto-instrumentations/configs/ — no Cursor SDK config
  • js/src/vendor-sdk-types/ — no Cursor SDK types
  • js/examples/ — no Cursor SDK example
  • e2e/scenarios/ — no Cursor SDK scenario

Metadata

Metadata

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions