Skip to content

Latest commit

 

History

History
260 lines (199 loc) · 29.4 KB

File metadata and controls

260 lines (199 loc) · 29.4 KB
title AI Agents Module

The AI Agents module is agnostic to the library used. The SDK will instrument existing AI agents in certain frameworks or libraries (at the time of writing those are openai-agents in Python and Vercel AI in Javascript). You may need to manually annotate spans for other libraries.

Spans Conventions

For your AI agents data to show up in the Sentry AI Agents Insights, at least one of the AI spans needs to be created and have well-defined names and data attributes. If the required data (marked with MUST or required) is missing, the data will not show up in the Agents dashbboard.

We try to follow v1.36.0 of the OpenTelemetry Semantic Conventions for Generative AI as close as possible. Being 100% compatible is not yet possible, because OpenTelemetry has "Span Events" which Sentry does not support. The input from/output to an AI model is stored in span events in OpenTelemetry. Since this is not possible in Sentry, we add this data onto span attributes as a list.

The [Sentry Conventions](https://getsentry.github.io/sentry-conventions/generated/attributes/) have all the detailed specifications for `"gen_ai.*"` span attributes.

Sentry Conventions is the single source of truth.

Create Agent Span

Describes GenAI agent creation and is usually applicable when working with remote agent services.

  • Span op SHOULD be "gen_ai.create_agent".
  • Span name SHOULD be "create_agent {gen_ai.agent.name}". (e.g. "create_agent Weather Agent")
  • Attribute gen_ai.operation.name MUST be "create_agent".
  • Attribute gen_ai.agent.name SHOULD be set to the agents name. (e.g. "Weather Agent")
  • If provided, the attribute gen_ai.request.model MUST be the agent's default request model. (e.g. "gpt-4o")
  • If relevant, the gen_ai.pipeline.name attribute SHOULD be set to the name of the AI workflow, pipeline or chain within which the agent operates. (e.g. "weather-pipeline")
  • All Common Span Attributes SHOULD be set (all required common attributes MUST be set).

Invoke Agent Span

Describes AI agent invocation. Agent invocations represent operations that can include multiple model calls, or some auxiliary work that goes beyond transforming the model input and output.

  • Span op SHOULD be "gen_ai.invoke_agent".
  • Span name SHOULD be "invoke_agent {gen_ai.agent.name}". (e.g. "invoke_agent Weather Agent") [8]
  • Attribute gen_ai.operation.name MUST be "invoke_agent".
  • Attribute gen_ai.agent.name SHOULD be set to the agents name. (e.g. "Weather Agent")
  • If provided, the attribute gen_ai.request.model MUST be the agent's default request model. (e.g. "gpt-4o")
  • If relevant, the gen_ai.pipeline.name attribute SHOULD be set to the name of the AI workflow, pipeline or chain within which the agent operates. (e.g. "weather-pipeline")
  • All Common Span Attributes SHOULD be set (all required common attributes MUST be set).

Additional attributes on the span:

Request Data

Attribute Type Requirement Level Description Example
gen_ai.input.messages string optional List of dictionaries describing the messages (prompts) given to the agent. [0], [1], [6], [7], [9] '[{"role": "user", "parts": [{"type": "text", "content": "..."}]}]'
gen_ai.tool.definitions string optional List of dictionaries describing the available tools. [0] '[{"name": "random_number", "description": "..."}, ...]'
gen_ai.system_instructions string optional The system instructions passed to the model. "You are a helpful assistant."
gen_ai.request.max_tokens int optional Model configuration parameter. 500
gen_ai.request.seed string optional Seed for reproducible outputs. "12345"
gen_ai.request.frequency_penalty float optional Model configuration parameter. 0.5
gen_ai.request.presence_penalty float optional Model configuration parameter. 0.5
gen_ai.request.temperature float optional Model configuration parameter. 0.1
gen_ai.request.top_p float optional Model configuration parameter. 0.7
gen_ai.request.top_k int optional Limits model to K most likely next tokens. 40
gen_ai.request.messages string optional Deprecated. Use gen_ai.input.messages instead. List of dictionaries describing the messages (prompts) given to the agent. [0] '[{"role": "system", "content": "..."}, ...]'
gen_ai.request.available_tools string optional Deprecated. Use gen_ai.tool.definitions instead. List of dictionaries describing the available tools. [0] '[{"name": "random_number", "description": "..."}, ...]'

Response Data

Attribute Type Requirement Level Description Example
gen_ai.output.messages string optional Stringified array of message objects representing the model's output. [0], [1] '[{"role": "assistant", "parts": [{"type": "text", "content": "..."}]}]'
gen_ai.response.streaming boolean optional Whether response was streamed asynchronously. true
gen_ai.response.text string optional Deprecated. Use gen_ai.output.messages instead. The text representation of the agents response. "The weather in Paris is rainy"
gen_ai.response.tool_calls string optional Deprecated. Use gen_ai.output.messages instead. The tool calls in the model's response. [0] '[{"name": "random_number", "type": "function_call", "arguments": "..."}]'

Token Usage Data

Attribute Type Requirement Level Description Example
gen_ai.usage.input_tokens int optional The number of tokens used in the AI input (prompt), including cached tokens. [2] 60
gen_ai.usage.input_tokens.cached int optional The number of cached tokens used in the AI input (prompt). 50
gen_ai.usage.input_tokens.cache_write int optional Tokens written to cache when processing input. 20
gen_ai.usage.output_tokens int optional The number of tokens used in the AI output, including reasoning tokens. [3] 130
gen_ai.usage.output_tokens.reasoning int optional The number of tokens used for reasoning. 30
gen_ai.usage.total_tokens int optional The sum of gen_ai.usage.input_tokens and gen_ai.usage.output_tokens. 190

Cost Data

Attribute Type Requirement Level Description Example
gen_ai.cost.input_tokens double optional Cost of input tokens in USD (without cached). 0.005
gen_ai.cost.output_tokens double optional Cost of output tokens in USD (without reasoning). 0.015
gen_ai.cost.total_tokens double optional Total cost for tokens used. 0.020
  • [0]: Span attributes only allow primitive data types (like int, float, boolean, string). This means you need to use a stringified version of a list of dictionaries. Do NOT set the object/array [{"foo": "bar"}] but rather the string '[{"foo": "bar"}]' (must be parsable JSON).
  • [1]: Messages use the format {role, parts} where parts is an array of typed objects: [{"role": "user", "parts": [{"type": "text", "content": "..."}]}]. The role must be "user", "assistant", "tool", or "system". For backwards compatibility, the legacy format {role, content} (e.g. [{"role": "user", "content": "..."}]) is also accepted.
  • [2]: Cached tokens are a subset of input tokens; gen_ai.usage.input_tokens includes gen_ai.usage.input_tokens.cached.
  • [3]: Reasoning tokens are a subset of output tokens; gen_ai.usage.output_tokens includes gen_ai.usage.output_tokens.reasoning.

AI Client Span

This span represents a request to an AI model or service that generates a response or requests a tool call based on the input prompt.

  • Span op SHOULD be "gen_ai.{gen_ai.operation.name}". (e.g. "gen_ai.chat")
  • Span name SHOULD be {gen_ai.operation.name} {gen_ai.request.model}". (e.g. "chat o3-mini")
  • Attribute gen_ai.operation.name MUST be "chat", "embeddings", "generate_content" or "text_completion". [4]
  • Attribute gen_ai.request.model MUST be the requested model. (e.g. "gpt-4o")
  • Attribute gen_ai.response.model MUST be the concrete response model. (e.g. "gpt-4o-2024-08-06")
  • If the request originates from an agent, the gen_ai.agent.name attribute SHOULD be set to the name of the agent. (e.g. "Weather Agent")
  • If relevant, the gen_ai.pipeline.name attribute SHOULD be set to the name of the AI workflow, pipeline or chain within which the agent operates. (e.g. "weather-pipeline")
  • All Common Span Attributes SHOULD be set (all required common attributes MUST be set).

Additional attributes on the span:

Request Data

Attribute Type Requirement Level Description Example
gen_ai.input.messages string optional List of dictionaries describing the messages (prompts) sent to the LLM. [0], [1], [6], [7], [9] '[{"role": "user", "parts": [{"type": "text", "content": "..."}]}]'
gen_ai.tool.definitions string optional List of dictionaries describing the available tools. [0] '[{"name": "random_number", "description": "..."}, ...]'
gen_ai.system_instructions string optional The system instructions passed to the model. "You are a helpful assistant."
gen_ai.request.max_tokens int optional Model configuration parameter. 500
gen_ai.request.seed string optional Seed for reproducible outputs. "12345"
gen_ai.request.frequency_penalty float optional Model configuration parameter. 0.5
gen_ai.request.presence_penalty float optional Model configuration parameter. 0.5
gen_ai.request.temperature float optional Model configuration parameter. 0.1
gen_ai.request.top_p float optional Model configuration parameter. 0.7
gen_ai.request.top_k int optional Limits model to K most likely next tokens. 40
gen_ai.request.messages string optional Deprecated. Use gen_ai.input.messages instead. List of dictionaries describing the messages (prompts) sent to the LLM. [0] '[{"role": "system", "content": "..."}, ...]'
gen_ai.request.available_tools string optional Deprecated. Use gen_ai.tool.definitions instead. List of dictionaries describing the available tools. [0] '[{"name": "random_number", "description": "..."}, ...]'

Response Data

Attribute Type Requirement Level Description Example
gen_ai.output.messages string optional Stringified array of message objects representing the model's output. [0], [1] '[{"role": "assistant", "parts": [{"type": "text", "content": "..."}]}]'
gen_ai.response.finish_reasons string optional The reason why the model stopped generating. "stop"
gen_ai.response.id string optional Unique identifier for the completion. "chatcmpl-abc123"
gen_ai.response.streaming boolean optional Whether response was streamed asynchronously. true
gen_ai.response.time_to_first_token double optional Seconds until first response chunk in streaming. 0.5
gen_ai.response.tokens_per_second double optional Output tokens per second throughput. 50.0
gen_ai.response.text string optional Deprecated. Use gen_ai.output.messages instead. The text representation of the model's response. [0] "The weather in Paris is rainy"
gen_ai.response.tool_calls string optional Deprecated. Use gen_ai.output.messages instead. The tool calls in the model's response. [0] '[{"name": "random_number", "type": "function_call", "arguments": "..."}]'

Token Usage Data

Attribute Type Requirement Level Description Example
gen_ai.usage.input_tokens int optional The number of tokens used in the AI input (prompt), including cached tokens. [2] 60
gen_ai.usage.input_tokens.cached int optional The number of cached tokens used in the AI input (prompt). 50
gen_ai.usage.input_tokens.cache_write int optional Tokens written to cache when processing input. 20
gen_ai.usage.output_tokens int optional The number of tokens used in the AI output, including reasoning tokens. [3] 130
gen_ai.usage.output_tokens.reasoning int optional The number of tokens used for reasoning. 30
gen_ai.usage.total_tokens int optional The sum of gen_ai.usage.input_tokens and gen_ai.usage.output_tokens. 190

Cost Data

Attribute Type Requirement Level Description Example
gen_ai.cost.input_tokens double optional Cost of input tokens in USD (without cached). 0.005
gen_ai.cost.output_tokens double optional Cost of output tokens in USD (without reasoning). 0.015
gen_ai.cost.total_tokens double optional Total cost for tokens used. 0.020
  • [0]: Span attributes only allow primitive data types (like int, float, boolean, string). This means you need to use a stringified version of a list of dictionaries. Do NOT set the object/array [{"foo": "bar"}] but rather the string '[{"foo": "bar"}]' (must be parsable JSON).
  • [1]: Messages use the format {role, parts} where parts is an array of typed objects: [{"role": "user", "parts": [{"type": "text", "content": "..."}]}]. The role must be "user", "assistant", "tool", or "system". For backwards compatibility, the legacy format {role, content} (e.g. [{"role": "user", "content": "..."}]) is also accepted.
  • [2]: Cached tokens are a subset of input tokens; gen_ai.usage.input_tokens includes gen_ai.usage.input_tokens.cached.
  • [3]: Reasoning tokens are a subset of output tokens; gen_ai.usage.output_tokens includes gen_ai.usage.output_tokens.reasoning.

Execute Tool Span

Describes a tool execution.

  • Span op SHOULD be "gen_ai.execute_tool".
  • Span name SHOULD be "execute_tool {gen_ai.tool.name}". (e.g. "execute_tool query_database")
  • Attribute gen_ai.operation.name MUST be "execute_tool".
  • Attribute gen_ai.tool.name SHOULD be set to the name of the tool. (e.g. "query_database")
  • Attribute gen_ai.agent.name SHOULD be set to the name of the agent that invoked the tool. (e.g. "Weather Agent")
  • If relevant, the gen_ai.pipeline.name attribute SHOULD be set to the name of the AI workflow, pipeline or chain within which the agent operates. (e.g. "weather-pipeline")
  • All Common Span Attributes SHOULD be set (all required common attributes MUST be set).

Additional attributes on the span:

Tool Data

Attribute Type Requirement Level Description Example
gen_ai.tool.name string optional Name of the tool executed. "random_number"
gen_ai.tool.description string optional Description of the tool executed. "Tool returning a random number"
gen_ai.tool.type string optional The type of the tools. "function"; "extension"; "datastore"
gen_ai.tool.call.arguments string optional Arguments of the tool call (stringified). '{"max":10}'
gen_ai.tool.call.result string optional Result of the tool call (stringified). "7"
gen_ai.tool.message string optional Response from a tool/function call passed to model. "The random number is 7"
gen_ai.tool.input string optional Deprecated. Use gen_ai.tool.call.arguments instead. Input that was given to the executed tool as string. '{"max":10}'
gen_ai.tool.output string optional Deprecated. Use gen_ai.tool.call.result instead. The output from the tool. "7"

Handoff Span

A span that describes the handoff from one agent to another agent.

  • Span op SHOULD be "gen_ai.handoff".
  • Span name SHOULD be "handoff from {from_agent} to {to_agent}".
  • Attribute gen_ai.operation.name MUST be "handoff".
  • All Common Span Attributes SHOULD be set (all required common attributes MUST be set).

Common Span Attributes

Some attributes are common to all AI Agents spans:

Attribute Type Requirement Level Description Example
gen_ai.operation.name string required The name of the operation being performed. [4] "chat"
gen_ai.system string optional The Generative AI product as identified by the client or server instrumentation. [5] "openai"

[4] Well defined values for data attribute gen_ai.operation.name:

Value Description
"chat" Chat completion operation (e.g. OpenAI Chat API)
"create_agent" Create GenAI agent
"embeddings" Embeddings operation (e.g. OpenAI Create Embeddings API)
"execute_tool" Execute a tool
"generate_content" Multimodal content generation (e.g. Gemini Generate Content)
"invoke_agent" Invoke GenAI agent
"text_completion" Text completion operation

[5] Well defined values for data attribute gen_ai.system:

Value Description
"anthropic" Anthropic
"aws.bedrock" AWS Bedrock
"az.ai.inference" Azure AI Inference
"az.ai.openai" Azure OpenAI
"cohere" Cohere
"deepseek" DeepSeek
"gcp.gemini" Gemini
"gcp.gen_ai" Any Google generative AI endpoint
"gcp.vertex_ai" Vertex AI
"groq" Groq
"ibm.watsonx.ai" IBM Watsonx AI
"mistral_ai" Mistral AI
"openai" OpenAI
"perplexity" Perplexity
"xai" xAI

[6]

The input list should include the most recent messages up to and including the most recent previous model response. The previous model response is identified with an "assistant" or "model" role in common frameworks. If there is no previous model response in the input list, then all input items which are not system instructions should be included. System instructions must be added in gen_ai.system_instructions, and are not included in the gen_ai.input.messages list.

[7]

Binary blobs in the input list should be replaced with the string "[Blob substitute]" in positions where binary data is expected in a given schema. Only binary blobs in positions where binary data is explicitly expected must be redacted. For example, in OpenAI Completions schema, only binary blobs in content blocks with type image_url, input_audio or file should be redacted.

[8]

In some agent libraries, the agent name is optional, and some do not provide the user an option to name their agents. In these cases, the span name SHOULD be "invoke_agent {call_id}", where call_id is some user-provided identifier for the agent invocation. For example, functionId in Vercel AI.

[9]

Image URLs in the data URL format in the input list should be replaced with the string "[Blob substitute]" in positions where binary data is expected. For example, data URLs like data:image/png;base64 will be redacted, but HTTP URLs like example.com/data?<a-base64-string> will not be.

See here for the regex used.