You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Two primitives that would let frameworks express structure without requiring new span types:
Sub-trace grouping - there is no way to say "these spans belong to the same logical unit" (a round, a task, a step) without creating a parent span solely for grouping purposes. Without this, backends must infer grouping from span order, which is brittle and breaks when retries, internal framework spans, or nested agents appear between the expected spans (as demonstrated in Adding ReAct Iterations Spans in Reasoning-Acting Agents semantic-conventions-genai#81 discussion).
Typed relationships - there is no way to express "this tool execution was triggered by that LLM output" for example or "this agent delegates to that agent". Without this, backends can not trace causality through an agent's decision chain. They can see that a tool was called, but not which LLM output caused it. OTel span links already exists in the spec (links-between-spans) and could serve this purpose, but they are not used anywhere in the GenAI conventions today.
Current State
GenAI semantic conventions (as of v1.40.0+) provide a solid foundation:
Each of these proposals although well motivated, but taken together they represent an N+1 span type problem: every new GenAI pattern (rounds, tasks, skills, orchestration, guardrails, memory) gets its own span type with its own attributes
This creates several risks:
Instrumentation burden: libraries must implement an ever growing set of span types
Backend fragmentation: each backend must understand each span type to render it meaningfully
Cross-framework inconsistency: what LangChain calls a "round", CrewAI calls a "task", and DSPy calls a "module step"
What already works well (pattern to follow)
OTel GenAI conventions already use generic primitives successfully:
gen_ai.operation.name - an extensible enum (chat, invoke_agent, execute_tool etc) that lets frameworks declare what a span represents without needing a separate span definition per operation
session.id - a generic attribute in the OTel registry (not GenAI-specific) that any domain can use for session grouping
gen_ai.tool.type - a generic field (function, extension, datastore) rather than separate span types per tool kind.
These are "box of shapes" primitives: OTel defines the shape vocabulary, frameworks decide which shapes to use.
Describe the solution you'd like
1. Generic Grouping Attributes
Two new attributes in the gen_ai registry:
Attribute
Type
Description
Example
gen_ai.group.id
string
Identifier for a logical group of spans within a trace
"round-3", "task-research", "step-plan"
gen_ai.group.type
string (open enum)
Type of the logical group
"react_round", "task", "planning_step", "skill"
How this subsumes existing proposals:
The idea is straightforward: instead of defining a new span type for each concept, instrumentations add gen_ai.group.id and gen_ai.group.type as attributes on the spans they are already emitting today (e.g., chat, execute_tool, invoke_agent). The grouping is expressed by giving related spans the same gen_ai.group.id value, no new span definitions, no new parent spans required.
Concrete example of grouping ReAct rounds:
Consider a ReAct agent that takes 2 rounds to answer a question. Today an instrumentation already emits these spans:
Without a grouping primitive, a backend must guess which spans belong to which round by pattern-matching on span order which breaks when retries, internal framework spans, or nested agents appear (as discussed in open-telemetry/semantic-conventions-genai#81).
With the grouping primitive, the instrumentation simply tags each span with a shared group ID:
invoke_agent research_agent
├── chat gpt-4 {gen_ai.group.id: "round-1", gen_ai.group.type: "react_round"}
├── execute_tool web_search {gen_ai.group.id: "round-1", gen_ai.group.type: "react_round"}
├── chat gpt-4 {gen_ai.group.id: "round-2", gen_ai.group.type: "react_round"}
├── execute_tool summarize {gen_ai.group.id: "round-2", gen_ai.group.type: "react_round"}
└── chat gpt-4 (no group, means this is final answer, and not part of a react cycle)
The spans are the same ones the instrumentation already produces today. No new span types or wrapper spans are introduced. The only addition is two attributes (gen_ai.group.id and gen_ai.group.type) on those spans. Now any backend can:
Count rounds by counting distinct gen_ai.group.id values where gen_ai.group.type is "react_round", in this case, 2
Group spans by round for visualization display
Alert on agents exceeding N rounds
No new span type was defined and no wrapper span was created. The structure is explicit and not inferred.
By "add" we mean: include gen_ai.group.id and gen_ai.group.type as referenced attributes in the existing span definitions in spans.yaml (e.g., under span.gen_ai.invoke_agent.client and span.gen_ai.execute_tool.internal), the same way attributes like gen_ai.agent.name or gen_ai.tool.call.id are referenced today. Instrumentations would then set these attributes when creating those spans.
Proposal
Current approach (new span type)
With grouping primitive (attributes on spans already being emitted)
#3419 ReAct Iterations Spans
Define a new react_iteration span that wraps each LLM+tool cycle
Add gen_ai.group.type = "react_round" and a shared gen_ai.group.id to the chat and execute_tool spans that the instrumentation already produces
#2912 Add Tasks
Define a new gen_ai.task span type
Add gen_ai.group.type = "task" and a shared gen_ai.group.id to the invoke_agent / execute_tool spans that already represent the work
#2993 Add Tool orchestration
Define a new orchestrate_tools span
Add gen_ai.group.type = "tool_cycle" and a shared gen_ai.group.id to the chat and execute_tool spans within the cycle
#3540 Add skill
Define a new gen_ai.skill span type
Add gen_ai.group.type = "skill" and a shared gen_ai.group.id to the spans emitted during skill execution
Key design decisions:
gen_ai.group.type is an open enum, frameworks define their own values, OTel does not prescribe what a "round" or "task" means
Multiple group memberships can be expressed by recording the attribute on multiple spans with the same gen_ai.group.id
Requirement level: recommended on invoke_agent, execute_tool, and invoke_workflow spans
This does not prevent frameworks from also creating parent spans for visualization hierarchy, it just provides an alternative that doesn't require it
2. Typed Span Links for GenAI Relationships
OTel span links already support expressing non-parent/child relationships between spans. The GenAI conventions should document and recommend their use.
Span links are not unknown to the GenAI SIG and they have come up in a few specific discussions:
However in each case span links are proposed as a solution for one narrow use case.They have not been considered as a general-purpose relationship primitive across GenAI conventions. This proposal suggests elevating them to that role.
Proposed guidance:
Instrumentations should consider using span links to express causal or semantic relationships between GenAI spans that are not captured by the parent-child hierarchy. When creating a span link, instrumentations should set an attribute on the link describing the relationship type.
Suggested link relationship types:
Relationship
Description
Example
triggered_by
The span was triggered by the linked span's output
Tool execution triggered by LLM tool_call response
delegates_to
This agent delegates work to the linked agent
parent agent invoking a sub-agent
evaluates
Evaluation targets the linked span
Evaluation event referencing the span it scores
Note: open-telemetry/semantic-conventions-genai#33 proposes span links for evaluation->target binding, validating this approach and this proposal is a generalization of that pattern.
Example:
invoke_agent main_agent
├── chat gpt-4 (span_id: A)
│ └── response includes a tool_call: "search"
├── execute_tool search (span_id: B, link: {span_id: A, type: "triggered_by"})
└── gen_ai.evaluation.result (link: {span_id: B type: "evaluates"})
Relationship to Existing Proposals
This proposal is complementary to, not a replacement for the existing proposals. It offers a design principle:
For proposals that need relationships (#2626 evaluation spans): typed span links provide the mechanism
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.
Area(s)
area:gen-ai
What's missing?
Two primitives that would let frameworks express structure without requiring new span types:
Sub-trace grouping - there is no way to say "these spans belong to the same logical unit" (a round, a task, a step) without creating a parent span solely for grouping purposes. Without this, backends must infer grouping from span order, which is brittle and breaks when retries, internal framework spans, or nested agents appear between the expected spans (as demonstrated in Adding ReAct Iterations Spans in Reasoning-Acting Agents semantic-conventions-genai#81 discussion).
Typed relationships - there is no way to express "this tool execution was triggered by that LLM output" for example or "this agent delegates to that agent". Without this, backends can not trace causality through an agent's decision chain. They can see that a tool was called, but not which LLM output caused it. OTel span links already exists in the spec (links-between-spans) and could serve this purpose, but they are not used anywhere in the GenAI conventions today.
Current State
GenAI semantic conventions (as of v1.40.0+) provide a solid foundation:
create_agent,invoke_agentinvoke_workflowinference,embeddings,retrievalexecute_toolgen_ai.evaluation.resulteventgen_ai.tool.call.id,gen_ai.agent.id,gen_ai.conversation.idSIG is currently receiving many proposals where each introduce new span types to address specific GenAI patterns:
gen_ai.workflow+gen_ai.taskgen_ai.task.*orchestrate_toolsgen_ai.skillgen_ai.agent.skills.*Each of these proposals although well motivated, but taken together they represent an N+1 span type problem: every new GenAI pattern (rounds, tasks, skills, orchestration, guardrails, memory) gets its own span type with its own attributes
This creates several risks:
What already works well (pattern to follow)
OTel GenAI conventions already use generic primitives successfully:
gen_ai.operation.name- an extensible enum (chat,invoke_agent,execute_tooletc) that lets frameworks declare what a span represents without needing a separate span definition per operationsession.id- a generic attribute in the OTel registry (not GenAI-specific) that any domain can use for session groupinggen_ai.tool.type- a generic field (function,extension,datastore) rather than separate span types per tool kind.These are "box of shapes" primitives: OTel defines the shape vocabulary, frameworks decide which shapes to use.
Describe the solution you'd like
1. Generic Grouping Attributes
Two new attributes in the
gen_airegistry:gen_ai.group.id"round-3","task-research","step-plan"gen_ai.group.type"react_round","task","planning_step","skill"How this subsumes existing proposals:
The idea is straightforward: instead of defining a new span type for each concept, instrumentations add
gen_ai.group.idandgen_ai.group.typeas attributes on the spans they are already emitting today (e.g.,chat,execute_tool,invoke_agent). The grouping is expressed by giving related spans the samegen_ai.group.idvalue, no new span definitions, no new parent spans required.Concrete example of grouping ReAct rounds:
Consider a ReAct agent that takes 2 rounds to answer a question. Today an instrumentation already emits these spans:
Without a grouping primitive, a backend must guess which spans belong to which round by pattern-matching on span order which breaks when retries, internal framework spans, or nested agents appear (as discussed in open-telemetry/semantic-conventions-genai#81).
With the grouping primitive, the instrumentation simply tags each span with a shared group ID:
The spans are the same ones the instrumentation already produces today. No new span types or wrapper spans are introduced. The only addition is two attributes (
gen_ai.group.idandgen_ai.group.type) on those spans. Now any backend can:gen_ai.group.idvalues wheregen_ai.group.typeis"react_round", in this case, 2No new span type was defined and no wrapper span was created. The structure is explicit and not inferred.
By "add" we mean: include
gen_ai.group.idandgen_ai.group.typeas referenced attributes in the existing span definitions inspans.yaml(e.g., underspan.gen_ai.invoke_agent.clientandspan.gen_ai.execute_tool.internal), the same way attributes likegen_ai.agent.nameorgen_ai.tool.call.idare referenced today. Instrumentations would then set these attributes when creating those spans.react_iterationspan that wraps each LLM+tool cyclegen_ai.group.type = "react_round"and a sharedgen_ai.group.idto thechatandexecute_toolspans that the instrumentation already producesgen_ai.taskspan typegen_ai.group.type = "task"and a sharedgen_ai.group.idto theinvoke_agent/execute_toolspans that already represent the workorchestrate_toolsspangen_ai.group.type = "tool_cycle"and a sharedgen_ai.group.idto thechatandexecute_toolspans within the cyclegen_ai.skillspan typegen_ai.group.type = "skill"and a sharedgen_ai.group.idto the spans emitted during skill executionKey design decisions:
gen_ai.group.typeis an open enum, frameworks define their own values, OTel does not prescribe what a "round" or "task" meansgen_ai.group.idrecommendedoninvoke_agent,execute_tool, andinvoke_workflowspans2. Typed Span Links for GenAI Relationships
OTel span links already support expressing non-parent/child relationships between spans. The GenAI conventions should document and recommend their use.
Span links are not unknown to the GenAI SIG and they have come up in a few specific discussions:
However in each case span links are proposed as a solution for one narrow use case.They have not been considered as a general-purpose relationship primitive across GenAI conventions. This proposal suggests elevating them to that role.
Proposed guidance:
Instrumentations should consider using span links to express causal or semantic relationships between GenAI spans that are not captured by the parent-child hierarchy. When creating a span link, instrumentations should set an attribute on the link describing the relationship type.
Suggested link relationship types:
triggered_bytool_callresponsedelegates_toevaluatesNote: open-telemetry/semantic-conventions-genai#33 proposes span links for evaluation->target binding, validating this approach and this proposal is a generalization of that pattern.
Example:
Relationship to Existing Proposals
This proposal is complementary to, not a replacement for the existing proposals. It offers a design principle:
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding
+1orme too, to help us triage it. Learn more here.