Skip to content

Commit 68f517a

Browse files
committed
Merge branch 'DOC-1901-document-feature-adp-execution-logs-ui-package-1' into adp-pkg1
2 parents a72b6b8 + cc149b9 commit 68f517a

4 files changed

Lines changed: 161 additions & 32 deletions

File tree

modules/ROOT/nav.adoc

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,9 @@
9292
**** xref:ai-agents:mcp/local/overview.adoc[Overview]
9393
**** xref:ai-agents:mcp/local/quickstart.adoc[Quickstart]
9494
**** xref:ai-agents:mcp/local/configuration.adoc[Configure]
95-
** xref:ai-agents:observability/concepts.adoc[Transcripts]
95+
** xref:ai-agents:observability/index.adoc[Transcripts]
96+
*** xref:ai-agents:observability/concepts.adoc[Concepts]
97+
*** xref:ai-agents:observability/view-transcripts.adoc[View Transcripts]
9698
9799
* xref:develop:connect/about.adoc[Redpanda Connect]
98100
** xref:develop:connect/connect-quickstart.adoc[Quickstart]

modules/ai-agents/pages/observability/concepts.adoc

Lines changed: 32 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
= Transcripts and AI Observability
2-
:description: Understand how Redpanda captures execution traces for agents and MCP servers using OpenTelemetry.
2+
:description: Understand how Redpanda captures execution transcripts for agents and MCP servers using OpenTelemetry.
33
:page-topic-type: concepts
44
:personas: agent_developer, platform_admin, data_engineer
5-
:learning-objective-1: Explain how traces and spans capture execution flow
6-
:learning-objective-2: Interpret trace structure for debugging and monitoring
7-
:learning-objective-3: Distinguish between observability traces and audit logs
5+
:learning-objective-1: Explain how transcripts and spans capture execution flow
6+
:learning-objective-2: Interpret transcript structure for debugging and monitoring
7+
:learning-objective-3: Distinguish between transcripts and audit logs
88

9-
Redpanda automatically captures execution traces for both AI agents and MCP servers, providing complete observability into how your agentic systems operate.
9+
Redpanda automatically captures execution transcripts for both AI agents and MCP servers, providing complete observability into how your agentic systems operate.
1010

1111
After reading this page, you will be able to:
1212

@@ -27,7 +27,7 @@ Transcripts capture:
2727
* Error conditions
2828
* Performance metrics
2929

30-
With 100% sampling, every operation is captured, creating complete transcripts that you can use for debugging, monitoring, and performance analysis.
30+
With 100% sampling, every operation is captured, enabling comprehensive debugging, monitoring, and performance analysis.
3131

3232
== Traces and spans
3333

@@ -37,13 +37,13 @@ OpenTelemetry traces provide a complete picture of how a request flows through y
3737
* A _span_ represents a single unit of work within that trace (such as a data processing operation or an external API call).
3838
* A trace contains one or more spans organized hierarchically, showing how operations relate to each other.
3939

40-
== Agent trace hierarchy
40+
== Agent transcript hierarchy
4141

4242
Agent executions create a hierarchy of spans that reflect how agents process requests. Understanding this hierarchy helps you interpret agent behavior and identify where issues occur.
4343

4444
=== Agent span types
4545

46-
Agent traces contain these span types:
46+
Agent transcripts contain these span types:
4747

4848
[cols="2,3,3", options="header"]
4949
|===
@@ -90,13 +90,13 @@ This shows:
9090

9191
Examine span durations to identify where time is spent and optimize accordingly.
9292

93-
== MCP server trace hierarchy
93+
== MCP server transcript hierarchy
9494

95-
MCP server executions create a different hierarchy that reflects tool invocations and internal processing. Understanding this hierarchy helps you debug tool execution and identify performance bottlenecks.
95+
MCP server tool invocations produce a different span hierarchy focused on tool execution and internal processing. This structure reveals performance bottlenecks and helps debug tool-specific issues.
9696

9797
=== MCP server span types
9898

99-
MCP server traces contain these span types:
99+
MCP server transcripts contain these span types:
100100

101101
[cols="2,3,3", options="header"]
102102
|===
@@ -143,13 +143,13 @@ This shows:
143143

144144
The majority of time (4+ seconds) is spent in tool execution, while internal processing (mapping) takes only microseconds. This indicates the tool itself (likely making external API calls or database queries) is the bottleneck, not Redpanda Connect's internal processing.
145145

146-
== Trace layers and scope
146+
== Transcript layers and scope
147147

148-
Traces contain multiple layers of instrumentation, from HTTP transport through application logic to external service calls. The `scope.name` field in each span identifies which layer of instrumentation created that span.
148+
Transcripts contain multiple layers of instrumentation, from HTTP transport through application logic to external service calls. The `scope.name` field in each span identifies which instrumentation layer created that span.
149149

150150
=== Instrumentation layers
151151

152-
A complete agent trace includes these layers:
152+
A complete agent transcript includes these layers:
153153

154154
[cols="2,2,4", options="header"]
155155
|===
@@ -178,7 +178,7 @@ A complete agent trace includes these layers:
178178

179179
=== How layers connect
180180

181-
Layers connect through parent-child relationships in a single trace:
181+
Layers connect through parent-child relationships in a single transcript:
182182

183183
----
184184
ai-agent-http-server (HTTP Server layer)
@@ -191,7 +191,7 @@ ai-agent-http-server (HTTP Server layer)
191191
└── chat gpt-5-nano (AI SDK layer, LLM call 2)
192192
----
193193

194-
This shows:
194+
The request flow demonstrates:
195195

196196
1. HTTP request arrives at agent
197197
2. Agent invokes sub-agent
@@ -201,9 +201,9 @@ This shows:
201201
6. Agent makes second LLM call with tool results
202202
7. Response returns through HTTP layer
203203

204-
=== Cross-service traces
204+
=== Cross-service transcripts
205205

206-
When agents call MCP tools, the trace spans multiple services. Each service has a different `service.name` in the resource attributes:
206+
When agents call MCP tools, the transcript spans multiple services. Each service has a different `service.name` in the resource attributes:
207207

208208
* Agent spans: `"service.name": "ai-agent"`
209209
* MCP server spans: `"service.name": "mcp-{server-id}"`
@@ -240,9 +240,9 @@ Redpanda Connect layer:
240240

241241
- Component-specific attributes from your tool configuration
242242

243-
Use `scope.name` to filter spans by layer when analyzing traces.
243+
Use `scope.name` to filter spans by layer when analyzing transcripts.
244244

245-
== Understand the trace structure
245+
== Understand the transcript structure
246246

247247
Each span captures a unit of work. Here's what a typical MCP tool invocation looks like:
248248

@@ -273,7 +273,7 @@ Key elements to understand:
273273

274274
=== Parent-child relationships
275275

276-
Traces show how operations relate. A tool invocation (parent) may trigger internal operations (children):
276+
Transcripts show how operations relate. A tool invocation (parent) may trigger internal operations (children):
277277

278278
[,json]
279279
----
@@ -289,9 +289,9 @@ Traces show how operations relate. A tool invocation (parent) may trigger intern
289289

290290
The `parentSpanId` links this child span to the parent tool invocation. Both share the same `traceId` so you can reconstruct the complete operation.
291291

292-
== Error events in traces
292+
== Error events in transcripts
293293

294-
When something goes wrong, traces capture error details:
294+
When something goes wrong, transcripts capture error details:
295295

296296
[,json]
297297
----
@@ -314,32 +314,32 @@ When something goes wrong, traces capture error details:
314314
The `events` array captures what happened and when. Use `timeUnixNano` to see exactly when the error occurred within the operation.
315315

316316
[[opentelemetry-traces-topic]]
317-
== How Redpanda stores traces
317+
== How Redpanda stores trace data
318318

319-
The `redpanda.otel_traces` topic stores OpenTelemetry spans in JSON format, following the https://opentelemetry.io/docs/specs/otel/protocol/[OpenTelemetry Protocol (OTLP)^] specification. A Protobuf schema named `redpanda.otel_traces-value` is also automatically registered with the topic, enabling clients to deserialize trace data correctly.
319+
The `redpanda.otel_traces` topic stores OpenTelemetry spans using Redpanda's Schema Registry wire format with a custom Protobuf schema named `redpanda.otel_traces-value` that closely follows the https://opentelemetry.io/docs/specs/otel/protocol/[OpenTelemetry Protocol (OTLP)^] specification. This schema is automatically registered in the Schema Registry with the topic, enabling clients to deserialize trace data correctly.
320320

321-
The `redpanda.otel_traces` topic and its schema are managed automatically by Redpanda. If you delete either the topic or the schema, they are recreated automatically. However, deleting the topic permanently deletes all trace data, and the topic comes back empty. Do not produce your own data to this topic. It is reserved for OpenTelemetry traces.
321+
Redpanda manages both the `redpanda.otel_traces` topic and its schema automatically. If you delete either the topic or the schema, they are recreated automatically. However, deleting the topic permanently deletes all trace data, and the topic comes back empty. Do not produce your own data to this topic. It is reserved for OpenTelemetry traces.
322322

323323
=== Topic configuration and lifecycle
324324

325325
The `redpanda.otel_traces` topic has a predefined retention policy. Configuration changes to this topic are not supported. If you modify settings, Redpanda reverts them to the default values.
326326

327327
The topic persists in your cluster even after all agents and MCP servers are deleted, allowing you to retain historical trace data for analysis.
328328

329-
Trace data may contain sensitive information from your tool inputs and outputs. Consider implementing appropriate glossterm:ACL[access control lists (ACLs)] for the `redpanda.otel_traces` topic, and review the data in traces before sharing or exporting to external systems.
329+
Transcripts may contain sensitive information from your tool inputs and outputs. Consider implementing appropriate glossterm:ACL[access control lists (ACLs)] for the `redpanda.otel_traces` topic, and review the data in transcripts before sharing or exporting to external systems.
330330

331-
== Traces compared to audit logs
331+
== Transcripts compared to audit logs
332332

333-
OpenTelemetry traces are designed for observability and debugging, not audit logging or compliance.
333+
Transcripts are designed for observability and debugging, not audit logging or compliance.
334334

335-
Traces provide:
335+
Transcripts provide:
336336

337337
* Hierarchical view of request flow through your system (parent-child span relationships)
338338
* Detailed timing information for performance analysis
339339
* Ability to reconstruct execution paths and identify bottlenecks
340340
* Insights into how operations flow through distributed systems
341341

342-
Traces are not:
342+
Transcripts are not:
343343

344344
* Immutable audit records for compliance purposes
345345
* Designed for "who did what" accountability tracking
@@ -348,5 +348,6 @@ For compliance and audit requirements, use the session and task topics for agent
348348

349349
== Next steps
350350

351+
* xref:ai-agents:observability/view-transcripts.adoc[]
351352
* xref:ai-agents:agents/monitor-agents.adoc[]
352353
* xref:ai-agents:mcp/remote/monitor-mcp-servers.adoc[]
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
= Transcripts
2+
:page-layout: index
3+
:description: Monitor agent and MCP server execution using complete OpenTelemetry traces captured by Redpanda.
4+
5+
{description}
6+
Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
= View Transcripts
2+
:description: Learn how to filter, search, and navigate the Transcripts interface to investigate agent execution traces using multiple detail views and interactive timeline navigation.
3+
:page-topic-type: how-to
4+
:personas: agent_developer, platform_admin
5+
:learning-objective-1: Filter and search transcripts to find specific execution traces
6+
:learning-objective-2: Navigate between detail views to inspect span information at different levels
7+
:learning-objective-3: Use the timeline interactively to navigate to specific time periods
8+
9+
The Transcripts view provides filtering, searching, and navigation capabilities for investigating agent and MCP server execution transcripts. Use these features to efficiently locate specific operations, analyze performance patterns, and debug issues across tool invocations, LLM calls, and agent reasoning steps.
10+
11+
After reading this page, you will be able to:
12+
13+
* [ ] {learning-objective-1}
14+
* [ ] {learning-objective-2}
15+
* [ ] {learning-objective-3}
16+
17+
For basic orientation on agent and MCP server monitoring, see xref:ai-agents:agents/monitor-agents.adoc[] or xref:ai-agents:mcp/remote/monitor-mcp-servers.adoc[]. For conceptual background on what transcripts capture and how spans are organized hierarchically, see xref:ai-agents:observability/concepts.adoc[].
18+
19+
== Prerequisites
20+
21+
* xref:ai-agents:agents/create-agent.adoc[Running agent] or xref:ai-agents:mcp/remote/quickstart.adoc[MCP server] with at least one execution
22+
* Access to the Transcripts view (requires appropriate permissions to read the `redpanda.otel_traces` topic)
23+
24+
== Navigate the Transcripts interface
25+
26+
=== Use the interactive timeline
27+
28+
Use the timeline visualization to quickly identify when errors began or patterns changed, and navigate directly to transcripts from particular timestamps.
29+
30+
When viewing time periods with many transcripts (hundreds or thousands), the timeline displays a subset of the data to maintain performance and usability. The timeline bar indicates the actual time range of currently visible data, which may be narrower than your selected range.
31+
32+
TIP: See xref:ai-agents:agents/monitor-agents.adoc[] and xref:ai-agents:mcp/remote/monitor-mcp-servers.adoc[] to learn basic execution patterns and health indicators to investigate.
33+
34+
=== Search and filter for transcripts
35+
36+
Use search and filters together to narrow down transcripts and quickly locate specific executions.
37+
38+
==== Search for specific transcripts
39+
40+
The search functionality helps you find transcripts by operation names, span types, or identifiers:
41+
42+
* Search by span names to find specific xref:ai-agents:observability/concepts.adoc#agent-span-types[agent operations] like `invoke_agent`, or xref:ai-agents:mcp/remote/create-tool.adoc[MCP tools]
43+
* Search by xref:ai-agents:observability/concepts.adoc#instrumentation-layers[scope] to filter by layer (for example, `rpcn-mcp` for MCP tool spans)
44+
* Search by trace IDs (`traceId`) when correlating with external systems or troubleshooting specific requests
45+
46+
==== Filter by service
47+
48+
Service filtering shows only transcripts from specific agents or MCP servers using the `service.name` resource attribute. See xref:ai-agents:observability/concepts.adoc#cross-service-transcripts[Cross-service transcripts] to understand how transcripts span multiple services.
49+
50+
* View executions from a single agent when multiple are running (service name: `ai-agent`)
51+
* Isolate MCP server activity from agent activity (service name: `mcp-{server-id}`)
52+
* Compare behavior across different service instances
53+
54+
==== Filter by execution status
55+
56+
Status filtering shows transcripts based on their execution outcome:
57+
58+
* Show successful executions for health checks
59+
* Show only failed executions for error investigation
60+
* Toggle between success and error views to compare and analyze patterns
61+
62+
==== Adjust time range
63+
64+
Use the time range selector to focus on specific time periods (from the last five minutes up to the last 24 hours):
65+
66+
* View recent executions (for example, over the last hour) to monitor real-time activity
67+
* Expand to longer periods for trend analysis over the last day
68+
* Narrow to specific time windows when investigating issues that occurred at known times
69+
70+
TIP: Apply broad filters first (time range, service) to reduce the transcript set, then use search to narrow to specific operations.
71+
72+
== Inspect span details
73+
74+
Each row in the transcript table represents a high-level agent or MCP server request flow. Expand each parent span to see the xref:ai-agents:observability/concepts.adoc#agent-transcript-hierarchy[hierarchical structure] of nested operations, including tool calls, LLM interactions, and internal processing steps. Parent-child spans show how operations relate: for example, an agent invocation (parent) triggers LLM calls and tool executions (children).
75+
76+
When agents invoke remote MCP servers, transcripts fold together across service boundaries to provide a unified view of the complete operation. The trace ID originates at the initial request touchpoint and propagates across all involved services, linking spans from both the agent and MCP server under a single transcript. Use the tree view to follow the trace flow across multiple services and understand the complete request lifecycle.
77+
78+
If you use external agents that directly invoke MCP servers in the Redpanda Agentic Data Plane, you may only see MCP-level parent transcripts, unless you have configured the agents to also emit traces to the Redpanda OTEL ingestion pipeline.
79+
80+
Selected spans display detailed information at multiple levels, from high-level summaries to complete raw data:
81+
82+
* Start with summary view for quick assessment
83+
* Inspect attributes for detailed investigation
84+
* Use raw data when you need complete information
85+
86+
=== Summary view
87+
88+
The summary panel provides high-level span information:
89+
90+
* Total nested operations (span count) and execution time
91+
* Token usage for LLM operations
92+
* Counts of LLM calls and tool calls
93+
94+
Click on an individual span to drill down into the execution context:
95+
96+
* View the full conversation history saved for that session, including user prompts, configured xref:ai-agents:agents/create-agent.adoc#write-the-system-prompt[system prompts] to guide agent behavior, and LLM outputs
97+
* Inspect individual tool calls made by the agent and any of its sub-agents, including request arguments and responses
98+
99+
TIP: Expand the summary panel to full view to easily read long conversations.
100+
101+
=== Detailed attributes view
102+
103+
The attributes view shows structured metadata for each transcript span. Use this view to quickly locate an attribute value such as conversation ID, then paste it into the search box to find all operations from that conversation session. See xref:ai-agents:observability/concepts.adoc#key-attributes-by-layer[Transcripts and AI Observability] for details on standard attributes by instrumentation layer.
104+
105+
=== Raw data view
106+
107+
The raw data view provides the complete span structure:
108+
109+
* Full OpenTelemetry span in JSON format
110+
* All fields including those not displayed in summary or attributes views
111+
* Structured data suitable for export or programmatic access
112+
113+
You can also view the raw transcript data in the `redpanda.otel_traces` topic.
114+
115+
== Next steps
116+
117+
* xref:ai-agents:agents/monitor-agents.adoc[]
118+
* xref:ai-agents:mcp/remote/monitor-mcp-servers.adoc[]
119+
* xref:ai-agents:observability/concepts.adoc[]
120+
* xref:ai-agents:agents/troubleshooting.adoc[]

0 commit comments

Comments
 (0)