Skip to content

Commit 1f126af

Browse files
committed
Merge branch 'DOC-1901' into adp-pkg1
# Conflicts: # modules/ROOT/nav.adoc # modules/ai-agents/pages/observability/concepts.adoc # modules/ai-agents/pages/observability/index.adoc # modules/ai-agents/pages/observability/ingest-custom-traces.adoc
2 parents 7990323 + c258c0c commit 1f126af

5 files changed

Lines changed: 197 additions & 6 deletions

File tree

modules/ROOT/nav.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@
6060
**** xref:ai-agents:agents/a2a-concepts.adoc[A2A Protocol]
6161
** xref:ai-agents:observability/index.adoc[Transcripts]
6262
*** xref:ai-agents:observability/concepts.adoc[Concepts]
63-
*** xref:ai-agents:observability/view-transcripts.adoc[View Transcripts]
63+
*** xref:ai-agents:observability/transcripts.adoc[View Transcripts]
6464
*** xref:ai-agents:observability/ingest-custom-traces.adoc[Ingest Traces from Custom Agents]
6565
** xref:ai-agents:ai-gateway/index.adoc[AI Gateway]
6666
*** xref:ai-agents:ai-gateway/what-is-ai-gateway.adoc[Overview]

modules/ai-agents/pages/observability/concepts.adoc

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
:learning-objective-2: Interpret transcript structure for debugging and monitoring
77
:learning-objective-3: Distinguish between transcripts and audit logs
88

9-
Redpanda automatically captures execution transcripts for both AI agents and MCP servers, providing complete observability into how your agentic systems operate.
9+
Redpanda automatically captures transcripts (also referred to as execution logs or traces) for both AI agents and MCP servers, providing complete observability into how your agentic systems operate.
1010

1111
After reading this page, you will be able to:
1212

@@ -330,6 +330,8 @@ Transcripts may contain sensitive information from your tool inputs and outputs.
330330

331331
== Transcripts compared to audit logs
332332

333+
// TODO: Ask SME to review and confirm whether we want to rephrase or change
334+
// "not designed for audit logging or compliance"
333335
Transcripts are designed for observability and debugging, not audit logging or compliance.
334336

335337
Transcripts provide:
@@ -348,6 +350,6 @@ For compliance and audit requirements, use the session and task topics for agent
348350

349351
== Next steps
350352

351-
* xref:ai-agents:observability/view-transcripts.adoc[]
353+
* xref:ai-agents:observability/transcripts.adoc[]
352354
* xref:ai-agents:agents/monitor-agents.adoc[]
353355
* xref:ai-agents:mcp/remote/monitor-mcp-servers.adoc[]

modules/ai-agents/pages/observability/index.adoc

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,3 @@
33
:description: Monitor agent and MCP server execution using complete OpenTelemetry traces captured by Redpanda.
44

55
{description}
6-

modules/ai-agents/pages/observability/ingest-custom-traces.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -407,7 +407,7 @@ Your custom agent transcripts display with:
407407
* **Agent name** in span details (from the `gen_ai.agent.name` attribute)
408408
* **Operation names** like `"invoke_agent my-assistant"` indicating agent executions
409409

410-
For detailed instructions on filtering, searching, and navigating transcripts in the UI, see xref:ai-agents:observability/view-transcripts.adoc[View Transcripts].
410+
For detailed instructions on filtering, searching, and navigating transcripts in the UI, see xref:ai-agents:observability/transcripts.adoc[View Transcripts].
411411

412412
==== Token usage tracking
413413

@@ -451,7 +451,7 @@ If requests succeed but traces do not appear in `redpanda.otel_traces`:
451451

452452
== Next steps
453453

454-
* xref:ai-agents:observability/view-transcripts.adoc[]
454+
* xref:ai-agents:observability/transcripts.adoc[]
455455
* xref:ai-agents:agents/monitor-agents.adoc[Observability for declarative agents]
456456
* https://docs.redpanda.com/redpanda-connect/components/inputs/otlp_http/[OTLP HTTP input reference^] - Complete configuration options for the `otlp_http` component
457457
* https://docs.redpanda.com/redpanda-connect/components/inputs/otlp_grpc/[OTLP gRPC input reference^] - Alternative gRPC-based trace ingestion
Lines changed: 190 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,190 @@
1+
= View Transcripts
2+
:description: Learn how to filter and navigate the Transcripts interface to investigate agent execution traces using multiple detail views and interactive timeline navigation.
3+
:page-topic-type: how-to
4+
:personas: agent_developer, platform_admin
5+
:learning-objective-1: Filter transcripts to find specific execution traces
6+
:learning-objective-2: Use the timeline interactively to navigate to specific time periods
7+
:learning-objective-3: Navigate between detail views to inspect span information at different levels
8+
9+
The Transcripts view provides filtering and navigation capabilities for investigating agent, MCP server, and AI Gateway execution glossterm:transcript[transcripts]. Use this view to quickly locate specific operations, analyze performance patterns, and debug issues across glossterm:tool[] invocations, LLM calls, and glossterm:agent[] reasoning steps.
10+
11+
After reading this page, you will be able to:
12+
13+
* [ ] {learning-objective-1}
14+
* [ ] {learning-objective-2}
15+
* [ ] {learning-objective-3}
16+
17+
For basic orientation on monitoring each Redpanda Agentic Data Plane (ADP) component, see:
18+
19+
* xref:ai-agents:ai-gateway/observability-metrics.adoc[]
20+
* xref:ai-agents:agents/monitor-agents.adoc[]
21+
* xref:ai-agents:mcp/remote/monitor-mcp-servers.adoc[]
22+
23+
For conceptual background on what transcripts capture, glossterm:span[] types, and how they are organized hierarchically, see xref:ai-agents:observability/concepts.adoc[].
24+
25+
== Prerequisites
26+
27+
* xref:ai-agents:agents/create-agent.adoc[Running agent] or xref:ai-agents:mcp/remote/quickstart.adoc[MCP server] with at least one execution
28+
* Access to the Transcripts view (requires appropriate permissions to read the `redpanda.otel_traces` topic)
29+
30+
== Navigate the Transcripts interface
31+
32+
=== Filter transcripts
33+
34+
Use filters to narrow down transcripts and quickly locate specific executions. When you use any of the filters, the transcript list updates to show only matching results.
35+
36+
The Transcripts view provides several quick-filter buttons:
37+
38+
* *Service*: Isolate operations from a particular component in your agentic data plane (agents, MCP servers, or AI Gateway)
39+
* *LLM Calls*: Inspect large language model (LLM) invocations, including chat completions and embeddings
40+
* *Tool Calls*: View tool executions by agents
41+
* *Agent Spans*: Inspect agent invocation and reasoning
42+
* *Errors Only*: Filter for failed operations or errors
43+
* *Slow (>5s)*: Isolate operations that exceeded five seconds in duration, useful for performance investigation
44+
45+
You can combine multiple filters to narrow results further. For example, use *Tool Calls* and *Errors Only* together to investigate failed tool executions.
46+
47+
Toggle *Full traces* on to see the complete execution context, in grayed-out text, for the filtered transcripts.
48+
49+
==== Filter by attribute
50+
51+
Click the *Attribute* button to query exact matches on specific span metadata such as the following:
52+
53+
* Agent names
54+
* LLM model names, for example, `gemini-3-flash-preview`
55+
* Tool names
56+
* Span and trace IDs
57+
58+
You can add multiple attribute filters to refine results.
59+
60+
==== Adjust time range
61+
62+
Use the time range selector to focus on specific time periods (from the last five minutes up to the last 24 hours):
63+
64+
* View recent executions, for example, over the last hour, to monitor real-time activity
65+
* Expand to longer periods for trend analysis over the last day
66+
67+
=== Use the interactive timeline
68+
69+
Use the timeline visualization to quickly identify when errors began or patterns changed, and navigate directly to transcripts from specific time windows when investigating issues that occurred at known times
70+
71+
The timeline displays transcript volume as a bar chart. Each bar represents a time bucket that recalibrates dynamically based on your <<adjust-time-range,selected time range>>, with color-coded indicators:
72+
73+
* Green: Successful operations
74+
* Red: Operations with errors
75+
76+
Click on any bar in the timeline to zoom into transcripts from that specific time period. The transcript table automatically scrolls to show operations from the time bucket in view.
77+
78+
[NOTE]
79+
====
80+
When viewing time ranges with many transcripts (hundreds or thousands), the table displays a subset of the data to maintain performance and usability. The timeline bar indicates the actual time range of currently loaded data, which may be narrower than your selected time range.
81+
82+
Refer to the timeline header to check the exact range and count of visible transcripts, for example, "Showing 100 of 299 transcripts from 13:17 to 15:16".
83+
====
84+
85+
== Inspect span details
86+
87+
The transcript table displays the following:
88+
89+
* Time: Timestamp when the span started (sortable)
90+
* Span: Span type indicator and span name, with hierarchical tree structure
91+
* Duration: Total duration, or duration of child spans relative to the parent span, represented as visual bars
92+
93+
Each top-level row in the transcript table represents a service-level request flow in an ADP component. Expand each parent span to see the hierarchical structure of nested operations, including internal processing steps, LLM interactions, and tool calls. xref:ai-agents:observability/concepts.adoc#parent-child-relationships[Parent-child spans] show how operations relate: for example, an agent invocation (parent) triggers LLM calls and tool executions (children). Use the *Collapse all* option to quickly fold all expanded spans.
94+
95+
// TODO: Clarify MCP trace structure
96+
When agents invoke remote MCP servers, transcripts fold together under a tree structure to provide a unified view of the complete operation across service boundaries. The glossterm:trace ID[] originates at the initial request touchpoint and propagates across all involved services, linking spans from both the agent and MCP server under a single transcript. Use the tree view to follow the trace flow across multiple services and understand the complete request lifecycle.
97+
98+
// TODO: Confirm how transcripts from external agents appear
99+
If you use external agents that directly invoke MCP servers in the Redpanda Agentic Data Plane, you may only see MCP-level parent transcripts, unless you have configured the agents to also emit traces to the Redpanda glossterm:OpenTelemetry[OTEL] ingestion pipeline.
100+
101+
// TODO: Confirm how gateway traces appear
102+
103+
Selected spans display detailed information at multiple levels, from high-level summaries to complete raw data:
104+
105+
* Start with summary tab for quick assessment
106+
* Inspect attributes for detailed investigation using structured metadata
107+
* Use raw data when you need complete information
108+
109+
[NOTE]
110+
====
111+
Rows labeled "awaiting root — waiting for parent span" indicate incomplete transcripts where child spans have been received but the parent span is missing or hasn't arrived yet. This can occur due to network latency between services, processing delays in the OpenTelemetry pipeline, or lost parent spans from service failures.
112+
If you consistently see awaiting root entries, this suggests instrumentation or trace collection issues that should be investigated.
113+
====
114+
115+
=== Summary tab
116+
117+
Click on any span in the transcript table to open the detail panel on the right side of the interface. The first tab displays a context-specific summary based on the span type.
118+
119+
For example, for tool call spans, the summary shows:
120+
121+
* *Description*: The purpose and context of the tool call
122+
* *Arguments*: JSON showing the parameters passed to the tool
123+
* *Response*: JSON showing the tool's output or result
124+
125+
The summary panel for other span types provides high-level information such as:
126+
127+
* Total nested operations (span count) and execution time
128+
* Token usage for LLM operations
129+
* Counts of LLM calls and tool calls
130+
* Full conversation history for agent spans, including user prompts, configured xref:ai-agents:agents/create-agent.adoc#write-the-system-prompt[system prompts], and LLM outputs
131+
132+
TIP: Expand the summary panel view to easily read long conversations and complex JSON structures.
133+
134+
=== Attributes tab
135+
136+
The attributes view shows structured metadata for each transcript span. Use this view to inspect span attributes and understand the context of each operation. See xref:ai-agents:observability/concepts.adoc#key-attributes-by-layer[Transcripts and AI Observability] for details on standard attributes by instrumentation layer.
137+
138+
=== Raw data tab
139+
140+
The raw data view provides the complete span structure:
141+
142+
* Full OpenTelemetry span in JSON format
143+
* All fields including those not displayed in summary or attributes views
144+
* Structured data suitable for export or programmatic access
145+
146+
You can also view the raw transcript data in the `redpanda.otel_traces` topic.
147+
148+
== Investigate and analyze operations
149+
150+
The following patterns demonstrate how to use the Transcripts view for understanding and troubleshooting your agentic systems.
151+
152+
=== Debug errors
153+
154+
. Use *Errors Only* to filter for failed operations, or review the timeline to identify and zoom in to when errors began occurring.
155+
. Expand error spans to examine the failure context.
156+
. Check preceding tool call arguments and LLM responses for root cause.
157+
158+
=== Investigate performance issues
159+
160+
. Use the *Slow (>5s)* filter to identify operations with high latency.
161+
. Expand slow spans to identify bottlenecks in the execution tree.
162+
. Compare duration bars across similar operations to spot anomalies.
163+
164+
=== Analyze tool usage
165+
166+
. Apply the *Tool Calls* filter and optionally use the *Attribute* filter to focus on a specific tool.
167+
. Review tool execution frequency in the timeline.
168+
. Click individual tool call spans to inspect arguments and responses.
169+
.. Check the Description field to understand tool invocation context.
170+
.. Use the Arguments field to verify correct parameter passing.
171+
172+
=== Monitor LLM interactions
173+
174+
. Click *LLM Calls* to focus on model invocations and optionally filter by model name and provider using the *Attribute* filter.
175+
. Review token usage patterns across different time periods.
176+
. Examine conversation history to understand model behavior.
177+
. Spot unexpected model calls or token consumption spikes.
178+
179+
=== Trace multi-service operations
180+
181+
. Locate the parent agent or gateway span in the transcript table.
182+
. Use the *Attribute* filter to follow the trace ID through agent and MCP server boundaries.
183+
. Expand the transcript tree to reveal child spans across services.
184+
. Review durations to understand where latency occurs in distributed calls.
185+
186+
== Next steps
187+
188+
* xref:ai-agents:agents/monitor-agents.adoc[]
189+
* xref:ai-agents:mcp/remote/monitor-mcp-servers.adoc[]
190+
* xref:ai-agents:agents/troubleshooting.adoc[]

0 commit comments

Comments
 (0)