Merge branch 'DOC-1901-document-feature-adp-execution-logs-ui-package-1' into adp-pkg1

paulohtb6 · paulohtb6 · commit 68f517a002be · 2026-01-28T21:30:31.000-03:00
diff --git a/modules/ROOT/nav.adoc b/modules/ROOT/nav.adoc
@@ -92,7 +92,9 @@
 **** xref:ai-agents:mcp/local/overview.adoc[Overview]
 **** xref:ai-agents:mcp/local/quickstart.adoc[Quickstart]
 **** xref:ai-agents:mcp/local/configuration.adoc[Configure]
-** xref:ai-agents:observability/concepts.adoc[Transcripts]
+** xref:ai-agents:observability/index.adoc[Transcripts]
+*** xref:ai-agents:observability/concepts.adoc[Concepts]
+*** xref:ai-agents:observability/view-transcripts.adoc[View Transcripts]
 
 * xref:develop:connect/about.adoc[Redpanda Connect]
 ** xref:develop:connect/connect-quickstart.adoc[Quickstart]
diff --git a/modules/ai-agents/pages/observability/concepts.adoc b/modules/ai-agents/pages/observability/concepts.adoc
@@ -1,12 +1,12 @@
 = Transcripts and AI Observability
-:description: Understand how Redpanda captures execution traces for agents and MCP servers using OpenTelemetry.
+:description: Understand how Redpanda captures execution transcripts for agents and MCP servers using OpenTelemetry.
 :page-topic-type: concepts
 :personas: agent_developer, platform_admin, data_engineer
-:learning-objective-1: Explain how traces and spans capture execution flow
-:learning-objective-2: Interpret trace structure for debugging and monitoring
-:learning-objective-3: Distinguish between observability traces and audit logs
+:learning-objective-1: Explain how transcripts and spans capture execution flow
+:learning-objective-2: Interpret transcript structure for debugging and monitoring
+:learning-objective-3: Distinguish between transcripts and audit logs
 
-Redpanda automatically captures execution traces for both AI agents and MCP servers, providing complete observability into how your agentic systems operate.
+Redpanda automatically captures execution transcripts for both AI agents and MCP servers, providing complete observability into how your agentic systems operate.
 
 After reading this page, you will be able to:
 
@@ -27,7 +27,7 @@ Transcripts capture:
 * Error conditions
 * Performance metrics
 
-With 100% sampling, every operation is captured, creating complete transcripts that you can use for debugging, monitoring, and performance analysis.
+With 100% sampling, every operation is captured, enabling comprehensive debugging, monitoring, and performance analysis.
 
 == Traces and spans
 
@@ -37,13 +37,13 @@ OpenTelemetry traces provide a complete picture of how a request flows through y
 * A _span_ represents a single unit of work within that trace (such as a data processing operation or an external API call).
 * A trace contains one or more spans organized hierarchically, showing how operations relate to each other.
 
-== Agent trace hierarchy
+== Agent transcript hierarchy
 
 Agent executions create a hierarchy of spans that reflect how agents process requests. Understanding this hierarchy helps you interpret agent behavior and identify where issues occur.
 
 === Agent span types
 
-Agent traces contain these span types:
+Agent transcripts contain these span types:
 
 [cols="2,3,3", options="header"]
 |===
@@ -90,13 +90,13 @@ This shows:
 
 Examine span durations to identify where time is spent and optimize accordingly.
 
-== MCP server trace hierarchy
+== MCP server transcript hierarchy
 
-MCP server executions create a different hierarchy that reflects tool invocations and internal processing. Understanding this hierarchy helps you debug tool execution and identify performance bottlenecks.
+MCP server tool invocations produce a different span hierarchy focused on tool execution and internal processing. This structure reveals performance bottlenecks and helps debug tool-specific issues.
 
 === MCP server span types
 
-MCP server traces contain these span types:
+MCP server transcripts contain these span types:
 
 [cols="2,3,3", options="header"]
 |===
@@ -143,13 +143,13 @@ This shows:
 
 The majority of time (4+ seconds) is spent in tool execution, while internal processing (mapping) takes only microseconds. This indicates the tool itself (likely making external API calls or database queries) is the bottleneck, not Redpanda Connect's internal processing.
 
-== Trace layers and scope
+== Transcript layers and scope
 
-Traces contain multiple layers of instrumentation, from HTTP transport through application logic to external service calls. The `scope.name` field in each span identifies which layer of instrumentation created that span.
+Transcripts contain multiple layers of instrumentation, from HTTP transport through application logic to external service calls. The `scope.name` field in each span identifies which instrumentation layer created that span.
 
 === Instrumentation layers
 
-A complete agent trace includes these layers:
+A complete agent transcript includes these layers:
 
 [cols="2,2,4", options="header"]
 |===
@@ -178,7 +178,7 @@ A complete agent trace includes these layers:
 
 === How layers connect
 
-Layers connect through parent-child relationships in a single trace:
+Layers connect through parent-child relationships in a single transcript:
 
 ----
 ai-agent-http-server (HTTP Server layer)
@@ -191,7 +191,7 @@ ai-agent-http-server (HTTP Server layer)
     └── chat gpt-5-nano (AI SDK layer, LLM call 2)
 ----
 
-This shows:
+The request flow demonstrates:
 
 1. HTTP request arrives at agent
 2. Agent invokes sub-agent
@@ -201,9 +201,9 @@ This shows:
 6. Agent makes second LLM call with tool results
 7. Response returns through HTTP layer
 
-=== Cross-service traces
+=== Cross-service transcripts
 
-When agents call MCP tools, the trace spans multiple services. Each service has a different `service.name` in the resource attributes:
+When agents call MCP tools, the transcript spans multiple services. Each service has a different `service.name` in the resource attributes:
 
 * Agent spans: `"service.name": "ai-agent"`
 * MCP server spans: `"service.name": "mcp-{server-id}"`
@@ -240,9 +240,9 @@ Redpanda Connect layer:
 
 - Component-specific attributes from your tool configuration
 
-Use `scope.name` to filter spans by layer when analyzing traces.
+Use `scope.name` to filter spans by layer when analyzing transcripts.
 
-== Understand the trace structure
+== Understand the transcript structure
 
 Each span captures a unit of work. Here's what a typical MCP tool invocation looks like:
 
@@ -273,7 +273,7 @@ Key elements to understand:
 
 === Parent-child relationships
 
-Traces show how operations relate. A tool invocation (parent) may trigger internal operations (children):
+Transcripts show how operations relate. A tool invocation (parent) may trigger internal operations (children):
 
 [,json]
 ----
@@ -289,9 +289,9 @@ Traces show how operations relate. A tool invocation (parent) may trigger intern
 
 The `parentSpanId` links this child span to the parent tool invocation. Both share the same `traceId` so you can reconstruct the complete operation.
 
-== Error events in traces
+== Error events in transcripts
 
-When something goes wrong, traces capture error details:
+When something goes wrong, transcripts capture error details:
 
 [,json]
 ----
@@ -314,32 +314,32 @@ When something goes wrong, traces capture error details:
 The `events` array captures what happened and when. Use `timeUnixNano` to see exactly when the error occurred within the operation.
 
 [[opentelemetry-traces-topic]]
-== How Redpanda stores traces
+== How Redpanda stores trace data
 
-The `redpanda.otel_traces` topic stores OpenTelemetry spans in JSON format, following the https://opentelemetry.io/docs/specs/otel/protocol/[OpenTelemetry Protocol (OTLP)^] specification. A Protobuf schema named `redpanda.otel_traces-value` is also automatically registered with the topic, enabling clients to deserialize trace data correctly.
+The `redpanda.otel_traces` topic stores OpenTelemetry spans using Redpanda's Schema Registry wire format with a custom Protobuf schema named `redpanda.otel_traces-value` that closely follows the https://opentelemetry.io/docs/specs/otel/protocol/[OpenTelemetry Protocol (OTLP)^] specification. This schema is automatically registered in the Schema Registry with the topic, enabling clients to deserialize trace data correctly.
 
-The `redpanda.otel_traces` topic and its schema are managed automatically by Redpanda. If you delete either the topic or the schema, they are recreated automatically. However, deleting the topic permanently deletes all trace data, and the topic comes back empty. Do not produce your own data to this topic. It is reserved for OpenTelemetry traces.
+Redpanda manages both the `redpanda.otel_traces` topic and its schema automatically. If you delete either the topic or the schema, they are recreated automatically. However, deleting the topic permanently deletes all trace data, and the topic comes back empty. Do not produce your own data to this topic. It is reserved for OpenTelemetry traces.
 
 === Topic configuration and lifecycle
 
 The `redpanda.otel_traces` topic has a predefined retention policy. Configuration changes to this topic are not supported. If you modify settings, Redpanda reverts them to the default values.
 
 The topic persists in your cluster even after all agents and MCP servers are deleted, allowing you to retain historical trace data for analysis.
 
-Trace data may contain sensitive information from your tool inputs and outputs. Consider implementing appropriate glossterm:ACL[access control lists (ACLs)] for the `redpanda.otel_traces` topic, and review the data in traces before sharing or exporting to external systems.
+Transcripts may contain sensitive information from your tool inputs and outputs. Consider implementing appropriate glossterm:ACL[access control lists (ACLs)] for the `redpanda.otel_traces` topic, and review the data in transcripts before sharing or exporting to external systems.
 
-== Traces compared to audit logs
+== Transcripts compared to audit logs
 
-OpenTelemetry traces are designed for observability and debugging, not audit logging or compliance.
+Transcripts are designed for observability and debugging, not audit logging or compliance.
 
-Traces provide:
+Transcripts provide:
 
 * Hierarchical view of request flow through your system (parent-child span relationships)
 * Detailed timing information for performance analysis
 * Ability to reconstruct execution paths and identify bottlenecks
 * Insights into how operations flow through distributed systems
 
-Traces are not:
+Transcripts are not:
 
 * Immutable audit records for compliance purposes
 * Designed for "who did what" accountability tracking
@@ -348,5 +348,6 @@ For compliance and audit requirements, use the session and task topics for agent
 
 == Next steps
 
+* xref:ai-agents:observability/view-transcripts.adoc[]
 * xref:ai-agents:agents/monitor-agents.adoc[]
 * xref:ai-agents:mcp/remote/monitor-mcp-servers.adoc[]
diff --git a/modules/ai-agents/pages/observability/index.adoc b/modules/ai-agents/pages/observability/index.adoc
@@ -0,0 +1,6 @@
+= Transcripts
+:page-layout: index
+:description: Monitor agent and MCP server execution using complete OpenTelemetry traces captured by Redpanda.
+
+{description}
+
diff --git a/modules/ai-agents/pages/observability/view-transcripts.adoc b/modules/ai-agents/pages/observability/view-transcripts.adoc
@@ -0,0 +1,120 @@
+= View Transcripts
+:description: Learn how to filter, search, and navigate the Transcripts interface to investigate agent execution traces using multiple detail views and interactive timeline navigation.
+:page-topic-type: how-to
+:personas: agent_developer, platform_admin
+:learning-objective-1: Filter and search transcripts to find specific execution traces
+:learning-objective-2: Navigate between detail views to inspect span information at different levels
+:learning-objective-3: Use the timeline interactively to navigate to specific time periods
+
+The Transcripts view provides filtering, searching, and navigation capabilities for investigating agent and MCP server execution transcripts. Use these features to efficiently locate specific operations, analyze performance patterns, and debug issues across tool invocations, LLM calls, and agent reasoning steps.
+
+After reading this page, you will be able to:
+
+* [ ] {learning-objective-1}
+* [ ] {learning-objective-2}
+* [ ] {learning-objective-3}
+
+For basic orientation on agent and MCP server monitoring, see xref:ai-agents:agents/monitor-agents.adoc[] or xref:ai-agents:mcp/remote/monitor-mcp-servers.adoc[]. For conceptual background on what transcripts capture and how spans are organized hierarchically, see xref:ai-agents:observability/concepts.adoc[].
+
+== Prerequisites
+
+* xref:ai-agents:agents/create-agent.adoc[Running agent] or xref:ai-agents:mcp/remote/quickstart.adoc[MCP server] with at least one execution
+* Access to the Transcripts view (requires appropriate permissions to read the `redpanda.otel_traces` topic)
+
+== Navigate the Transcripts interface
+
+=== Use the interactive timeline
+
+Use the timeline visualization to quickly identify when errors began or patterns changed, and navigate directly to transcripts from particular timestamps.
+
+When viewing time periods with many transcripts (hundreds or thousands), the timeline displays a subset of the data to maintain performance and usability. The timeline bar indicates the actual time range of currently visible data, which may be narrower than your selected range.
+
+TIP: See xref:ai-agents:agents/monitor-agents.adoc[] and xref:ai-agents:mcp/remote/monitor-mcp-servers.adoc[] to learn basic execution patterns and health indicators to investigate.
+
+=== Search and filter for transcripts
+
+Use search and filters together to narrow down transcripts and quickly locate specific executions.
+
+==== Search for specific transcripts
+
+The search functionality helps you find transcripts by operation names, span types, or identifiers:
+
+* Search by span names to find specific xref:ai-agents:observability/concepts.adoc#agent-span-types[agent operations] like `invoke_agent`, or xref:ai-agents:mcp/remote/create-tool.adoc[MCP tools]
+* Search by xref:ai-agents:observability/concepts.adoc#instrumentation-layers[scope] to filter by layer (for example, `rpcn-mcp` for MCP tool spans)
+* Search by trace IDs (`traceId`) when correlating with external systems or troubleshooting specific requests
+
+==== Filter by service
+
+Service filtering shows only transcripts from specific agents or MCP servers using the `service.name` resource attribute. See xref:ai-agents:observability/concepts.adoc#cross-service-transcripts[Cross-service transcripts] to understand how transcripts span multiple services.
+
+* View executions from a single agent when multiple are running (service name: `ai-agent`)
+* Isolate MCP server activity from agent activity (service name: `mcp-{server-id}`)
+* Compare behavior across different service instances
+
+==== Filter by execution status
+
+Status filtering shows transcripts based on their execution outcome:
+
+* Show successful executions for health checks
+* Show only failed executions for error investigation
+* Toggle between success and error views to compare and analyze patterns
+
+==== Adjust time range
+
+Use the time range selector to focus on specific time periods (from the last five minutes up to the last 24 hours):
+
+* View recent executions (for example, over the last hour) to monitor real-time activity
+* Expand to longer periods for trend analysis over the last day
+* Narrow to specific time windows when investigating issues that occurred at known times
+
+TIP: Apply broad filters first (time range, service) to reduce the transcript set, then use search to narrow to specific operations.
+
+== Inspect span details
+
+Each row in the transcript table represents a high-level agent or MCP server request flow. Expand each parent span to see the xref:ai-agents:observability/concepts.adoc#agent-transcript-hierarchy[hierarchical structure] of nested operations, including tool calls, LLM interactions, and internal processing steps. Parent-child spans show how operations relate: for example, an agent invocation (parent) triggers LLM calls and tool executions (children).
+
+When agents invoke remote MCP servers, transcripts fold together across service boundaries to provide a unified view of the complete operation. The trace ID originates at the initial request touchpoint and propagates across all involved services, linking spans from both the agent and MCP server under a single transcript. Use the tree view to follow the trace flow across multiple services and understand the complete request lifecycle. 
+
+If you use external agents that directly invoke MCP servers in the Redpanda Agentic Data Plane, you may only see MCP-level parent transcripts, unless you have configured the agents to also emit traces to the Redpanda OTEL ingestion pipeline.
+
+Selected spans display detailed information at multiple levels, from high-level summaries to complete raw data:
+
+* Start with summary view for quick assessment
+* Inspect attributes for detailed investigation
+* Use raw data when you need complete information
+
+=== Summary view
+
+The summary panel provides high-level span information:
+
+* Total nested operations (span count) and execution time
+* Token usage for LLM operations
+* Counts of LLM calls and tool calls
+
+Click on an individual span to drill down into the execution context:
+
+* View the full conversation history saved for that session, including user prompts, configured xref:ai-agents:agents/create-agent.adoc#write-the-system-prompt[system prompts] to guide agent behavior, and LLM outputs
+* Inspect individual tool calls made by the agent and any of its sub-agents, including request arguments and responses
+
+TIP: Expand the summary panel to full view to easily read long conversations.
+
+=== Detailed attributes view
+
+The attributes view shows structured metadata for each transcript span. Use this view to quickly locate an attribute value such as conversation ID, then paste it into the search box to find all operations from that conversation session. See xref:ai-agents:observability/concepts.adoc#key-attributes-by-layer[Transcripts and AI Observability] for details on standard attributes by instrumentation layer.
+
+=== Raw data view
+
+The raw data view provides the complete span structure:
+
+* Full OpenTelemetry span in JSON format
+* All fields including those not displayed in summary or attributes views
+* Structured data suitable for export or programmatic access
+
+You can also view the raw transcript data in the `redpanda.otel_traces` topic.
+
+== Next steps
+
+* xref:ai-agents:agents/monitor-agents.adoc[]
+* xref:ai-agents:mcp/remote/monitor-mcp-servers.adoc[]
+* xref:ai-agents:observability/concepts.adoc[]
+* xref:ai-agents:agents/troubleshooting.adoc[]