You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: modules/ai-agents/pages/agents/concepts.adoc
+8-17Lines changed: 8 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,11 +22,11 @@ Every agent request follows a reasoning loop. The agent doesn't execute all tool
22
22
23
23
When an agent receives a request:
24
24
25
-
. **LLM receives context**: System prompt, conversation history, user request, and previous tool results
26
-
. **LLM decides action**: Choose to invoke a tool, request more information, or respond to user
27
-
. **Tool executes** (if chosen): Tool runs and returns results.
28
-
. **Context grows**: Tool results added to conversation history
29
-
. **Loop repeats**: LLM reasons again with expanded context.
25
+
. The LLM receives the context, including system prompt, conversation history, user request, and previous tool results.
26
+
. The LLM chooses to invoke a tool, requests more information, or responds to user.
27
+
. The tool runs and returns results if invoked.
28
+
. The tool's results are added to conversation history.
29
+
. The LLM reasons again with an expanded context.
30
30
31
31
The loop continues until one of these conditions is met:
32
32
@@ -42,10 +42,9 @@ Each iteration includes three phases:
42
42
. **Tool invocation**: If the agent decides to call a tool, execution happens and waits for results.
43
43
. **Context expansion**: Tool results are added to the conversation history for the next iteration.
44
44
45
-
This creates a cost/capability/latency triangle:
45
+
With higher iteration limits, agents can complete complex tasks but costs more and takes longer.
46
46
47
-
* **Higher iteration limits**: Agent can complete complex tasks but costs more and takes longer.
48
-
* **Lower iteration limits**: Faster, cheaper responses but agent may fail on complex requests.
47
+
With lower iteration limits, agents respond faster and cheaper but may fail on complex requests.
49
48
50
49
==== Cost calculation
51
50
@@ -131,9 +130,7 @@ Agents handle two types of information: conversation context (what's been discus
131
130
132
131
The agent's context includes the system prompt (always present), user messages, agent responses, tool invocation requests, and tool results.
133
132
134
-
Agents persist conversation context within a session. When you use the *Inspector* tab in the Redpanda Cloud Console, it automatically maintains session state across multiple requests. The context ID is displayed at the top of the *Inspector* tab.
135
-
136
-
For programmatic access, applications must pass the context ID to maintain conversation continuity across requests. The context ID links to the session record in the agent's sessions topic.
133
+
As the conversation progresses, context grows. Each tool result adds tokens to the context window, which the LLM uses for reasoning in subsequent iterations.
137
134
138
135
=== Context window limits
139
136
@@ -143,12 +140,6 @@ When context exceeds the limit, the oldest tool results get truncated, the agent
143
140
144
141
Design workflows to complete within context limits. Avoid unbounded tool chaining.
145
142
146
-
=== State across conversations
147
-
148
-
Redpanda Cloud automatically persists conversation history. The *Inspector* tab automatically manages sessions for you. For programmatic integration, pass a session ID in API requests to maintain conversation continuity.
149
-
150
-
If you need additional state management beyond conversation history, create tools that read/write to custom state stores, or pass relevant context in each request.
You can use this URL to call your agent programmatically or integrate it with external systems.
200
-
201
-
The *Inspector* tab in the Cloud Console automatically uses this URL to connect to your agent for testing.
202
-
203
-
For programmatic access or external agent integration, see xref:ai-agents:agents/integration-overview.adoc[].
204
-
205
-
== Configure A2A discovery metadata (optional)
179
+
=== Configure A2A discovery metadata
206
180
207
181
After creating your agent, configure discovery metadata for external integrations. For detailed agent card design guidance, see link:https://agent2agent.info/docs/guides/create-agent-card/[Create an Agent Card^].
208
182
@@ -237,6 +211,32 @@ Skills describe what your agent can do for capability-based discovery. External
237
211
238
212
The updated metadata appears immediately at `\https://your-agent-url/.well-known/agent-card.json`. For more about what these fields mean and how they're used, see xref:ai-agents:agents/a2a-concepts.adoc#agent-card-metadata[Agent card metadata].
Copy file name to clipboardExpand all lines: modules/ai-agents/pages/agents/monitor-agents.adoc
+18-52Lines changed: 18 additions & 52 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,10 +1,10 @@
1
1
= Monitor Agent Activity
2
-
:description: Monitor agent execution, analyze conversation history, track token usage, and debug issues using inspector, transcripts, and agent data topics.
2
+
:description: Monitor agent execution, analyze conversation history, track token usage, and debug issues using Inspector, Transcripts, and agent data topics.
3
3
:page-topic-type: how-to
4
4
:personas: agent_developer, platform_admin
5
5
:learning-objective-1: pass:q[Verify agent behavior using the *Inspector* tab]
6
6
:learning-objective-2: Track token usage and performance metrics
7
-
:learning-objective-3: Debug agent execution using transcripts
7
+
:learning-objective-3: pass:q[Debug agent execution using *Transcripts*]
8
8
9
9
Use monitoring to track agent performance, analyze conversation patterns, debug execution issues, and optimize token costs.
10
10
@@ -20,94 +20,64 @@ For conceptual background on traces and observability, see xref:ai-agents:observ
20
20
21
21
You must have a running agent. If you do not have one, see xref:ai-agents:agents/quickstart.adoc[].
22
22
23
-
== Debug agent execution with transcripts
23
+
== Debug agent execution with Transcripts
24
24
25
-
The transcripts view shows execution traces with detailed timing, errors, and performance metrics. Use this view to debug issues, verify agent behavior, and monitor performance in real-time.
25
+
The *Transcripts* view shows execution traces with detailed timing, errors, and performance metrics. Use this view to debug issues, verify agent behavior, and monitor performance in real-time.
Use the transcripts view to verify your agent is healthy:
32
+
Use the *Transcripts* view to verify your agent is healthy. Look for consistent green bars in the timeline, which indicate successful executions. Duration should stay within your expected range, while token usage remains stable without unexpected growth.
33
33
34
-
Healthy agent indicators:
34
+
Several warning signs indicate problems. Red bars in the timeline mean errors or failures that need investigation. When duration increases over time, your context window may be growing or tool calls could be slowing down. Many LLM calls for simple requests often signal that the agent is stuck in loops or making unnecessary iterations. If you see missing transcripts, the agent may be stopped or encountering deployment issues.
35
35
36
-
* Timeline shows consistent green bars (successful executions)
37
-
* Duration stays within expected range (check summary panel)
38
-
* Token usage is stable (not growing unexpectedly)
39
-
* LLM calls match expected patterns (1-3 calls for simple queries)
40
-
* No error bars in timeline
36
+
Pay attention to patterns across multiple executions. When all recent transcripts show errors, start by checking agent status, MCP server connectivity, and system prompt configuration. A spiky timeline that alternates between success and error typically points to intermittent tool failures or external API issues. If duration increases steadily over a session, your context window is likely filling up. Clear the conversation history to reset it. High token usage combined with relatively few LLM calls usually means tool results are large or your system prompts are verbose.
41
37
42
-
Warning signs:
38
+
=== Debug with Transcripts
43
39
44
-
* Red bars in timeline: Errors or failures, click to investigate
45
-
* Increasing duration: May indicate context window growth or slow tool calls
46
-
* High token usage: Check if conversation history is too long
47
-
* Many LLM calls: Agent may be stuck in loops or making unnecessary iterations
48
-
* Missing transcripts: Agent may be stopped or encountering deployment issues
40
+
Use *Transcripts* to diagnose specific issues:
49
41
50
-
Common patterns to investigate:
51
-
52
-
* All recent transcripts show errors: Check agent status, MCP server connectivity, or system prompt
53
-
* Duration increasing over session: Context window filling up, consider clearing conversation history
54
-
* Spiky timeline (alternating success/error): Intermittent tool failures or external API issues
55
-
* High token usage with few LLM calls: Large tool results or verbose system prompts
56
-
57
-
=== Debug with transcripts
58
-
59
-
Use transcripts to diagnose specific issues:
60
-
61
-
Agent not responding
42
+
If the agent is not responding:
62
43
63
44
. Check the timeline for recent transcripts. If none appear, the agent may be stopped.
64
45
. Verify agent status in the main *AI Agents* view.
65
46
. Look for error transcripts with deployment or initialization failures.
66
47
67
-
Tool execution errors
48
+
If the agent fails during execution:
68
49
69
50
. Select the failed transcript (red bar in timeline).
70
51
. Expand the trace hierarchy to find the tool invocation span.
71
52
. Check the span details for error messages.
72
53
. Cross-reference with MCP server status.
73
54
74
-
Slow performance
55
+
If performance is slow:
75
56
76
57
. Compare duration across multiple transcripts in the summary panel.
77
58
. Look for specific spans with long durations (wide bars in trace list).
78
59
. Check if LLM calls are taking longer than expected.
79
60
. Verify tool execution time by examining nested spans.
80
61
81
-
Unexpected behavior
82
-
83
-
. Select the transcript for the problematic request.
84
-
. Expand the full trace hierarchy to see all operations.
. Check LLM call count: excessive calls may indicate loops.
87
-
88
62
=== Track token usage and costs
89
63
90
-
View token consumption in the Summary panel when you select a transcript:
91
-
92
-
* Input tokens: Tokens sent to the LLM (system prompt + conversation history + tool results)
93
-
* Output tokens: Tokens generated by the LLM (agent responses)
94
-
* Total tokens: Sum of input and output
64
+
View token consumption in the *Summary* panel when you select a transcript. The breakdown shows input tokens (everything sent to the LLM including system prompt, conversation history, and tool results), output tokens (what the LLM generates in agent responses), and total tokens as the sum of both.
Cost = (input_tokens x input_price) + (output_tokens x output_price)
100
70
----
101
71
102
72
Example: GPT-5.2 with 4,302 input tokens and 1,340 output tokens at $0.00000175 per input token and $0.000014 per output token costs $0.026 per request.
103
73
104
74
For cost optimization strategies, see xref:ai-agents:agents/concepts.adoc#cost-calculation[Cost calculation].
105
75
106
-
== Test agent behavior with the inspector
76
+
== Test agent behavior with Inspector
107
77
108
78
The *Inspector* tab provides real-time conversation testing. Use it to test agent responses interactively and verify behavior before deploying changes.
109
79
110
-
=== Access the inspector
80
+
=== Access Inspector
111
81
112
82
. Navigate to *Agentic AI* > *AI Agents* in the Redpanda Cloud Console.
113
83
. Click your agent name.
@@ -118,13 +88,9 @@ The *Inspector* tab provides real-time conversation testing. Use it to test agen
118
88
119
89
=== Testing best practices
120
90
121
-
Test your agents systematically with these scenarios:
91
+
Test your agents systematically by exploring edge cases and potential failure scenarios. Begin with boundary testing. Requests at the edge of agent capabilities verify that scope enforcement works correctly. Error handling becomes clear when you request unavailable data and observe whether the agent degrades gracefully or fabricates information.
122
92
123
-
* Boundary cases: Test requests at the edge of agent capabilities to verify scope enforcement.
124
-
* Error handling: Request unavailable data to verify graceful degradation.
125
-
* Iteration count: Monitor how many iterations complex requests require.
126
-
* Ambiguous input: Send vague queries to verify clarification behavior.
127
-
* Token usage: Track tokens per request to estimate costs.
93
+
Monitor iteration counts during complex requests to ensure they complete within your configured limits. Ambiguous or vague queries reveal whether the agent asks clarifying questions or makes risky assumptions. Throughout testing, track token usage per request to estimate costs and identify which query patterns consume the most resources.
0 commit comments