Skip to content

Commit 8332534

Browse files
committed
base
1 parent f4b10a8 commit 8332534

File tree

1 file changed

+148
-0
lines changed

1 file changed

+148
-0
lines changed
Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
= Agent Concepts
2+
:description: Understand how agents execute, manage context, invoke tools, and handle errors.
3+
:page-topic-type: concepts
4+
:personas: agent_developer, streaming_developer, data_engineer
5+
:learning-objective-1: Explain how agents execute reasoning loops and make tool invocation decisions
6+
:learning-objective-2: Describe how agents manage context and state across interactions
7+
:learning-objective-3: Identify error handling strategies for agent failures
8+
9+
Agents execute through a reasoning loop where the LLM analyzes context, decides which tools to invoke, processes results, and repeats until the task completes. Understanding this execution model helps you design reliable agent systems.
10+
11+
After reading this page, you will be able to:
12+
13+
* [ ] {learning-objective-1}
14+
* [ ] {learning-objective-2}
15+
* [ ] {learning-objective-3}
16+
17+
== Agent execution model
18+
19+
Every agent request follows a reasoning loop. The agent doesn't execute all tool calls at once. Instead, it makes decisions iteratively.
20+
21+
=== The reasoning loop
22+
23+
When an agent receives a request:
24+
25+
. The LLM receives the context, including system prompt, conversation history, user request, and previous tool results.
26+
. The LLM chooses to invoke a tool, requests more information, or responds to user.
27+
. The tool runs and returns results if invoked.
28+
. The tool's results are added to conversation history.
29+
. The LLM reasons again with an expanded context.
30+
31+
The loop continues until one of these conditions is met:
32+
33+
* Agent completes the task and responds to the user
34+
* Agent reaches max iterations limit
35+
* Agent encounters an unrecoverable error
36+
37+
=== Why iterations matter
38+
39+
Each iteration includes three phases:
40+
41+
. **LLM reasoning**: The model processes the growing context to decide the next action.
42+
. **Tool invocation**: If the agent decides to call a tool, execution happens and waits for results.
43+
. **Context expansion**: Tool results are added to the conversation history for the next iteration.
44+
45+
With higher iteration limits, agents can complete complex tasks but costs more and takes longer.
46+
47+
With lower iteration limits, agents respond faster and cheaper but may fail on complex requests.
48+
49+
==== Cost calculation
50+
51+
Calculate the approximate cost per request by estimating average context tokens per iteration:
52+
53+
----
54+
Cost per request = (iterations x context tokens x model price per token)
55+
----
56+
57+
Example with 30 iterations at $0.000002 per token:
58+
59+
----
60+
Iteration 1: 500 tokens x $0.000002 = $0.001
61+
Iteration 15: 2000 tokens x $0.000002 = $0.004
62+
Iteration 30: 4000 tokens x $0.000002 = $0.008
63+
64+
Total: ~$0.013 per request
65+
----
66+
67+
Actual costs vary based on:
68+
69+
* Tool result sizes (large results increase context)
70+
* Model pricing (varies by provider and model tier)
71+
* Task complexity (determines iteration count)
72+
73+
Setting max iterations creates a cost/capability trade-off:
74+
75+
[cols="1,1,2,1", options="header"]
76+
|===
77+
|Limit |Range |Use Case |Cost
78+
79+
|Low
80+
|10-20
81+
|Simple queries, single tool calls
82+
|Cost-effective
83+
84+
|Medium
85+
|30-50
86+
|Multi-step workflows, tool chaining
87+
|Balanced
88+
89+
|High
90+
|50-100
91+
|Complex analysis, exploratory tasks
92+
|Higher
93+
|===
94+
95+
Iteration limits prevent runaway costs when agents encounter complex or ambiguous requests.
96+
97+
== MCP tool invocation patterns
98+
99+
MCP tools extend agent capabilities beyond text generation. Understanding when and how tools execute helps you design effective tool sets.
100+
101+
=== Synchronous tool execution
102+
103+
In Redpanda Cloud, tool calls block the agent. When the agent decides to invoke a tool, it pauses and waits while the tool executes (querying a database, calling an API, or processing data). When the tool returns its result, the agent resumes reasoning.
104+
105+
This synchronous model means latency adds up across multiple tool calls, the agent sees tool results sequentially rather than in parallel, and long-running tools can delay or fail agent requests due to timeouts.
106+
107+
=== Tool selection decisions
108+
109+
The LLM decides which tool to invoke based on system prompt guidance (such as "Use get_orders when customer asks about history"), tool descriptions from the MCP schema that define parameters and purpose, and conversation context where previous tool results influence the next tool choice. Agents can invoke the same tool multiple times with different parameters if the task requires it.
110+
111+
=== Tool chaining
112+
113+
Agents chain tools when one tool's output feeds another tool's input. For example, an agent might first call `get_customer_info(customer_id)` to retrieve details, then use that data to call `get_order_history(customer_email)`.
114+
115+
Tool chaining requires sufficient max iterations because each step in the chain consumes one iteration.
116+
117+
=== Tool granularity considerations
118+
119+
Tool design affects agent behavior. Coarse-grained tools that do many things result in fewer tool calls but less flexibility and more complex implementation. Fine-grained tools that each do one thing require more tool calls but offer higher composability and simpler implementation.
120+
121+
Choose granularity based on how often you'll reuse tool logic across workflows, whether intermediate results help with debugging, and how much control you want over tool invocation order.
122+
123+
For tool design guidance, see xref:ai-agents:mcp/remote/best-practices.adoc[].
124+
125+
== Context and state management
126+
127+
Agents handle two types of information: conversation context (what's been discussed) and state (persistent data across sessions).
128+
129+
=== Conversation context
130+
131+
The agent's context includes the system prompt (always present), user messages, agent responses, tool invocation requests, and tool results.
132+
133+
As the conversation progresses, context grows. Each tool result adds tokens to the context window, which the LLM uses for reasoning in subsequent iterations.
134+
135+
=== Context window limits
136+
137+
LLM context windows limit how much history fits. Small models support 8K-32K tokens, medium models support 32K-128K tokens, and large models support 128K-1M+ tokens.
138+
139+
When context exceeds the limit, the oldest tool results get truncated, the agent loses access to early conversation details, and may ask for information it already retrieved.
140+
141+
Design workflows to complete within context limits. Avoid unbounded tool chaining.
142+
143+
== Next steps
144+
145+
* xref:ai-agents:agents/architecture-patterns.adoc[]
146+
* xref:ai-agents:agents/quickstart.adoc[]
147+
* xref:ai-agents:agents/prompt-best-practices.adoc[]
148+
* xref:ai-agents:mcp/remote/best-practices.adoc[]

0 commit comments

Comments
 (0)