Skip to content

Commit a60abf2

Browse files
Da-MikeyHermes Evolution
andauthored
feat(loop-guard): mono-tool spiral detection with escalated interrupt (Closes #432) (#436)
Add mutating/idempotent tool-aware thresholding to the loop guard, so mutating tools (terminal, write_file, execute_code, etc.) trigger spiral detection at half the threshold of read-only tools. === Changes === agent/loop_guard.py: - Add _MUTATING_TOOLS and _IDEMPOTENT_TOOLS frozensets with category threshold constants: mutating repeat=4/fail=2/escalate=8 vs idempotent repeat=8/fail=4/escalate=15 - Add _tool_category() and _tool_spiral_score() helper functions - Update maybe_nudge() to auto-select thresholds based on tool type - Add ESCALATED INTERRUPT level: when a spiral exceeds the escalate threshold, the nudge becomes a directive requiring the agent to summarize progress before continuing - Include spiral-intensity score in high-count nudges so the model sees the evidence of fixation - Unknown tools (MCP, plugins) default to the safer mutating thresholds - Fix _EXIT_CODE_RE regex to correctly use \s (whitespace) and \d (digit) character classes tests/agent/test_loop_guard.py: - Split mutating/idempotent threshold coverage across both tool types - Add TestEscalatedInterrupt class (8 tests): verify escalated interrupt fires at correct thresholds for mutating, idempotent, and unknown tools - Test spiral-intensity annotations appear at high counts - Update existing tests to reflect new mutating thresholds - Added 11 new test cases (15 -> 26 total) agent/conversation_loop.py: - Log a warning when an ESCALATED INTERRUPT fires, so operators and log aggregators can detect deep spiral patterns === Testing === 26/26 loop_guard tests pass, 9/9 guardrail runtime tests pass (35 total) Co-authored-by: Hermes Evolution <evolution@hermes-agent.nousresearch.com>
1 parent ab84fa4 commit a60abf2

3 files changed

Lines changed: 262 additions & 38 deletions

File tree

agent/conversation_loop.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -712,6 +712,12 @@ def _run_conversation_impl(
712712
if _lg_nudge:
713713
messages.append({"role": "user", "content": _lg_nudge})
714714
agent._loop_guard_nudged = (_lg_tool, _lg_count)
715+
if "ESCALATED INTERRUPT" in _lg_nudge:
716+
logger.warning(
717+
"loop_guard: ESCALATED INTERRUPT for %s (%d calls) — "
718+
"deep mono-tool spiral detected (#432)",
719+
_lg_tool, _lg_count,
720+
)
715721
if not agent.quiet_mode:
716722
agent._safe_print("\n🌀 loop-guard: nudging a strategy change")
717723
except Exception as _lg_err: # never let the guard break the loop

agent/loop_guard.py

Lines changed: 159 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
* hard limits / access denials retried instead of routed around (#175)
99
* an unreachable MCP server looped on health checks (#176)
1010
* spirals that eventually hit the max-iteration abort (#143)
11+
* mono-tool spirals where the agent fixates on ONE tool category (#432)
1112
1213
Mechanism (deliberately conservative — advisory, never blocking):
1314
inspect the most recent CONSECUTIVE assistant tool-call turns. If the SAME tool
@@ -17,7 +18,17 @@
1718
to stop, re-check the goal, and change strategy. A real loop is broken; a rare
1819
false positive costs one advisory message.
1920
20-
Pure functions over the `messages` list → fully unit-testable, no agent state
21+
Tools are split into two categories for thresholding:
22+
- Mutating tools (terminal, write_file, patch, execute_code, etc.) get LOWER
23+
thresholds because a fixation on these is more costly and the model should
24+
be stopped sooner (#432).
25+
- Idempotent tools (read_file, search_files, web_search, etc.) use the default
26+
higher thresholds since re-reading data is less harmful and sometimes needed.
27+
28+
At higher call counts, the nudge escalates from advisory to a DIRECTIVE that
29+
requires the model to explain progress before continuing (#432).
30+
31+
Pure functions over the ``messages`` list -> fully unit-testable, no agent state
2132
required (the caller tracks "already nudged this run" to avoid spamming).
2233
"""
2334

@@ -58,6 +69,59 @@
5869
_NON_RETRYABLE = frozenset({"timeout", "permission", "missing_command", "limit"})
5970
_NONRETRY_THRESHOLD = 2
6071

72+
# Mutating tools get LOWER thresholds than idempotent tools because a fixation
73+
# on mutating operations (writing files, running commands) is more costly and
74+
# indicates a deeper strategy problem (#432).
75+
_IDEMPOTENT_TOOLS = frozenset(
76+
{
77+
"read_file",
78+
"search_files",
79+
"web_search",
80+
"web_extract",
81+
"session_search",
82+
"browser_snapshot",
83+
"browser_console",
84+
"browser_get_images",
85+
"mcp_filesystem_read_file",
86+
"mcp_filesystem_read_text_file",
87+
"mcp_filesystem_read_multiple_files",
88+
"mcp_filesystem_list_directory",
89+
"mcp_filesystem_list_directory_with_sizes",
90+
"mcp_filesystem_directory_tree",
91+
"mcp_filesystem_get_file_info",
92+
"mcp_filesystem_search_files",
93+
}
94+
)
95+
_MUTATING_TOOLS = frozenset(
96+
{
97+
"terminal",
98+
"execute_code",
99+
"write_file",
100+
"patch",
101+
"todo",
102+
"memory",
103+
"skill_manage",
104+
"browser_click",
105+
"browser_type",
106+
"browser_press",
107+
"browser_scroll",
108+
"browser_navigate",
109+
"send_message",
110+
"cronjob",
111+
"delegate_task",
112+
"process",
113+
}
114+
)
115+
# Default thresholds: lower for mutating tools, higher for idempotent (#432).
116+
# Mutating: repeat at 4, fail at 2, escalate at 8
117+
# Idempotent: repeat at 8, fail at 4, escalate at 15
118+
_MUTATING_REPEAT_THRESHOLD = 4
119+
_IDEMPOTENT_REPEAT_THRESHOLD = 8
120+
_MUTATING_FAIL_THRESHOLD = 2
121+
_IDEMPOTENT_FAIL_THRESHOLD = 4
122+
_MUTATING_ESCALATE_THRESHOLD = 8
123+
_IDEMPOTENT_ESCALATE_THRESHOLD = 15
124+
61125

62126
def _failure_category(content: Any) -> Optional[str]:
63127
"""The tool_diagnostics failure class of a result, or None if not a failure.
@@ -126,22 +190,72 @@ def _recent_tool_runs(messages: List[Dict[str, Any]]) -> List[Tuple[str, bool, O
126190
return runs
127191

128192

193+
def _tool_category(tool_name: str) -> str:
194+
"""Return 'mutating', 'idempotent', or 'unknown' for a tool name."""
195+
if tool_name in _MUTATING_TOOLS:
196+
return "mutating"
197+
if tool_name in _IDEMPOTENT_TOOLS:
198+
return "idempotent"
199+
return "unknown"
200+
201+
202+
def _tool_spiral_score(tool_name: str, count: int, base: int) -> Optional[str]:
203+
"""Compute a diversity-awareness score for the nudge message.
204+
205+
Returns a one-line annotation like 'spiral-index: 5' when the number of
206+
consecutive calls is meaningfully above the base threshold, or None for
207+
short runs.
208+
"""
209+
if count <= base:
210+
return None
211+
excess = count - base
212+
intensity = min(excess // 2, 5) # cap at 5 for readability
213+
if intensity >= 2:
214+
return f"spiral-intensity: {intensity} of 5"
215+
return None
216+
217+
129218
def maybe_nudge(
130219
messages: List[Dict[str, Any]],
131220
*,
132-
repeat_threshold: int = 6,
133-
fail_threshold: int = 3,
221+
repeat_threshold: Optional[int] = None,
222+
fail_threshold: Optional[int] = None,
134223
) -> Optional[str]:
135224
"""Return a nudge string if the trailing single-tool run is stuck, else None.
136225
137-
Two triggers (failure takes precedence — it's the higher-signal one):
138-
* the same tool's last `fail_threshold` results all look like failures
139-
* the same tool was called `repeat_threshold`+ times in a row
226+
Three trigger levels (each is lower for mutating tools than idempotent):
227+
1. Non-retryable failure class repeated twice (highest priority, #231)
228+
2. Generic failures >= fail_threshold
229+
3. Same tool called >= repeat_threshold times in a row
230+
4. Escalated interrupt at higher counts (#432)
231+
232+
Returns None when the agent is making varied progress (not stuck).
140233
"""
141234
runs = _recent_tool_runs(messages)
142235
if not runs:
143236
return None
144237
tool = runs[0][0]
238+
239+
# Pick thresholds based on tool category (#432).
240+
# Unknown tools get mutating thresholds as the safer default.
241+
cat = _tool_category(tool)
242+
is_mutating = cat == "mutating"
243+
is_unknown = cat == "unknown"
244+
if repeat_threshold is None:
245+
repeat_threshold = (
246+
_MUTATING_REPEAT_THRESHOLD if (is_mutating or is_unknown)
247+
else _IDEMPOTENT_REPEAT_THRESHOLD
248+
)
249+
if fail_threshold is None:
250+
fail_threshold = (
251+
_MUTATING_FAIL_THRESHOLD if (is_mutating or is_unknown)
252+
else _IDEMPOTENT_FAIL_THRESHOLD
253+
)
254+
escalate_threshold = (
255+
_MUTATING_ESCALATE_THRESHOLD if (is_mutating or is_unknown)
256+
else _IDEMPOTENT_ESCALATE_THRESHOLD
257+
)
258+
145259
# All entries in `runs` share the same tool (run breaks on tool change),
146260
# but guard anyway:
147261
same = [r for r in runs if r[0] == tool]
@@ -165,6 +279,14 @@ def maybe_nudge(
165279
else:
166280
counting_nonretry = False
167281

282+
# Category label for nudge messages.
283+
if is_mutating:
284+
cat_label = "mutating"
285+
elif is_unknown:
286+
cat_label = "unknown"
287+
else:
288+
cat_label = "idempotent"
289+
168290
# Highest-priority: a DETERMINISTIC failure repeated even once (#231). These
169291
# reproduce on a near-identical retry, so the generic 3-strike threshold is
170292
# too lenient — two in a row is already a spiral (terminal timeouts, denied
@@ -181,21 +303,41 @@ def maybe_nudge(
181303

182304
if consec_fail >= fail_threshold:
183305
return (
184-
f"[loop-guard] The `{tool}` tool has failed {consec_fail} times in a "
185-
f"row with the same approach. STOP repeating it. Diagnose the actual "
186-
f"blocker first (check prerequisites / environment / the exact error "
187-
f"class), then either switch to a different tool or strategy, or — if "
188-
f"the blocker can't be resolved — report it concisely instead of "
189-
f"retrying. Do not call `{tool}` again the same way."
306+
f"[loop-guard] The `{tool}` tool ({cat_label}) has failed "
307+
f"{consec_fail} times in a row with the same approach. STOP repeating "
308+
f"it. Diagnose the actual blocker first (check prerequisites / "
309+
f"environment / the exact error class), then either switch to a "
310+
f"different tool or strategy, or — if the blocker can't be resolved "
311+
f"— report it concisely instead of retrying. Do not call `{tool}` "
312+
f"again the same way."
190313
)
314+
191315
if count >= repeat_threshold:
316+
# Build diversity score for the nudge.
317+
score = _tool_spiral_score(tool, count, repeat_threshold)
318+
score_line = f"\n{score}" if score else ""
319+
320+
if count >= escalate_threshold:
321+
return (
322+
f"[loop-guard] You have called `{tool}` ({cat_label}) {count} "
323+
f"times in a row without resolving the task.{score_line}\n"
324+
f"⚠️ ESCALATED INTERRUPT: This is a deep mono-tool spiral. "
325+
f"PAUSE and summarize in one paragraph the concrete progress "
326+
f"these {count} calls have made toward the goal. If no measurable "
327+
f"progress exists, state the actual blocker explicitly and "
328+
f"propose a fundamentally different strategy — do NOT call "
329+
f"`{tool}` again until you have provided this summary."
330+
)
331+
192332
return (
193-
f"[loop-guard] You have called `{tool}` {count} times in a row without "
194-
f"resolving the task. Pause and re-read the goal: what concrete "
195-
f"progress have these calls made? Check your plan/success criterion, "
196-
f"then either change strategy, move to the next step, or report the "
197-
f"blocker. Avoid another near-identical `{tool}` call."
333+
f"[loop-guard] You have called `{tool}` ({cat_label}) {count} times "
334+
f"in a row without resolving the task.{score_line} Pause and re-read "
335+
f"the goal: what concrete progress have these calls made? Check your "
336+
f"plan/success criterion, then either change strategy, move to the "
337+
f"next step, or report the blocker. Avoid another near-identical "
338+
f"`{tool}` call."
198339
)
340+
199341
return None
200342

201343

0 commit comments

Comments
 (0)