Commit 743b7f2
committed
[argus] loop_detector + prompt: nudge toward observation when stuck
Two related changes addressing a failure mode observed on Qwen3-27B:
the model has full multimodal vision but tends to skip past observing
tool output and pattern-match the bug from priors. When iteration
doesn't change the visible result, a single-line "stop calling tools"
warning amplifies the wrong instinct — the model concludes "I should
do this all at once" and rewrites the file from scratch, usually
introducing new bugs.
1. loop_detection_middleware: rewrite the soft-warning text.
Old: "[LOOP DETECTED] You are repeating the same tool calls. Stop
calling tools and produce your final answer now."
New: "[REPEAT TOOL CALL DETECTED] You have just made the same tool
call several times in a row. If the observable result is not
changing between calls, your model of the bug is likely
wrong... (a) describe what you actually observe in the latest
result, (b) note explicitly what is different from what you
expected, (c) instrument or pick a clearly different angle.
Do not 'rewrite from scratch' as a debugging strategy."
Hard-stop messages (>= hard_limit) are unchanged — at that point
producing a final answer is the right outcome. Soft-warning
threshold is the place where a strategy switch helps, so the new
text points at strategy not at termination.
Test assertions updated from "LOOP DETECTED" to
"REPEAT TOOL CALL DETECTED" — 8 sites in
test_loop_detection_middleware.py.
2. lead_agent prompt: add an "Observe before you diagnose" paragraph
to <debugging_when_stuck>.
Tells the model to describe what it actually sees in tool output
(visible elements, missing elements, error messages and warnings
verbatim) BEFORE proposing a fix. This forces the next fix to be
grounded in observation rather than priors, breaking the
pattern-match-from-prior loop that produces consecutive same-area
blind fixes.
Companion to a render-and-verify SKILL.md update on the Argus side
that turns this into an explicit numbered flow.
PR-candidate: maybe (loop detector text), maybe (prompt block)
Reason: Both changes target a Qwen-shaped failure mode that we observed
on a specific minion-render thread; the principles generalise
but upstream may have different priors on what soft-warning
text should say.1 parent ef623b5 commit 743b7f2
4 files changed
Lines changed: 27 additions & 11 deletions
File tree
- backend
- packages/harness/deerflow/agents
- lead_agent
- middlewares
- tests
Lines changed: 4 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
464 | 464 | | |
465 | 465 | | |
466 | 466 | | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
467 | 471 | | |
468 | 472 | | |
469 | 473 | | |
| |||
Lines changed: 13 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
126 | 126 | | |
127 | 127 | | |
128 | 128 | | |
129 | | - | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
130 | 137 | | |
131 | 138 | | |
132 | | - | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
133 | 144 | | |
134 | 145 | | |
135 | 146 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
123 | 123 | | |
124 | 124 | | |
125 | 125 | | |
126 | | - | |
| 126 | + | |
| 127 | + | |
127 | 128 | | |
128 | 129 | | |
129 | 130 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
200 | 200 | | |
201 | 201 | | |
202 | 202 | | |
203 | | - | |
| 203 | + | |
204 | 204 | | |
205 | 205 | | |
206 | 206 | | |
| |||
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
218 | | - | |
| 218 | + | |
219 | 219 | | |
220 | 220 | | |
221 | 221 | | |
| |||
306 | 306 | | |
307 | 307 | | |
308 | 308 | | |
309 | | - | |
| 309 | + | |
310 | 310 | | |
311 | 311 | | |
312 | 312 | | |
313 | 313 | | |
314 | | - | |
| 314 | + | |
315 | 315 | | |
316 | 316 | | |
317 | 317 | | |
| |||
533 | 533 | | |
534 | 534 | | |
535 | 535 | | |
536 | | - | |
| 536 | + | |
537 | 537 | | |
538 | 538 | | |
539 | 539 | | |
| |||
545 | 545 | | |
546 | 546 | | |
547 | 547 | | |
548 | | - | |
| 548 | + | |
549 | 549 | | |
550 | 550 | | |
551 | 551 | | |
| |||
619 | 619 | | |
620 | 620 | | |
621 | 621 | | |
622 | | - | |
| 622 | + | |
623 | 623 | | |
624 | 624 | | |
625 | 625 | | |
| |||
642 | 642 | | |
643 | 643 | | |
644 | 644 | | |
645 | | - | |
| 645 | + | |
646 | 646 | | |
647 | 647 | | |
648 | 648 | | |
| |||
0 commit comments