Commit 089fdc5
committed
[argus] lead_agent/prompt: add <debugging_when_stuck> block
Smaller models (e.g. Qwen3-27B) tend to apply progressively weaker
fixes to the same suspect area when their initial mental model of a
bug is wrong, eventually rewriting from scratch and introducing new
bugs. Frontier models tend to step back and instrument when the
observable result doesn't change between fixes; smaller models often
don't have that instinct.
The new <debugging_when_stuck> block sits next to <file_editing>
inside <working_directory> and gives an explicit decision rule:
1. Two failed fixes that didn't change the observable result is a
signal that the model of the bug is wrong, not a reason to keep
guessing.
2. Instrument first, fix second — add logging, inspect compile/link
status, intermediate values, draw counts.
3. If instrumentation doesn't pinpoint it, reduce the test surface
to the simplest version that should still fail and add
complexity back one piece at a time. Do not rewrite from scratch
as a debugging strategy.
Adds a test asserting block presence, contents, and placement after
<file_editing> within <working_directory>.
PR-candidate: maybe
Upstream-issue: none
Reason: Like the <file_editing> block, this is opinionated phrasing
targeting a specific observed failure mode. Worth proposing
upstream after we have data on whether it actually changes
agent behaviour on a similar reproduction.1 parent 184d3c9 commit 089fdc5
2 files changed
Lines changed: 47 additions & 0 deletions
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
478 | 478 | | |
479 | 479 | | |
480 | 480 | | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
481 | 493 | | |
482 | 494 | | |
483 | 495 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
255 | 255 | | |
256 | 256 | | |
257 | 257 | | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
258 | 293 | | |
259 | 294 | | |
260 | 295 | | |
| |||
0 commit comments