Skip to content

Commit 9e6c23f

Browse files
author
bgagent
committed
fix(agent): cancel hook short-circuits nudge consumption (krokoko review aws-samples#3)
Pre-fix, the between-turns-hooks dispatcher at ``agent/src/hooks.py`` ran EVERY registered hook in the list before checking ``ctx["_cancel_requested"]``. The registered order is cancel-hook first, nudge-hook second — so when cancel fired, the nudge hook had already run and: - Read ``TaskNudgesTable`` via ``read_pending``. - Marked the nudge rows ``status=consumed`` via ``mark_consumed``. - Added them to the in-memory ``_INJECTED_NUDGES`` dedup set. But the dispatcher's return value discarded everything when cancel was detected, so the nudge was silently lost. The user's ``bgagent nudge`` had returned 202 Accepted, the DDB row was now ``consumed``, yet the agent never saw the nudge content. A cancelled task leaks a nudge on every invocation. ## Fix — belt-and-braces Two independent guards so the invariant is preserved even if a future refactor reorders the hook registry: 1. **Loop-level break (primary).** The between-turns-hooks dispatcher checks ``ctx.get("_cancel_requested")`` immediately after each hook returns and breaks out of the loop. Cancel-hook runs first by convention, flips the flag, and the nudge-hook never runs. No DDB reads, no status mutation. 2. **Internal early-return in the nudge hook (secondary).** ``_nudge_between_turns_hook`` checks ``ctx.get("_cancel_requested")`` up front (right after the empty-task-id guard, BEFORE any DDB call and BEFORE ``_emit_nudge_milestone`` would write a spurious ``nudge_acknowledged``). Also guards against the pre-loop-cancel case (cancel flag already set when the dispatcher enters). The two guards are independent: if the registry ordering breaks (e.g. Phase 3 prepends an approval hook that flips cancel after the registered cancel hook), the internal early-return still protects against data loss. Hardened the registry-declaration comment at ``agent/src/hooks.py`` to state the ordering is load-bearing ("cancel MUST come before nudge") and documented how future hooks should be appended. Did NOT add a runtime sort — the list literal preserves insertion order deterministically, and forcing an implicit sort would hide real bugs from anyone intentionally reordering. ## Tests +5 regression tests in a new ``agent/tests/test_cancel_hook.py``: - ``test_nudge_hook_not_invoked_when_cancel_fires_first`` — spy hook verifies invocation count is 0 after cancel; dedup set untouched. - ``test_real_nudge_reader_not_touched_on_cancel`` — end-to-end with mocked DDB; asserts neither ``table.query`` nor ``update_item`` is called when cancel fires first. - ``test_preloop_cancel_skips_all_hooks_via_internal_guard`` — the nudge hook invoked directly with ``_cancel_requested=True``; no DDB I/O, no dedup mutation. - ``test_nudge_hook_internal_guard_fires_even_if_registry_reordered`` — asserts ``write_agent_milestone`` is NOT called when cancel is set (no spurious ack milestone for a cancelled turn). - ``test_running_task_nudge_still_consumed_normally`` — negative-control regression: RUNNING tasks still flow through cancel→nudge→inject path with DDB calls firing exactly once. Agent suite: 473 passing (was 468). Refs: krokoko code review on PR aws-samples#52 (finding 3)
1 parent d6ad9a5 commit 9e6c23f

2 files changed

Lines changed: 284 additions & 3 deletions

File tree

agent/src/hooks.py

Lines changed: 45 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -248,6 +248,21 @@ def _nudge_between_turns_hook(ctx: dict) -> list[str]:
248248
if not task_id:
249249
return []
250250

251+
# Belt-and-braces second guard against the "cancel consumes nudges" hazard
252+
# (krokoko PR #52 review finding #3). The primary guard is the loop-level
253+
# break in :func:`stop_hook` which short-circuits the dispatcher as soon as
254+
# any earlier hook sets ``_cancel_requested``. That assumes
255+
# ``_cancel_between_turns_hook`` runs BEFORE this hook — true for the
256+
# module-level ``between_turns_hooks`` registry today (line 340), but a
257+
# future reorder (or a test that rebinds the list without preserving
258+
# order) would silently reintroduce the bug: ``read_pending`` +
259+
# ``mark_consumed`` would flip the DDB rows to consumed and stamp
260+
# ``_INJECTED_NUDGES`` for a dying agent that will never see the text.
261+
# Early-returning here makes the invariant structural — no nudges are
262+
# ever consumed once cancel is flagged, regardless of hook ordering.
263+
if ctx.get("_cancel_requested"):
264+
return []
265+
251266
try:
252267
pending = nudge_reader.read_pending(task_id)
253268
except Exception as exc:
@@ -334,9 +349,15 @@ def _cancel_between_turns_hook(ctx: dict) -> list[str]:
334349
return []
335350

336351

337-
# Global list of between-turns hooks. Cancel runs first so it can short-circuit
338-
# nudges on cancelled tasks (no point injecting nudges into a dying agent).
339-
# Phase 3 (approval gates) will ``append`` additional hooks here.
352+
# Global list of between-turns hooks. Cancel MUST run first so it can
353+
# short-circuit nudges on cancelled tasks (no point injecting nudges into a
354+
# dying agent — worse, the nudge reader mutates DDB state that the agent will
355+
# never act on; see krokoko PR #52 review finding #3). The :func:`stop_hook`
356+
# dispatcher breaks out of the loop as soon as ``_cancel_requested`` is set,
357+
# and :func:`_nudge_between_turns_hook` early-returns when the flag is already
358+
# present — belt-and-braces in case a future ``append`` reorders this list.
359+
# Phase 3 (approval gates) should ``append`` additional hooks AFTER the
360+
# nudge reader to preserve cancel-wins semantics.
340361
between_turns_hooks: list[BetweenTurnsHook] = [
341362
_cancel_between_turns_hook,
342363
_nudge_between_turns_hook,
@@ -371,6 +392,20 @@ async def stop_hook(
371392
"progress": progress,
372393
}
373394

395+
# Cancel-before-nudge short-circuit (krokoko PR #52 review finding #3).
396+
# Previously the loop ran ALL hooks before checking ``_cancel_requested``,
397+
# which meant the nudge hook's ``read_pending`` + ``mark_consumed`` path
398+
# executed even on cancelled tasks — flipping the DDB rows to consumed
399+
# and stamping ``_INJECTED_NUDGES`` for a dying agent. The user saw a
400+
# 202 Accepted for their nudge but the injection was discarded when we
401+
# returned ``continue_=False`` below. Breaking out of the loop as soon
402+
# as any hook sets ``_cancel_requested`` guarantees subsequent hooks
403+
# (notably the nudge reader) never run, so DDB state is never mutated
404+
# for work the agent will never do. The registry at line 340 keeps
405+
# ``_cancel_between_turns_hook`` first so this break fires before the
406+
# nudge hook gets a chance. ``_nudge_between_turns_hook`` also carries
407+
# an internal cancel-check as belt-and-braces in case a future refactor
408+
# reorders the registry.
374409
chunks: list[str] = []
375410
for hook in between_turns_hooks:
376411
try:
@@ -383,6 +418,13 @@ async def stop_hook(
383418
continue
384419
if produced:
385420
chunks.extend(produced)
421+
if ctx.get("_cancel_requested"):
422+
# Any text produced by earlier hooks in this same loop iteration
423+
# is discarded below — the ``_cancel_requested`` branch returns
424+
# ``continue_=False`` and never reads ``chunks``. This is
425+
# intentional: cancel wins, and we would rather drop a
426+
# simultaneous nudge than inject into a dying agent.
427+
break
386428

387429
# Cancel takes precedence over nudge injection. ``continue_: False`` tells
388430
# the SDK to end the turn loop and return control to the caller, which

agent/tests/test_cancel_hook.py

Lines changed: 239 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
import pytest
1919

2020
import hooks as hooks_mod
21+
import nudge_reader
2122
import task_state
2223

2324

@@ -29,8 +30,12 @@ def _run(coro):
2930
def _reset():
3031
# Restore the default registry after each test.
3132
original = list(hooks_mod.between_turns_hooks)
33+
nudge_reader._reset_cache_for_tests()
34+
hooks_mod._reset_injected_nudges_for_tests()
3235
yield
3336
hooks_mod.between_turns_hooks[:] = original
37+
nudge_reader._reset_cache_for_tests()
38+
hooks_mod._reset_injected_nudges_for_tests()
3439

3540

3641
class TestCancelBetweenTurnsHook:
@@ -180,3 +185,237 @@ def test_milestone_emitted_on_cancel_detect(self, monkeypatch):
180185
progress.write_agent_milestone.assert_called_once()
181186
call_kwargs = progress.write_agent_milestone.call_args.kwargs
182187
assert call_kwargs["milestone"] == "cancel_detected"
188+
189+
190+
class TestCancelShortCircuitsNudgeConsumption:
191+
"""Regression for krokoko PR #52 review finding #3.
192+
193+
Before the fix, :func:`stop_hook` iterated ALL between-turns hooks BEFORE
194+
checking ``_cancel_requested`` — so when cancel fired, the nudge hook had
195+
already run, mutated DDB (``mark_consumed`` + stamped ``_INJECTED_NUDGES``),
196+
and had its return value silently discarded by the cancel branch. Users
197+
saw a 202 Accepted for their nudge but the instruction was never injected
198+
into the (dying) agent.
199+
200+
The fix is two-layered:
201+
1. ``stop_hook`` breaks out of the dispatcher loop as soon as any hook
202+
sets ``_cancel_requested``, so the nudge hook never runs on a
203+
cancelled turn.
204+
2. ``_nudge_between_turns_hook`` itself early-returns when
205+
``_cancel_requested`` is already present, as belt-and-braces in
206+
case a future refactor reorders the registry.
207+
"""
208+
209+
def test_nudge_hook_not_invoked_when_cancel_fires_first(self, monkeypatch):
210+
"""Happy-path regression: cancel hook flips sentinel → nudge hook is
211+
never called → DDB query never issued → injected-nudges set untouched.
212+
"""
213+
monkeypatch.setattr(task_state, "get_task", lambda _tid: {"status": "CANCELLED"})
214+
215+
nudge_calls = {"count": 0}
216+
217+
def _spy_nudge(_ctx):
218+
nudge_calls["count"] += 1
219+
return ["<user_nudge>should never be injected</user_nudge>"]
220+
221+
hooks_mod.between_turns_hooks[:] = [
222+
hooks_mod._cancel_between_turns_hook,
223+
_spy_nudge,
224+
]
225+
226+
result = _run(
227+
hooks_mod.stop_hook(
228+
hook_input={},
229+
tool_use_id=None,
230+
hook_context=None,
231+
task_id="t-cancel-nudge-race",
232+
progress=MagicMock(),
233+
)
234+
)
235+
236+
# Cancel-wins semantics unchanged.
237+
assert result == {
238+
"continue_": False,
239+
"stopReason": "Task cancelled by user",
240+
}
241+
# Critical invariant: the nudge hook was NEVER called. Before the
242+
# fix, ``nudge_calls["count"]`` would have been 1 and the pending
243+
# DDB row would have been marked consumed.
244+
assert nudge_calls["count"] == 0
245+
# In-process dedup set must be untouched — the "task set" should not
246+
# have been created because the nudge hook never ran.
247+
assert "t-cancel-nudge-race" not in hooks_mod._INJECTED_NUDGES
248+
249+
def test_real_nudge_reader_not_touched_on_cancel(self, monkeypatch):
250+
"""End-to-end regression: with the ACTUAL ``_nudge_between_turns_hook``
251+
registered alongside the cancel hook, a pending DDB row MUST NOT be
252+
read or marked consumed when cancel fires in the same turn.
253+
254+
This is the scenario the review was concerned about — a user submits
255+
a nudge, then immediately cancels, and the nudge disappears silently
256+
because it was consumed but never injected.
257+
"""
258+
monkeypatch.setattr(task_state, "get_task", lambda _tid: {"status": "CANCELLED"})
259+
260+
table = MagicMock()
261+
# If the nudge hook runs, it would see this pending row.
262+
table.query.return_value = {
263+
"Items": [
264+
{
265+
"task_id": "t-cancel-real",
266+
"nudge_id": "01NUDGE",
267+
"message": "please add logging",
268+
"created_at": "2026-05-05T12:00:00Z",
269+
"consumed": False,
270+
}
271+
]
272+
}
273+
table.update_item.return_value = {}
274+
nudge_reader._TABLE_CACHE = table
275+
276+
# Default registry order: cancel first, nudge second.
277+
hooks_mod.between_turns_hooks[:] = [
278+
hooks_mod._cancel_between_turns_hook,
279+
hooks_mod._nudge_between_turns_hook,
280+
]
281+
282+
result = _run(
283+
hooks_mod.stop_hook(
284+
hook_input={},
285+
tool_use_id=None,
286+
hook_context=None,
287+
task_id="t-cancel-real",
288+
progress=MagicMock(),
289+
)
290+
)
291+
292+
assert result["continue_"] is False
293+
# DDB must not have been queried — the nudge hook never ran.
294+
table.query.assert_not_called()
295+
# And therefore no ``mark_consumed`` call either.
296+
table.update_item.assert_not_called()
297+
298+
def test_preloop_cancel_skips_all_hooks_via_internal_guard(self, monkeypatch):
299+
"""If cancel is already flagged on ``ctx`` entering the dispatcher
300+
(e.g. a Phase 3 hook prepended to the registry sets it, or a future
301+
code path stamps the flag before hook dispatch), the nudge hook's
302+
own early-return covers it.
303+
304+
Today ``stop_hook`` builds ``ctx`` fresh each call so the pre-loop
305+
case is not reachable from the normal SDK entry point, but the
306+
nudge hook's internal guard is tested here directly to document the
307+
second line of defence.
308+
"""
309+
table = MagicMock()
310+
table.query.return_value = {
311+
"Items": [
312+
{
313+
"task_id": "t-preloop",
314+
"nudge_id": "01PRELOOP",
315+
"message": "should not be consumed",
316+
"created_at": "2026-05-05T12:00:00Z",
317+
"consumed": False,
318+
}
319+
]
320+
}
321+
table.update_item.return_value = {}
322+
nudge_reader._TABLE_CACHE = table
323+
324+
# Cancel sentinel already set on ctx entering the nudge hook.
325+
ctx = {"task_id": "t-preloop", "_cancel_requested": True}
326+
result = hooks_mod._nudge_between_turns_hook(ctx)
327+
328+
assert result == []
329+
# Belt-and-braces check: the nudge hook returned before any DDB I/O.
330+
table.query.assert_not_called()
331+
table.update_item.assert_not_called()
332+
# And the in-process dedup set was not stamped.
333+
assert "t-preloop" not in hooks_mod._INJECTED_NUDGES
334+
335+
def test_nudge_hook_internal_guard_fires_even_if_registry_reordered(
336+
self, monkeypatch
337+
):
338+
"""If a future refactor accidentally puts nudge before cancel in the
339+
registry, the loop-level break no longer helps — but the nudge
340+
hook's own ``_cancel_requested`` check still has to short-circuit.
341+
342+
Simulate this by registering a synthetic "early cancel" hook that
343+
flips the sentinel BEFORE the nudge hook, but keeping nudge second
344+
as usual. The loop will break after the cancel hook (finding
345+
already covered); here we verify the nudge hook's internal guard
346+
by driving it directly with cancel already set in ctx and an
347+
attached progress writer.
348+
"""
349+
table = MagicMock()
350+
table.query.return_value = {
351+
"Items": [
352+
{
353+
"task_id": "t-guard",
354+
"nudge_id": "01GUARD",
355+
"message": "must not inject",
356+
"created_at": "ts",
357+
"consumed": False,
358+
}
359+
]
360+
}
361+
table.update_item.return_value = {}
362+
nudge_reader._TABLE_CACHE = table
363+
364+
progress = MagicMock()
365+
ctx = {
366+
"task_id": "t-guard",
367+
"progress": progress,
368+
"_cancel_requested": True,
369+
}
370+
result = hooks_mod._nudge_between_turns_hook(ctx)
371+
372+
assert result == []
373+
# The early-return happens before ``_emit_nudge_milestone`` — no
374+
# ``nudge_acknowledged`` event should be written for a cancelled task.
375+
progress.write_agent_milestone.assert_not_called()
376+
table.query.assert_not_called()
377+
table.update_item.assert_not_called()
378+
379+
def test_running_task_nudge_still_consumed_normally(self, monkeypatch):
380+
"""Negative control: the guard must not regress the happy path.
381+
382+
A RUNNING task with a pending nudge should still flow through:
383+
cancel hook returns [] without setting the sentinel, nudge hook
384+
reads + consumes + injects as before.
385+
"""
386+
monkeypatch.setattr(task_state, "get_task", lambda _tid: {"status": "RUNNING"})
387+
388+
table = MagicMock()
389+
table.query.return_value = {
390+
"Items": [
391+
{
392+
"task_id": "t-live",
393+
"nudge_id": "01LIVE",
394+
"message": "add docs",
395+
"created_at": "ts",
396+
"consumed": False,
397+
}
398+
]
399+
}
400+
table.update_item.return_value = {}
401+
nudge_reader._TABLE_CACHE = table
402+
403+
hooks_mod.between_turns_hooks[:] = [
404+
hooks_mod._cancel_between_turns_hook,
405+
hooks_mod._nudge_between_turns_hook,
406+
]
407+
408+
result = _run(
409+
hooks_mod.stop_hook(
410+
hook_input={},
411+
tool_use_id=None,
412+
hook_context=None,
413+
task_id="t-live",
414+
progress=MagicMock(),
415+
)
416+
)
417+
418+
assert result["decision"] == "block"
419+
assert "add docs" in result["reason"]
420+
table.query.assert_called_once()
421+
table.update_item.assert_called_once()

0 commit comments

Comments
 (0)