fix(safe_message_handler): use state field for ClusterShutdown instead of local variable#458
Conversation
ClusterShutdown in ClusterManagerState was checked by update handlers to reject work after shutdown, but was never set — only a local shouldShutdown variable was set. This made the shutdown guard dead code, allowing updates to slip through during the AllHandlersFinished drain. Replace shouldShutdown with cm.state.ClusterShutdown as the single source of truth. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Fixes shutdown guarding in the safe_message_handler example workflow by making ClusterManagerState.ClusterShutdown the single source of truth, ensuring update handlers correctly reject work after shutdown.
Changes:
- Set
cm.state.ClusterShutdown = truewhen the shutdown signal is received. - Use
cm.state.ClusterShutdown(instead of a per-iteration local variable) as the run-loop break condition.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| selector.AddReceive(cm.shutdownCh, func(c workflow.ReceiveChannel, _ bool) { | ||
| c.Receive(ctx, nil) | ||
| shouldShutdown = true | ||
| cm.state.ClusterShutdown = true | ||
| }) | ||
| selector.AddFuture(workflow.NewTimer(ctx, cm.sleepInterval), func(f workflow.Future) { | ||
| cm.performHealthCheck(ctx) | ||
| }) | ||
| selector.Select(ctx) | ||
| if shouldShutdown { | ||
| if cm.state.ClusterShutdown { | ||
| break |
There was a problem hiding this comment.
Consider adding a regression test to ensure updates are rejected after ShutdownCluster is signaled. This change makes cm.state.ClusterShutdown the source of truth, but there’s currently no unit test asserting that an AssignNodesToJobs/DeleteJob update attempted after shutdown returns the expected error (especially while the workflow is draining via workflow.AllHandlersFinished).
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
Summary
ClusterManagerState.ClusterShutdownwas checked in bothAssignNodesToJobsandDeleteJobhandlers to reject work after shutdown, but was never actually set totrueshouldShutdownvariable was being set in the shutdown signal handler, making the guard in the update handlers dead codeAllHandlersFinisheddrain at shutdown could slip through and do real workshouldShutdownwithcm.state.ClusterShutdownas the single source of truth for both the loop break condition and the update handler guardsTest plan
go run safe_message_handler/worker/main.goandgo run safe_message_handler/starter/main.goand verify clean shutdownShutdownClustersignal is sent🤖 Generated with Claude Code