Skip to content

feat: pause conductor heartbeats after inactivity#839

Open
yaroshevych wants to merge 2 commits into
asheshgoplani:mainfrom
yaroshevych:conductor-heartbeat-idle-pause
Open

feat: pause conductor heartbeats after inactivity#839
yaroshevych wants to merge 2 commits into
asheshgoplani:mainfrom
yaroshevych:conductor-heartbeat-idle-pause

Conversation

@yaroshevych
Copy link
Copy Markdown
Contributor

@yaroshevych yaroshevych commented Apr 30, 2026

Summary

If agent-deck is left running for a long period of time - e.g. overnight, it keeps pinging conductor, burning tokens. New feature adds --heartbeat-idle-minutes flag, which pauses heartbeats after period of inactivity.

  • add per-conductor heartbeat idle threshold configuration
  • expose heartbeat idle threshold and last activity in conductor status JSON
  • centralize heartbeat gating in the Go session send path for shell and bridge heartbeats
  • use persistent hook lifecycle timestamps for heartbeat activity instead of raw tmux window activity

Tests

Tested manually:

  1. Created new conductor with agent-deck conductor setup test --heartbeat-idle-minutes 3
  2. Started new Claude session, sent a message
  3. Conductor received 3 heartbeats, then nothing
  4. In Claude session, sent one more message
  5. Conductor started receiving heartbeats

Unit tests:

  1. go test ./internal/session -run 'TestGetConductorLastActivity|TestRenderConductorHeartbeatScript|TestConductorHeartbeatScript|TestConductorSessionTitle|TestIsConductorHeartbeatMessage|TestConductorStatusJSON'
  2. go test ./cmd/agent-deck -run TestShouldSkipConductorHeartbeatSend

Copy link
Copy Markdown
Owner

@asheshgoplani asheshgoplani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yaroshevych — thank you for the thorough work, the heartbeat-pause feature is genuinely useful and the implementation is mostly clean (variadic backwards-compat, proper hook-ledger usage instead of raw tmux activity, additive JSON, default-off). Tests pass, race-condition is benign, no regressions on existing conductors.

One concern from the dual-review (Claude + Codex agreed on this) that I'd like resolved before merge:

Silent-failure mode on zero lastActivity. shouldSkipConductorHeartbeatSend at cmd/agent-deck/session_cmd.go:1859 treats lastActivity.IsZero() as "skip heartbeat." In the happy path this means "no managed sessions yet, no need to ping." But it also fires if:

  • The hook ledger files are missing (filesystem cleanup, fresh installs of the runtime)
  • The ledger is corrupt (write race, disk full mid-write, partial flush after crash)
  • readHookStatusFile returns zero for any reason while sessions still exist

In those cases, heartbeats stop and stay stopped until the user manually messages the conductor — which they likely won't notice for hours or days. That's the same class of silent failure we've been fixing across the codebase this week (notifier dedup, SQLite leak, busy-retry termination), so I'd rather not ship another silent-failure mode.

The test docstring on TestGetConductorLastActivity_NoManagedSessions even says "must not suppress heartbeats in this case" but the implementation does suppress — the assertion would mask a future regression that someone "fixes" toward the docstring.

Suggested fix: in shouldSkipConductorHeartbeatSend, treat lastActivity.IsZero() as "send heartbeat" (the conservative default) rather than "skip." If activity is genuinely never recorded, the user keeps getting their normal cadence — which they can disable explicitly via --no-heartbeat. Suppression should require positive evidence of recent activity.

If you'd rather take a different approach (e.g. a separate "activity-required" flag), happy to discuss. Either way the test docstring should be aligned with the implementation.

Everything else is good to go — once that's resolved, this lands in v1.7.79.

Committed by Ashesh Goplani

@yaroshevych yaroshevych force-pushed the conductor-heartbeat-idle-pause branch from 0231055 to 1aa5cdb Compare May 2, 2026 20:37
@yaroshevych
Copy link
Copy Markdown
Contributor Author

Good catch. Updated the behaviour.

@yaroshevych yaroshevych requested a review from asheshgoplani May 2, 2026 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants