Skip to content

bug(tasks): Switching to PostHog Code Inbox during a live local task run silently kills all in-session background Monitors #3035

Description

@edblankenship

Repro

With a local agent task session actively running, switch to the PostHog Code Inbox view (or navigate away) and come back. The session "continues" into a fresh window ("Continue from where you left off" appears; the run UUID changes), and every long-running background Monitor / run_in_background Bash process the task agent spawned is dead — silently. Sub-agents and other AI Agent worker sessions spawned from PostHog Code's harness externally survive, such as when a PostHog Code task session is in an Orchestrator role invoking external MCP tools of another harness to start other sessions; only the PostHog Code harnesses' in-process Monitor children die.

Root cause

Remounting the task view drives reconcileLocalConnection down the resume-existing path → agent.reconnectgetOrCreateSession(isReconnect=true), which kills the prior task run's whole process group via killProcessTree (packages/workspace-server/src/services/agent/agent.ts:716-723process-tracking.ts:183), then resumeSession spawns a fresh task agent process rehydrated from the JSONL log — conversation only, no live children. The Monitor/background children are untracked OS descendants of the killed agent process (the "child" process category referenced in the kill-loop is never registered anywhere — process-tracking/schemas.ts:3).

Note

The underlying trigger is actually the Task session dropping out of the connected status, and navigation causes that via remount -- but in principle, an SSE-channel close / heartbeat-timeout / idle could flip it too. The root issue may not be scoped narrowly to navigation changes only.

Impact

Silent loss of real-time monitoring the moment a user checks the PostHog Code's 'Inbox ' while live task(s) are still running — the exact missed-signal failure real-time monitoring exists to prevent, with no warning, the watches stop.

Proposed fix

When the task run is still the current live task run with a tracked agent process, re-attach to the existing session (the sessions.get(taskRunId) reuse fast-path already exists at agent.ts:710-714 and re-subscribe to the event channel instead of kill+resume.

Longer term, register background children under the existing "child" category and re-establish them after a genuine resume.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugSomething isn't working

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions