Sub-Agent task_complete Bug: Race Condition Causes Response Loss
Summary
When sub-agents (spawned via the task tool) are instructed to use the task_complete tool, their responses are not captured by the parent agent, resulting in "General-purpose agent did not produce a response." This is a race condition in the sub-agent framework that affects all Claude models (Opus 4.6, Sonnet 4, Sonnet 4.5, Haiku 4.5), not just Opus 4.6.
Root Cause
The Copilot CLI sub-agent framework extracts responses from the sub-agent's text output stream, not from the task_complete summary. When a sub-agent calls task_complete, it terminates the response stream. This creates a race condition:
Happy path (works):
- Sub-agent emits text response
- Sub-agent calls
task_complete
- Parent captures text before stream terminates
- ✅ Response returned successfully
Bug path (fails):
- Sub-agent calls
task_complete as first/only action
- Response stream terminates immediately
- Parent receives empty/null response
- ❌ "General-purpose agent did not produce a response"
Evidence
Experimental Results
Spawned sub-agents with identical tasks, varying only whether they're told to use task_complete:
| Model |
With task_complete |
Without task_complete |
| claude-opus-4.6 |
6/13 success (46%) |
2/2 success (100%) |
| claude-sonnet-4.5 |
0/1 fail |
— |
| claude-sonnet-4 |
0/1 fail |
1/1 success |
| claude-haiku-4.5 |
0/1 fail |
— |
| claude-opus-4.5 |
1/1 success |
— |
| gpt-5.1 |
1/1 success |
— |
Key findings:
- Bug affects all Claude models as sub-agents, not just Opus 4.6
- Failure is non-deterministic (46% success rate for Opus 4.6)
- GPT models appear unaffected
- Without task_complete instruction: 100% success rate
Non-Deterministic Behavior
Prompt intensity matters:
- Emphatic prompts ("CRITICAL MUST use task_complete ONLY") → 100% failure
- Moderate prompts ("Do X and use task_complete") → ~46% success for Opus 4.6
Why? The more the prompt emphasizes task_complete, the more likely the model calls it before or instead of emitting text, triggering the race condition.
Reproduction
Guaranteed Failure (100%)
task(
agent_type="general-purpose",
prompt="""
Your ONLY job is to call task_complete with summary "Test".
CRITICAL: Use task_complete tool. Do not write any other response.
""",
mode="sync",
model="claude-opus-4.6"
)
# Result: "General-purpose agent did not produce a response"
Non-Deterministic Failure (~54%)
task(
agent_type="general-purpose",
prompt="""
Count files in scripts/ directory.
IMPORTANT: You MUST use task_complete when done.
""",
mode="sync",
model="claude-opus-4.6"
)
# Result: Sometimes works, sometimes "did not produce a response"
Guaranteed Success (100%)
task(
agent_type="general-purpose",
prompt="""
Count files in scripts/ directory.
DO NOT USE task_complete. Return your findings directly.
""",
mode="sync",
model="claude-opus-4.6"
)
# Result: Always returns response
Impact
Affected:
- All Claude model sub-agents (Opus 4.6, Sonnet 4/4.5, Haiku 4.5)
- Both sync and background modes
- Parent agents cannot reliably use Claude sub-agents if prompts mention task_complete
Not affected:
- GPT models (gpt-5.1, gpt-5.2-codex)
- Claude Opus 4.5 (appears more reliable but not tested extensively)
- Direct agent usage (non-sub-agents)
Workaround
For sub-agent prompts: Never instruct sub-agents to use task_complete. The parent framework already handles completion signaling. Instruct them to return findings directly:
task(
agent_type="general-purpose",
prompt="""
Analyze X and report findings.
DO NOT USE task_complete tool. Return your response text directly.
""",
mode="sync",
model="claude-opus-4.6"
)
Success rate: 100% with this workaround
Recommended Fix
The framework should:
- Buffer text output before allowing
task_complete to terminate the stream
- Ensure response serialization completes before terminating agent context
- Consider making
task_complete a soft signal rather than immediate termination
- Or: Auto-suppress
task_complete for sub-agents since parent framework already tracks completion
The fundamental issue is that task_complete terminates execution before ensuring the response text is captured, creating an avoidable race condition.
Environment
- Copilot CLI version: 0.0.405
- OS: macOS (Darwin)
- Affected models: All Claude models (tested: Opus 4.6, Sonnet 4, Sonnet 4.5, Haiku 4.5)
- Working models: GPT-5.1, GPT-5.2-Codex
Related
This issue was initially reported as Opus 4.6-specific, but debugging revealed it affects all Claude models when used as sub-agents. The non-deterministic nature made it appear model-specific initially.
Sub-Agent task_complete Bug: Race Condition Causes Response Loss
Summary
When sub-agents (spawned via the
tasktool) are instructed to use thetask_completetool, their responses are not captured by the parent agent, resulting in "General-purpose agent did not produce a response." This is a race condition in the sub-agent framework that affects all Claude models (Opus 4.6, Sonnet 4, Sonnet 4.5, Haiku 4.5), not just Opus 4.6.Root Cause
The Copilot CLI sub-agent framework extracts responses from the sub-agent's text output stream, not from the
task_completesummary. When a sub-agent callstask_complete, it terminates the response stream. This creates a race condition:Happy path (works):
task_completeBug path (fails):
task_completeas first/only actionEvidence
Experimental Results
Spawned sub-agents with identical tasks, varying only whether they're told to use
task_complete:Key findings:
Non-Deterministic Behavior
Prompt intensity matters:
Why? The more the prompt emphasizes
task_complete, the more likely the model calls it before or instead of emitting text, triggering the race condition.Reproduction
Guaranteed Failure (100%)
Non-Deterministic Failure (~54%)
Guaranteed Success (100%)
Impact
Affected:
Not affected:
Workaround
For sub-agent prompts: Never instruct sub-agents to use
task_complete. The parent framework already handles completion signaling. Instruct them to return findings directly:Success rate: 100% with this workaround
Recommended Fix
The framework should:
task_completeto terminate the streamtask_completea soft signal rather than immediate terminationtask_completefor sub-agents since parent framework already tracks completionThe fundamental issue is that
task_completeterminates execution before ensuring the response text is captured, creating an avoidable race condition.Environment
Related
This issue was initially reported as Opus 4.6-specific, but debugging revealed it affects all Claude models when used as sub-agents. The non-deterministic nature made it appear model-specific initially.