Skip to content

fix(ai): preserve parallel tool-call identity in transcript replay#1068

Open
dylan-savage wants to merge 1 commit into
developfrom
fix/deepagent-parallel-toolcall-replay
Open

fix(ai): preserve parallel tool-call identity in transcript replay#1068
dylan-savage wants to merge 1 commit into
developfrom
fix/deepagent-parallel-toolcall-replay

Conversation

@dylan-savage
Copy link
Copy Markdown
Collaborator

@dylan-savage dylan-savage commented Jun 2, 2026

Problem

langchain_messages_to_transcript (ai.common.utils.agent_tools) renders prior conversation history into the plain-text transcript that is replayed to the LLM on every turn.

When an assistant turn issued parallel tool calls, the function:

  • flattened them into separate single-call tool_call lines, and
  • dropped both the per-call id and the ToolMessage.tool_call_id.

So on the next turn the replayed transcript had lost the call→result pairing. The model could mis-attribute results — most acutely when several parallel calls hit the same tool (e.g. task for agent_deepagent subagent fan-out, where every result line read identically as tool[task]: ...).

This is the CodeRabbit Major flagged as a follow-up to the concurrent fan-out work in #990. The first-turn execution path was never affected (it runs on live id-bearing message objects); the drift only appeared on multi-turn replay, which is why #990's tests passed.

Fix

In langchain_messages_to_transcript:

  • ToolMessage now renders tool[name#tool_call_id], pinning each result to the call that produced it. Degrades to tool[name] when no tool_call_id is present (no crash, no dangling #).
  • AIMessage preserves the original grouping and per-call ids:
    • a single call emits the singular {"type":"tool_call","id":...,"name":...,"args":...} envelope;
    • multiple parallel calls emit one plural {"type":"tool_calls","calls":[...]} envelope — the same shape the model emitted — instead of being flattened into separate single-call lines.

Tests

Adds 3 regression tests to TestLangchainMessagesToTranscript in packages/ai/tests/ai/common/utils/test_agent_tools.py:

  • test_single_tool_call_round_trips_with_id — singular envelope carries its id; result pinned to tool[search#call_solo].
  • test_parallel_calls_preserve_grouping_and_per_call_identity — two parallel calls to the same tool keep distinct ids on both call and result; exactly one plural envelope, zero flattened single-call lines.
  • test_missing_tool_call_id_degrades_gracefully — renders tool[search]: ok.

RED→GREEN verified: with the source fix stashed, the two identity tests fail against develop (the failure output shows the two flattened tool_call lines with no ids); restoring the fix makes all pass.

Verification

  • packages/ai/tests/ai/common/utils/test_agent_tools.py — 27 passed
  • packages/ai/tests/ai/common/utils/ — 178 passed
  • packages/ai/tests/ — 1181 passed, 120 skipped (one pre-existing collection error in test_task_scheduler.py from a missing time_machine dependency, unrelated to this change)
  • ruff check + ruff format clean on both files

Notes

The fix lives in the shared util now (develop moved these helpers out of deepagent.py into ai.common.utils.agent_tools after #990 merged), so every consumer of the transcript renderer benefits, not just agent_deepagent.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Tool call results now properly associate with their source calls through unique identifiers, improving accuracy in multi-tool scenarios.
    • Enhanced formatting of parallel tool calls using structured JSON envelopes while preserving call grouping.
  • Tests

    • Added regression tests verifying tool result pairing, parallel call handling, and graceful degradation when identifiers are missing.

`langchain_messages_to_transcript` (ai.common.utils.agent_tools) renders
prior conversation history into the plain-text transcript replayed to the
LLM on every turn. When an assistant turn issued *parallel* tool calls it
flattened them into separate single-call `tool_call` lines and dropped both
the per-call `id` and the `ToolMessage.tool_call_id`, so the replayed turn
lost the call->result pairing. The model could then mis-attribute results,
especially when several parallel calls hit the *same* tool (e.g. `task` for
agent_deepagent subagent fan-out).

Fix:
- ToolMessage now renders `tool[name#tool_call_id]`, pinning each result to
  the call that produced it (degrades to `tool[name]` when no id).
- AIMessage preserves the original grouping and per-call ids: a single call
  emits the singular envelope; multiple parallel calls emit ONE plural
  `tool_calls` envelope (the shape the model emitted) instead of being
  flattened into separate single-call lines.

Adds 3 regression tests to TestLangchainMessagesToTranscript covering
single-call round-trip, parallel same-tool identity, and graceful degrade
on a missing tool_call_id. Follow-up to the concurrent fan-out work in #990
(CodeRabbit Major).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: e3c825a7-b7b0-42ae-a642-be2154966ea8

📥 Commits

Reviewing files that changed from the base of the PR and between eae941c and f514a8e.

📒 Files selected for processing (2)
  • packages/ai/src/ai/common/utils/agent_tools.py
  • packages/ai/tests/ai/common/utils/test_agent_tools.py

📝 Walkthrough

Walkthrough

This PR enhances LangChain message transcript rendering to preserve tool call identity. ToolMessage now incorporates tool_call_id into role labels for result pairing, and AIMessage tool calls are serialized as grouped JSON envelopes (tool_call for one call, tool_calls with array for multiple) instead of separate lines.

Changes

Tool Call Identity in Message Transcripts

Layer / File(s) Summary
Tool message identity extraction
packages/ai/src/ai/common/utils/agent_tools.py
ToolMessage handling extracts tool_call_id to construct labeled role identifiers such as tool[name#call_id], ensuring tool results map to their originating calls.
Tool call grouped envelope serialization
packages/ai/src/ai/common/utils/agent_tools.py
AIMessage tool calls now render as a single JSON envelope: {"tool_call": {...}} for one call or {"tool_calls": {"calls": [...]}} for multiple parallel calls, with each call including id, name, and args.
Mock fixture and transcript identity tests
packages/ai/tests/ai/common/utils/test_agent_tools.py
ToolMessage mock fixture extended to accept and store tool_call_id; three regression tests verify single-call ID round-tripping, parallel call grouping with per-call identity preservation, and graceful handling of missing tool_call_id.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 Tool calls now wear their IDs with pride,
In envelopes grouped, no longer split wide,
One call per label, or many in arrays,
Each matched to results in transcript displays! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 62.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main fix: preserving parallel tool-call identity in transcript replay, which directly aligns with the core change of incorporating tool_call_id into ToolMessage roles and preserving AIMessage tool-call grouping.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/deepagent-parallel-toolcall-replay

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added the module:ai AI/ML modules label Jun 2, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

No description provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:ai AI/ML modules

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant