fix(ai): preserve parallel tool-call identity in transcript replay#1068
fix(ai): preserve parallel tool-call identity in transcript replay#1068dylan-savage wants to merge 1 commit into
Conversation
`langchain_messages_to_transcript` (ai.common.utils.agent_tools) renders prior conversation history into the plain-text transcript replayed to the LLM on every turn. When an assistant turn issued *parallel* tool calls it flattened them into separate single-call `tool_call` lines and dropped both the per-call `id` and the `ToolMessage.tool_call_id`, so the replayed turn lost the call->result pairing. The model could then mis-attribute results, especially when several parallel calls hit the *same* tool (e.g. `task` for agent_deepagent subagent fan-out). Fix: - ToolMessage now renders `tool[name#tool_call_id]`, pinning each result to the call that produced it (degrades to `tool[name]` when no id). - AIMessage preserves the original grouping and per-call ids: a single call emits the singular envelope; multiple parallel calls emit ONE plural `tool_calls` envelope (the shape the model emitted) instead of being flattened into separate single-call lines. Adds 3 regression tests to TestLangchainMessagesToTranscript covering single-call round-trip, parallel same-tool identity, and graceful degrade on a missing tool_call_id. Follow-up to the concurrent fan-out work in #990 (CodeRabbit Major). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR enhances LangChain message transcript rendering to preserve tool call identity. ChangesTool Call Identity in Message Transcripts
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
No description provided. |
Problem
langchain_messages_to_transcript(ai.common.utils.agent_tools) renders prior conversation history into the plain-text transcript that is replayed to the LLM on every turn.When an assistant turn issued parallel tool calls, the function:
tool_calllines, andidand theToolMessage.tool_call_id.So on the next turn the replayed transcript had lost the call→result pairing. The model could mis-attribute results — most acutely when several parallel calls hit the same tool (e.g.
taskforagent_deepagentsubagent fan-out, where every result line read identically astool[task]: ...).This is the CodeRabbit Major flagged as a follow-up to the concurrent fan-out work in #990. The first-turn execution path was never affected (it runs on live id-bearing message objects); the drift only appeared on multi-turn replay, which is why #990's tests passed.
Fix
In
langchain_messages_to_transcript:tool[name#tool_call_id], pinning each result to the call that produced it. Degrades totool[name]when notool_call_idis present (no crash, no dangling#).{"type":"tool_call","id":...,"name":...,"args":...}envelope;{"type":"tool_calls","calls":[...]}envelope — the same shape the model emitted — instead of being flattened into separate single-call lines.Tests
Adds 3 regression tests to
TestLangchainMessagesToTranscriptinpackages/ai/tests/ai/common/utils/test_agent_tools.py:test_single_tool_call_round_trips_with_id— singular envelope carries itsid; result pinned totool[search#call_solo].test_parallel_calls_preserve_grouping_and_per_call_identity— two parallel calls to the same tool keep distinct ids on both call and result; exactly one plural envelope, zero flattened single-call lines.test_missing_tool_call_id_degrades_gracefully— renderstool[search]: ok.RED→GREEN verified: with the source fix stashed, the two identity tests fail against develop (the failure output shows the two flattened
tool_calllines with no ids); restoring the fix makes all pass.Verification
packages/ai/tests/ai/common/utils/test_agent_tools.py— 27 passedpackages/ai/tests/ai/common/utils/— 178 passedpackages/ai/tests/— 1181 passed, 120 skipped (one pre-existing collection error intest_task_scheduler.pyfrom a missingtime_machinedependency, unrelated to this change)ruff check+ruff formatclean on both filesNotes
The fix lives in the shared util now (develop moved these helpers out of
deepagent.pyintoai.common.utils.agent_toolsafter #990 merged), so every consumer of the transcript renderer benefits, not justagent_deepagent.🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests