You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After the recent ai.common.utils consolidation, three duplicates remain in the node-side code.
_tool_call_protocol_prompt is byte-identical between nodes/src/nodes/agent_langchain/langchain.py:322 and nodes/src/nodes/agent_deepagent/deepagent.py:472.
_parse_tool_call_envelope is a near-duplicate between the same two drivers (langchain.py:338, deepagent.py:495). The deepagent version is strictly better — it uses _extract_first_json_object to tolerate trailing prose, fenced markdown, and stacked JSON objects.
_build_highlight_config is byte-identical inside one node: nodes/src/nodes/index_search/opensearch_client.py:50 and elasticsearch_store.py:52.
None of the four helpers have direct unit tests today.
Proposed Solution
Move _tool_call_protocol_prompt, _parse_tool_call_envelope, and _extract_first_json_object into a new packages/ai/src/ai/common/agent/_internal/protocol.py. Public names: build_tool_call_protocol_prompt, parse_tool_call_envelope, extract_first_json_object. Keep the trio together — they describe one protocol (LLM emits a JSON envelope; we parse it).
Move _build_highlight_config into a new ai.common.index namespace at packages/ai/src/ai/common/index/highlight.py, with public name build_highlight_config. Anticipates future index-related shared helpers.
Add unit tests next to each consolidated helper. For the envelope parser, include cases for trailing prose, fenced markdown, stacked JSON, escaped quotes, malformed input, and the LangChain-compat additional_kwargs fallback path. For the walker, add direct tests of balanced-brace edge cases.
Behaviour changes accepted for agent_langchain: the unified envelope parser uses the smarter walker (parses noisy LLM output that previously returned None), and tool-call IDs use uuid.uuid4().hex[:12] instead of id(obj). Call IDs are opaque tokens; the format change is invisible to the model.
Alternatives Considered
Placing all three in ai.common.utils.agent_tools (the namespace established by the parent PR). Rejected — the agent-protocol trio is conceptually part of the agent module, not generic utilities.
Splitting the parser and the walker into different modules. Rejected — the parser depends on the walker and they share one purpose.
Keeping LangChain's strict json.loads parsing and id(obj) call IDs via a strict_json / id_format parameter. Rejected as over-engineering — the deepagent behaviour is strictly better.
Putting build_highlight_config inside index_search/ (the only current user). Rejected to anticipate future Elasticsearch/OpenSearch-related nodes.
Problem Statement
ai.common.utilsconsolidation, three duplicates remain in the node-side code._tool_call_protocol_promptis byte-identical betweennodes/src/nodes/agent_langchain/langchain.py:322andnodes/src/nodes/agent_deepagent/deepagent.py:472._parse_tool_call_envelopeis a near-duplicate between the same two drivers (langchain.py:338,deepagent.py:495). The deepagent version is strictly better — it uses_extract_first_json_objectto tolerate trailing prose, fenced markdown, and stacked JSON objects._build_highlight_configis byte-identical inside one node:nodes/src/nodes/index_search/opensearch_client.py:50andelasticsearch_store.py:52.Proposed Solution
_tool_call_protocol_prompt,_parse_tool_call_envelope, and_extract_first_json_objectinto a newpackages/ai/src/ai/common/agent/_internal/protocol.py. Public names:build_tool_call_protocol_prompt,parse_tool_call_envelope,extract_first_json_object. Keep the trio together — they describe one protocol (LLM emits a JSON envelope; we parse it)._build_highlight_configinto a newai.common.indexnamespace atpackages/ai/src/ai/common/index/highlight.py, with public namebuild_highlight_config. Anticipates future index-related shared helpers.additional_kwargsfallback path. For the walker, add direct tests of balanced-brace edge cases.agent_langchain: the unified envelope parser uses the smarter walker (parses noisy LLM output that previously returnedNone), and tool-call IDs useuuid.uuid4().hex[:12]instead ofid(obj). Call IDs are opaque tokens; the format change is invisible to the model.Alternatives Considered
ai.common.utils.agent_tools(the namespace established by the parent PR). Rejected — the agent-protocol trio is conceptually part of the agent module, not generic utilities.json.loadsparsing andid(obj)call IDs via astrict_json/id_formatparameter. Rejected as over-engineering — the deepagent behaviour is strictly better.build_highlight_configinsideindex_search/(the only current user). Rejected to anticipate future Elasticsearch/OpenSearch-related nodes.Affected Modules