Summary
_tool_call_regex in mlx_lm/tool_parsers/gemma4.py (line 11) fails to match tool calls when <|"|>-delimited string arguments contain unbalanced { or }. This is common in code snippets, partial HTML, CSS, and JS templates passed as tool arguments.
Current behavior
import regex as re
pat = re.compile(r"call:(\w+)(\{(?:[^{}]|(?2))*\})", re.DOTALL)
# Unbalanced opening brace in string → NO MATCH (tool call silently lost)
pat.search('call:write_file{content:<|"|>if (x) { console.log(y)<|"|>}')
# → None
# Unbalanced closing brace in string → TRUNCATED
m = pat.search('call:write_file{content:<|"|>test } only<|"|>}')
# → m.group(2) == '{content:<|"|>test }' (value cut at first })
Expected behavior
Both should match the full tool call. { and } inside <|"|>...<|"|> are string content, not structural braces.
Root cause
Line 11:
_tool_call_regex = re.compile(r"call:(\w+)(\{(?:[^{}]|(?2))*\})", re.DOTALL)
[^{}] excludes all braces, and (?2) recursion handles balanced pairs. But <|"|>...<|"|> spans are not recognized — braces inside string delimiters are treated structurally. Balanced braces (e.g., CSS { color: red; }) pass by coincidence; unbalanced ones fail.
Note: _gemma4_args_to_json() correctly handles <|"|> strings (lines 14-34), but it runs after the regex — so when the regex fails to match, _gemma4_args_to_json never gets a chance to run.
Reproduction
import regex as re
pat = re.compile(r"call:(\w+)(\{(?:[^{}]|(?2))*\})", re.DOTALL)
# PASSES (balanced by coincidence):
assert pat.search('call:f{c:<|"|><style>.a { color: red; }</style><|"|>}')
# FAILS (unbalanced open brace):
assert pat.search('call:f{c:<|"|>if (x) { console.log(y)<|"|>}') is not None
# FAILS (unbalanced close brace — truncated):
m = pat.search('call:f{c:<|"|>test } only<|"|>}')
assert m and '<|"|>test } only<|"|>' in m.group(2)
Environment
Summary
_tool_call_regexinmlx_lm/tool_parsers/gemma4.py(line 11) fails to match tool calls when<|"|>-delimited string arguments contain unbalanced{or}. This is common in code snippets, partial HTML, CSS, and JS templates passed as tool arguments.Current behavior
Expected behavior
Both should match the full tool call.
{and}inside<|"|>...<|"|>are string content, not structural braces.Root cause
Line 11:
[^{}]excludes all braces, and(?2)recursion handles balanced pairs. But<|"|>...<|"|>spans are not recognized — braces inside string delimiters are treated structurally. Balanced braces (e.g., CSS{ color: red; }) pass by coincidence; unbalanced ones fail.Note:
_gemma4_args_to_json()correctly handles<|"|>strings (lines 14-34), but it runs after the regex — so when the regex fails to match,_gemma4_args_to_jsonnever gets a chance to run.Reproduction
Environment
regexmodule