Summary
The client loop in DeepAnalyzeVLLM.generate() breaks whenever the model response contains no <Code> block, even if it contains no <Answer>. If the model returns <Analyze> only (possible), the client stops and never executes any code.
This is a client-side loop termination bug/design flaw (not server/vLLM).
Where
File: deepanalyze.py
# Buggy termination condition: lines 118–120.
code_match = re.search(r"<Code>(.*?)</Code>", ans, re.DOTALL)
if not code_match or "<Answer>" in ans:
break
Why this is a bug
- “No
<Code>” does not mean “done.”
<Analyze>-only responses are valid intermediate steps.
- Current logic can terminate before execution and can end without
<Answer>.
Minimal reproduction
-
Call generate() with a prompt expecting code execution.
-
If the model returns only:
<Analyze>
I will proceed step-by-step.
</Analyze>
(no " <Code>", no "<Answer>". Only "<Analyze>"),
- The loop exits immediately due to if not code_match ... break.
Actual behavior
- Stops early when
code_match is None
- No
<Execute> output
- Can end without
<Answer>
Expected behavior
- Stop only when
<Answer> is present (done).
- Execute when
<Code> is present.
- If neither
<Answer> nor <Code> is present, do not terminate; retry / nudge for <Code>.
Suggested minimal fix
- Break only on
<Answer>.
- If
<Code> is missing, append the assistant message and nudge the model to output <Code>, then continue.
Example patch:
if "<Answer>" in ans:
break
code_match = re.search(r"<Code>(.*?)</Code>", ans, re.DOTALL)
if not code_match:
messages.append({"role": "assistant", "content": ans})
messages.append({
"role": "user",
"content": "Please provide a <Code> block to execute. Your last message contained no <Code>."
})
continue
Optional improvements
- Add
"</Answer>" to stop list: stop=["</Code>", "</Answer>"]
- Always append
ans to messages each round (even when no <Code>) for continuity/debugging.
- Add a small retry budget for malformed outputs to avoid infinite loops.
Diff
--- a/deepanalyze.py
+++ b/deepanalyze.py
@@ class DeepAnalyzeVLLM:
- # Check for <Code> block
- code_match = re.search(r"<Code>(.*?)</Code>", ans, re.DOTALL)
- if not code_match or "<Answer>" in ans:
- break
+ # Stop only when the model explicitly answers
+ if "<Answer>" in ans:
+ break
+
+ # Check for <Code> block (do NOT terminate if missing)
+ code_match = re.search(r"<Code>(.*?)</Code>", ans, re.DOTALL)
+ if not code_match:
+ # Model returned <Analyze>-only or malformed output.
+ # Preserve context and nudge it to emit a <Code> block.
+ messages.append({"role": "assistant", "content": ans})
+ messages.append({
+ "role": "user",
+ "content": "Please provide a <Code> block to execute. Your last message contained no <Code>."
+ })
+ continue
Summary
The client loop in
DeepAnalyzeVLLM.generate()breaks whenever the model response contains no<Code>block, even if it contains no<Answer>. If the model returns<Analyze>only (possible), the client stops and never executes any code.This is a client-side loop termination bug/design flaw (not server/vLLM).
Where
File:
deepanalyze.pyWhy this is a bug
<Code>” does not mean “done.”<Analyze>-only responses are valid intermediate steps.<Answer>.Minimal reproduction
Call generate() with a prompt expecting code execution.
If the model returns only:
(no "
<Code>", no "<Answer>". Only "<Analyze>"),Actual behavior
code_matchisNone<Execute>output<Answer>Expected behavior
<Answer>is present (done).<Code>is present.<Answer>nor<Code>is present, do not terminate; retry / nudge for<Code>.Suggested minimal fix
<Answer>.<Code>is missing, append the assistant message and nudge the model to output<Code>, then continue.Example patch:
Optional improvements
"</Answer>"to stop list:stop=["</Code>", "</Answer>"]anstomessageseach round (even when no<Code>) for continuity/debugging.Diff