Skip to content

Commit 3ad308b

Browse files
ehsan6shaclaude
andcommitted
runtime: tool responses use role=user (not role=tool) to match set_chat_template
Lab test after temperature drop showed the first turn works beautifully (XML tool_call emitted, diag/summary ran). Second turn (sending tool response back) returned rkllm_run=-1. Cause: our session-level set_chat_template configured only the user prefix (<|im_start|>user\\n) + postfix (<|im_end|>\\n <|im_start|>assistant\\n). When we sent role='tool', the runtime had no per-role template for 'tool' and rejected the call with return code -1. Fix: send tool results as role='user' content=<tool_response> {name,result}</tool_response>. The runtime applies the user-role template (matching set_chat_template's prefix/postfix), and the <tool_response> wrapping matches the training-data format so the model semantically recognises it as the tool turn. KV cache preserved via keep_history=1. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent d857c4d commit 3ad308b

1 file changed

Lines changed: 8 additions & 8 deletions

File tree

src/runtime/rkllm_runtime.py

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1284,15 +1284,15 @@ async def run_troubleshoot(
12841284
if ok and tc["tool"] == "diag/summary" and isinstance(result, dict):
12851285
last_summary_payload = result
12861286

1287-
# Queue the tool responses as the NEXT generate() call's
1288-
# role="tool" content. The v1.2.3 runtime appends to the
1289-
# existing KV cache (keep_history=1 set above) so the
1290-
# model sees the prior assistant turn + this tool result.
1291-
# Multiple tool responses concatenated with newlines —
1292-
# the runtime templates the whole blob as one tool turn.
1293-
next_role = "tool"
1287+
# Queue the tool responses as the NEXT generate() call.
1288+
# Use role="user" (NOT "tool") because our session's
1289+
# set_chat_template configured user prefix/postfix only —
1290+
# using role="tool" returns -1 from the runtime. The
1291+
# <tool_response>{...}</tool_response> wrapping matches
1292+
# the training format so the model sees it as the tool
1293+
# turn semantically. KV cache preserved via keep_history=1.
1294+
next_role = "user"
12941295
next_content = "\n".join(tool_responses_for_context)
1295-
# next_keep_history already set to 1 after first turn.
12961296
history.append({
12971297
"role": "tool",
12981298
"content": next_content,

0 commit comments

Comments
 (0)