Skip to content

Commit 529fdd3

Browse files
authored
feat(router): openai-harmony bypass for gpt-oss tool calls
Closes #444 #455 #468 #480 #513. Replaces wire-text reconstruction of gpt-oss tool calls with vLLM/SGLang's structured-passthrough architecture: HarmonyStreamingRouter delegates to openai-harmony's StreamableParser, emits RouterEvent.tool_call as structured (name, arguments), engine surfaces GenerationOutput.tool_calls, routes bypass regex-based parsing via _parse_tool_calls_with_parser(structured_tool_calls=...). Streaming fast-path enforces tool_calls[*].index monotonicity across router chunks + parallel_tool_calls=false cap. 16 rounds of codex review, final round MERGE-SAFE: 4484 unit / 553 targeted / 3×3 stress matrix green.
1 parent 7f45f0a commit 529fdd3

14 files changed

Lines changed: 2657 additions & 79 deletions

pyproject.toml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,13 @@ dependencies = [
5050
# rapidserver Worker as the reverse-tunnel transport. Pure Python,
5151
# ~200 KB. Replaces the prior frpc Go binary download.
5252
"websockets>=12.0",
53+
# openai-harmony — official harmony-protocol streaming parser. Used
54+
# by ``HarmonyStreamingRouter`` (output_router_harmony.py) to route
55+
# gpt-oss tool calls correctly (issue #513 / cluster #444 #455
56+
# #468 #480). Same library vLLM and SGLang delegate to for gpt-oss
57+
# tool calling. Soft-imported at runtime; the legacy custom state
58+
# machine remains the fallback if the dep is unavailable.
59+
"openai-harmony>=0.0.6",
5360
]
5461

5562
[project.optional-dependencies]

0 commit comments

Comments
 (0)