xinference v2.7.0+vllm0.18.1 工具串行调用中断问题

### System Info / 系統信息

python 3.10

### Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

- [ ] docker / docker
- [x] pip install / 通过 pip install 安装
- [ ] installation from source / 从源码安装

### Version info / 版本信息

xinference2.7.0+vllm0.18.1 部署 Qwen3.6-35B-A3B-FP8

### The command used to start Xinference / 用以启动 xinference 的命令

xinference webui 启动 Qwen3.6-35B-A3B-FP8 虚拟环境为False

### Reproduction / 复现过程

调用工具能正常调用，但是我们根据大纲生成ppt的场景，稍稍有些特殊，运行时存在一个问题：
根据大纲生成ppt场景，有四个mcp工具，先think（指定执行initialize_design），再initialize_design（指定执行insert_page），再循环调用insert_page（指定insert_page或finalize）生成ppt页，然后调用finalize整体结束。
同样的代码流程，使用vllm部署qwen3.5执行正常。使用xinference部署qwen3.5，有问题，只能生成一页ppt，然后就调finalize工具，结束ppt生成。
下图是我调查分析的结果：
<img width="867" height="232" alt="Image" src="https://github.com/user-attachments/assets/df18f143-51cb-4627-a545-f214708468b6" />

请问，针对这种情况xinference有什么特殊的处理吗？

### Expected behavior / 期待表现

运行效果和vllm保持一致

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xinference v2.7.0+vllm0.18.1 工具串行调用中断问题 #4865

System Info / 系統信息

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

Version info / 版本信息

The command used to start Xinference / 用以启动 xinference 的命令

Reproduction / 复现过程

Expected behavior / 期待表现

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

xinference v2.7.0+vllm0.18.1 工具串行调用中断问题 #4865

Description

System Info / 系統信息

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

Version info / 版本信息

The command used to start Xinference / 用以启动 xinference 的命令

Reproduction / 复现过程

Expected behavior / 期待表现

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions