fix(openai-compatible): preserve non-stream reasoning content by Epochex · Pull Request #3099 · langgenius/dify-official-plugins

Epochex · 2026-05-13T19:31:10Z

Summary

Related to #2945.

The OpenAI-compatible streaming path already normalizes both delta.reasoning and delta.reasoning_content into Dify's <think>...</think> format. The non-streaming chat response path still only used message.content, so vLLM/SGLang-style responses that put the reasoning trace in message.reasoning dropped that trace before Dify could render or filter it.

This PR applies the same normalization to non-streaming chat responses:

wraps message.reasoning and message.reasoning_content before the final answer
leaves content that already starts with <think> unchanged to avoid double wrapping
keeps existing tool call extraction and usage handling unchanged
keeps existing thinking-disabled filtering behavior working after the response is wrapped

This is separate from the earlier streaming fix in #2741 and the extensions/openai_compatible endpoint fix in #2676; it covers the model plugin's non-streaming choices[].message handler.

Change Type

Documentation / non-plugin change
Non-LLM plugin (tools, extensions, datasource, etc.)
LLM plugin

Screenshots / Videos

N/A. This is a response-normalization fix covered by unit and package tests.

Before	After
Non-streaming responses with `message.reasoning` only surfaced `message.content`.	Reasoning is preserved as `<think>...</think>` before the final answer, matching the streaming path.

LLM Plugin Checklist

Areas affected by this change (check all that apply)

Message flow (system messages, user -> assistant turn-taking)
Tool interaction flow (multi-round usage, Agent App and Agent Node)
Multimodal input (images, PDFs, audio, video, etc.)
Multimodal output (images, audio, video, etc.)
Structured output (JSON, XML, etc.)
Token consumption metrics
Other LLM functionality (reasoning, grounding, prompt caching, etc.)
New models / model parameter fixes

Version

Bumped top-level version in manifest.yaml (not the one under meta)
dify_plugin>=0.5.0 is declared in pyproject.toml and locked in uv.lock

Note: this PR bumps openai_api_compatible to 0.0.51 because open PRs #3091 and #3092 already use the pending 0.0.50 bump for the same plugin.

Testing

uv run --project models/openai_api_compatible --frozen --with pytest pytest models/openai_api_compatible/tests
uv run --project models/openai_api_compatible --frozen python -m py_compile models/openai_api_compatible/models/llm/llm.py models/openai_api_compatible/tests/test_handle_response.py
uv run --with requests --with dify_plugin python .scripts/toolkit/uploader/upload-package.py -d models/openai_api_compatible -t dummy-token --plugin-daemon-path .scripts/dify-plugin-windows-amd64.exe -u https://marketplace.dify.ai -f --test
.\.scripts\dify-plugin-windows-amd64.exe plugin package models/openai_api_compatible --output_path test-openai-api-compatible-package.difypkg
Unpacked the generated package and ran uv run --frozen --with pytest pytest
Local deployment - Dify version:
SaaS (cloud.dify.ai)

Local result: 28 passed, 2 warnings from existing dependencies.

gemini-code-assist

Code Review

This pull request adds support for handling reasoning traces in non-streaming responses for OpenAI-compatible models, such as those from vLLM or SGLang. A new method, _wrap_non_stream_reasoning_content, was implemented to extract reasoning data from the reasoning or reasoning_content fields and wrap it in tags if not already present in the main content. Additionally, the version was bumped to 0.0.51, and several unit tests were added to ensure correct handling of reasoning content alongside tool calls and existing thinking blocks. I have no feedback to provide.

dosubot Bot added the size:S This PR changes 10-29 lines, ignoring generated files. label May 13, 2026

Epochex temporarily deployed to models/openai_api_compatible May 13, 2026 19:32 — with GitHub Actions Inactive

gemini-code-assist Bot reviewed May 13, 2026

View reviewed changes

fix(openai-compatible): preserve non-stream reasoning content

2d06469

Epochex force-pushed the fix/openai-compatible-nonstream-reasoning branch from 2c15474 to 2d06469 Compare May 15, 2026 12:11

Epochex temporarily deployed to models/openai_api_compatible May 15, 2026 12:12 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(openai-compatible): preserve non-stream reasoning content#3099

fix(openai-compatible): preserve non-stream reasoning content#3099
Epochex wants to merge 1 commit into
langgenius:mainfrom
Epochex:fix/openai-compatible-nonstream-reasoning

Epochex commented May 13, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Epochex commented May 13, 2026

Summary

Change Type

Screenshots / Videos

LLM Plugin Checklist

Version

Testing

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant