fix: preserve images in Claude tool_result when converting to OpenAI chat completions by playeriv65 · Pull Request #4873 · QuantumNous/new-api

playeriv65 · 2026-05-15T02:22:04Z

变更描述

修复 Claude tool_result 中的图片在转换为 OpenAI Chat Completions 格式时丢失的问题。

根因：service/convert.go 的 ClaudeToOpenAIRequest 处理 tool_result 时，把非字符串内容直接 Marshal 成 JSON 字符串塞进 tool message 的 content 字段。上游收到的 tool 消息里 content 是一段 JSON 文本，模型拿不到图片输入，只能靠文本猜测，产生幻觉。

修复：按 OpenAI Chat Completions 规范拆分：

tool_result 的文本内容 -> role:tool，字符串 content
tool_result 的图片内容 -> 拆成 role:tool（文本说明）+ role:user（image_url content part）

OpenAI 规范里 role:tool 只支持 text content part，图片必须放在 user message 中。

变更类型

Bug 修复

提交前检查项

确认：已亲自整理描述，未直接粘贴 AI 输出
已搜索 Issues 和 PRs，确认非重复提交
Bug fix 说明：已通过 curl 实验和端到端复现验证
变更范围仅限 service/convert.go 的 tool_result 分支
无无关代码改动
本地测试 8 个用例全部通过，curl 最小复现和 Claude Code 端到端均验证通过
代码中无凭据泄露

运行证明

修复前 dump（outbound JSON）：

role=tool, content=[{"type":"image","source":{"type":"base64",...}}]
模型回答：\u0022像素风格樱桃/水果图标\u0022

修复后 dump：

role=tool, content=\u0022[Tool result contained image content...]\u0022
role=user, type=image_url, len=24234
模型回答：\u0022Windows Recycle Bin icon\u0022

测试命令：go test ./service -run TestClaudeToOpenAIRequest_ToolResultImages -v

Summary by CodeRabbit

New Features
- Improved conversion of tool-result content: non-text media are parsed so text remains in tool messages while images are emitted as separate follow-up user messages.
- Supports both direct image URLs and base64 image data (converted to data: URIs), preserves text alongside images, and skips empty image data.
- Unknown block types are serialized into tool message content.
Tests
- Added comprehensive tests covering mixed text/image tool results, base64 and URL images, and edge cases.

…chat completions Previously, when converting Claude tool_result content blocks to OpenAI format, non-string content was marshaled into a JSON string. This caused image blocks returned by tools to be lost as vision input, since the image data was treated as plain text. This change converts tool_result text blocks to role:tool string content, and image blocks to role:user content[].image_url, matching OpenAI Chat Completions schema. Images must be in user messages per the spec, not in tool messages.

…chat completions Previously, when converting Claude tool_result content blocks to OpenAI format, non-string content was marshaled into a JSON string. This caused image blocks returned by tools to be lost as vision input, since the image data was treated as plain text. This change converts tool_result text blocks to role:tool string content, and image blocks to role:user content[].image_url, matching OpenAI Chat Completions schema. Images must be in user messages per the spec, not in tool messages. Tests added (8 cases): - pure string tool_result (unchanged) - text block tool_result -> tool text - image block tool_result -> tool text notice + user image_url - mixed text + image -> both preserved - multiple images in single tool_result - image with URL source instead of base64 data - empty data image is skipped - unknown block type serialized as text

coderabbitai · 2026-05-15T02:22:18Z

Walkthrough

ClaudeToOpenAIRequest now handles Claude tool_result messages containing images by parsing media blocks, extracting text and image content, creating a tool message with text content and an optional user message with image media when images are present.

Changes

Claude tool_result to OpenAI conversion with image handling

Layer / File(s)	Summary
Tool message initialization and media parsing `service/convert.go`	Tool message is initialized by resolving the tool name via `SearchToolNameByToolCallId`, then non-string tool results are parsed to extract textual content (including unknown block types serialized to JSON) and image blocks (converted to OpenAI media with URL or base64 `data:` URI).
Image content extraction and user message creation `service/convert.go`	When image content is detected in the parsed media, an additional OpenAI `"user"` message is appended containing the image media content, prefixed with a notice indicating the tool returned images.
Test coverage for tool result image conversion `service/convert_test.go`	`TestClaudeToOpenAIRequest_ToolResultImages` validates conversion across variants: text-only, text+image, image-only, multiple images, base64 vs. URL images, empty base64 skipping, and unknown block serialization.

Sequence Diagram

sequenceDiagram
  participant Claude as Claude tool_result
  participant Parser as Media Parser
  participant TextExtract as Text Extractor
  participant ImageExtract as Image Extractor
  participant ToolMsg as OpenAI tool message
  participant UserMsg as OpenAI user message
  
  Claude->>Parser: tool_result with media blocks
  Parser->>TextExtract: parse media content for text blocks
  TextExtract->>ToolMsg: provide extracted text (+ image notice if images present)
  Parser->>ImageExtract: parse image blocks (url/base64)
  ImageExtract->>ImageExtract: convert base64 to data: URI (default image/png if missing)
  ImageExtract->>UserMsg: emit image_url media blocks with notice referencing tool call id

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

QuantumNous/new-api#1531: Both PRs modify service/convert.go's ClaudeToOpenAIRequest handling of Claude tool_result messages, extending tool name resolution and media conversion logic.

Suggested reviewers

seefs001

Poem

🐰 A rabbit hops through images bright,
Converting Claude's tools to OpenAI's light,
With media parsed and images spread,
User messages sent where images tread,
Tests confirm each byte and thread.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main change: preserving images in Claude tool_result when converting to OpenAI format, which is the core fix described in the PR objectives.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

service/convert_test.go (1)

250-267: ⚡ Quick win

Consider using a named constant for ChannelType.

The magic number 1 at line 262 reduces code clarity. If a named constant exists (e.g., ChannelTypeOpenAI), use it to make the test setup more self-documenting.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@service/convert_test.go` around lines 250 - 267, Replace the magic number
used for ChannelMeta.ChannelType in the test setup with the appropriate named
constant (e.g., ChannelTypeOpenAI) to improve clarity: locate the test where
RelayInfo and ChannelMeta are constructed (the variable info with
&common.RelayInfo{ ChannelMeta: &common.ChannelMeta{ ChannelType: 1 }} ) and
change ChannelType to the existing constant defined in your codebase (or add one
if missing) so ClaudeToOpenAIRequest receives a self-documenting channel type.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@service/convert_test.go`:
- Around line 295-311: The test currently only captures the last image URL
because the loop in the assertion overwrites foundURL; update the test to either
(A) validate all images by changing the test case field tc.expectImageURL ->
tc.expectImageURLs []string, collect every image URL from
imgUserMsg.ParseContent() into a slice (checking block.Type ==
dto.ContentTypeImageURL and extracting from *dto.MessageImageUrl or
map[string]any), then assert the collected slice contains exactly (or at least)
the expected URLs for the "multiple images in tool_result" case, or (B) if only
the last image is intended, add a clear comment and change the test
name/expectation to reflect that only the final image is preserved; adjust
assertions accordingly around imgUserMsg.ParseContent() and tc fields.

---

Nitpick comments:
In `@service/convert_test.go`:
- Around line 250-267: Replace the magic number used for ChannelMeta.ChannelType
in the test setup with the appropriate named constant (e.g., ChannelTypeOpenAI)
to improve clarity: locate the test where RelayInfo and ChannelMeta are
constructed (the variable info with &common.RelayInfo{ ChannelMeta:
&common.ChannelMeta{ ChannelType: 1 }} ) and change ChannelType to the existing
constant defined in your codebase (or add one if missing) so
ClaudeToOpenAIRequest receives a self-documenting channel type.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7b89c640-f36d-4be3-8755-dfbe25c92939

📥 Commits

Reviewing files that changed from the base of the PR and between 8789891 and a23b352.

📒 Files selected for processing (2)

service/convert.go
service/convert_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

service/convert.go

coderabbitai · 2026-05-15T03:23:17Z

+			if tc.expectUserImageMsg {
+				require.NotNil(t, imgUserMsg, "user message with image_url should exist")
+				content := imgUserMsg.ParseContent()
+				var foundURL string
+				for _, block := range content {
+					if block.Type == dto.ContentTypeImageURL && block.ImageUrl != nil {
+						if url, ok := block.ImageUrl.(*dto.MessageImageUrl); ok {
+							foundURL = url.Url
+						} else if urlMap, ok := block.ImageUrl.(map[string]any); ok {
+							foundURL, _ = urlMap["url"].(string)
+						}
+					}
+				}
+				require.Equal(t, tc.expectImageURL, foundURL)
+			} else {
+				require.Nil(t, imgUserMsg, "user message with image_url should not exist for text-only")
+			}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

The multiple-images test case only validates the last image URL.

The assertion loop (lines 299-307) overwrites foundURL on each iteration, so the test case "multiple images in tool_result" (line 126) only verifies that the last image (image2) is present. If the conversion should preserve all images in the user message, this test would not detect if earlier images were dropped.

Consider iterating through all expected images and asserting that each is present in the user message content, or add a comment clarifying that only the last image is expected by design.

🔍 Suggested enhancement to validate all images

If all images should be preserved, replace the single expectImageURL field with expectImageURLs []string in the test case struct and update the assertion logic:

if tc.expectUserImageMsg { require.NotNil(t, imgUserMsg, "user message with image_url should exist") content := imgUserMsg.ParseContent() - var foundURL string + var foundURLs []string for _, block := range content { if block.Type == dto.ContentTypeImageURL && block.ImageUrl != nil { if url, ok := block.ImageUrl.(*dto.MessageImageUrl); ok { - foundURL = url.Url + foundURLs = append(foundURLs, url.Url) } else if urlMap, ok := block.ImageUrl.(map[string]any); ok { - foundURL, _ = urlMap["url"].(string) + if url, ok := urlMap["url"].(string); ok { + foundURLs = append(foundURLs, url) + } } } } - require.Equal(t, tc.expectImageURL, foundURL) + require.ElementsMatch(t, tc.expectImageURLs, foundURLs) } else {

Then update the "multiple images" test case (line 161):

-expectImageURL: "data:image/jpeg;base64,image2", +expectImageURLs: []string{"data:image/png;base64,image1", "data:image/jpeg;base64,image2"},

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@service/convert_test.go` around lines 295 - 311, The test currently only captures the last image URL because the loop in the assertion overwrites foundURL; update the test to either (A) validate all images by changing the test case field tc.expectImageURL -> tc.expectImageURLs []string, collect every image URL from imgUserMsg.ParseContent() into a slice (checking block.Type == dto.ContentTypeImageURL and extracting from *dto.MessageImageUrl or map[string]any), then assert the collected slice contains exactly (or at least) the expected URLs for the "multiple images in tool_result" case, or (B) if only the last image is intended, add a clear comment and change the test name/expectation to reflect that only the final image is preserved; adjust assertions accordingly around imgUserMsg.ParseContent() and tc fields.

playeriv65 added 2 commits May 14, 2026 15:20

docs: add docstrings to ClaudeToOpenAIRequest and test function

a23b352

coderabbitai Bot reviewed May 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: preserve images in Claude tool_result when converting to OpenAI chat completions#4873

fix: preserve images in Claude tool_result when converting to OpenAI chat completions#4873
playeriv65 wants to merge 3 commits into
QuantumNous:mainfrom
playeriv65:feat/tool-result-image-fix

playeriv65 commented May 15, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 15, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

playeriv65 commented May 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

变更描述

变更类型

提交前检查项

运行证明

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

playeriv65 commented May 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 15, 2026 •

edited

Loading