fix: preserve images in Claude tool_result when converting to OpenAI chat completions#4873
fix: preserve images in Claude tool_result when converting to OpenAI chat completions#4873playeriv65 wants to merge 3 commits into
Conversation
…chat completions Previously, when converting Claude tool_result content blocks to OpenAI format, non-string content was marshaled into a JSON string. This caused image blocks returned by tools to be lost as vision input, since the image data was treated as plain text. This change converts tool_result text blocks to role:tool string content, and image blocks to role:user content[].image_url, matching OpenAI Chat Completions schema. Images must be in user messages per the spec, not in tool messages.
…chat completions Previously, when converting Claude tool_result content blocks to OpenAI format, non-string content was marshaled into a JSON string. This caused image blocks returned by tools to be lost as vision input, since the image data was treated as plain text. This change converts tool_result text blocks to role:tool string content, and image blocks to role:user content[].image_url, matching OpenAI Chat Completions schema. Images must be in user messages per the spec, not in tool messages. Tests added (8 cases): - pure string tool_result (unchanged) - text block tool_result -> tool text - image block tool_result -> tool text notice + user image_url - mixed text + image -> both preserved - multiple images in single tool_result - image with URL source instead of base64 data - empty data image is skipped - unknown block type serialized as text
WalkthroughClaudeToOpenAIRequest now handles Claude ChangesClaude tool_result to OpenAI conversion with image handling
Sequence DiagramsequenceDiagram
participant Claude as Claude tool_result
participant Parser as Media Parser
participant TextExtract as Text Extractor
participant ImageExtract as Image Extractor
participant ToolMsg as OpenAI tool message
participant UserMsg as OpenAI user message
Claude->>Parser: tool_result with media blocks
Parser->>TextExtract: parse media content for text blocks
TextExtract->>ToolMsg: provide extracted text (+ image notice if images present)
Parser->>ImageExtract: parse image blocks (url/base64)
ImageExtract->>ImageExtract: convert base64 to data: URI (default image/png if missing)
ImageExtract->>UserMsg: emit image_url media blocks with notice referencing tool call id
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
service/convert_test.go (1)
250-267: ⚡ Quick winConsider using a named constant for
ChannelType.The magic number
1at line 262 reduces code clarity. If a named constant exists (e.g.,ChannelTypeOpenAI), use it to make the test setup more self-documenting.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@service/convert_test.go` around lines 250 - 267, Replace the magic number used for ChannelMeta.ChannelType in the test setup with the appropriate named constant (e.g., ChannelTypeOpenAI) to improve clarity: locate the test where RelayInfo and ChannelMeta are constructed (the variable info with &common.RelayInfo{ ChannelMeta: &common.ChannelMeta{ ChannelType: 1 }} ) and change ChannelType to the existing constant defined in your codebase (or add one if missing) so ClaudeToOpenAIRequest receives a self-documenting channel type.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@service/convert_test.go`:
- Around line 295-311: The test currently only captures the last image URL
because the loop in the assertion overwrites foundURL; update the test to either
(A) validate all images by changing the test case field tc.expectImageURL ->
tc.expectImageURLs []string, collect every image URL from
imgUserMsg.ParseContent() into a slice (checking block.Type ==
dto.ContentTypeImageURL and extracting from *dto.MessageImageUrl or
map[string]any), then assert the collected slice contains exactly (or at least)
the expected URLs for the "multiple images in tool_result" case, or (B) if only
the last image is intended, add a clear comment and change the test
name/expectation to reflect that only the final image is preserved; adjust
assertions accordingly around imgUserMsg.ParseContent() and tc fields.
---
Nitpick comments:
In `@service/convert_test.go`:
- Around line 250-267: Replace the magic number used for ChannelMeta.ChannelType
in the test setup with the appropriate named constant (e.g., ChannelTypeOpenAI)
to improve clarity: locate the test where RelayInfo and ChannelMeta are
constructed (the variable info with &common.RelayInfo{ ChannelMeta:
&common.ChannelMeta{ ChannelType: 1 }} ) and change ChannelType to the existing
constant defined in your codebase (or add one if missing) so
ClaudeToOpenAIRequest receives a self-documenting channel type.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7b89c640-f36d-4be3-8755-dfbe25c92939
📒 Files selected for processing (2)
service/convert.goservice/convert_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- service/convert.go
| if tc.expectUserImageMsg { | ||
| require.NotNil(t, imgUserMsg, "user message with image_url should exist") | ||
| content := imgUserMsg.ParseContent() | ||
| var foundURL string | ||
| for _, block := range content { | ||
| if block.Type == dto.ContentTypeImageURL && block.ImageUrl != nil { | ||
| if url, ok := block.ImageUrl.(*dto.MessageImageUrl); ok { | ||
| foundURL = url.Url | ||
| } else if urlMap, ok := block.ImageUrl.(map[string]any); ok { | ||
| foundURL, _ = urlMap["url"].(string) | ||
| } | ||
| } | ||
| } | ||
| require.Equal(t, tc.expectImageURL, foundURL) | ||
| } else { | ||
| require.Nil(t, imgUserMsg, "user message with image_url should not exist for text-only") | ||
| } |
There was a problem hiding this comment.
The multiple-images test case only validates the last image URL.
The assertion loop (lines 299-307) overwrites foundURL on each iteration, so the test case "multiple images in tool_result" (line 126) only verifies that the last image (image2) is present. If the conversion should preserve all images in the user message, this test would not detect if earlier images were dropped.
Consider iterating through all expected images and asserting that each is present in the user message content, or add a comment clarifying that only the last image is expected by design.
🔍 Suggested enhancement to validate all images
If all images should be preserved, replace the single expectImageURL field with expectImageURLs []string in the test case struct and update the assertion logic:
if tc.expectUserImageMsg {
require.NotNil(t, imgUserMsg, "user message with image_url should exist")
content := imgUserMsg.ParseContent()
- var foundURL string
+ var foundURLs []string
for _, block := range content {
if block.Type == dto.ContentTypeImageURL && block.ImageUrl != nil {
if url, ok := block.ImageUrl.(*dto.MessageImageUrl); ok {
- foundURL = url.Url
+ foundURLs = append(foundURLs, url.Url)
} else if urlMap, ok := block.ImageUrl.(map[string]any); ok {
- foundURL, _ = urlMap["url"].(string)
+ if url, ok := urlMap["url"].(string); ok {
+ foundURLs = append(foundURLs, url)
+ }
}
}
}
- require.Equal(t, tc.expectImageURL, foundURL)
+ require.ElementsMatch(t, tc.expectImageURLs, foundURLs)
} else {Then update the "multiple images" test case (line 161):
-expectImageURL: "data:image/jpeg;base64,image2",
+expectImageURLs: []string{"data:image/png;base64,image1", "data:image/jpeg;base64,image2"},🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@service/convert_test.go` around lines 295 - 311, The test currently only
captures the last image URL because the loop in the assertion overwrites
foundURL; update the test to either (A) validate all images by changing the test
case field tc.expectImageURL -> tc.expectImageURLs []string, collect every image
URL from imgUserMsg.ParseContent() into a slice (checking block.Type ==
dto.ContentTypeImageURL and extracting from *dto.MessageImageUrl or
map[string]any), then assert the collected slice contains exactly (or at least)
the expected URLs for the "multiple images in tool_result" case, or (B) if only
the last image is intended, add a clear comment and change the test
name/expectation to reflect that only the final image is preserved; adjust
assertions accordingly around imgUserMsg.ParseContent() and tc fields.
变更描述
修复 Claude tool_result 中的图片在转换为 OpenAI Chat Completions 格式时丢失的问题。
根因:service/convert.go 的 ClaudeToOpenAIRequest 处理 tool_result 时,把非字符串内容直接 Marshal 成 JSON 字符串塞进 tool message 的 content 字段。上游收到的 tool 消息里 content 是一段 JSON 文本,模型拿不到图片输入,只能靠文本猜测,产生幻觉。
修复:按 OpenAI Chat Completions 规范拆分:
OpenAI 规范里 role:tool 只支持 text content part,图片必须放在 user message 中。
变更类型
提交前检查项
运行证明
修复前 dump(outbound JSON):
修复后 dump:
测试命令:go test ./service -run TestClaudeToOpenAIRequest_ToolResultImages -v
Summary by CodeRabbit
New Features
Tests