feat: handle audio and image output models from OpenRouter#9
Merged
Conversation
OpenRouter exposes models that return audio transcripts (message.audio) and generated images (message.images) instead of text in message.content. Without this fix those models return an empty response in Open WebUI. - Add _format_image_output() to render image URLs as markdown image tags - In _non_stream_response: fall back to audio.transcript when content is empty/None, and append any images as markdown after the text content - In _stream_response: use delta.audio.transcript as the streamed text when delta.content is absent (audio-streaming models) - Add 23 unit tests covering all new paths (section 34 in test_pipe.py) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds support for OpenRouter models that return outputs via message.audio.transcript (audio-output models) and message.images (image-output models), so Open WebUI doesn’t show blank responses for these model types.
Changes:
- Add
_format_image_output()to rendermessage.imagesitems as markdown image tags. - Update non-streaming and streaming response formatting to fall back to audio transcripts when
contentis empty/None, and to append generated images after text. - Add unit tests covering image formatting and audio transcript fallbacks for both non-streaming and streaming paths.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
openrouter_pipe.py |
Adds image-output formatting helper; updates _non_stream_response and _stream_response to handle audio transcripts and image outputs. |
test_pipe.py |
Adds a new test section validating _format_image_output plus audio/image handling in non-streaming and streaming responses. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address Copilot review comments on PR #9: - _format_image_output: reject unsafe URL schemes (only http/https and data:image/* are allowed) and percent-encode ')' to avoid breaking markdown link syntax — consistent with _insert_citations behaviour - _non_stream_response: prefix image_md with '\n\n' when final_parts is non-empty so the image tag is never glued directly to preceding text - Add 6 new unit tests covering unsafe scheme rejection, ')' encoding, blank-line separator, and image-only (no leading newlines) case Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_format_image_output()helper that convertsmessage.imagesentries into markdown image tags_non_stream_response: fall back toaudio.transcriptwhencontentis empty/None (audio-only models), and append generated images as markdown after text content_stream_response: usedelta.audio.transcriptas streamed text whendelta.contentis absent (audio-streaming models)test_pipe.pycovering all new code pathsMotivation
OpenRouter exposes 5 audio-output models and 7 image-output models. These models return an empty
contentfield and put their output inmessage.audioormessage.images. Without this fix, using them through the pipe produces a blank response in Open WebUI.Test plan
python test_pipe.py→Total: 425 | ✓ Passed: 425 | ✗ Failed: 0openai/gpt-4o-audio-previewor similar audio model (requires live API key)openai/gpt-image-1or similar image model (requires live API key)🤖 Generated with Claude Code