Skip to content

Support declaring image input capability for custom openai-compatibility models #3976

Description

@a137460387

Problem

When using openai-compatibility with a custom model (e.g. mimo-v2.5-pro from Xiaomi API), the proxy returns:

{"error":{"code":"404","message":"No endpoints found that support image input"}}

The upstream API (api.xiaomimimo.com) supports both text and image input for mimo-v2.5-pro, but the proxy has no way to know this since the model is not in the built-in models list (models.router-for.me/models.json).

Current config

openai-compatibility:
  - name: mimo-0mm4
    base-url: https://api.xiaomimimo.com/v1
    api-key-entries:
      - api-key: sk-xxx
    models:
      - name: mimo-v2.5-pro
        alias: ""

Steps to reproduce

curl -s -X POST http://127.0.0.1:8317/v1/responses \
  -H "Authorization: Bearer <api-key>" \
  -H "Content-Type: application/json" \
  -d "{\"model\":\"mimo-v2.5-pro\",\"input\":[{\"role\":\"user\",\"content\":[{\"type\":\"input_text\",\"text\":\"describe this\"},{\"type\":\"input_image\",\"image_url\":\"data:image/png;base64,...\"}]}],\"max_output_tokens\":50}"

Response:

{"error":{"code":"404","message":"No endpoints found that support image input"}}

Expected behavior

A way to declare model capabilities in the config, e.g.:

models:
  - name: mimo-v2.5-pro
    alias: ""
    supports_image: true
    input_modalities: [text, image]

Or alternatively, allow a local models.json override that supplements the remote model list.

Use case

Third-party APIs (like Xiaomi mimo) that are OpenAI-compatible but not in the standard model registry need a way to declare their capabilities (image input, vision, etc.) so that clients like Codex Desktop can use them fully.

CPA Version

v7.2.28

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions