Skip to content

Commit fb84521

Browse files
fix: CoPilot review pass on PR #44
- audio/video symmetry in the substring fallback of _looks_like_content_rejection - explicit isinstance(block, ImageBlock) guard in _block_to_wire to surface added union variants as a TypeError instead of an AttributeError on .source - clarify ImageBlock.media_type docstring: permitted but redundant on URL sources (the URL payload carries content-type), provider implementations MAY consume it as a hint - reword CHANGELOG qualifier '(proposal X, spec vY.Z)' → '(proposal X, introduced in spec vY.Z)' on the 0015 and 0016 entries so it doesn't read like a per-entry submodule pin change
1 parent 027ae56 commit fb84521

3 files changed

Lines changed: 16 additions & 10 deletions

File tree

CHANGELOG.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The
88

99
### Added
1010

11-
- **Image content blocks for user messages (proposal 0015, spec v0.13.0).** `UserMessage.content` now accepts `str | list[ContentBlock]`. The block surface introduces `TextBlock`, `ImageBlock`, `ImageSourceURL`, `ImageSourceInline`, and the `ContentBlock` / `ImageSource` discriminated unions over the block / source `type` field. `ImageBlock` carries a `media_type` (required for inline sources; ignored for URL sources; typed as `str | None` so callers MAY pass any `image/*` type the bound model supports) and an optional `detail` hint (`"auto"` / `"low"` / `"high"`; `None` default omits the field from the wire so providers apply their own default). System, assistant, and tool messages stay text-string-only; image inputs are user-only in v1.
11+
- **Image content blocks for user messages (proposal 0015, introduced in spec v0.13.0).** `UserMessage.content` now accepts `str | list[ContentBlock]`. The block surface introduces `TextBlock`, `ImageBlock`, `ImageSourceURL`, `ImageSourceInline`, and the `ContentBlock` / `ImageSource` discriminated unions over the block / source `type` field. `ImageBlock` carries a `media_type` (required for inline sources; ignored for URL sources; typed as `str | None` so callers MAY pass any `image/*` type the bound model supports) and an optional `detail` hint (`"auto"` / `"low"` / `"high"`; `None` default omits the field from the wire so providers apply their own default). System, assistant, and tool messages stay text-string-only; image inputs are user-only in v1.
1212
- **`OpenAIProvider` content-array wire mapping.** When `UserMessage.content` is a content-block sequence, the wire body uses OpenAI's `content` array per §8.1.1. `TextBlock → {type: "text", text}`. `ImageBlock` with a URL source maps to `{type: "image_url", image_url: {url, detail?}}`. `ImageBlock` with an inline source constructs an RFC 2397 `data:<media_type>;base64,<base64_data>` URI and goes through the same `image_url` entry shape. Inline bytes pass through unchanged — no inspection, transcoding, or re-encoding.
1313
- **New error category `ProviderUnsupportedContentBlock` (non-transient).** Raised when the bound model rejects a content block type / media variant. Distinct from `ProviderInvalidRequest` (which covers spec-shape malformation): this category surfaces a *capability* mismatch, letting callers route differently (e.g., fall back to a multimodal-capable provider) without overloading the malformed-request category. Carries `block_type` ("image" / "audio" / "video") and `reason` (provider's human-readable message) when those are recoverable from the rejection. `OpenAIProvider` detects content rejection via HTTP 400 bodies — heuristic on `error.code` (known set: `image_content_not_supported`, `unsupported_image_media_type`, `audio_content_not_supported`, etc.), `error.type` (`image_parse_error`), and `error.message` ("does not support" + image/audio/video).
14-
- **Structured output (proposal 0016, spec v0.14.0).** `Provider.complete()` now accepts an optional `response_schema` parameter — either a JSON Schema dict or a Pydantic `BaseModel` subclass. When supplied, the provider constrains the model's output to the schema and populates `Response.parsed` with the validated value (`dict` for dict-schema input, a `BaseModel` instance for class input). New `StructuredOutputInvalid` error category (non-transient by default) raises on JSON parse failure or schema validation failure; carries the requested schema, the raw response content, and a failure description.
14+
- **Structured output (proposal 0016, introduced in spec v0.14.0).** `Provider.complete()` now accepts an optional `response_schema` parameter — either a JSON Schema dict or a Pydantic `BaseModel` subclass. When supplied, the provider constrains the model's output to the schema and populates `Response.parsed` with the validated value (`dict` for dict-schema input, a `BaseModel` instance for class input). New `StructuredOutputInvalid` error category (non-transient by default) raises on JSON parse failure or schema validation failure; carries the requested schema, the raw response content, and a failure description.
1515
- **`OpenAIProvider` native response_format wire path.** When `response_schema` is supplied, the chat-completions request body carries `response_format: { type: "json_schema", json_schema: { name, schema, strict } }`. The `strict` flag is determined by a deep recursive walk over the schema (object-property required-coverage rule across `anyOf` / `oneOf` / `allOf` and `$ref` targets, with cycle protection); unresolvable refs fall through to `strict: false`. The `name` field uses `schema.title` when present, otherwise a deterministic sha256-prefix hash.
1616
- **`OpenAIProvider` prompt-augmentation fallback.** Constructor flag `force_prompt_augmentation_fallback: bool` (default `False`) and read-only inspect property `uses_prompt_augmentation_fallback: bool`. When the flag is on, structured-output calls build a fresh message list with a system directive containing the serialized schema, omit `response_format` from the wire, and validate the response post-receive. The caller's original `messages` list is never mutated. Use for OpenAI-compatible servers (older vLLM, some LM Studio releases, llama.cpp variants) that reject or silently ignore `response_format`.
1717
- **Provider-agnostic schema helpers.** `openarmature.llm.validate_response_schema(schema)` (raises `ProviderInvalidRequest` when the schema is not a dict with a top-level `type: "object"`) and `openarmature.llm.strict_mode_supported(schema)` (the deep-tree strict-mode constraint check) are exported for reuse by future Anthropic/Gemini providers.

src/openarmature/llm/messages.py

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -174,11 +174,14 @@ class ImageBlock(BaseModel):
174174
Attributes:
175175
type: The discriminator literal ``"image"``.
176176
source: One of ``ImageSourceURL`` or ``ImageSourceInline``.
177-
media_type: IANA media type. Required when source is inline;
178-
ignored when source is a URL. Providers MUST accept
179-
``image/png``, ``image/jpeg``, ``image/webp`` at minimum
180-
and MAY accept additional ``image/*`` types they document
181-
support for.
177+
media_type: IANA media type. Required when source is inline.
178+
Permitted but redundant when source is a URL (the URL
179+
payload carries the content-type); the OpenAI wire path
180+
currently does not surface it for URL sources, but
181+
provider implementations MAY consume it as a hint.
182+
Providers MUST accept ``image/png``, ``image/jpeg``,
183+
``image/webp`` at minimum and MAY accept additional
184+
``image/*`` types they document support for.
182185
detail: Image-processing fidelity hint. One of ``"auto"``,
183186
``"low"``, ``"high"``. ``None`` (the default) omits the
184187
field from the wire.

src/openarmature/llm/providers/openai.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,7 @@
7575
from ..messages import (
7676
AssistantMessage,
7777
ContentBlock,
78+
ImageBlock,
7879
ImageSourceInline,
7980
Message,
8081
SystemMessage,
@@ -694,7 +695,8 @@ def _message_to_wire(msg: Message) -> dict[str, Any]:
694695
def _block_to_wire(block: ContentBlock) -> dict[str, Any]:
695696
if isinstance(block, TextBlock):
696697
return {"type": "text", "text": block.text}
697-
# ImageBlock
698+
if not isinstance(block, ImageBlock): # pyright: ignore[reportUnnecessaryIsInstance]
699+
raise TypeError(f"unhandled content block type: {type(block).__name__}")
698700
if isinstance(block.source, ImageSourceInline):
699701
url = f"data:{block.media_type};base64,{block.source.base64_data}"
700702
else:
@@ -854,8 +856,9 @@ def _looks_like_content_rejection(
854856
if error_code in _CONTENT_REJECTION_ERROR_CODES:
855857
return True
856858
lower_code = error_code.lower()
857-
if "image" in lower_code and ("not_supported" in lower_code or "unsupported" in lower_code):
858-
return True
859+
for block_type in ("image", "audio", "video"):
860+
if block_type in lower_code and ("not_supported" in lower_code or "unsupported" in lower_code):
861+
return True
859862
if isinstance(error_type, str) and error_type.lower() in {
860863
"image_parse_error",
861864
"image_content_not_supported",

0 commit comments

Comments
 (0)