fix: third CoPilot review pass on PR #42

chris-colinsky · chris-colinsky · commit 1b5fbb0c54b9 · 2026-05-15T13:37:50.000-07:00
Addresses 5 remaining review threads (3 substantive, 2 stale on already-fixed code): - LlmProviderResponseAssertion (the typed assertion model in harness/expectations.py) now lists `parsed: Any | None`. The runtime assertion in test_llm_provider.py already handled it, but the typed parser had it under extra="forbid" and would have rejected any future case-shape LLM fixture using `parsed`. The 021-028 fixtures slip past today on `calls:` form's permissive `LlmCallSpec.expected: dict[str, Any]`; this lines the two paths up. - docs/model-providers/authoring.md skeleton comment tightened: removed the "ignore it and return free-form text" option from the response_schema guidance. A provider that silently drops the parameter violates the Protocol contract; callers expect either Response.parsed populated or StructuredOutputInvalid raised. Now only two valid options surfaced: raise ProviderInvalidRequest until implemented, or wire it through. - docs/concepts/llms.md softened the static-typing claim in the Pydantic-class form section. Response.parsed is `dict[str, Any] | BaseModel | None`, so a type checker won't narrow from `response_schema=Classification` alone. The page now separates the runtime guarantee (validated instance) from static access (requires cast/isinstance/typed assignment); generic Response[T] flagged as a follow-up. The two stale threads (examples/00-hello-world/main.py provider cleanup, test_structured_output.py provider cleanup) were already fixed in commit 8ed334c; replies sent + threads resolved without code changes.
diff --git a/docs/concepts/llms.md b/docs/concepts/llms.md
@@ -97,11 +97,18 @@ async def classify(state):
     return {"classification": response.parsed}
 ```
 
-`Response.parsed` is a validated `Classification` instance. Field
-access is statically typed (`response.parsed.intent` returns
-`Literal["research", "summarize"]`); the framework calls
-`.model_json_schema()` under the hood to derive the wire body and
-`.model_validate()` to deserialize the response.
+`Response.parsed` is a validated `Classification` instance at
+runtime; the framework calls `.model_json_schema()` under the hood
+to derive the wire body and `.model_validate()` to deserialize the
+response.
+
+Static typing is shallower. `Response.parsed` is annotated as
+`dict[str, Any] | BaseModel | None`, so a type checker won't narrow
+to `Classification` from the `response_schema=Classification`
+argument alone. Callers that want static field access either
+`cast(Classification, response.parsed)`, narrow with `isinstance`,
+or assign the value into a typed local. Generic `Response[T]` is on
+the table as a follow-up.
 
 ### JSON Schema dict form
 
diff --git a/docs/model-providers/authoring.md b/docs/model-providers/authoring.md
@@ -67,11 +67,14 @@ class MyProvider:
         config: RuntimeConfig | None = None,
         response_schema: dict[str, Any] | type[BaseModel] | None = None,
     ) -> Response:
-        # response_schema support is an optional capability; a skeleton
-        # provider can raise ProviderInvalidRequest when it's set, or
-        # ignore it and return free-form text. A production provider
-        # would wire it through to native response_format support or
-        # the prompt-augmentation fallback. See ``openarmature.llm.OpenAIProvider``.
+        # response_schema is part of the Protocol; a skeleton provider
+        # MUST NOT silently ignore it — callers expect either
+        # Response.parsed populated or a StructuredOutputInvalid raise.
+        # Until the wire path is implemented, raise
+        # ProviderInvalidRequest when response_schema is set. A
+        # production provider wires it through to native response_format
+        # support or the prompt-augmentation fallback; see
+        # ``openarmature.llm.OpenAIProvider``.
         validate_message_list(messages)
         validate_tools(tools)
 
diff --git a/tests/conformance/harness/expectations.py b/tests/conformance/harness/expectations.py
@@ -71,6 +71,12 @@ class LlmProviderResponseAssertion(_ForbidExtras):
     finish_reason: str | None = None
     usage: dict[str, Any] | None = None
     raw_check: dict[str, Any] | None = None
+    # `parsed` was introduced by proposal 0016 — the runtime asserts
+    # equality against ``Response.parsed``. Typed as Any | None because
+    # the fixture-side value can be a dict (dict-schema input form),
+    # a model_dump-equivalent dict (class-schema form), or None
+    # (tool-call response or no-schema call).
+    parsed: Any | None = None
 
 
 class LlmProviderRaisesAssertion(BaseModel):