Guard OpenAI backend against malformed compatible responses by ElvesYuki · Pull Request #677 · plastic-labs/honcho

ElvesYuki · 2026-05-13T12:11:52Z

Summary

Fixes #676.

This makes the OpenAI backend a little less trusting when it is talking to OpenAI-compatible providers. A successful HTTP response can still come back with an odd shape from proxies or gateways, so the backend now validates the first choice/message before normalizing it and treats malformed responses as controlled ValidationExceptions instead of raw attribute/index crashes.

It also expands the structured-output fallback path. If parse(response_model) fails or returns an unusable response, Honcho now tries create(json_schema), then create(json_object), then plain create() before giving up.

What changed

Added safe first-choice/message extraction for OpenAI chat responses
Made missing usage fall back to zero token counts
Guarded structured response parsing against malformed response shapes
Added structured fallback attempts through json_schema, json_object, and plain create
Added regression tests for missing usage, empty choices, missing messages, and fallback progression

Tests

uv run ruff check src/llm/backends/openai.py tests/llm/test_backends/test_openai.py
uv run pytest tests/llm/test_backends/test_openai.py

Summary by CodeRabbit

Bug Fixes
- More robust handling of missing or malformed API response fields, including defaulting token counts to 0 when usage info is absent.
- Improved structured-output fallback: tries JSON Schema → JSON object → plain formats more reliably and recovers from parse/validation errors.
Tests
- Added tests covering missing/None usage, malformed responses, and the structured-response fallback sequences.

coderabbitai · 2026-05-13T12:12:05Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0bdd4aaa-c86f-4ca3-9968-cbfdc1d2efb1

📥 Commits

Reviewing files that changed from the base of the PR and between b5a132b and 0879089.

📒 Files selected for processing (2)

src/llm/backends/openai.py
tests/llm/test_backends/test_openai.py

🚧 Files skipped from review as they are similar to previous changes (2)

src/llm/backends/openai.py
tests/llm/test_backends/test_openai.py

Walkthrough

This PR hardens the OpenAI backend against malformed provider responses by introducing defensive helpers for choice/message access, implementing a multi-stage fallback loop for structured output, and using getattr()-based field extraction to avoid crashes when responses lack expected attributes or are empty.

Changes

OpenAI Response Robustness & Fallback Enhancement

Layer / File(s)	Summary
Response Access Helpers `src/llm/backends/openai.py`	New `_STRUCTURED_FALLBACK_ERRORS` exception tuple identifies when to trigger fallbacks. Helper functions `_first_openai_choice()` and `_first_openai_message()` safely extract the first choice and message, raising `ValidationException` if `choices` is empty/missing or `message` is absent.
Structured Output Fallback Mechanism `src/llm/backends/openai.py`	Private request builders `_create_json_object_response()` and `_create_plain_response()` create requests with alternative response formats. `_fallback_structured_response()` iterates over fallback options in a single loop (json_schema → json_object → plain), and `_parse_or_repair_structured_content()` uses helper-based message access and `getattr()` for safe content extraction.
Defensive Response Normalization `src/llm/backends/openai.py`	`_normalize_response()` now derives choice, message, usage, and finish_reason via helpers and `getattr()` instead of direct indexing. `CompletionResult` construction defensively reads `message.content` and token counts with `getattr(..., default=0)` to handle missing fields.
Complete Path Integration `src/llm/backends/openai.py`	Structured completion path refactored to use helper-based message retrieval, add dedicated fallback path triggered by `_STRUCTURED_FALLBACK_ERRORS`, and consolidate parsing/refusal/content extraction around helper access. `LengthFinishReasonError` handling and refusal extraction switch from direct response indexing to helper-provided message.
Test Coverage for Robustness & Fallback `tests/llm/test_backends/test_openai.py`	New `StructuredAnswer` Pydantic model and four test functions verify missing-`usage` defaults to zero, malformed responses raise `ValidationException`, and structured fallback sequences (`json_schema` → `json_object`, `json_object` → `plain`) are attempted in order by inspecting `client.chat.completions.create()` call payloads.

Sequence Diagram

sequenceDiagram
  participant Caller as complete()
  participant Helpers as _first_openai_message/_first_openai_choice
  participant Parse as parse/repair
  participant Fallback as _fallback_structured_response
  participant Provider as chat.completions.create
  Caller->>Helpers: extract first message/choice
  Helpers-->>Caller: message or ValidationException
  Caller->>Parse: attempt parse using message.content
  Parse-->>Caller: parsed result or raise structured parse errors
  Caller->>Fallback: on structured parse/fallback errors
  Fallback->>Provider: create(request with json_schema/json_object/plain)
  Provider-->>Fallback: response used for next parse attempt

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hop through messages, sniffing empty slots,
Helpers in paw to patch the clumsy plots.
Schema tries once, then JSON tries again,
If all else fails, plain text brings the zen.
Safe hops, small fixes — stability hops on! 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 5.26% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: adding defensive guards to the OpenAI backend against malformed responses.
Linked Issues check	✅ Passed	The PR fully addresses issue `#676` requirements: defensive response handling with ValidationException, safe choice/message extraction helpers, guarded parsing, missing usage handling, and expanded structured fallback chain (parse → json_schema → json_object → plain).
Out of Scope Changes check	✅ Passed	All changes are within scope: defensive modifications to OpenAI backend response handling, new validation helpers, structured fallback routing, and comprehensive regression tests covering malformed responses and fallback progression.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/llm/backends/openai.py (1)

214-228: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Unusable-content case skips the fallback chain.

When parse() succeeds but returns parsed=None, no raw_content, and no refusal, line 228 raises ValidationException immediately. The PR objective states fallback should engage when parse(response_model) "fails or returns unusable content". The other "no content at all" paths route through _fallback_structured_response via the _STRUCTURED_FALLBACK_ERRORS catch on line 204; this empty-but-well-shaped response does not. Consider routing this case through the fallback chain too, so json_object/plain retries get a chance before surfacing the failure.

♻️ Suggested change

             if parsed is None:
                 refusal = getattr(message, "refusal", None)
                 if refusal:
                     return self._normalize_response(
                         response,
                         content_override=refusal,
                     )
-                raise ValidationException("No parsed content in structured response")
+                fallback_response, content = await self._fallback_structured_response(
+                    params=params,
+                    response_format=response_format,
+                    model=model,
+                )
+                return self._normalize_response(
+                    fallback_response,
+                    content_override=content,
+                )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llm/backends/openai.py` around lines 214 - 228, The code currently raises
ValidationException when parse() returns parsed is None with no raw_content and
no refusal, skipping the structured fallback chain; change that branch so
instead of raising directly it calls the structured fallback handler (use the
existing _fallback_structured_response logic) to let json_object/plain retries
run: locate the block checking parsed is None, raw_content, and message.refusal
in the method that calls parse(), and replace the final raise
ValidationException with a call to self._fallback_structured_response(response,
response_format, model) (or the appropriate signature used elsewhere) so the
response is routed through the same fallback path guarded by
_STRUCTURED_FALLBACK_ERRORS.

🧹 Nitpick comments (1)

src/llm/backends/openai.py (1)
471-475: 💤 Low value

Consider logger.warning for fallback failures.

Each fallback failure here corresponds to a provider response Honcho could not consume in its preferred format. INFO makes these effectively invisible at typical log levels, even though they indicate provider compatibility/quality issues that an operator likely wants to see (and they precede a possible final ValidationException). WARNING would aid debugging without being noisy, since the loop only logs on actual failure.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llm/backends/openai.py` around lines 471 - 475, Change the OpenAI
structured output fallback log call from logger.info to logger.warning so
fallback failures are visible at typical log levels; update the call in the code
handling structured output fallbacks (the block referencing fallback_name and
exc.__class__.__name__) to use logger.warning with the same message format and
arguments so it logs "OpenAI structured output fallback %s failed: %s" along
with fallback_name and exc.__class__.__name__.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/llm/backends/openai.py`:
- Around line 397-398: The token counts pulled from the OpenAI `usage` object
may exist but be None, so change the expressions that set input_tokens and
output_tokens to coalesce None to 0 by wrapping the getattr results with an `or
0` fallback (e.g., for `usage` use `(getattr(usage, "prompt_tokens", 0) or 0)`
and `(getattr(usage, "completion_tokens", 0) or 0)`) so `CompletionResult` (or
whatever construct uses those fields) always receives numeric zeros instead of
None.

---

Outside diff comments:
In `@src/llm/backends/openai.py`:
- Around line 214-228: The code currently raises ValidationException when
parse() returns parsed is None with no raw_content and no refusal, skipping the
structured fallback chain; change that branch so instead of raising directly it
calls the structured fallback handler (use the existing
_fallback_structured_response logic) to let json_object/plain retries run:
locate the block checking parsed is None, raw_content, and message.refusal in
the method that calls parse(), and replace the final raise ValidationException
with a call to self._fallback_structured_response(response, response_format,
model) (or the appropriate signature used elsewhere) so the response is routed
through the same fallback path guarded by _STRUCTURED_FALLBACK_ERRORS.

---

Nitpick comments:
In `@src/llm/backends/openai.py`:
- Around line 471-475: Change the OpenAI structured output fallback log call
from logger.info to logger.warning so fallback failures are visible at typical
log levels; update the call in the code handling structured output fallbacks
(the block referencing fallback_name and exc.__class__.__name__) to use
logger.warning with the same message format and arguments so it logs "OpenAI
structured output fallback %s failed: %s" along with fallback_name and
exc.__class__.__name__.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1c1279a1-fc3d-4c76-9698-fe86e40106ac

📥 Commits

Reviewing files that changed from the base of the PR and between a420264 and b5a132b.

📒 Files selected for processing (2)

src/llm/backends/openai.py
tests/llm/test_backends/test_openai.py

Guard OpenAI backend response handling

b5a132b

coderabbitai Bot reviewed May 13, 2026

View reviewed changes

Comment thread src/llm/backends/openai.py Outdated

Address OpenAI fallback review feedback

0879089

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guard OpenAI backend against malformed compatible responses#677

Guard OpenAI backend against malformed compatible responses#677
ElvesYuki wants to merge 2 commits into
plastic-labs:mainfrom
ElvesYuki:ElvesYuki/openai-response-guardrails

ElvesYuki commented May 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 13, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ElvesYuki commented May 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ElvesYuki commented May 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 13, 2026 •

edited

Loading