Skip to content

Guard OpenAI backend against malformed compatible responses#677

Open
ElvesYuki wants to merge 2 commits into
plastic-labs:mainfrom
ElvesYuki:ElvesYuki/openai-response-guardrails
Open

Guard OpenAI backend against malformed compatible responses#677
ElvesYuki wants to merge 2 commits into
plastic-labs:mainfrom
ElvesYuki:ElvesYuki/openai-response-guardrails

Conversation

@ElvesYuki
Copy link
Copy Markdown

@ElvesYuki ElvesYuki commented May 13, 2026

Summary

Fixes #676.

This makes the OpenAI backend a little less trusting when it is talking to OpenAI-compatible providers. A successful HTTP response can still come back with an odd shape from proxies or gateways, so the backend now validates the first choice/message before normalizing it and treats malformed responses as controlled ValidationExceptions instead of raw attribute/index crashes.

It also expands the structured-output fallback path. If parse(response_model) fails or returns an unusable response, Honcho now tries create(json_schema), then create(json_object), then plain create() before giving up.

What changed

  • Added safe first-choice/message extraction for OpenAI chat responses
  • Made missing usage fall back to zero token counts
  • Guarded structured response parsing against malformed response shapes
  • Added structured fallback attempts through json_schema, json_object, and plain create
  • Added regression tests for missing usage, empty choices, missing messages, and fallback progression

Tests

  • uv run ruff check src/llm/backends/openai.py tests/llm/test_backends/test_openai.py
  • uv run pytest tests/llm/test_backends/test_openai.py

Summary by CodeRabbit

  • Bug Fixes

    • More robust handling of missing or malformed API response fields, including defaulting token counts to 0 when usage info is absent.
    • Improved structured-output fallback: tries JSON Schema → JSON object → plain formats more reliably and recovers from parse/validation errors.
  • Tests

    • Added tests covering missing/None usage, malformed responses, and the structured-response fallback sequences.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0bdd4aaa-c86f-4ca3-9968-cbfdc1d2efb1

📥 Commits

Reviewing files that changed from the base of the PR and between b5a132b and 0879089.

📒 Files selected for processing (2)
  • src/llm/backends/openai.py
  • tests/llm/test_backends/test_openai.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/llm/backends/openai.py
  • tests/llm/test_backends/test_openai.py

Walkthrough

This PR hardens the OpenAI backend against malformed provider responses by introducing defensive helpers for choice/message access, implementing a multi-stage fallback loop for structured output, and using getattr()-based field extraction to avoid crashes when responses lack expected attributes or are empty.

Changes

OpenAI Response Robustness & Fallback Enhancement

Layer / File(s) Summary
Response Access Helpers
src/llm/backends/openai.py
New _STRUCTURED_FALLBACK_ERRORS exception tuple identifies when to trigger fallbacks. Helper functions _first_openai_choice() and _first_openai_message() safely extract the first choice and message, raising ValidationException if choices is empty/missing or message is absent.
Structured Output Fallback Mechanism
src/llm/backends/openai.py
Private request builders _create_json_object_response() and _create_plain_response() create requests with alternative response formats. _fallback_structured_response() iterates over fallback options in a single loop (json_schema → json_object → plain), and _parse_or_repair_structured_content() uses helper-based message access and getattr() for safe content extraction.
Defensive Response Normalization
src/llm/backends/openai.py
_normalize_response() now derives choice, message, usage, and finish_reason via helpers and getattr() instead of direct indexing. CompletionResult construction defensively reads message.content and token counts with getattr(..., default=0) to handle missing fields.
Complete Path Integration
src/llm/backends/openai.py
Structured completion path refactored to use helper-based message retrieval, add dedicated fallback path triggered by _STRUCTURED_FALLBACK_ERRORS, and consolidate parsing/refusal/content extraction around helper access. LengthFinishReasonError handling and refusal extraction switch from direct response indexing to helper-provided message.
Test Coverage for Robustness & Fallback
tests/llm/test_backends/test_openai.py
New StructuredAnswer Pydantic model and four test functions verify missing-usage defaults to zero, malformed responses raise ValidationException, and structured fallback sequences (json_schemajson_object, json_objectplain) are attempted in order by inspecting client.chat.completions.create() call payloads.

Sequence Diagram

sequenceDiagram
  participant Caller as complete()
  participant Helpers as _first_openai_message/_first_openai_choice
  participant Parse as parse/repair
  participant Fallback as _fallback_structured_response
  participant Provider as chat.completions.create
  Caller->>Helpers: extract first message/choice
  Helpers-->>Caller: message or ValidationException
  Caller->>Parse: attempt parse using message.content
  Parse-->>Caller: parsed result or raise structured parse errors
  Caller->>Fallback: on structured parse/fallback errors
  Fallback->>Provider: create(request with json_schema/json_object/plain)
  Provider-->>Fallback: response used for next parse attempt
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hop through messages, sniffing empty slots,
Helpers in paw to patch the clumsy plots.
Schema tries once, then JSON tries again,
If all else fails, plain text brings the zen.
Safe hops, small fixes — stability hops on! 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 5.26% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding defensive guards to the OpenAI backend against malformed responses.
Linked Issues check ✅ Passed The PR fully addresses issue #676 requirements: defensive response handling with ValidationException, safe choice/message extraction helpers, guarded parsing, missing usage handling, and expanded structured fallback chain (parse → json_schema → json_object → plain).
Out of Scope Changes check ✅ Passed All changes are within scope: defensive modifications to OpenAI backend response handling, new validation helpers, structured fallback routing, and comprehensive regression tests covering malformed responses and fallback progression.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/llm/backends/openai.py (1)

214-228: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Unusable-content case skips the fallback chain.

When parse() succeeds but returns parsed=None, no raw_content, and no refusal, line 228 raises ValidationException immediately. The PR objective states fallback should engage when parse(response_model) "fails or returns unusable content". The other "no content at all" paths route through _fallback_structured_response via the _STRUCTURED_FALLBACK_ERRORS catch on line 204; this empty-but-well-shaped response does not. Consider routing this case through the fallback chain too, so json_object/plain retries get a chance before surfacing the failure.

♻️ Suggested change
             if parsed is None:
                 refusal = getattr(message, "refusal", None)
                 if refusal:
                     return self._normalize_response(
                         response,
                         content_override=refusal,
                     )
-                raise ValidationException("No parsed content in structured response")
+                fallback_response, content = await self._fallback_structured_response(
+                    params=params,
+                    response_format=response_format,
+                    model=model,
+                )
+                return self._normalize_response(
+                    fallback_response,
+                    content_override=content,
+                )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llm/backends/openai.py` around lines 214 - 228, The code currently raises
ValidationException when parse() returns parsed is None with no raw_content and
no refusal, skipping the structured fallback chain; change that branch so
instead of raising directly it calls the structured fallback handler (use the
existing _fallback_structured_response logic) to let json_object/plain retries
run: locate the block checking parsed is None, raw_content, and message.refusal
in the method that calls parse(), and replace the final raise
ValidationException with a call to self._fallback_structured_response(response,
response_format, model) (or the appropriate signature used elsewhere) so the
response is routed through the same fallback path guarded by
_STRUCTURED_FALLBACK_ERRORS.
🧹 Nitpick comments (1)
src/llm/backends/openai.py (1)

471-475: 💤 Low value

Consider logger.warning for fallback failures.

Each fallback failure here corresponds to a provider response Honcho could not consume in its preferred format. INFO makes these effectively invisible at typical log levels, even though they indicate provider compatibility/quality issues that an operator likely wants to see (and they precede a possible final ValidationException). WARNING would aid debugging without being noisy, since the loop only logs on actual failure.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llm/backends/openai.py` around lines 471 - 475, Change the OpenAI
structured output fallback log call from logger.info to logger.warning so
fallback failures are visible at typical log levels; update the call in the code
handling structured output fallbacks (the block referencing fallback_name and
exc.__class__.__name__) to use logger.warning with the same message format and
arguments so it logs "OpenAI structured output fallback %s failed: %s" along
with fallback_name and exc.__class__.__name__.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/llm/backends/openai.py`:
- Around line 397-398: The token counts pulled from the OpenAI `usage` object
may exist but be None, so change the expressions that set input_tokens and
output_tokens to coalesce None to 0 by wrapping the getattr results with an `or
0` fallback (e.g., for `usage` use `(getattr(usage, "prompt_tokens", 0) or 0)`
and `(getattr(usage, "completion_tokens", 0) or 0)`) so `CompletionResult` (or
whatever construct uses those fields) always receives numeric zeros instead of
None.

---

Outside diff comments:
In `@src/llm/backends/openai.py`:
- Around line 214-228: The code currently raises ValidationException when
parse() returns parsed is None with no raw_content and no refusal, skipping the
structured fallback chain; change that branch so instead of raising directly it
calls the structured fallback handler (use the existing
_fallback_structured_response logic) to let json_object/plain retries run:
locate the block checking parsed is None, raw_content, and message.refusal in
the method that calls parse(), and replace the final raise ValidationException
with a call to self._fallback_structured_response(response, response_format,
model) (or the appropriate signature used elsewhere) so the response is routed
through the same fallback path guarded by _STRUCTURED_FALLBACK_ERRORS.

---

Nitpick comments:
In `@src/llm/backends/openai.py`:
- Around line 471-475: Change the OpenAI structured output fallback log call
from logger.info to logger.warning so fallback failures are visible at typical
log levels; update the call in the code handling structured output fallbacks
(the block referencing fallback_name and exc.__class__.__name__) to use
logger.warning with the same message format and arguments so it logs "OpenAI
structured output fallback %s failed: %s" along with fallback_name and
exc.__class__.__name__.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1c1279a1-fc3d-4c76-9698-fe86e40106ac

📥 Commits

Reviewing files that changed from the base of the PR and between a420264 and b5a132b.

📒 Files selected for processing (2)
  • src/llm/backends/openai.py
  • tests/llm/test_backends/test_openai.py

Comment thread src/llm/backends/openai.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] OpenAI-compatible chat responses can crash backend when choices or usage are missing

1 participant