Guard OpenAI backend against malformed compatible responses#677
Guard OpenAI backend against malformed compatible responses#677ElvesYuki wants to merge 2 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
WalkthroughThis PR hardens the OpenAI backend against malformed provider responses by introducing defensive helpers for choice/message access, implementing a multi-stage fallback loop for structured output, and using ChangesOpenAI Response Robustness & Fallback Enhancement
Sequence DiagramsequenceDiagram
participant Caller as complete()
participant Helpers as _first_openai_message/_first_openai_choice
participant Parse as parse/repair
participant Fallback as _fallback_structured_response
participant Provider as chat.completions.create
Caller->>Helpers: extract first message/choice
Helpers-->>Caller: message or ValidationException
Caller->>Parse: attempt parse using message.content
Parse-->>Caller: parsed result or raise structured parse errors
Caller->>Fallback: on structured parse/fallback errors
Fallback->>Provider: create(request with json_schema/json_object/plain)
Provider-->>Fallback: response used for next parse attempt
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/llm/backends/openai.py (1)
214-228:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUnusable-content case skips the fallback chain.
When
parse()succeeds but returnsparsed=None, noraw_content, and norefusal, line 228 raisesValidationExceptionimmediately. The PR objective states fallback should engage whenparse(response_model)"fails or returns unusable content". The other "no content at all" paths route through_fallback_structured_responsevia the_STRUCTURED_FALLBACK_ERRORScatch on line 204; this empty-but-well-shaped response does not. Consider routing this case through the fallback chain too, sojson_object/plainretries get a chance before surfacing the failure.♻️ Suggested change
if parsed is None: refusal = getattr(message, "refusal", None) if refusal: return self._normalize_response( response, content_override=refusal, ) - raise ValidationException("No parsed content in structured response") + fallback_response, content = await self._fallback_structured_response( + params=params, + response_format=response_format, + model=model, + ) + return self._normalize_response( + fallback_response, + content_override=content, + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/llm/backends/openai.py` around lines 214 - 228, The code currently raises ValidationException when parse() returns parsed is None with no raw_content and no refusal, skipping the structured fallback chain; change that branch so instead of raising directly it calls the structured fallback handler (use the existing _fallback_structured_response logic) to let json_object/plain retries run: locate the block checking parsed is None, raw_content, and message.refusal in the method that calls parse(), and replace the final raise ValidationException with a call to self._fallback_structured_response(response, response_format, model) (or the appropriate signature used elsewhere) so the response is routed through the same fallback path guarded by _STRUCTURED_FALLBACK_ERRORS.
🧹 Nitpick comments (1)
src/llm/backends/openai.py (1)
471-475: 💤 Low valueConsider
logger.warningfor fallback failures.Each fallback failure here corresponds to a provider response Honcho could not consume in its preferred format. INFO makes these effectively invisible at typical log levels, even though they indicate provider compatibility/quality issues that an operator likely wants to see (and they precede a possible final
ValidationException). WARNING would aid debugging without being noisy, since the loop only logs on actual failure.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/llm/backends/openai.py` around lines 471 - 475, Change the OpenAI structured output fallback log call from logger.info to logger.warning so fallback failures are visible at typical log levels; update the call in the code handling structured output fallbacks (the block referencing fallback_name and exc.__class__.__name__) to use logger.warning with the same message format and arguments so it logs "OpenAI structured output fallback %s failed: %s" along with fallback_name and exc.__class__.__name__.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/llm/backends/openai.py`:
- Around line 397-398: The token counts pulled from the OpenAI `usage` object
may exist but be None, so change the expressions that set input_tokens and
output_tokens to coalesce None to 0 by wrapping the getattr results with an `or
0` fallback (e.g., for `usage` use `(getattr(usage, "prompt_tokens", 0) or 0)`
and `(getattr(usage, "completion_tokens", 0) or 0)`) so `CompletionResult` (or
whatever construct uses those fields) always receives numeric zeros instead of
None.
---
Outside diff comments:
In `@src/llm/backends/openai.py`:
- Around line 214-228: The code currently raises ValidationException when
parse() returns parsed is None with no raw_content and no refusal, skipping the
structured fallback chain; change that branch so instead of raising directly it
calls the structured fallback handler (use the existing
_fallback_structured_response logic) to let json_object/plain retries run:
locate the block checking parsed is None, raw_content, and message.refusal in
the method that calls parse(), and replace the final raise ValidationException
with a call to self._fallback_structured_response(response, response_format,
model) (or the appropriate signature used elsewhere) so the response is routed
through the same fallback path guarded by _STRUCTURED_FALLBACK_ERRORS.
---
Nitpick comments:
In `@src/llm/backends/openai.py`:
- Around line 471-475: Change the OpenAI structured output fallback log call
from logger.info to logger.warning so fallback failures are visible at typical
log levels; update the call in the code handling structured output fallbacks
(the block referencing fallback_name and exc.__class__.__name__) to use
logger.warning with the same message format and arguments so it logs "OpenAI
structured output fallback %s failed: %s" along with fallback_name and
exc.__class__.__name__.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1c1279a1-fc3d-4c76-9698-fe86e40106ac
📒 Files selected for processing (2)
src/llm/backends/openai.pytests/llm/test_backends/test_openai.py
Summary
Fixes #676.
This makes the OpenAI backend a little less trusting when it is talking to OpenAI-compatible providers. A successful HTTP response can still come back with an odd shape from proxies or gateways, so the backend now validates the first choice/message before normalizing it and treats malformed responses as controlled
ValidationExceptions instead of raw attribute/index crashes.It also expands the structured-output fallback path. If
parse(response_model)fails or returns an unusable response, Honcho now triescreate(json_schema), thencreate(json_object), then plaincreate()before giving up.What changed
usagefall back to zero token countsjson_schema,json_object, and plaincreateTests
uv run ruff check src/llm/backends/openai.py tests/llm/test_backends/test_openai.pyuv run pytest tests/llm/test_backends/test_openai.pySummary by CodeRabbit
Bug Fixes
Tests