fix(hybrid): activate --hybrid-fallback on server-absent path (PDFDLOSP-21)#511
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (5)
WalkthroughAdds client messages that suggest ChangesHybrid fallback and tests
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@java/opendataloader-pdf-cli/src/test/java/org/opendataloader/pdf/cli/CLIMainTest.java`:
- Around line 406-409: The test CLIMainTest currently checks absence of stack
traces using stdoutHolder[0]; extend it to also assert the user-facing fallback
guidance appears by verifying the captured output (out) contains the phrase
"backend unavailable" (or the exact message used by the CLI) and a hint about
using "--hybrid-fallback" (or the exact flag text), while keeping the existing
assertFalse that checks for "\tat " intact; update the assertions to use the
same variable out and make the checks case-sensitive to match the CLI message
text.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: af988605-0846-41d0-bba3-8edb3c5b1bcc
📒 Files selected for processing (5)
java/opendataloader-pdf-cli/src/test/java/org/opendataloader/pdf/cli/CLIMainTest.javajava/opendataloader-pdf-core/src/main/java/org/opendataloader/pdf/hybrid/DoclingFastServerClient.javajava/opendataloader-pdf-core/src/main/java/org/opendataloader/pdf/hybrid/HancomClient.javajava/opendataloader-pdf-core/src/main/java/org/opendataloader/pdf/processors/HybridDocumentProcessor.javajava/opendataloader-pdf-core/src/test/java/org/opendataloader/pdf/HybridBackendFailureIntegrationTest.java
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
…LOSP-21) CodeRabbit nit on PR opendataloader-project#511: the regression guarded against stack-trace leakage but didn't lock in the friendly message contract this PR promises. CLIMain emits the hybrid unavailability message via LOGGER.SEVERE, which JUL routes to stderr — out of reach of the existing stdout-only capture helper. Attach a LogRecord handler scoped to the CLIMain logger so the test can assert that the captured SEVERE message contains both 'Hybrid server is not available' and the '--hybrid-fallback' hint, while keeping the existing stdout stack-trace guard intact. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…SP-21) Phase 0 health check in HybridDocumentProcessor invoked checkAvailability() outside the fallback try/catch, so a connection refused IOException propagated past every isFallbackToJava() check and the option was a no-op in the server-absent path. Wrap the call and route through a new processAllPagesAsJavaFallback helper when fallback is enabled. Hint --hybrid-fallback in the unavailability message and add regression coverage for fallback ON/OFF/timeout combos. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…LOSP-21) CodeRabbit nit on PR opendataloader-project#511: the regression guarded against stack-trace leakage but didn't lock in the friendly message contract this PR promises. CLIMain emits the hybrid unavailability message via LOGGER.SEVERE, which JUL routes to stderr — out of reach of the existing stdout-only capture helper. Attach a LogRecord handler scoped to the CLIMain logger so the test can assert that the captured SEVERE message contains both 'Hybrid server is not available' and the '--hybrid-fallback' hint, while keeping the existing stdout stack-trace guard intact. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fcdeb3a to
7379db8
Compare
Summary
HybridDocumentProcessor.processDocument) now respects--hybrid-fallback— connection refused / health checkIOExceptionroutes through the Java-only path instead of aborting the run--hybrid-fallbackas an alternative escape hatch (was only suggesting--hybridremoval)Root cause
HybridDocumentProcessor.java:187invokedgetClient(config).checkAvailability()outside the fallback try/catch at lines 240-252. When the hybrid server was absent, the resultingIOExceptionpropagated past everyisFallbackToJava()check (lines 246/259/306) and reachedCLIMain.processFile, so the option was effectively a no-op in the server-absent path. The timeout path remained healthy because chunk-levelIOException(line 546) was caught inside the protected region.This explains the H-6/H-7 equivalence reported in PDFDLOSP-21 (same 23 PDFs failed with identical exit codes regardless of
--hybrid-timeout).Changes
processors/HybridDocumentProcessor.javacheckAvailability()in try/catch; onIOExceptionwith fallback enabled, route through newprocessAllPagesAsJavaFallbackhelper (filterAllPages → processJavaPath → mergeResults → postProcess)hybrid/DoclingFastServerClient.java--hybrid-fallbackhint to the unavailability messagehybrid/HancomClient.javatest/HybridBackendFailureIntegrationTest.javareserveClosedPort()helpercli/CLIMainTest.java\tatgrep, mirrors PDFDLOSP-14 pattern)Test plan
Out of scope (separate PRs)
Summary by CodeRabbit
New Features
Bug Fixes
Tests