Skip to content

Commit 1f14e49

Browse files
bundoleeclaude
andcommitted
fix: add install and setup instructions to hybrid error message and CLI help
Problem: Users and AI agents encountering --hybrid for the first time cannot self-resolve. The CLI description says "Hybrid backend for AI processing" with no hint that a server is required. The connection error message mentions the server command but omits the pip install step and --hybrid-url for custom servers. Solution: - CLI --hybrid description: state "requires a running server" with quick start commands and --hybrid-url for remote servers - Error message: add step-by-step install/start instructions and --hybrid-url hint for remote/custom port setups - Run npm run sync to regenerate Python/Node.js bindings and docs Verification: - DoclingFastServerClientTest (10/10): new assertions verify install instructions and server start command are present in error output - Subagent test: a fresh agent with no README/CLAUDE.md context resolved hybrid setup using only --help and error messages Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 753434b commit 1f14e49

11 files changed

Lines changed: 98 additions & 89 deletions

File tree

content/docs/_generated/node-convert-options.mdx

Lines changed: 27 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -7,30 +7,30 @@ description: Options for the Node.js convert function
77
{/* Run `npm run generate-options` to regenerate */}
88

99

10-
| Option | Type | Default | Description |
11-
|-------------------------|----------------------|--------------|------------------------------------------------------------------------------------------------------------------------------------|
12-
| `outputDir` | `string` | - | Directory where output files are written. Default: input file directory |
13-
| `password` | `string` | - | Password for encrypted PDF files |
14-
| `format` | `string \| string[]` | - | Output formats (comma-separated). Values: json, text, html, pdf, markdown, markdown-with-html, markdown-with-images. Default: json |
15-
| `quiet` | `boolean` | `false` | Suppress console logging output |
16-
| `contentSafetyOff` | `string \| string[]` | - | Disable content safety filters. Values: all, hidden-text, off-page, tiny, hidden-ocg |
17-
| `sanitize` | `boolean` | `false` | Enable sensitive data sanitization. Replaces emails, phone numbers, IPs, credit cards, and URLs with placeholders |
18-
| `keepLineBreaks` | `boolean` | `false` | Preserve original line breaks in extracted text |
19-
| `replaceInvalidChars` | `string` | `" "` | Replacement character for invalid/unrecognized characters. Default: space |
20-
| `useStructTree` | `boolean` | `false` | Use PDF structure tree (tagged PDF) for reading order and semantic structure |
21-
| `tableMethod` | `string` | `"default"` | Table detection method. Values: default (border-based), cluster (border + cluster). Default: default |
22-
| `readingOrder` | `string` | `"xycut"` | Reading order algorithm. Values: off, xycut. Default: xycut |
23-
| `markdownPageSeparator` | `string` | - | Separator between pages in Markdown output. Use %page-number% for page numbers. Default: none |
24-
| `textPageSeparator` | `string` | - | Separator between pages in text output. Use %page-number% for page numbers. Default: none |
25-
| `htmlPageSeparator` | `string` | - | Separator between pages in HTML output. Use %page-number% for page numbers. Default: none |
26-
| `imageOutput` | `string` | `"external"` | Image output mode. Values: off (no images), embedded (Base64 data URIs), external (file references). Default: external |
27-
| `imageFormat` | `string` | `"png"` | Output format for extracted images. Values: png, jpeg. Default: png |
28-
| `imageDir` | `string` | - | Directory for extracted images |
29-
| `pages` | `string` | - | Pages to extract (e.g., "1,3,5-7"). Default: all pages |
30-
| `includeHeaderFooter` | `boolean` | `false` | Include page headers and footers in output |
31-
| `detectStrikethrough` | `boolean` | `false` | Detect strikethrough text and wrap with ~~ in Markdown output (experimental) |
32-
| `hybrid` | `string` | `"off"` | Hybrid backend for AI processing. Values: off (default), docling-fast |
33-
| `hybridMode` | `string` | `"auto"` | Hybrid triage mode. Values: auto (default, dynamic triage), full (skip triage, all pages to backend) |
34-
| `hybridUrl` | `string` | - | Hybrid backend server URL (overrides default) |
35-
| `hybridTimeout` | `string` | `"0"` | Hybrid backend request timeout in milliseconds (0 = no timeout). Default: 0 |
36-
| `hybridFallback` | `boolean` | `false` | Opt in to Java fallback on hybrid backend error (default: disabled) |
10+
| Option | Type | Default | Description |
11+
|-------------------------|----------------------|--------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
12+
| `outputDir` | `string` | - | Directory where output files are written. Default: input file directory |
13+
| `password` | `string` | - | Password for encrypted PDF files |
14+
| `format` | `string \| string[]` | - | Output formats (comma-separated). Values: json, text, html, pdf, markdown, markdown-with-html, markdown-with-images. Default: json |
15+
| `quiet` | `boolean` | `false` | Suppress console logging output |
16+
| `contentSafetyOff` | `string \| string[]` | - | Disable content safety filters. Values: all, hidden-text, off-page, tiny, hidden-ocg |
17+
| `sanitize` | `boolean` | `false` | Enable sensitive data sanitization. Replaces emails, phone numbers, IPs, credit cards, and URLs with placeholders |
18+
| `keepLineBreaks` | `boolean` | `false` | Preserve original line breaks in extracted text |
19+
| `replaceInvalidChars` | `string` | `" "` | Replacement character for invalid/unrecognized characters. Default: space |
20+
| `useStructTree` | `boolean` | `false` | Use PDF structure tree (tagged PDF) for reading order and semantic structure |
21+
| `tableMethod` | `string` | `"default"` | Table detection method. Values: default (border-based), cluster (border + cluster). Default: default |
22+
| `readingOrder` | `string` | `"xycut"` | Reading order algorithm. Values: off, xycut. Default: xycut |
23+
| `markdownPageSeparator` | `string` | - | Separator between pages in Markdown output. Use %page-number% for page numbers. Default: none |
24+
| `textPageSeparator` | `string` | - | Separator between pages in text output. Use %page-number% for page numbers. Default: none |
25+
| `htmlPageSeparator` | `string` | - | Separator between pages in HTML output. Use %page-number% for page numbers. Default: none |
26+
| `imageOutput` | `string` | `"external"` | Image output mode. Values: off (no images), embedded (Base64 data URIs), external (file references). Default: external |
27+
| `imageFormat` | `string` | `"png"` | Output format for extracted images. Values: png, jpeg. Default: png |
28+
| `imageDir` | `string` | - | Directory for extracted images |
29+
| `pages` | `string` | - | Pages to extract (e.g., "1,3,5-7"). Default: all pages |
30+
| `includeHeaderFooter` | `boolean` | `false` | Include page headers and footers in output |
31+
| `detectStrikethrough` | `boolean` | `false` | Detect strikethrough text and wrap with ~~ in Markdown output (experimental) |
32+
| `hybrid` | `string` | `"off"` | Hybrid backend (requires a running server). Quick start: pip install "opendataloader-pdf[hybrid]" && opendataloader-pdf-hybrid --port 5002. For remote servers use --hybrid-url. Values: off (default), docling-fast |
33+
| `hybridMode` | `string` | `"auto"` | Hybrid triage mode. Values: auto (default, dynamic triage), full (skip triage, all pages to backend) |
34+
| `hybridUrl` | `string` | - | Hybrid backend server URL (overrides default) |
35+
| `hybridTimeout` | `string` | `"0"` | Hybrid backend request timeout in milliseconds (0 = no timeout). Default: 0 |
36+
| `hybridFallback` | `boolean` | `false` | Opt in to Java fallback on hybrid backend error (default: disabled) |

0 commit comments

Comments
 (0)