Picture classification and description silently skipped for DOCX/PPTX/XLSX/HTML — format_options dict only includes PDF and IMAGE

## Summary

When submitting a DOCX (or PPTX/XLSX/HTML) to docling-serve with `do_picture_description=true`
and `do_picture_classification=true`, the flags are silently ignored. The resulting JSON has
`meta=null`, `annotations=[]`, and `captions=[]` on every picture. No error, no warning.

Root cause is in `docling-jobkit`: the `DoclingConverterManager` only registers
`format_options` entries for `InputFormat.PDF` and `InputFormat.IMAGE`. All other formats
fall back to docling's bare `WordFormatOption()` / `PowerpointFormatOption()` / etc., which
default to `do_picture_description=False` and `do_picture_classification=False`.

Since docling 2.52.0 (PR docling-project/docling#2251, 2025-09-11), `ConvertPipeline` and
`BaseItemAndImageEnrichmentModel.prepare_element` explicitly support enrichment for
documents without page images (DOCX/HTML). The fix landed in docling 8 months ago but
docling-jobkit never plumbed the request flags to the office FormatOptions.

## Versions

- docling-serve 1.13.1
- docling-jobkit 1.11.0 (also reproduced on `main` at b6b2e02)
- docling 2.74.0
- docling-core 2.65.2

## Reproduction

POST a DOCX containing embedded images to `/v1/convert/file/async`:

```bash
curl -X POST https://<docling-serve>/v1/convert/file/async \
  -H "X-Api-Key: ..." \
  -F "files=@with-images.docx;type=application/vnd.openxmlformats-officedocument.wordprocessingml.document" \
  -F "to_formats=json" -F "to_formats=md" \
  -F "do_picture_classification=true" \
  -F "do_picture_description=true" \
  -F 'picture_description_api={"url":"https://your-vlm/v1/chat/completions","params":{"model":"..."},"headers":{"Authorization":"Bearer ..."},"prompt":"Describe the image."}'
```

Expected: pictures with `meta.classification.predictions` and `meta.description` set,
plus `annotations` array populated.

Actual: every picture has `meta=null`, `annotations=[]`, processing completes in seconds
(no VLM call made).

## Control

Same request with a PDF works correctly: classifier fills 18 class predictions per picture
and the VLM is called for each picture > area threshold. Confirms the request flags are
parsed and that docling-core's enrichment is functional. The only difference is the input
format and which `FormatOption` it routes through.

## Bug location

[`docling_jobkit/convert/manager.py`](https://github.com/docling-project/docling-jobkit/blob/main/docling_jobkit/convert/manager.py) (lines 473–478 on main):

```python
format_options: dict[InputFormat, FormatOption] = {
    InputFormat.PDF: pdf_format_option,
    InputFormat.IMAGE: image_format_option,
}
return DocumentConverter(format_options=format_options)
```

DOCX/PPTX/XLSX/HTML never get a `FormatOption` carrying the request's pipeline options.

## Suggested fix

Build a `ConvertPipelineOptions` from the request flags and register entries for all
non-PDF formats docling supports as input. Sketch:

```python
convert_pipeline_options = ConvertPipelineOptions(
    do_picture_classification=request.do_picture_classification,
    do_picture_description=request.do_picture_description,
    picture_description_options=picture_description_options,
    do_chart_extraction=request.do_chart_extraction,
)
format_options[InputFormat.DOCX]  = WordFormatOption(pipeline_options=convert_pipeline_options)
format_options[InputFormat.PPTX]  = PowerpointFormatOption(pipeline_options=convert_pipeline_options)
format_options[InputFormat.XLSX]  = ExcelFormatOption(pipeline_options=convert_pipeline_options)
format_options[InputFormat.HTML]  = HTMLFormatOption(pipeline_options=convert_pipeline_options)
```

Happy to send a PR if the maintainers confirm this is the right approach.

## Related

- docling 2.52.0 PR docling-project/docling#2251 — `ConvertPipeline` adds enrichment for
  DOCX/HTML
- docling 2.55.x issue docling-project/docling#2401 — same fix extended to PPTX/XLSX
- docling-serve issue docling-project/docling-serve#298 — closed without maintainer
  response, asks the same question


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Picture classification and description silently skipped for DOCX/PPTX/XLSX/HTML — format_options dict only includes PDF and IMAGE #145

Summary

Versions

Reproduction

Control

Bug location

Suggested fix

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Picture classification and description silently skipped for DOCX/PPTX/XLSX/HTML — format_options dict only includes PDF and IMAGE #145

Description

Summary

Versions

Reproduction

Control

Bug location

Suggested fix

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions