Image Extraction Improvements for Issue #511 by ljluestc · Pull Request #1136 · run-llama/llama_cloud_services

ljluestc · 2026-03-27T15:15:49Z

Image Extraction Improvements for Issue #511

Problem Statement

Users reported that images in uploaded PDFs are not being recognized or extracted. Specifically, a user uploaded a Chinese microwave oven instruction manual (weibolu.pdf) containing text, images, and tables, but the output did not contain the images.

Job ID: 6a79d5d9-ce02-4103-9055-db03be7e7613

Root Cause Analysis

Default parsing mode does not prioritize image extraction - users need to opt into premium or agent-based parsing
Fast mode explicitly skips OCR and image extraction without clear warning
No diagnostic tooling existed to help users understand why images weren't extracted
No convenient API to get both text and images in a single call
Chinese documents require language='zh' for optimal OCR

Solution Overview

This PR adds comprehensive image extraction diagnostics, warnings, and convenience methods to help users successfully extract images from PDFs.

Migration Guide

For users currently using `load_data()`:

# Before
documents = parser.load_data("document.pdf")

# After (if you need images)
text_documents, image_documents = parser.load_data_with_images("document.pdf")

For users currently using `parse()`:

# Before
result = await parser.aparse("document.pdf")

# After (with diagnostics)
result = await parser.aparse("document.pdf")
if not result.has_images():
    result.print_image_extraction_report()

Related Issues

Closes Images in uploaded pdfs are not recognized #511: Images in uploaded PDFs are not recognized
Related: Chinese document OCR improvements
Related: Multimodal RAG documentation

- Add load_data_with_images / aload_data_with_images convenience methods to LlamaParse (Python) that return both text/markdown documents and ImageDocument objects in a single call - Add loadDataWithImages to LlamaParseReader (TypeScript) that returns documents and image metadata together - Add JobResult diagnostic helpers: has_images(), get_image_extraction_summary(), get_image_extraction_troubleshooting(), print_image_extraction_report() - Emit a helpful warning when parse() detects no images were extracted, suggesting language, premium_mode, and take_screenshot options - Clarify fast_mode description to warn it skips image extraction - Improve docstrings on load_data / loadData to point users toward image-aware methods - Add 17 unit tests for the new JobResult methods Closes run-llama#511

changeset-bot · 2026-03-27T15:15:54Z

⚠️ No Changeset found

Latest commit: cce0830

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

ljluestc changed the title ~~# Image Extraction Improvements for Issue #511~~ Image Extraction Improvements for Issue #511 Mar 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image Extraction Improvements for Issue #511#1136

Image Extraction Improvements for Issue #511#1136
ljluestc wants to merge 1 commit into
run-llama:mainfrom
ljluestc:fix/pdf-image-extraction-511

ljluestc commented Mar 27, 2026

Uh oh!

changeset-bot Bot commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ljluestc commented Mar 27, 2026

Image Extraction Improvements for Issue #511

Problem Statement

Root Cause Analysis

Solution Overview

Migration Guide

For users currently using load_data():

For users currently using parse():

Related Issues

Uh oh!

changeset-bot Bot commented Mar 27, 2026

⚠️ No Changeset found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

For users currently using `load_data()`:

For users currently using `parse()`: