Description of the bug
Hi Team,
i have page where
- page.get_images_info() is returning empty array whereas pymupdf4llm.to_markdown is giving 2 page blocks with class 'picture'
- page.get_text() is returning null.
PFA pdf file.
Thanks of the support.
bug_page1.pdf
How to reproduce the bug
Run below python code
import pymupdf
import pymupdf4llm
input_path = "<path_to_file>"
doc = pymupdf.open(input_path)
for page in doc:
print(f"text {page.get_text()}")
image_infos = page.get_image_info(xrefs=True)
print(f"page number {page.number} image_infos : {image_infos}")
documentText = pymupdf4llm.to_markdown(doc,page_chunks = True)
print(f"document text : {documentText}")
versions:
PyMuPDF 1.27.2.2
pymupdf-layout 1.27.2.2
pymupdf4llm 1.27.2.2
PyMuPDF version
1.27.2.2
Operating system
MacOS
Python version
3.14
Description of the bug
Hi Team,
i have page where
PFA pdf file.
Thanks of the support.
bug_page1.pdf
How to reproduce the bug
Run below python code
versions:
PyMuPDF 1.27.2.2
pymupdf-layout 1.27.2.2
pymupdf4llm 1.27.2.2
PyMuPDF version
1.27.2.2
Operating system
MacOS
Python version
3.14