From 7d29fc27293c777310ccb91755298bab7acc80d8 Mon Sep 17 00:00:00 2001 From: Jamie Lemon Date: Tue, 19 Aug 2025 14:02:29 +0100 Subject: [PATCH 1/3] Fixes docs indentation for argument. --- docs/shape.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/shape.rst b/docs/shape.rst index 582b12197..c3a3cb585 100644 --- a/docs/shape.rst +++ b/docs/shape.rst @@ -345,7 +345,7 @@ Several draw methods can be executed in a row and each one of them will contribu :arg float lineheight: a factor to override the line height calculated from font properties. If not `None`, a line height of `fontsize * lineheight` will be used. - :arg int expandtabs: controls handling of tab characters ``\t`` using the `string.expandtabs()` method **per each line**. + :arg int expandtabs: controls handling of tab characters ``\t`` using the `string.expandtabs()` method **per each line**. :arg float stroke_opacity: *(new in v1.18.1)* set transparency for stroke colors. Negative values and values > 1 will be ignored. Default is 1 (intransparent). :arg float fill_opacity: *(new in v1.18.1)* set transparency for fill colors. Default is 1 (intransparent). Use this value to control transparency of the text color. Stroke opacity **only** affects the border line of characters. From 26078e47b876377f381c684d593e320ae70c5bfb Mon Sep 17 00:00:00 2001 From: Jamie Lemon Date: Tue, 19 Aug 2025 14:05:14 +0100 Subject: [PATCH 2/3] Amends description for pno parameter for insert_page & new_page. --- docs/document.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/document.rst b/docs/document.rst index b8791a61c..a3229753a 100644 --- a/docs/document.rst +++ b/docs/document.rst @@ -1448,7 +1448,7 @@ For details on **embedded files** refer to Appendix 3. PDF only: Insert an empty page. - :arg int pno: page number in front of which the new page should be inserted. Must be in `1 < pno <= page_count`. Special values -1 and *doc.page_count* insert **after** the last page. + :arg int pno: page number index (zero-indexed) at which to insert page. Special values -1 and *doc.page_count* insert **after** the last page. :arg float width: page width. :arg float height: page height. @@ -1468,7 +1468,7 @@ For details on **embedded files** refer to Appendix 3. PDF only: Insert a new page and insert some text. Convenience function which combines :meth:`Document.new_page` and (parts of) :meth:`Page.insert_text`. - :arg int pno: page number (0-based) **in front of which** to insert. Must be in `range(-1, doc.page_count + 1)`. Special values -1 and `doc.page_count` insert **after** the last page. + :arg int pno: page number index (zero-indexed) at which to insert page. Special values -1 and `doc.page_count` insert **after** the last page. Changed in v1.14.12 This is now a positional parameter From bcf22a2d64b3f6673300dc24f031f25097a7e2a5 Mon Sep 17 00:00:00 2001 From: Jamie Lemon Date: Fri, 22 Aug 2025 15:44:08 +0100 Subject: [PATCH 3/3] Updates docs to include section for converting files. --- docs/converting-files.rst | 97 +++++++++++++++++++++++++++++++++++++ docs/how-to-open-a-file.rst | 37 ++++++++++++++ docs/recipes.rst | 8 +-- 3 files changed, 139 insertions(+), 3 deletions(-) create mode 100644 docs/converting-files.rst diff --git a/docs/converting-files.rst b/docs/converting-files.rst new file mode 100644 index 000000000..d27da3679 --- /dev/null +++ b/docs/converting-files.rst @@ -0,0 +1,97 @@ +.. include:: header.rst + +.. _ConvertingFiles: + +============================== +Converting Files +============================== + + + +Files to PDF +~~~~~~~~~~~~~~~~~~ + +:ref:`Document types supported by PyMuPDF` can easily be converted to |PDF| by using the :meth:`Document.convert_to_pdf` method. This method returns a buffer of data which can then be utilized by |PyMuPDF| to create a new |PDF|. + + + +**Example** + +.. code-block:: python + + import pymupdf + + xps = pymupdf.open("input.xps") + pdfbytes = xps.convert_to_pdf() + pdf = pymupdf.open("pdf", pdfbytes) + pdf.save("output.pdf") + + + +PDF to SVG +~~~~~~~~~~~~~~~~~~ + +Technically, as SVG files cannot be multipage, we must export each page as an SVG. + +To get an SVG representation of a page use the :meth:`Page.get_svg_image` method. + +**Example** + +.. code-block:: python + + import pymupdf + + doc = pymupdf.open("input.pdf") + page = doc[0] + + # Convert page to SVG + svg_content = page.get_svg_image() + + # Save to file + with open("output.svg", "w", encoding="utf-8") as f: + f.write(svg_content) + + doc.close() + + +PDF to Markdown +~~~~~~~~~~~~~~~~~ + +By utlilizing the :doc:`PyMuPDF4LLM API ` we are able to convert PDF to a Markdown representation. + +**Example** + +.. code-block:: python + + import pymupdf4llm + import pathlib + + md_text = pymupdf4llm.to_markdown("test.pdf") + print(md_text) + + pathlib.Path("4llm-output.md").write_bytes(md_text.encode()) + + +PDF to DOCX +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Use the pdf2docx_ library which uses |PyMuPDF| to provide document conversion from |PDF| to **DOCX** format. + + + +**Example** + +.. code-block:: python + + from pdf2docx import Converter + + pdf_file = 'input.pdf' + docx_file = 'output.docx' + + # convert pdf to docx + cv = Converter(pdf_file) + cv.convert(docx_file) # all pages by default + cv.close() + + +.. include:: footer.rst diff --git a/docs/how-to-open-a-file.rst b/docs/how-to-open-a-file.rst index 8899066af..1afd8e503 100644 --- a/docs/how-to-open-a-file.rst +++ b/docs/how-to-open-a-file.rst @@ -11,9 +11,15 @@ Opening Files .. _Supported_File_Types: + Supported File Types ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +| + +PyMuPDF +""""""""" + |PyMuPDF| can open files other than just |PDF|. The following file types are supported: @@ -21,6 +27,37 @@ The following file types are supported: .. include:: supported-files-table.rst +---- + + +PyMuPDF Pro +""""""""""""""" + +|PyMuPDF Pro| can open Office files. + +The following file types are supported: + +.. list-table:: + :header-rows: 1 + + * - **DOC/DOCX** + - **XLS/XLSX** + - **PPT/PPTX** + - **HWP/HWPX** + * - .. image:: images/icons/icon-docx.svg + :width: 40 + :height: 40 + - .. image:: images/icons/icon-xlsx.svg + :width: 40 + :height: 40 + - .. image:: images/icons/icon-pptx.svg + :width: 40 + :height: 40 + - .. image:: images/icons/icon-hangul.svg + :width: 40 + :height: 40 + + How to Open a File ~~~~~~~~~~~~~~~~~~~~~ diff --git a/docs/recipes.rst b/docs/recipes.rst index b775111ca..c0125d50b 100644 --- a/docs/recipes.rst +++ b/docs/recipes.rst @@ -10,6 +10,11 @@ how-to-open-a-file.rst +---- + +.. toctree:: + + converting-files.rst ---- @@ -18,21 +23,18 @@ recipes-text.rst - ---- .. toctree:: recipes-images.rst - ---- .. toctree:: recipes-annotations.rst - ---- .. toctree::