Skip to content

Commit 42ec4af

Browse files
committed
Documentation updates for version 1.28.0
1 parent 610a665 commit 42ec4af

12 files changed

Lines changed: 281 additions & 54 deletions

docs/about-feature-matrix.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,10 @@
5555
:width: 0
5656
:height: 0
5757

58+
.. image:: images/icons/icon-md.svg
59+
:width: 0
60+
:height: 0
61+
5862
.. raw:: html
5963

6064

@@ -181,6 +185,11 @@
181185
background-size: 40px 40px;
182186
}
183187
188+
#feature-matrix .icon.md {
189+
background: url("_images/icon-md.svg") 0 0 transparent no-repeat;
190+
background-size: 40px 40px;
191+
}
192+
184193
</style>
185194

186195

@@ -207,6 +216,7 @@
207216
<span class="icon cbz"><cite>CBZ</cite></span>
208217
<span class="icon svg"><cite>SVG</cite></span>
209218
<span class="icon txt"><cite>TXT</cite></span>
219+
<span class="icon md"><cite>MD</cite></span>
210220
<span class="icon image"><cite id="transFM3">Image</cite></span>
211221
<hr/>
212222
<span class="icon docx"><cite>DOCX</cite></span>

docs/about.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ The following table illustrates what features the products offer:
9797
- PyMuPDF Pro
9898
- PyMuPDF4LLM
9999
* - **Input Documents**
100-
- `PDF`, `XPS`, `EPUB`, `CBZ`, `MOBI`, `FB2`, `SVG`, `TXT`, Images (*standard document types*)
100+
- `PDF`, `XPS`, `EPUB`, `CBZ`, `MOBI`, `FB2`, `SVG`, `TXT`, `MD`, Images (*standard document types*)
101101
- *as PyMuPDF* and:
102102
`DOC`/`DOCX`, `XLS`/`XLSX`, `PPT`/`PPTX`, `HWP`/`HWPX`
103103
- *as PyMuPDF*

docs/app3.rst

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -421,6 +421,88 @@ Typical document page sizes are **ISO A4** and **Letter**. A **Letter** page has
421421

422422

423423

424+
425+
.. _CSS_Support:
426+
427+
CSS Support
428+
--------------------------------------------
429+
430+
For now, only a subset of CSS properties are supported.
431+
432+
The underlying C library MuPDF supports a subset of HTML4 and CSS2. The primary goal of the HTML/CSS support is to serve as a popular and convenient way to style text — not to faithfully reproduce websites in PDF.
433+
434+
What Works
435+
~~~~~~~~~~~~~
436+
437+
The following list shows the supported properties, grouped by category.
438+
439+
Box Model & Layout
440+
""""""""""""""""""
441+
442+
``margin``, ``margin-top``, ``margin-right``, ``margin-bottom``, ``margin-left``, ``padding``, ``padding-top``, ``padding-right``, ``padding-bottom``, ``padding-left``, ``width``, ``height``, ``display``, ``position``, ``top``, ``right``, ``bottom``, ``left``, ``inset``, ``overflow-wrap``, ``columns``
443+
444+
.. note::
445+
446+
The properties ``position`` & ``display`` are supported in a very limited way. Only the values ``position: relative`` and ``display: block`` are supported.
447+
448+
449+
Border
450+
""""""""""""""""""
451+
452+
``border``, ``border-top``, ``border-right``, ``border-bottom``, ``border-left``, ``border-color``, ``border-style``, ``border-width``, ``border-spacing``, ``border-collapse``, ``border-top-color``, ``border-right-color``, ``border-bottom-color``, ``border-left-color``, ``border-top-style``, ``border-right-style``, ``border-bottom-style``, ``border-left-style``, ``border-top-width``, ``border-right-width``, ``border-bottom-width``, ``border-left-width``
453+
454+
Background
455+
""""""""""""""""""
456+
457+
``background``, ``background-color``
458+
459+
.. note::
460+
461+
Background images are not supported, but the ``background`` property can be used to set a background color for a text block, which is then rendered as a filled rectangle behind the text.
462+
463+
Font
464+
""""""""""""""""""
465+
466+
``font``, ``font-family``, ``font-size``, ``font-style``, ``font-variant``, ``font-weight``
467+
468+
Text
469+
""""""""""""""""""
470+
471+
``color``, ``letter-spacing``, ``line-height``, ``text-align``, ``text-decoration``, ``text-indent``, ``text-transform``, ``word-spacing``, ``white-space``, ``vertical-align``, ``direction``, ``hyphens``
472+
473+
List
474+
""""""""""""""""""
475+
476+
``list-style``, ``list-style-image``, ``list-style-position``, ``list-style-type``
477+
478+
Page
479+
""""""""""""""""""
480+
481+
``page-break-before``, ``page-break-after``, ``orphans``, ``widows``
482+
483+
Visibility
484+
""""""""""""""""""""""""""""""""""""
485+
486+
``visibility``
487+
488+
MuPDF-specific / WebKit extensions
489+
""""""""""""""""""""""""""""""""""""
490+
491+
``-mupdf-leading``, ``-webkit-text-fill-color``, ``-webkit-text-stroke-color``, ``-webkit-text-stroke-width``
492+
493+
Other
494+
""""""""""""""""""
495+
496+
``src`` (for @font-face), ``overflow-wrap``
497+
498+
499+
500+
501+
What Doesn't Work
502+
~~~~~~~~~~~~~~~~~~~~~~~~~~
503+
504+
Modern CSS (CSS3+): no ``flexbox``, ``grid``, ``custom properties`` (--vars), ``calc()``, ``transitions``, ``animations``, ``position: absolute`` / ``fixed``, ``float``, ``clear`` and so on.
505+
424506
.. rubric:: Footnotes
425507

426508
.. [#f1] MuPDF supports "deep-copying" objects between PDF documents. To avoid duplicate data in the target, it uses so-called "graftmaps", like a form of scratchpad: for each object to be copied, its :data:`xref` number is looked up in the graftmap. If found, copying is skipped. Otherwise, the new :data:`xref` is recorded and the copy takes place. PyMuPDF makes use of this technique in two places so far: :meth:`Document.insert_pdf` and :meth:`Page.show_pdf_page`. This process is fast and very efficient, because it prevents multiple copies of typically large and frequently referenced data, like images and fonts. However, you may still want to consider using garbage collection (option 4) in any of the following cases:

docs/archive-class.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Archive
1010

1111
This class represents a generalization of file folders and container files like ZIP and TAR archives. Archives allow accessing arbitrary collections of file folders, ZIP / TAR files and single binary data elements as if they all were part of one hierarchical tree of folders.
1212

13-
In PyMuPDF, archives are currently only used by :ref:`Story` objects to specify where to look for fonts, images and other resources.
13+
In PyMuPDF, archives are currently only used by :ref:`Story` objects and as an :ref:`option when opening files <Full_Options_for_Opening_a_File>` to specify where to look for fonts, images and other resources.
1414

1515
================================ ===================================================
1616
**Method / Attribute** **Short Description**

docs/converting-files.rst

Lines changed: 102 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Converting Files
1111
Files to PDF
1212
~~~~~~~~~~~~~~~~~~
1313

14-
:ref:`Document types supported by PyMuPDF<HowToOpenAFile>` can easily be converted to |PDF| by using the :meth:`Document.convert_to_pdf` method. This method returns a buffer of data which can then be utilized by |PyMuPDF| to create a new |PDF|.
14+
:ref:`Document types supported by PyMuPDF <HowToOpenAFile>` can easily be converted to |PDF| by using the :meth:`Document.convert_to_pdf` method. This method returns a buffer of data which can then be utilized by |PyMuPDF| to create a new |PDF|.
1515

1616

1717

@@ -20,38 +20,97 @@ Files to PDF
2020
.. code-block:: python
2121
2222
import pymupdf
23-
23+
24+
# Convert Markdown to PDF
25+
md_doc = pymupdf.open("example.md")
26+
pdfdata = md_doc.convert_to_pdf()
27+
pdf_doc = pymupdf.open(stream=pdfdata)
28+
pdf_doc.save("example.pdf")
29+
30+
# Convert XPS to PDF
2431
xps = pymupdf.open("input.xps")
25-
pdfbytes = xps.convert_to_pdf()
26-
pdf = pymupdf.open("pdf", pdfbytes)
32+
pdfdata = xps.convert_to_pdf()
33+
pdf = pymupdf.open(stream=pdfdata)
2734
pdf.save("output.pdf")
2835
36+
.. _Markdown_to_PDF:
2937

38+
Markdown to PDF
39+
~~~~~~~~~~~~~~~~~
3040

31-
PDF to SVG
32-
~~~~~~~~~~~~~~~~~~
41+
As Markdown files are supported input files they can be easily converted to PDF using the :meth:`Document.convert_to_pdf` method.
3342

34-
Technically, as SVG files cannot be multipage, we must export each page as an SVG.
43+
In the simplest case you can just open the Markdown file and call the method to get a PDF representation of the content.
3544

36-
To get an SVG representation of a page use the :meth:`Page.get_svg_image` method.
3745

38-
**Example**
46+
Defining paper size
47+
"""""""""""""""""""
48+
49+
The default paper size is 400 x 600 :doc:`rect` but you can specify a custom paper size if you wish, to do this just send through the `rect` parameter as required, for example:
3950

4051
.. code-block:: python
4152
42-
import pymupdf
53+
md_doc = pymupdf.open("example.md", rect=pymupdf.paper_rect("A4")) # A4 size
4354
44-
doc = pymupdf.open("input.pdf")
45-
page = doc[0]
4655
47-
# Convert page to SVG
48-
svg_content = page.get_svg_image()
56+
Defining CSS
57+
""""""""""""
58+
59+
By default, the Markdown content will be converted to PDF using a default CSS stylesheet. However, you can specify your own CSS stylesheet to customize the appearance of the resulting PDF. To do this, define your `css` and apply it.
60+
61+
For example, to make all ``h1`` headers red (The single ``#`` symbol in Markdown), you could do the following:
62+
63+
.. code-block:: python
64+
65+
md_doc = pymupdf.open( # open the Markdown document in A4 size
66+
"example.md",
67+
rect=pymupdf.paper_rect("A4")
68+
)
69+
70+
css = "h1 {color:red;}"
71+
md_doc.apply_css(css)
72+
73+
pdf_doc = pymupdf.open(stream=md_doc.convert_to_pdf())
74+
pdf_doc.ez_save("red-colored-header.pdf")
75+
76+
.. note::
77+
78+
The :ref:`support for CSS <CSS_Support>` is currently limited.
79+
80+
81+
Defining Fonts
82+
"""""""""""""""""
83+
84+
Fonts can be defined by using the `archive` parameter to provide a custom :ref:`Archive` containing the font files.
85+
86+
The fonts must exist in an archive which is provided to the `archive` parameter when opening the Markdown file. The CSS can then refer to these fonts by their names as defined in the archive.
87+
88+
For example, assuming you have access to the source files for the "Comic Sans" font for all text, you could do the following:
89+
90+
.. code-block:: python
91+
92+
# Global CSS instructions to use the "Comic Sans" font for all text. The font files must be provided in the archive.
93+
css = """
94+
@font-face {font-family: sans-serif; src: url(comic.ttf);}
95+
@font-face {font-family: sans-serif; src: url(comicbd.ttf); font-weight: bold;}
96+
@font-face {font-family: sans-serif; src: url(comicz.ttf); font-weight: bold; font-style: italic;}
97+
@font-face {font-family: sans-serif; src: url(comici.ttf); font-style: italic;}
98+
"""
99+
100+
archive = pymupdf.Archive("C:/Windows/Fonts") # the fonts are here
101+
archive.add(".") # we've stored the archive image in this script's folder
102+
103+
md_file = "sample.md"
104+
md_doc = pymupdf.open( # open the Markdown document
105+
md_file,
106+
archive=archive, # where to look for resources (fonts, images)
107+
rect=pymupdf.paper_rect("A4"), # page dimension ISO A4
108+
)
109+
110+
md_doc.apply_css(css)
111+
49112
50-
# Save to file
51-
with open("output.svg", "w", encoding="utf-8") as f:
52-
f.write(svg_content)
53113
54-
doc.close()
55114
56115
57116
PDF to Markdown
@@ -72,6 +131,31 @@ By utlilizing the :doc:`PyMuPDF4LLM API <pymupdf4llm/api>` we are able to conver
72131
pathlib.Path("4llm-output.md").write_bytes(md_text.encode())
73132
74133
134+
PDF to SVG
135+
~~~~~~~~~~~~~~~~~~
136+
137+
Technically, as SVG files cannot be multipage, we must export each page as an SVG.
138+
139+
To get an SVG representation of a page use the :meth:`Page.get_svg_image` method.
140+
141+
**Example**
142+
143+
.. code-block:: python
144+
145+
import pymupdf
146+
147+
doc = pymupdf.open("input.pdf")
148+
page = doc[0]
149+
150+
# Convert page to SVG
151+
svg_content = page.get_svg_image()
152+
153+
# Save to file
154+
with open("output.svg", "w", encoding="utf-8") as f:
155+
f.write(svg_content)
156+
157+
doc.close()
158+
75159
PDF to DOCX
76160
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
77161

0 commit comments

Comments
 (0)