You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/app3.rst
+54Lines changed: 54 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -421,6 +421,60 @@ Typical document page sizes are **ISO A4** and **Letter**. A **Letter** page has
421
421
422
422
423
423
424
+
425
+
.. _CSS_Support:
426
+
427
+
CSS Support
428
+
--------------------------------------------
429
+
430
+
For now, only a subset of CSS properties are supported.
431
+
432
+
The underlying C library MuPDF supports a subset of HTML4 and CSS2. The primary goal of the HTML/CSS support is to serve as a popular and convenient way to style text — not to faithfully reproduce websites in PDF.
433
+
434
+
What Works
435
+
~~~~~~~~~~~~~
436
+
437
+
The following list shows the supported properties and their possible values. The list is not exhaustive, but it gives an idea of what to expect.
438
+
439
+
Text styling
440
+
""""""""""""""
441
+
442
+
443
+
``color``
444
+
``font-family``
445
+
``font-size``
446
+
``font-weight`` (bold)
447
+
``font-style`` (italic)
448
+
``text-align``
449
+
``line-height``
450
+
``letter-spacing``
451
+
``text-decoration`` (underline etc.)
452
+
453
+
Box model (basic)
454
+
""""""""""""""""""""""""""""
455
+
456
+
``margin``
457
+
``padding``
458
+
``border``
459
+
``background-color`` (applies to the text's occupied sub-rectangle, not the full box)
460
+
461
+
Fonts
462
+
""""""""""""""
463
+
464
+
``@font-face`` for loading custom fonts via an Archive
465
+
Standard variants (regular, bold, italic, bold-italic) via ``font-weight`` and ``font-style``
466
+
467
+
Layout
468
+
""""""""""""""
469
+
470
+
Only relative layout is available. No ``position: absolute``, no ``flexbox``, no ``grid``, no ``float``, no ``clear``. The layout is basically a flow layout, where the text is laid out in lines and paragraphs, and the lines are laid out in blocks.
471
+
472
+
473
+
What Doesn't Work
474
+
~~~~~~~~~~~~~~~~~~~~~~~~~~
475
+
476
+
Modern CSS (CSS3+): no ``flexbox``, ``grid``, ``custom properties`` (--vars), ``calc()``, ``transitions``, ``animations``, ``position: absolute`` / ``fixed``, ``float``, ``clear`` and so on.
477
+
424
478
.. rubric:: Footnotes
425
479
426
480
.. [#f1] MuPDF supports "deep-copying" objects between PDF documents. To avoid duplicate data in the target, it uses so-called "graftmaps", like a form of scratchpad: for each object to be copied, its :data:`xref` number is looked up in the graftmap. If found, copying is skipped. Otherwise, the new :data:`xref` is recorded and the copy takes place. PyMuPDF makes use of this technique in two places so far: :meth:`Document.insert_pdf` and :meth:`Page.show_pdf_page`. This process is fast and very efficient, because it prevents multiple copies of typically large and frequently referenced data, like images and fonts. However, you may still want to consider using garbage collection (option 4) in any of the following cases:
Copy file name to clipboardExpand all lines: docs/archive-class.rst
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ Archive
10
10
11
11
This class represents a generalization of file folders and container files like ZIP and TAR archives. Archives allow accessing arbitrary collections of file folders, ZIP / TAR files and single binary data elements as if they all were part of one hierarchical tree of folders.
12
12
13
-
In PyMuPDF, archives are currently only used by :ref:`Story` objects to specify where to look for fonts, images and other resources.
13
+
In PyMuPDF, archives are currently only used by :ref:`Story` objects and as an :ref:`option when opening files <Full_Options_for_Opening_a_File>` to specify where to look for fonts, images and other resources.
:ref:`Document types supported by PyMuPDF<HowToOpenAFile>` can easily be converted to |PDF| by using the :meth:`Document.convert_to_pdf` method. This method returns a buffer of data which can then be utilized by |PyMuPDF| to create a new |PDF|.
14
+
:ref:`Document types supported by PyMuPDF<HowToOpenAFile>` can easily be converted to |PDF| by using the :meth:`Document.convert_to_pdf` method. This method returns a buffer of data which can then be utilized by |PyMuPDF| to create a new |PDF|.
15
15
16
16
17
17
@@ -20,38 +20,97 @@ Files to PDF
20
20
.. code-block:: python
21
21
22
22
import pymupdf
23
-
23
+
24
+
# Convert Markdown to PDF
25
+
md_doc = pymupdf.open("example.md")
26
+
pdfdata = md_doc.convert_to_pdf()
27
+
pdf_doc = pymupdf.open(stream=pdfdata)
28
+
pdf_doc.save("example.pdf")
29
+
30
+
# Convert XPS to PDF
24
31
xps = pymupdf.open("input.xps")
25
-
pdfbytes= xps.convert_to_pdf()
26
-
pdf = pymupdf.open("pdf", pdfbytes)
32
+
pdfdata= xps.convert_to_pdf()
33
+
pdf = pymupdf.open(stream=pdfdata)
27
34
pdf.save("output.pdf")
28
35
36
+
.. _Markdown_to_PDF:
29
37
38
+
Markdown to PDF
39
+
~~~~~~~~~~~~~~~~~
30
40
31
-
PDF to SVG
32
-
~~~~~~~~~~~~~~~~~~
41
+
As Markdown files are supported input files they can be easily converted to PDF using the :meth:`Document.convert_to_pdf` method.
33
42
34
-
Technically, as SVG files cannot be multipage, we must export each page as an SVG.
43
+
In the simplest case you can just open the Markdown file and call the method to get a PDF representation of the content.
35
44
36
-
To get an SVG representation of a page use the :meth:`Page.get_svg_image` method.
37
45
38
-
**Example**
46
+
Defining paper size
47
+
"""""""""""""""""""
48
+
49
+
The default paper size is a is 400 x 600 :doc:`rect` but you can specify a custom paper size if you wish, to do this just send through the `rect` parameter as required, for example:
By default, the Markdown content will be converted to PDF using a default CSS stylesheet. However, you can specify your own CSS stylesheet to customize the appearance of the resulting PDF. To do this, define your `css` and apply it.
60
+
61
+
For example, to make all ``h1`` headers red (The single ``#`` symbol in Markdown), you could do the following:
62
+
63
+
.. code-block:: python
64
+
65
+
md_doc = pymupdf.open( # open the Markdown document in A4 size
The :ref:`support for CSS <CSS_Support>` is currently limited.
79
+
80
+
81
+
Defining Fonts
82
+
"""""""""""""""""
83
+
84
+
Fonts can be defined by using the `archive` parameter to provide a custom :ref:`Archive` containing the font files.
85
+
86
+
The fonts must exist in an archive which is provided to the `archive` parameter when opening the Markdown file. The CSS can then refer to these fonts by their names as defined in the archive.
87
+
88
+
For example, to assuming you have access to the source files for the "Comic Sans" font for all text, you could do the following:
89
+
90
+
.. code-block:: python
91
+
92
+
# Global CSS instructions to use the "Comic Sans" font for all text. The font files must be provided in the archive.
@@ -183,11 +184,13 @@ For details on **embedded files** refer to Appendix 3.
183
184
184
185
:arg str filetype: A string specifying the type of document. This is only ever needed when file content inspection fails. Text types like "txt", "html", "xml" etc. cannot be disambiguated by their content. When such files are provided in memory or being provided with the wrong file extension, this parameter **must** be used.
185
186
186
-
:arg rect_like rect: a rectangle specifying the desired page size. This parameter is only meaningful for documents with a variable page layout ("reflowable" documents), like e-books or HTML, and ignored otherwise. If specified, it must be a non-empty, finite rectangle with top-left coordinates (0, 0). Together with parameter :data:`fontsize`, each page will be accordingly laid out and hence also determine the number of pages.
187
+
:arg Archive archive: An optional :ref:`Archive` object to use as a source for resources like fonts and images.
187
188
188
-
:arg float width: may used together with ``height`` as an alternative to ``rect`` to specify layout information.
189
+
:arg rect_like rect: A rectangle specifying the desired page size. This parameter is only meaningful for documents with a variable page layout ("reflowable" documents), like e-books, MD or HTML, and ignored otherwise. If specified, it must be a non-empty, finite rectangle with top-left coordinates (0, 0). Together with parameter :data:`fontsize`, each page will be accordingly laid out and hence also determine the number of pages.
189
190
190
-
:arg float height: may used together with ``width`` as an alternative to ``rect`` to specify layout information.
191
+
:arg float width: May used together with ``height`` as an alternative to ``rect`` to specify layout information.
192
+
193
+
:arg float height: May be used together with ``width`` as an alternative to ``rect`` to specify layout information.
191
194
192
195
:arg float fontsize: the default :data:`fontsize` for reflowable document types. This parameter is ignored if none of the parameters ``rect`` or ``width`` and ``height`` are specified. Will be used to calculate the page layout.
193
196
@@ -201,24 +204,29 @@ For details on **embedded files** refer to Appendix 3.
201
204
202
205
In case of problems you can see more detail in the internal messages store: `print(pymupdf.TOOLS.mupdf_warnings())` (which will be emptied by this call, but you can also prevent this -- consult :meth:`Tools.mupdf_warnings`).
203
206
204
-
Overview of possible forms, note: `open` is a synonym of `Document`::
205
207
206
-
>>> # from a file
207
-
>>> doc = pymupdf.open("some.xps")
208
-
>>> # handle wrong extension
209
-
>>> doc = pymupdf.open("some.file", filetype="xps") # assert expected type
210
-
>>> doc = pymupdf.open("some.file", filetype="txt") # treat as plain text
211
-
>>>
212
-
>>> # from memory
213
-
>>> doc = pymupdf.open(stream=mem_area) # works for any supported type
214
-
>>> doc = pymupdf.open(stream=unknown-type, filetype="txt") # treat as plain text
215
-
>>>
216
-
>>> # new empty PDF
217
-
>>> doc = pymupdf.open()
218
-
>>> doc = pymupdf.open(None)
219
-
>>> doc = pymupdf.open("")
208
+
Overview of possible forms, note: :meth:`open` is a synonym of :meth:`Document`::
209
+
210
+
# from a file
211
+
doc = pymupdf.open("some.xps")
212
+
# handle wrong extension
213
+
doc = pymupdf.open("some.file", filetype="xps") # assert expected type
214
+
doc = pymupdf.open("some.file", filetype="txt") # treat as plain text
215
+
216
+
# from memory
217
+
doc = pymupdf.open(stream=mem_area) # works for any supported type
218
+
doc = pymupdf.open(stream=unknown-type, filetype="txt") # treat as plain text
219
+
220
+
# new empty PDF
221
+
doc = pymupdf.open()
222
+
doc = pymupdf.open(None)
223
+
doc = pymupdf.open("")
220
224
221
-
.. note:: Raster images with a wrong (but supported) file extension **are no problem**. MuPDF will determine the correct image type when file **content** is actually accessed and will process it without complaint.
225
+
.. note::
226
+
227
+
Raster images with a wrong (but supported) file extension **are no problem**. MuPDF will determine the correct image type when file **content** is actually accessed and will process it without complaint.
228
+
229
+
See :ref:`supported file types <Supported_File_Types>` for more information.
222
230
223
231
The Document class can be also be used as a **context manager**. Exiting the content manager will close the document automatically.
224
232
@@ -2030,6 +2038,20 @@ For details on **embedded files** refer to Appendix 3.
2030
2038
This is a normal PDF document with no usage restrictions whatsoever. If it is not being changed in any way, it can be used together with its journal to undo / redo operations or continue updating.
2031
2039
2032
2040
2041
+
.. method:: apply_css(css, append=True)
2042
+
2043
+
* New in v1.28.0
2044
+
2045
+
Apply CSS styles to the document. This is a global operation, which means that the styles will be applied to all pages and all elements of the document. The CSS syntax is the same as for HTML documents, but only a subset of CSS properties is supported.
2046
+
2047
+
:arg str css: a string containing the CSS styles to be applied.
2048
+
:arg bool append: whether to append the new styles to existing ones (if any) or to replace them.
2049
+
2050
+
.. note:: This method is primarily intended for use with :ref:`Markdown documents <Markdown_to_PDF>`.
2051
+
2052
+
2053
+
2054
+
2033
2055
.. attribute:: outline
2034
2056
2035
2057
Contains the first :ref:`Outline` entry of the document (or `None`). Can be used as a starting point to walk through all outline items. Accessing this property for encrypted, not authenticated documents will raise an *AttributeError*.
0 commit comments