You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/pymupdf4llm/api.rst
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -146,10 +146,10 @@ The |PyMuPDF4LLM| API
146
146
- **"page_boxes"** - |PyMuPDFLayoutMode_Valid| a list of dictionaries representing the layout boundary boxes. Each dictionary has the following structure::
147
147
148
148
{
149
-
"index": 0-based integer index of the box in reading sequence
150
-
"class": str, # one of "text", "picture", "table", etc.
149
+
"index": int, # 0-based integer index of the box in reading sequence
150
+
"class": str, # one of "text", "picture", "table", etc.
:arg float page_height: specify a desired page height. For relevance see the `page_width` parameter. If using the default `None`, the document will appear as one large page with a width of `page_width`. Consequently in this case, no markdown page separators will occur (except the final one), respectively only one page chunk will be returned.
@@ -216,10 +216,10 @@ The |PyMuPDF4LLM| API
216
216
- **"page_boxes"** - a list of dictionaries representing the layout boundary boxes. Each dictionary has the following structure::
217
217
218
218
{
219
-
"index": 0-based integer index of the box in reading sequence
220
-
"class": str, # one of "text", "picture", "table", etc.
219
+
"index": int, # 0-based integer index of the box in reading sequence
220
+
"class": str, # one of "text", "picture", "table", etc.
Copy file name to clipboardExpand all lines: docs/pymupdf4llm/index.rst
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -40,7 +40,7 @@ Functionality
40
40
41
41
- Standard text and tables are detected, brought in the right reading sequence and then together converted to **GitHub**-compatible **Markdown** text. Tables in plain text output mode are rendered using the `tabulate <https://pypi.org/project/tabulate/>`_ package.
42
42
43
-
- Header lines are identified via the font size and appropriately prefixed with one or more `#` tags. When using the package together with :ref:`PyMuPDF Layout <https://pypi.org/project/pymupdf-layout/>`_, titels, section headers and page headers and footers are detected.
43
+
- Header lines are identified via the font size and appropriately prefixed with one or more `#` tags. When using the package together with :ref:`PyMuPDF Layout <https://pypi.org/project/pymupdf-layout/>`_, titles, section headers and page headers and footers are detected.
44
44
45
45
- Bold, italic, mono-spaced text and code blocks are detected and formatted accordingly. Similar applies to ordered and unordered lists.
0 commit comments