From d79b950fc0bdbf96639f262b09ac024bce47cb4b Mon Sep 17 00:00:00 2001 From: Jamie Lemon Date: Tue, 2 Dec 2025 17:48:39 +0000 Subject: [PATCH] Docs: Fixes typos in changes.txt file. --- changes.txt | 52 ++++++++++++++++++++++++++-------------------------- 1 file changed, 26 insertions(+), 26 deletions(-) diff --git a/changes.txt b/changes.txt index e4b28d4ef..032212fc2 100644 --- a/changes.txt +++ b/changes.txt @@ -542,13 +542,13 @@ Other: * **Fixed** `3281 `_: Preparing metadata (pyproject.toml) did not run successfully * **Fixed** `3279 `_: PyMuPDF no longer builds in Alpine Linux - * **Fixed** `3257 `_: apply_redactions() deleting text outside of annoted box + * **Fixed** `3257 `_: apply_redactions() deleting text outside of annotated box * **Fixed** `3216 `_: AttributeError: 'Annot' object has no attribute '__del__' * **Fixed** `3207 `_: get_drawings's items is missing line from h path operator * **Fixed** `3201 `_: Memory leaks when merging PDFs * **Fixed** `3197 `_: page.get_text() returns hexadecimal text for some characters * **Fixed** `3196 `_: Remove text not working in 1.23.25 version vs 1.20.2 - * **Fixed** `3172 `_: PDF's 45º lines dissapearing in png conversion + * **Fixed** `3172 `_: PDF's 45º lines disappearing in png conversion * **Fixed** `3135 `_: Do not log warnings to stdout * **Fixed** `3125 `_: get_pixmap method stuck on one page and runs forever * **Fixed** `2964 `_: There is an issue with the image generated by the page.get_pixmap() function @@ -965,7 +965,7 @@ Other: * Bug fixes: - * **Fixed** `2556 `_: Segmentation fault at caling get_cdrawings(extended=True) + * **Fixed** `2556 `_: Segmentation fault at calling get_cdrawings(extended=True) * **Fixed** `2637 `_: Page.insert_textbox incorrectly handles the last word if it starts a new line * **Fixed** `2683 `_: Windows sdist build failure - non-quoting of path and using UNIX which command * **Fixed** `2691 `_: Page.get_textpage_ocr() bug in rebased fitz_new version @@ -1245,7 +1245,7 @@ Other: * Improve ``insert_file()`` documentation. - * ``get_bboxlog()``: aded optional ``layers`` to ``get_bboxlog()``. + * ``get_bboxlog()``: added optional ``layers`` to ``get_bboxlog()``. * ``Page.get_texttrace()``: add new dictionary key ``layer``, name of Optional Content Group. * Mention use of Python venv in installation documentation. @@ -1425,7 +1425,7 @@ Changes to build/release process: * **Added** new constants defining the default text extraction flags for more comfortable handling. Their naming convention is like :data:`TEXTFLAGS_WORDS` for ``page.get_text("words")``. See :ref:`text_extraction_flags`. -* **Changed** :meth:`Page.annots` and :meth:`Page.widgets` to detect and prevent reloading the page (illegally) inside the iterator loops via :meth:`Document.reload_page`. Doing this brings down the interpretor. Documented clean ways to do annotation and widget mass updates within properly designed loops. +* **Changed** :meth:`Page.annots` and :meth:`Page.widgets` to detect and prevent reloading the page (illegally) inside the iterator loops via :meth:`Document.reload_page`. Doing this brings down the interpreter. Documented clean ways to do annotation and widget mass updates within properly designed loops. * **Changed** several internal utility functions to become standalone ("SWIG inline") as opposed to be part of the :ref:`Tools` class. This, among other things, increases the performance of geometry object creation. @@ -1486,11 +1486,11 @@ This patch version implements minor improvements for :ref:`Pixmap` and also some * **Fixed** `#1351 `_. Reverted code that introduced the memory growth in v1.18.15. -* **Fixed** `#1417 `_. Developped circumvention for growth of open file handles using :meth:`Document.insert_pdf`. +* **Fixed** `#1417 `_. Developed circumvention for growth of open file handles using :meth:`Document.insert_pdf`. -* **Fixed** `#1418 `_. Developped circumvention for memory growth using :meth:`Document.insert_pdf`. +* **Fixed** `#1418 `_. Developed circumvention for memory growth using :meth:`Document.insert_pdf`. -* **Fixed** `#1430 `_. Developped circumvention for mass pixmap generations of document pages. +* **Fixed** `#1430 `_. Developed circumvention for mass pixmap generations of document pages. * **Fixed** `#1433 `_. Solves a bbox error for some Type 3 font in PyMuPDF text processing. @@ -1498,7 +1498,7 @@ This patch version implements minor improvements for :ref:`Pixmap` and also some * **Added** :meth:`Pixmap.warp` which makes a new pixmap from a given arbitrary convex quad inside the pixmap. -* **Added** :attr:`Annot.irt_xref` and :meth:`Annot.set_irt_xref` to inquire or set the `/IRT` ("In Responde To") property of an annotation. Implements `#1450 `_. +* **Added** :attr:`Annot.irt_xref` and :meth:`Annot.set_irt_xref` to inquire or set the `/IRT` ("In Response To") property of an annotation. Implements `#1450 `_. * **Added** :meth:`Rect.torect` and :meth:`IRect.torect` which compute a matrix that transforms to a given other rectangle. @@ -1562,7 +1562,7 @@ A new MuPDF feature is **journalling PDF updates**, which is also supported by t A third feature (unrelated to the new MuPDF version) includes the ability to detect when page **objects cover or hide each other**. It is now e.g. possible to see that text is covered by a drawing or an image. -* **Changed** terminology and meaning of important geometry concepts: Rectangles are now characterized as *finite*, *valid* or *empty*, while the definitions of these terms have also changed. Rectangles specifically are now thought of being "open": not all corners and sides are considered part of the retangle. Please do read the :ref:`Rect` section for details. +* **Changed** terminology and meaning of important geometry concepts: Rectangles are now characterized as *finite*, *valid* or *empty*, while the definitions of these terms have also changed. Rectangles specifically are now thought of being "open": not all corners and sides are considered part of the rectangle. Please do read the :ref:`Rect` section for details. * **Added** new parameter `"no_new_id"` to :meth:`Document.save` / :meth:`Document.tobytes` methods. Use it to suppress updating the second item of the document ``/ID`` which in PDF indicates that the original file has been updated. If the PDF has no ``/ID`` at all yet, then no new one will be created either. @@ -1692,7 +1692,7 @@ Focus of this version are major performance improvements of selected functions. * **Added** documentation for handling transparent image insertions, :meth:`Page.insert_image`. * **Added** :meth:`Page.get_image_rects`, an improved version of :meth:`Page.get_image_bbox`. * **Changed** :meth:`Document.delete_pages` to support various ways of specifying pages to delete. Implements `#1042 `_. -* **Changed** :meth:`Page.insert_image` to also accept the xref of an existing image in the file. This allows "copying" images between pages, and extremely fast mutiple insertions. +* **Changed** :meth:`Page.insert_image` to also accept the xref of an existing image in the file. This allows "copying" images between pages, and extremely fast multiple insertions. * **Changed** :meth:`Page.insert_image` to also accept the integer parameter ``alpha``. To be used for performance improvements. * **Changed** :meth:`Pixmap.set_alpha` to support new parameters for pre-multiplying colors with their alpha values and setting a specific color to fully transparent (e.g. white). * **Changed** :meth:`Document.embfile_add` to automatically set creation and modification date-time. Correspondingly, :meth:`Document.embfile_upd` automatically maintains modification date-time (``/ModDate`` PDF key), and :meth:`Document.embfile_info` correspondingly reports these data. In addition, the embedded file's associated "collection item" is included via its :data:`xref`. This supports the development of PDF portfolio applications. @@ -1730,7 +1730,7 @@ Focus of this version are major performance improvements of selected functions. * **Fixed** issue `#895 `_. * **Fixed** issue `#896 `_. Since v1.17.6 PyMuPDF suppresses the font subset tags and only reports the base fontname in text extraction outputs "dict" / "json" / "rawdict" / "rawjson". Now a new global parameter can request the old behaviour, :meth:`Tools.set_subset_fontnames`. * **Fixed** issue `#885 `_. Pixmap creation now also works with filenames given as ``pathlib.Paths``. -* **Changed** :meth:`Document.subset_fonts`: Text is **not rewritten** any more and should therefore **retain all its origial properties** -- like being hidden or being controlled by Optional Content mechanisms. +* **Changed** :meth:`Document.subset_fonts`: Text is **not rewritten** any more and should therefore **retain all its original properties** -- like being hidden or being controlled by Optional Content mechanisms. * **Changed** :ref:`TextWriter` output to also accept text in right to left mode (Arabian, Hebrew): :meth:`TextWriter.fill_textbox`, :meth:`TextWriter.append`. These methods now accept a new boolean parameter `right_to_left`, which is *False* by default. Implements `#897 `_. * **Changed** :meth:`TextWriter.fill_textbox` to return all lines of text, that did not fit in the given rectangle. Also changed the default of the ``warn`` parameter to no longer print a warning message in overflow situations. * **Added** a utility function :meth:`recover_quad`, which computes the quadrilateral of a span. This function can be used for correctly marking text extracted with the "dict" or "rawdict" options of :meth:`Page.get_text`. @@ -1906,8 +1906,8 @@ This is the first PyMuPDF version supporting MuPDF v1.18. The focus here is on e * **Fixed** issue `#651 `_. An upstream bug causing interpreter crashes in corner case redaction processings was fixed by backporting MuPDF changes from their development repo. * **Fixed** issue `#645 `_. Pixmap top-left coordinates can be set (again) by their own method, :meth:`Pixmap.set_origin`. * **Fixed** issue `#622 `_. :meth:`Page.insertImage` again accepts a :data:`rect_like` parameter. -* **Added** severeal new methods to improve and speed-up table of contents (TOC) handling. Among other things, TOC items can now changed or deleted individually -- without always replacing the complete TOC. Furthermore, access to some PDF page attributes is now possible without first **loading** the page. This has a very significant impact on the performance of TOC manipulation. -* **Added** an option to :meth:`Document.insert_pdf` which allows displaying progress messages. Adresses `#640 `_. +* **Added** several new methods to improve and speed-up table of contents (TOC) handling. Among other things, TOC items can now changed or deleted individually -- without always replacing the complete TOC. Furthermore, access to some PDF page attributes is now possible without first **loading** the page. This has a very significant impact on the performance of TOC manipulation. +* **Added** an option to :meth:`Document.insert_pdf` which allows displaying progress messages. Addresses `#640 `_. * **Added** :meth:`Page.getTextbox` which extracts text contained in a rectangle. In many cases, this should obsolete writing your own script for this type of thing. * **Added** new ``clip`` parameter to :meth:`Page.getText` to simplify and speed up text extraction of page sub areas. * **Added** :meth:`TextWriter.appendv` to add text in **vertical write mode**. Addresses issue `#653 `_ @@ -1986,9 +1986,9 @@ This version is based on MuPDF v1.17. Following are highlights of new and change * **Added** extended language support for annotations and widgets: a mixture of Latin, Greece, Russian, Chinese, Japanese and Korean characters can now be used in 'FreeText' annotations and text widgets. No special arrangement is required to use it. -* Faster page access is implemented for documents supporting a "chapter" structure. This applies to EPUB documents currently. This comes with several new :ref:`Document` methods and changes for :meth:`Document.loadPage` and the "indexed" page access *doc[n]*: In addition to specifying a page number as before, a tuple *(chaper, pno)* can be specified to identify the desired page. +* Faster page access is implemented for documents supporting a "chapter" structure. This applies to EPUB documents currently. This comes with several new :ref:`Document` methods and changes for :meth:`Document.loadPage` and the "indexed" page access *doc[n]*: In addition to specifying a page number as before, a tuple *(chapter, pno)* can be specified to identify the desired page. -* **Changed:** Improved support of redaction annotations: images overlapped by redactions are **permanantly modified** by erasing the overlap areas. Also links are removed if overlapped by redactions. This is now fully in sync with PDF specifications. +* **Changed:** Improved support of redaction annotations: images overlapped by redactions are **permanently modified** by erasing the overlap areas. Also links are removed if overlapped by redactions. This is now fully in sync with PDF specifications. Other changes: @@ -2012,7 +2012,7 @@ Potential code breaking changes: This version introduces several new features around PDF text output. The motivation is to simplify this task, while at the same time offering extending features. -One major achievement is using MuPDF's capabilities to dynamically choosing fallback fonts whenever a character cannot be found in the current one. This seemlessly works for Base-14 fonts in combination with CJK fonts (China, Japan, Korea). So a text may contain **any combination of characters** from the Latin, Greek, Russian, Chinese, Japanese and Korean languages. +One major achievement is using MuPDF's capabilities to dynamically choosing fallback fonts whenever a character cannot be found in the current one. This seamlessly works for Base-14 fonts in combination with CJK fonts (China, Japan, Korea). So a text may contain **any combination of characters** from the Latin, Greek, Russian, Chinese, Japanese and Korean languages. * **Fixed** issue `#493 `_. ``Pixmap(doc, xref)`` should now again correctly resemble the loaded image object. * **Fixed** issue `#488 `_. Widget names are now modifiable. @@ -2181,7 +2181,7 @@ Minor changes compared to version 1.16.2. The code of the "dict" and "rawdict" v * **Changed** text extraction methods of :ref:`Page` to allow detail control of the amount of extracted data. * **Added** :meth:`planish_line` which maps a given line (defined as a pair of points) to the x-axis. -* **Fixed** an issue (w/o Github number) which brought down the interpreter when encountering certain non-UTF-8 encodable characters while using :meth:`Page.getText` with te "dict" option. +* **Fixed** an issue (w/o Github number) which brought down the interpreter when encountering certain non-UTF-8 encodable characters while using :meth:`Page.getText` with the "dict" option. * **Fixed** issue #362 ("Memory Leak with getText('rawDICT')"). ------ @@ -2309,7 +2309,7 @@ List of change details: **Changes in Version 1.14.10** -* **Changed** :meth:`Page.show_pdf_page` to support rotation of the source rectangle. Fixes #261 ("Cannot rotate insterted pages"). +* **Changed** :meth:`Page.show_pdf_page` to support rotation of the source rectangle. Fixes #261 ("Cannot rotate inserted pages"). * **Fixed** a bug in :meth:`Page.insertImage` which prevented insertion of multiple images provided as streams. @@ -2431,7 +2431,7 @@ This version contains some technical / performance improvements and bug fixes. **Changes in Version 1.13.17** * **Fixed** an error that intermittently caused an exception in :meth:`Page.show_pdf_page`, when pages from many different source PDFs were shown. -* **Changed** method :meth:`Document.extractImage` to now return more meta information about the extracted imgage. Also, its performance has been greatly improved. Several demo scripts have been changed to make use of this method. +* **Changed** method :meth:`Document.extractImage` to now return more meta information about the extracted image. Also, its performance has been greatly improved. Several demo scripts have been changed to make use of this method. * **Changed** method :meth:`Document._getXrefStream` to now return *None* if the object is no stream and no longer raise an exception if otherwise. * **Added** method :meth:`Document._deleteObject` which deletes a PDF object identified by its :data:`xref`. Only to be used by the experienced PDF expert. * **Added** a method :meth:`paper_rect` which returns a :ref:`Rect` for a supplied paper format string. Example: *fitz.paper_rect("letter") = fitz.Rect(0.0, 0.0, 612.0, 792.0)*. @@ -2492,9 +2492,9 @@ This patch version contains several improvements for embedded files and file att **Changes in Version 1.13.11** -While the preceeding patch subversions only contained various fixes, this version again introduces major new features: +While the preceding patch subversions only contained various fixes, this version again introduces major new features: -* **Added** basic support for PDF widget annotations. You can now add PDF form fields of types Text, CheckBox, ListBox and ComboBox. Where necessary, the PDF is tranformed to a Form PDF with the first added widget. +* **Added** basic support for PDF widget annotations. You can now add PDF form fields of types Text, CheckBox, ListBox and ComboBox. Where necessary, the PDF is transformed to a Form PDF with the first added widget. * **Fixed** issues #176 ("wrong file embedding"), #177 ("segment fault when invoking page.getText()")and #179 ("Segmentation fault using page.getLinks() on encrypted PDF"). @@ -2550,7 +2550,7 @@ The major enhancement is PDF form field support. Form fields are annotations of **Changes in Version 1.13.1** -* :meth:`TextPage.extractDICT` is a new method to extract the contents of a document page (text and images). All document types are supported as with the other :ref:`TextPage` *extract*()* methods. The returned object is a dictionary of nested lists and other dictionaries, and **exactly equal** to the JSON-deserialization of the old :meth:`TextPage.extractJSON`. The difference is that the result is created directly -- no JSON module is used. Because the user needs no JSON module to interpet the information, it should be easier to use, and also have a better performance, because it contains images in their original **binary format** -- they need not be base64-decoded. +* :meth:`TextPage.extractDICT` is a new method to extract the contents of a document page (text and images). All document types are supported as with the other :ref:`TextPage` *extract*()* methods. The returned object is a dictionary of nested lists and other dictionaries, and **exactly equal** to the JSON-deserialization of the old :meth:`TextPage.extractJSON`. The difference is that the result is created directly -- no JSON module is used. Because the user needs no JSON module to interpret the information, it should be easier to use, and also have a better performance, because it contains images in their original **binary format** -- they need not be base64-decoded. * :meth:`Page.getText` correspondingly supports the new parameter value *"dict"* to invoke the above method. * :meth:`TextPage.extractJSON` (resp. *Page.getText("json")*) is still supported for convenience, but its use is expected to decline. @@ -2722,15 +2722,15 @@ Though MuPDF has declared it as being mostly a bug fix version, one major new fe MuPDF version 1.10 has a significant impact on our bindings. Some of the changes also affect the API -- in other words, **you** as a PyMuPDF user. -* Link destination information has been reduced. Several properties of the *linkDest* class no longer contain valuable information. In fact, this class as a whole has been deleted from MuPDF's library and we in PyMuPDF only maintain it to provide compatibilty to existing code. +* Link destination information has been reduced. Several properties of the *linkDest* class no longer contain valuable information. In fact, this class as a whole has been deleted from MuPDF's library and we in PyMuPDF only maintain it to provide compatibility to existing code. * In an effort to minimize memory requirements, several improvements have been built into MuPDF v1.10: - A new *config.h* file can be used to de-select unwanted features in the C base code. Using this feature we have been able to reduce the size of our binary *_fitz.o* / *_fitz.pyd* by about 50% (from 9 MB to 4.5 MB). When UPX-ing this, the size goes even further down to a very handy 2.3 MB. - - The alpha (transparency) channel for pixmaps is now optional. Letting alpha default to *False* significantly reduces pixmap sizes (by 20% -- CMYK, 25% -- RGB, 50% -- GRAY). Many *Pixmap* constructors therefore now accept an *alpha* boolean to control inclusion of this channel. Other pixmap constructors (e.g. those for file and image input) create pixmaps with no alpha alltogether. On the downside, save methods for pixmaps no longer accept a *savealpha* option: this channel will always be saved when present. To minimize code breaks, we have left this parameter in the call patterns -- it will just be ignored. + - The alpha (transparency) channel for pixmaps is now optional. Letting alpha default to *False* significantly reduces pixmap sizes (by 20% -- CMYK, 25% -- RGB, 50% -- GRAY). Many *Pixmap* constructors therefore now accept an *alpha* boolean to control inclusion of this channel. Other pixmap constructors (e.g. those for file and image input) create pixmaps with no alpha altogether. On the downside, save methods for pixmaps no longer accept a *savealpha* option: this channel will always be saved when present. To minimize code breaks, we have left this parameter in the call patterns -- it will just be ignored. -* *DisplayList* and *TextPage* class constructors now **require the mediabox** of the page they are referring to (i.e. the *page.bound()* rectangle). There is no way to construct this information from other sources, therefore a source code change cannot be avoided in these cases. We assume however, that not many users are actually employing these rather low level classes explixitely. So the impact of that change should be minor. +* *DisplayList* and *TextPage* class constructors now **require the mediabox** of the page they are referring to (i.e. the *page.bound()* rectangle). There is no way to construct this information from other sources, therefore a source code change cannot be avoided in these cases. We assume however, that not many users are actually employing these rather low level classes explicitly. So the impact of that change should be minor. **Other Changes compared to Version 1.9.3**