This primarily drops support for Python 3.9, adds support for Python 3.13, and updates the parser to comply with Commonmark 0.31.2 and Markdown-It v14.1.0.
- ⬆️ Drop support for Python 3.9 in #360
- ⬆️ Comply with Commonmark 0.31.2 in #362
- 👌 Improve performance of "text" inline rule in #347
- 👌 Use
str.removesuffixin #348 - 👌 limit the number of autocompleted cells in a table in #364
- 👌 fix quadratic complexity in reference parser in #367
- 🐛 Fix emphasis inside raw links bugs in #320
Full Changelog: https://github.com/executablebooks/markdown-it-py/compare/v3.0.0...v4.0.0
Full Changelog: https://github.com/executablebooks/markdown-it-py/compare/v2.2.0...v3.0.0
Also add testing for Python 3.11
A key change is the addition of a new Token type, text_special, which is used to represent HTML entities and backslash escaped characters.
This ensures that (core) typographic transformation rules are not incorrectly applied to these texts.
The final core rule is now the new text_join rule, which joins adjacent text/text_special tokens,
and so no text_special tokens should be present in the final token stream.
Any custom typographic rules should be inserted before text_join.
A new linkify rule has also been added to the inline chain, which will linkify full URLs (e.g. https://example.com),
and fixes collision of emphasis and linkifier (so http://example.org/foo._bar_-_baz is now a single link, not emphasized).
Emails and fuzzy links are not affected by this.
- ♻️ Refactor backslash escape logic, add
text_special#276 - ♻️ Parse entities to
text_specialtoken #280 - ♻️ Refactor: Add linkifier rule to inline chain for full links #279
‼️ Remove(p)=>§replacement in typographer #281‼️ Remove unusedsilentarg inParserBlock.tokenize#284- 🐛 FIX: numeric character reference passing #272
- 🐛 Fix: tab preventing paragraph continuation in lists #274
- 👌 Improve nested emphasis parsing #273
- 👌 fix possible ReDOS in newline rule #275
- 👌 Improve performance of
skipSpaces/skipChars#271 - 👌 Show text of
text_specialintree.pretty#282
The use of StateBase.srcCharCode is deprecated (with backward-compatibility), and all core uses are replaced by StateBase.src.
Conversion of source string characters to an integer representing the Unicode character is prevalent in the upstream JavaScript implementation, to improve performance.
However, it is unnecessary in Python and leads to harder to read code and performance deprecations (during the conversion in the StateBase initialisation).
See #270, thanks to @hukkinj1.
For CommonMark, the presence of indented code blocks prevent any other block element from having an indent of greater than 4 spaces.
Certain Markdown flavors and derivatives, such as mdx and djot, disable these code blocks though, since it is more common to use code fences and/or arbitrary indenting is desirable.
Previously, disabling code blocks did not remove the indent limitation, since most block elements had the 3 space limitation hard-coded.
This change centralised the logic of applying this limitation (in StateBlock.is_code_block), and only applies it when indented code blocks are enabled.
This allows for e.g.
<div>
<div>
I can indent as much as I want here.
<div>
<div>See #260
Strict type annotation checking has been applied to the whole code base, ruff is now used for linting, and fuzzing tests have been added to the CI, to integrate with Google OSS-Fuzz testing, thanks to @DavidKorczynski.
- 🔧 MAINTAIN: Make type checking strict #
- 🔧 Add typing of rule functions #283
- 🔧 Move linting from flake8 to ruff #268
- 🧪 CI: Add fuzzing workflow for PRs #262
- 🔧 Add tox env for fuzz testcase run #263
- 🧪 Add OSS-Fuzz set up by @DavidKorczynski in #255
- 🧪 Fix fuzzing test failures #254
- ⬆️ UPGRADE: Allow linkify-it-py v2 by @hukkin in #218
- 🐛 FIX: CVE-2023-26303 by @chrisjsewell in #246
- 🐛 FIX: CLI crash on non-utf8 character by @chrisjsewell in #247
- 📚 DOCS: Update the example by @redstoneleo in #229
- 📚 DOCS: Add section about markdown renderer by @holamgadol in #227
- 🔧 Create SECURITY.md by @chrisjsewell in #248
- 🔧 MAINTAIN: Update mypy's additional dependencies by @hukkin in #217
- Fix typo by @jwilk in #230
- 🔧 Bump GH actions by @chrisjsewell in #244
- 🔧 Update benchmark pkg versions by @chrisjsewell in #245
Thanks to 🎉
- @jwilk made their first contribution in #230
- @holamgadol made their first contribution in #227
- @redstoneleo made their first contribution in #229
Full Changelog: https://github.com/executablebooks/markdown-it-py/compare/v2.1.0...v2.2.0
This release is primarily to replace the attrs package dependency,
with the built-in Python dataclasses package.
This should not be a breaking change, for most use cases.
- ⬆️ UPGRADE: Drop support for EOL Python 3.6 (#194)
- ♻️ REFACTOR: Move
Rule/Delimiterclasses fromattrstodataclass(#211) - ♻️ REFACTOR: Move
Tokenclass fromattrstodataclass(#211) ‼️ Remove deprecatedNestedTokensandnest_tokens- ✨ NEW: Save ordered list numbering (#192)
- 🐛 FIX: Combination of blockquotes, list and newlines causes
IndexError(#207)
- 🐛 FIX: Crash when file ends with empty blockquote line.
- ✨ NEW: Add
inline_definitionsoption. This option allows fordefinitiontoken to be inserted into the token stream, at the point where the definition is located in the source text. It is useful for cases where one wishes to capture a "loseless" syntax tree of the parsed Markdown (in conjunction with thestore_labelsoption).
- ⬆️ Update: Sync with markdown-it v12.1.0 and CommonMark v0.30
- ♻️ REFACTOR: Port
mdurlandpunycodefor URL normalisation (thanks to @hukkin!). This port fixes the outstanding CommonMark compliance tests. - ♻️ REFACTOR: Remove
AttrDict. This is no longer used is core or mdit-py-plugins, instead standard dictionaries are used. - 👌 IMPROVE: Use
__all__to signal re-exports
⬆️ UPGRADE: attrs -> v21 (#165)
This release has no breaking changes (see: https://github.com/python-attrs/attrs/blob/main/CHANGELOG.rst)
The first stable release of markdown-it-py 🎉
See the changes in the beta releases below, thanks to all the contributors in the last year!
- 👌 IMPROVE: Add
RendererProtocoltype, for typing renderers (thanks to @hukkinj1) - 🔧 MAINTAIN:
Noneis no longer allowed as a validsrcinput forStateBasesubclasses
mdit-py-plugins out of the core install requirements and into a plugins extra.
Synchronised code with the upstream Markdown-It v12.0.6:
- 🐛 FIX: Raise HTML blocks priority to resolve conflict with headings
- 🐛 FIX: Newline not rendered in image alt attribute
This is the first beta release of the stable v1.x series.
There are four notable (and breaking) changes:
- The code has been synchronised with the upstream Markdown-It
v12.0.4. In particular, this update alters the parsing of tables to be consistent with the GFM specification: https://github.github.com/gfm/#tables-extension- A number of parsing performance and validation improvements are also included. Token.attrsare now stored as dictionaries, rather than a list of lists. This is a departure from upstream Markdown-It, allowed by Pythons guarantee of ordered dictionaries (see #142), and is the more natural representation. NoteattrGet,attrSet,attrPushandattrJoinmethods remain identical to those upstream, andToken.as_dict(as_upstream=True)will convert the token back to a directly comparable dict.- The use of
AttrDicthas been replaced: Forenvany Python mutable mapping is now allowed, and so attribute access to keys is not (differing from the Javascript dictionary). ForMarkdownIt.optionsit is now set as anOptionsDict, which is a dictionary sub-class, with attribute access only for core MarkdownIt configuration keys. - Introduction of the
SyntaxTreeNode. This is a more comprehensive replacement fornest_tokensandNestedTokens(which are now deprecated). It allows for theTokenstream to be converted to/from a nested tree structure, with opening/closing tokens collapsed into a singleSyntaxTreeNodeand the intermediate tokens set as children. See Creating a syntax tree documentation for details.
- Fix exception due to empty lines after blockquote+footnote
- Fix linkify link nesting levels
- Fix the use of
Ruler.atfor plugins - Avoid fenced token mutations during rendering
- Fix CLI version info and correct return of exit codes
This release brings Markdown-It-Py inline with Markdown-It v11.0.1 (2020-09-14), applying two fixes:
Thanks to @hukkinj1!
This release provides some improvements to the code base:
- 🐛 FIX: Do not resolve backslash escapes inside auto-links
- 🐛 FIX: Add content to image tokens
- 👌 IMPROVE: Add more type annotations, thanks to @hukkinj1
🗑 DEPRECATE: Move plugins to mdit_py_plugins
Plugins (in markdown_it.extensions) have now been moved to executablebooks/mdit-py-plugins.
This will allow for their maintenance to occur on a different cycle to the core code, facilitating the release of a v1.0.0 for this package
🔧 MAINTAIN: Add mypy type-checking, thanks to @hukkinj1.
✨ NEW: Add linkify, thanks to @tsutsu3.
This extension uses linkify-it-py to identify URL links within text:
github.com-><a href="http://github.com">github.com</a>
Important: To use this extension you must install linkify-it-py; pip install markdown-it-py[linkify]
It can then be activated by:
from markdown_it import MarkdownIt
md = MarkdownIt().enable("linkify")
md.options["linkify"] = True✨ NEW: Add smartquotes, thanks to @tsutsu3.
This extension will convert basic quote marks to their opening and closing variants:
- 'single quotes' -> ‘single quotes’
- "double quotes" -> “double quotes”
It can be activated by:
from markdown_it import MarkdownIt
md = MarkdownIt().enable("smartquotes")
md.options["typographer"] = True✨ NEW: Add markdown-it-task-lists plugin, thanks to @wna-se.
This is a port of the JS markdown-it-task-lists,
for building task/todo lists out of markdown lists with items starting with [ ] or [x].
For example:
- [ ] An item that needs doing
- [x] An item that is completeThis plugin can be activated by:
from markdown_it import MarkdownIt
from markdown_it.extensions.tasklists import tasklists_plugin
md = MarkdownIt().use(tasklists_plugin)🐛 Various bug fixes, thanks to @hukkinj1:
- Do not copy empty
envarg inMarkdownIt.render _Entities.__contains__fix return data- Parsing of unicode ordinals
- Handling of final character in
skipSpacesBackandskipCharsBackmethods - Avoid exception when document ends in heading/blockquote marker
🧪 TESTS: Add CI for Python 3.9 and PyPy3
-
✨ NEW: Add simple typographic replacements, thanks to @tsutsu3: This allows you to add the
typographeroption to the parser, to replace particular text constructs:(c),(C)→ ©(tm),(TM)→ ™(r),(R)→ ®(p),(P)→ §+-→ ±...→ …?....→ ?..!....→ !..????????→ ???!!!!!→ !!!,,,→ ,--→ &ndash---→ &mdash
md = MarkdownIt().enable("replacements") md.options["typographer"] = True
-
📚 DOCS: Improve documentation for CLI, thanks to @westurner
-
👌 IMPROVE: Use
re.sub()instead ofre.subn()[0], thanks to @hukkinj1 -
🐛 FIX: An exception raised by having multiple blank lines at the end of some files
👌 IMPROVE: Add store_labels option.
This allows for storage of original reference label in link/image token's metadata, which can be useful for renderers.
✨ NEW: Add anchors_plugin for headers, which can produce:
<h1 id="title-string">Title String <a class="header-anchor" href="#title-string">¶</a></h1>🐛 Fixed an undefined variable in the reference block.
🐛 Fixed an IndexError in container_plugin, when there is no newline on the closing tag line.
⬆️ UPGRADE: attrs -> v20
This is not breaking, since it only deprecates Python 3.4 (see CHANGELOG.rst)
deflistanddollarmathplugins (see plugins list).
- Added benchmarking tests and CI (see https://executablebooks.github.io/markdown-it-py/dev/bench/)
- Improved performance of computing ordinals (=> 10-15% parsing speed increase). Thanks to @sildar!
- Stopped empty lines at the end of the document, after certain list blocks, raising an exception (#36).
- Allow myst-role to accept names containing digits (0-9).
containersplugin (see plugins list)
- Plugins and improved contributing section