Skip to content

Commit a2e265c

Browse files
committed
πŸ“š DOCS: Add GFM and GFM autolink plugins to documentation
1 parent 4582513 commit a2e265c

2 files changed

Lines changed: 39 additions & 1 deletion

File tree

β€Ždocs/index.mdβ€Ž

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,18 @@ html_string = md.render("some *Markdown*")
4343
.. autofunction:: mdit_py_plugins.front_matter.front_matter_plugin
4444
```
4545

46+
## GFM (GitHub Flavored Markdown)
47+
48+
```{eval-rst}
49+
.. autofunction:: mdit_py_plugins.gfm.gfm_plugin
50+
```
51+
52+
## GFM Autolinks
53+
54+
```{eval-rst}
55+
.. autofunction:: mdit_py_plugins.gfm_autolink.gfm_autolink_plugin
56+
```
57+
4658
## Footnotes
4759

4860
```{eval-rst}

β€Žmdit_py_plugins/gfm_autolink/index.pyβ€Ž

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,38 @@
22
33
Three inline scanners are registered:
44
5-
- **gfm_autolink_www** (char ``w``): bare ``www.`` URLs
5+
- **gfm_autolink_www** (char ``w``): bare ``www.`` URLs.
6+
Uses ``add_terminator_char("w")`` so the text scanner interrupts at ``w``.
67
- **gfm_autolink_protocol** (char ``:``): ``http://``, ``https://``,
78
``mailto:``, ``xmpp:`` URLs via back-scanning ``pending``.
89
- **gfm_autolink_email** (char ``@``): bare email addresses via
910
back-scanning ``pending``.
1011
12+
Since ``:`` and ``@`` are already default terminator characters in
13+
markdown-it-py, the protocol and email rules are invoked at every occurrence
14+
of those characters. They use a *back-scanning* approach: looking backwards
15+
through ``state.pending`` for a protocol prefix or email local-part that was
16+
accumulated by the text rule. This means every ``:`` and ``@`` in the
17+
document incurs a (cheap) regex check or character scan of pending text.
18+
19+
The trade-off vs. a **core-rule** (post-processing) approach β€” which would
20+
walk the final token stream, find autolink patterns in text tokens, and
21+
split them β€” is:
22+
23+
- **Inline approach** (current): simpler, integrates naturally with
24+
``state.linkLevel`` to suppress matching inside links, but relies on the
25+
prefix being present in ``state.pending`` (if a prior inline rule consumed
26+
part of the prefix, matching would fail β€” unlikely in practice).
27+
- **Core-rule approach**: guaranteed to find all autolinks regardless of
28+
inline rule ordering, but requires token-stream surgery (splitting text
29+
tokens and inserting link tokens) and cannot easily interact with nesting
30+
guards like ``linkLevel``.
31+
32+
The ``w`` terminator is the only *new* terminator added. It causes the text
33+
rule to interrupt at every ``w``, which is a minor performance cost for
34+
documents heavy in that letter, but necessary since ``www.`` must be matched
35+
from the start of the URL.
36+
1137
Specification: https://github.github.com/gfm/#autolinks-extension-
1238
1339
.. versionadded:: 0.5.0

0 commit comments

Comments
Β (0)