Skip to content

Commit e843eaa

Browse files
authored
✨ NEW: add <img> tag parsing (#210)
1 parent 985bcc0 commit e843eaa

9 files changed

Lines changed: 193 additions & 14 deletions

File tree

docs/conf.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@
6767

6868
myst_amsmath_enable = True
6969
myst_admonition_enable = True
70+
myst_html_img = True
7071

7172

7273
def run_apidoc(app):

docs/using/img/fun-fish.png

89.9 KB
Loading

docs/using/intro.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -217,13 +217,16 @@ To do so, use the keywords beginning `myst_`.
217217
* - Option
218218
- Default
219219
- Description
220+
* - `myst_disable_syntax`
221+
- ()
222+
- List of markdown syntax elements to disable, see the [markdown-it parser guide](markdown_it:using).
220223
* - `myst_url_schemes`
221224
- `None`
222225
- [URI schemes](https://en.wikipedia.org/wiki/List_of_URI_schemes) that will be recognised as external URLs in `[](scheme:loc)` syntax, or set `None` to recognise all.
223226
Other links will be resolved as internal cross-references.
224-
* - `myst_disable_syntax`
225-
- ()
226-
- List of markdown syntax elements to disable, see the [markdown-it parser guide](markdown_it:using).
227+
* - `myst_html_img`
228+
- `False`
229+
- Convert HTML <img> elements to sphinx image nodes, see the [image syntax](syntax/images) for details
227230
* - `myst_math_delimiters`
228231
- "dollars"
229232
- Delimiters for parsing math, see the [Math syntax](syntax/math) for details

docs/using/syntax.md

Lines changed: 60 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -232,22 +232,22 @@ In addition to these summaries of inline syntax, see {ref}`extra-markdown-syntax
232232
- Description
233233
- Example
234234
* - HTMLSpan
235-
- any valid HTML (rendered in HTML output only)
235+
- Any valid HTML (rendered in HTML output only)
236236
- ```html
237237
<p>some text</p>
238238
```
239239
* - EscapeSequence
240-
- escaped symbols (to avoid them being interpreted as other syntax elements)
240+
- Escaped symbols (to avoid them being interpreted as other syntax elements)
241241
- ```md
242242
\*
243243
```
244244
* - AutoLink
245-
- link that is shown in final output
245+
- Link that is shown in final output
246246
- ```md
247247
<http://www.google.com>
248248
```
249249
* - InlineCode
250-
- literal text
250+
- Literal text
251251
- ```md
252252
`a=1`
253253
```
@@ -257,7 +257,8 @@ In addition to these summaries of inline syntax, see {ref}`extra-markdown-syntax
257257
A hard break\
258258
```
259259
* - Image
260-
- link to an image
260+
- Link to an image.
261+
You can also use HTML syntax, to include image size etc, [see here](syntax/images) for details
261262
- ```md
262263
![alt](src "title")
263264
```
@@ -267,17 +268,17 @@ In addition to these summaries of inline syntax, see {ref}`extra-markdown-syntax
267268
[text](target "title") or [text][key]
268269
```
269270
* - Strong
270-
- bold text
271+
- Bold text
271272
- ```md
272273
**strong**
273274
```
274275
* - Emphasis
275-
- italic text
276+
- Italic text
276277
- ```md
277278
*emphasis*
278279
```
279280
* - RawText
280-
- any text
281+
- Any text
281282
- ```md
282283
any text
283284
```
@@ -894,8 +895,58 @@ leave the "text" section of the markdown link empty. For example, this
894895
markdown: `[](syntax.md)` will result in: [](syntax.md).
895896
```
896897
897-
(syntax/footnotes)=
898+
(syntax/images)=
899+
900+
### Images
901+
902+
MyST provides a few different syntaxes for including images in your documentation, as explained below.
903+
904+
The first is the standard Markdown syntax:
905+
906+
```md
907+
![fishy](img/fun-fish.png)
908+
```
909+
910+
![fishy](img/fun-fish.png)
911+
912+
This will correctly copy the image to the build folder and will render it in all output formats (HTML, TeX, etc).
913+
However, it is limited in the configuration that can be applied, for example setting a width.
914+
915+
As discussed [above](syntax/directives), MyST allow for directives to be used such as `image` and `figure` (see {ref}`the sphinx documentation <sphinx:rst-primer>`):
916+
917+
````md
918+
```{image} img/fun-fish.png
919+
:alt: fishy
920+
:class: bg-primary
921+
:width: 200px
922+
:align: center
923+
```
924+
````
925+
926+
```{image} img/fun-fish.png
927+
:alt: fishy
928+
:class: bg-primary mb-1
929+
:width: 200px
930+
```
931+
932+
Additional options can now be set, however, in contrast to the Markdown syntax, this syntax will not show the image in common Markdown viewers (for example when the files are viewed on GitHub).
933+
934+
The final option is directly using HTML, which is also parsed by MyST.
935+
This is usually a bad option, because the HTML is treated as raw text during the build process and so sphinx will not recognise that the image file is to be copied, and will not output the HTML into non-HTML output formats.
936+
937+
HTML parsing to the rescue!
938+
By setting `myst_html_img = True` in the sphinx `conf.py` configuration file, MySt-Parser will attempt to convert any isolated `img` tags (i.e. not wrapped in any other HTML) to the internal representation used in sphinx.
898939
940+
```md
941+
<img src="img/fun-fish.png" alt="fishy" class="bg-primary" width="200px">
942+
```
943+
944+
<img src="img/fun-fish.png" alt="fishy" class="bg-primary mb-1" width="200px">
945+
946+
Allowed attributes are equivalent to the `image` directive: src, alt, class, width, height and name.
947+
Any other attributes will be dropped.
948+
949+
(syntax/footnotes)=
899950
### Footnotes
900951
901952
Footnote labels **start with `^`** and can then be any alpha-numeric string (no spaces),

myst_parser/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ def setup_sphinx(app):
2929
app.add_config_value("myst_math_delimiters", "dollars", "env")
3030
app.add_config_value("myst_amsmath_enable", False, "env")
3131
app.add_config_value("myst_admonition_enable", False, "env")
32+
app.add_config_value("myst_html_img", False, "env")
3233

3334
app.connect("config-inited", validate_config)
3435

myst_parser/docutils_renderer.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
MockIncludeDirective,
3434
)
3535
from .parse_directives import parse_directive_text, DirectiveParsingError
36+
from .parse_html import HTMLImgParser
3637

3738

3839
def make_document(source_path="notset") -> nodes.document:
@@ -468,7 +469,12 @@ def render_html_inline(self, token):
468469
self.current_node.append(nodes.raw("", token.content, format="html"))
469470

470471
def render_html_block(self, token):
471-
self.current_node.append(nodes.raw("", token.content, format="html"))
472+
node = None
473+
if self.config.get("myst_html_img", False):
474+
node = HTMLImgParser().parse(token.content, self.document, token.map[0])
475+
if node is None:
476+
node = nodes.raw("", token.content, format="html")
477+
self.current_node.append(node)
472478

473479
def render_image(self, token):
474480
img_node = nodes.image()

myst_parser/parse_html.py

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
from html.parser import HTMLParser
2+
from docutils import nodes
3+
4+
from docutils.parsers.rst import directives
5+
6+
7+
class ExitImageParse(Exception):
8+
pass
9+
10+
11+
def align(argument):
12+
return directives.choice(argument, ("left", "center", "right"))
13+
14+
15+
def make_error(document, error_msg, text, line_number):
16+
return document.reporter.error(
17+
"<img> conversion: {}".format(error_msg),
18+
nodes.literal_block(text, text),
19+
line=line_number,
20+
)
21+
22+
23+
class HTMLImgParser(HTMLParser):
24+
def handle_starttag(self, tag, attrs):
25+
if tag == "img":
26+
self._attrs = dict(attrs)
27+
raise ExitImageParse()
28+
29+
def parse(self, text: str, document: nodes.document, line_number: int):
30+
self.reset()
31+
self._attrs = None
32+
try:
33+
self.feed(text)
34+
except ExitImageParse:
35+
pass
36+
if self._attrs is None:
37+
return
38+
39+
# TODO check for preceding text?
40+
41+
if "src" not in self._attrs:
42+
return make_error(document, "missing src attribute", text, line_number)
43+
44+
options = {}
45+
for name, key, spec in [
46+
("src", "uri", directives.uri),
47+
("class", "classes", directives.class_option),
48+
("alt", "alt", directives.unchanged),
49+
("height", "height", directives.length_or_unitless),
50+
("width", "width", directives.length_or_percentage_or_unitless),
51+
("align", "align", align)
52+
# note: docutils also has scale and target
53+
]:
54+
if name in self._attrs:
55+
value = self._attrs[name]
56+
try:
57+
options[key] = spec(value)
58+
except (ValueError, TypeError) as error:
59+
error_msg = "Invalid attribute: (key: '{}'; value: {})\n{}".format(
60+
name, value, error
61+
)
62+
return make_error(document, error_msg, text, line_number)
63+
64+
node = nodes.image(text, **options)
65+
if "name" in self._attrs:
66+
name = nodes.fully_normalize_name(self._attrs["name"])
67+
node["names"].append(name)
68+
document.note_explicit_target(node, node)
69+
70+
return node

myst_parser/sphinx_parser.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ class MystParser(Parser):
2727
"myst_math_delimiters": "dollars",
2828
"myst_amsmath_enable": False,
2929
"myst_admonition_enable": False,
30+
"myst_html_img": False,
3031
}
3132

3233
# these specs are copied verbatim from the docutils RST parser
@@ -202,7 +203,12 @@ def get_markdown_parser(config: dict, renderer: str = "sphinx"):
202203
enable_amsmath=config["myst_amsmath_enable"],
203204
enable_admonitions=config["myst_admonition_enable"],
204205
)
205-
parser.options.update({"myst_url_schemes": config["myst_url_schemes"]})
206+
parser.options.update(
207+
{
208+
"myst_url_schemes": config["myst_url_schemes"],
209+
"myst_html_img": config["myst_html_img"],
210+
}
211+
)
206212
return parser
207213

208214

tests/test_html_parse.py

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
from unittest.mock import Mock
2+
import pytest
3+
4+
from myst_parser.parse_html import HTMLImgParser
5+
6+
7+
@pytest.mark.parametrize("text", ["", "abc", "<div></div>" "<div><img></div>"])
8+
def test_html_parse_none(text):
9+
document = Mock(reporter=Mock(error=Mock(return_value="error")))
10+
output = HTMLImgParser().parse(text, document, 0)
11+
assert output is None
12+
13+
14+
@pytest.mark.parametrize(
15+
"text", ["<img>", '<img src="a" width="x">', '<img src="a" height="x">']
16+
)
17+
def test_html_parse_error(text):
18+
document = Mock(reporter=Mock(error=Mock(return_value="error")))
19+
output = HTMLImgParser().parse(text, document, 0)
20+
assert output == "error"
21+
22+
23+
@pytest.mark.parametrize(
24+
"text,outcome",
25+
[
26+
('<img src="a">', '<image uri="a">'),
27+
('<img src="a" other="b">', '<image uri="a">'),
28+
(
29+
'<img src="a" height="200px" class="a b">',
30+
'<image classes="a b" height="200px" uri="a">',
31+
),
32+
(
33+
'<img src="a" name="b" align="left">',
34+
'<image align="left" names="b" uri="a">',
35+
),
36+
],
37+
)
38+
def test_html_parse_ok(text, outcome):
39+
document = Mock(reporter=Mock(error=Mock(return_value="error")))
40+
output = HTMLImgParser().parse(text, document, 0)
41+
assert output.pformat().strip() == outcome

0 commit comments

Comments
 (0)