Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions AUTHORS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ The AUTHORS/Contributors are (and/or have been):
* Edward Ross <gh: EdwardJRoss>
* Gregory Anders <gh: gpanders>
* Alex Vandiver <gh: alexmv>
* Anis <gh: assinscreedFC>

Maintainer:

Expand Down
6 changes: 6 additions & 0 deletions ChangeLog.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
Unreleased
==========
----

* Fix #405: Don't insert a spurious space between a closing emphasis marker and following punctuation.

2025.4.15
=========
----
Expand Down
8 changes: 6 additions & 2 deletions html2text/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -881,11 +881,15 @@ def handle_data(self, data: str, entity_char: bool = False) -> None:
self.preceding_stressed = True
elif self.preceding_stressed:
if (
re.match(r"[^][(){}\s.!?]", data[0])
re.match(r"\w", data[0])
and not hn(self.current_tag)
and self.current_tag not in ["a", "code", "pre"]
):
# should match a letter or common punctuation
# A following word character (letter, digit or underscore)
# attaches to the closing emphasis marker and stops Markdown
# from recognising it, so the separating space is needed only
# before word characters -- not before punctuation, which
# previously received a spurious space.
data = " " + data
self.preceding_stressed = False

Expand Down
4 changes: 4 additions & 0 deletions test/emphasis_punctuation.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
<p>An <em>emphasized</em>, then <em>another</em>: also <em>more</em>; right?</p>
<p>A <strong>strong</strong>, and a quote <em>here</em>"end".</p>
<p><em>word</em>boundary keeps its space, as does a digit <em>v</em>2.</p>
<p>No space before an apostrophe: <em>cat</em>'s tail.</p>
8 changes: 8 additions & 0 deletions test/emphasis_punctuation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
An _emphasized_, then _another_: also _more_; right?

A **strong**, and a quote _here_"end".

_word_ boundary keeps its space, as does a digit _v_ 2.

No space before an apostrophe: _cat_'s tail.