Skip to content

Commit 9456e79

Browse files
authored
Replace attribute_manager with a new rdoc-inline-format parser (#1559)
## RDoc::Markup::AttributeManager RDoc parses block-level structure in `RDoc::Markup::Parser` and parses inline styling in `RDoc::Markup::AttributeManager`. `RDoc::Markup::AttributeManager` is a string-replacing/macro based parser. It's very complicated, and its mechanism are the root cause of many bugs. We need to remove and replace it. It converts to a flow/stream of string and styling changing operation. Unfortunately, inline styles are structured data, output format (HTML) is also a structured data. There's an architecture mismatch. Tidylink `{label}[url]` should be a syntax rule, but it is currently handled in a regexp-based macro called `regexp_handling`. Of course parsing nested structure such as styled tidylink label frequently fails. It's uncontrollable. ## Solution - Eliminate `RDoc::Markup::AttributeManager` - Create a parser that generates structured data - Traverse structured data to generate output instead of string replacing - Use controllable regexp-handling macro: only apply to text nodes ## New inline styling syntax ### Tokens - Word pairs `+word+` `*word*` `_word_` `` `word` `` and so on - Standalone tags `<br>`, `<tt>code_text</tt>`, `<code>code_text</code>` - Open and close tags `<i>` `</i>`, `<b>` `</b>`, `<em>` `</em>`, `<s>` `</s>`, `<del>` `</del>` - Tidy link opening `{` and closing `}[url_part]` - Simplified tidylink `word[url_part]` - Text nodes ### Regexp handling macro Matching with CROSSREF, RDOCREF, HYPERLINK regexp should be applied only to text nodes after parsing phase. Parsed tree modication instead of text-node gsub is also another option to implement this. ### Error recovery RDoc format doesn't have syntax error. We need to make the behavior as similar before. - Closing tags and braces will invalidate unclosed tags, and unclosed tags are treated as plain text - `<a><b><c></a>` will be `<a>&lt;b&gt;&lt;c&gt;</a>` - `{<a><b>}[url]` will be a tidylink with label `&lt;a&gt;&lt;b&gt;` - Unmatched closing tags and braces will be treated as plain text - `<a></b>}</a>` will be `<a>&lt;/b&gt;}</a>` - Tidylink inside tidylink will invalidate outside tidylinks - `{{inner}[url]}[<b>]</b>` will be `"{" + inner_tidylink + "}[" + bold("]")` ### Simplified tidylink RDoc was converting ``a*_`+<b>c[foo]`` to ``<a href="foo">a*_`+&lt;b&gt;c</a>``. This is terrible, it can't coexist with other syntaxes like `*word*` `_word_` `+word+`. We should restrict characters and recommend `{label}[url]`. In this pull request, only `Alphanumeric[url]` (should start with alphabet) is supported .
1 parent 6d3ac98 commit 9456e79

33 files changed

+1335
-1819
lines changed

lib/rdoc/cross_reference.rb

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
# frozen_string_literal: true
22

3-
require_relative 'markup/attribute_manager' # for PROTECT_ATTR
4-
53
##
64
# RDoc::CrossReference is a reusable way to create cross references for names.
75

@@ -33,7 +31,7 @@ class RDoc::CrossReference
3331
# See CLASS_REGEXP_STR
3432

3533
METHOD_REGEXP_STR = /(
36-
(?!\d)[\w#{RDoc::Markup::AttributeManager::PROTECT_ATTR}]+[!?=]?|
34+
(?!\d)[\w]+[!?=]?|
3735
%|=(?:==?|~)|![=~]|\[\]=?|<(?:<|=>?)?|>[>=]?|[-+!]@?|\*\*?|[\/%\`|&^~]
3836
)#{METHOD_ARGS_REGEXP_STR}/.source.delete("\n ").freeze
3937

lib/rdoc/markup.rb

Lines changed: 8 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@
7979
#
8080
# class WikiHtml < RDoc::Markup::ToHtml
8181
# def handle_regexp_WIKIWORD(target)
82-
# "<font color=red>" + target.text + "</font>"
82+
# "<font color=red>" + target + "</font>"
8383
# end
8484
# end
8585
#
@@ -110,10 +110,10 @@
110110

111111
class RDoc::Markup
112112

113-
##
114-
# An AttributeManager which handles inline markup.
113+
# Array of regexp handling pattern and its name. A regexp handling
114+
# sequence is something like a WikiWord
115115

116-
attr_reader :attribute_manager
116+
attr_reader :regexp_handlings
117117

118118
##
119119
# Parses +str+ into an RDoc::Markup::Document.
@@ -148,27 +148,11 @@ def self.parse(str)
148148
# structure (paragraphs, lists, and so on). Invoke an event handler as we
149149
# identify significant chunks.
150150

151-
def initialize(attribute_manager = nil)
152-
@attribute_manager = attribute_manager || RDoc::Markup::AttributeManager.new
151+
def initialize
152+
@regexp_handlings = []
153153
@output = nil
154154
end
155155

156-
##
157-
# Add to the sequences used to add formatting to an individual word (such
158-
# as *bold*). Matching entries will generate attributes that the output
159-
# formatters can recognize by their +name+.
160-
161-
def add_word_pair(start, stop, name)
162-
@attribute_manager.add_word_pair(start, stop, name)
163-
end
164-
165-
##
166-
# Add to the sequences recognized as general markup.
167-
168-
def add_html(tag, name)
169-
@attribute_manager.add_html(tag, name)
170-
end
171-
172156
##
173157
# Add to other inline sequences. For example, we could add WikiWords using
174158
# something like:
@@ -178,7 +162,7 @@ def add_html(tag, name)
178162
# Each wiki word will be presented to the output formatter.
179163

180164
def add_regexp_handling(pattern, name)
181-
@attribute_manager.add_regexp_handling(pattern, name)
165+
@regexp_handlings << [pattern, name]
182166
end
183167

184168
##
@@ -197,15 +181,9 @@ def convert(input, formatter)
197181
end
198182

199183
autoload :Parser, "#{__dir__}/markup/parser"
184+
autoload :InlineParser, "#{__dir__}/markup/inline_parser"
200185
autoload :PreProcess, "#{__dir__}/markup/pre_process"
201186

202-
# Inline markup classes
203-
autoload :AttrChanger, "#{__dir__}/markup/attr_changer"
204-
autoload :AttrSpan, "#{__dir__}/markup/attr_span"
205-
autoload :Attributes, "#{__dir__}/markup/attributes"
206-
autoload :AttributeManager, "#{__dir__}/markup/attribute_manager"
207-
autoload :RegexpHandling, "#{__dir__}/markup/regexp_handling"
208-
209187
# RDoc::Markup AST
210188
autoload :BlankLine, "#{__dir__}/markup/blank_line"
211189
autoload :BlockQuote, "#{__dir__}/markup/block_quote"

lib/rdoc/markup/attr_changer.rb

Lines changed: 0 additions & 22 deletions
This file was deleted.

lib/rdoc/markup/attr_span.rb

Lines changed: 0 additions & 35 deletions
This file was deleted.

0 commit comments

Comments
 (0)