Skip to content

Commit 34322cc

Browse files
committed
Add __FILE__ bug before/after slide, fix rendering accuracy
- New slide showing old AttributeManager rendering __FILE__ as emphasis - Verified actual RDoc output: After shows plain text, not <code> - Slide 16: tompng rewrites now say "fixed 30+ documented bugs" - Research file for tompng's comment/formatting rewrites
1 parent 4430b1e commit 34322cc

2 files changed

Lines changed: 294 additions & 37 deletions

File tree

Lines changed: 234 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,234 @@
1+
# tompng's Comment Directive & Inline Formatting Rewrites
2+
3+
## Overview
4+
5+
tompng (tomoya ishida) made two major rewrites to RDoc:
6+
7+
1. **Prism Ruby parser compatibility** (PR #1284, merged as `eb85efb5`) -- 10 commits fixing bugs in `RDoc::Parser::PrismRuby` to reach parity with the old Ripper-based parser, then PR #1581 switched the default parser to PrismRuby
8+
2. **InlineParser replacing AttributeManager** (PR #1559, merged as `9456e795`) -- complete rewrite of inline formatting from a string-substitution/macro engine to a proper recursive-descent parser producing structured AST nodes
9+
10+
---
11+
12+
## Part 1: InlineParser (PR #1559)
13+
14+
### What was replaced
15+
16+
`RDoc::Markup::AttributeManager` -- a string-replacing/macro-based inline parser that used non-printing character substitution (NUL bytes, chars 0x00-0x08) to mark formatting boundaries in a flat string. It converted text into a "flow" (array of strings and `AttrChanger` objects), which formatters then walked linearly.
17+
18+
**Deleted files (7):**
19+
- `lib/rdoc/markup/attribute_manager.rb` (432 lines)
20+
- `lib/rdoc/markup/attributes.rb` (70 lines)
21+
- `lib/rdoc/markup/attr_changer.rb` (22 lines)
22+
- `lib/rdoc/markup/attr_span.rb` (35 lines)
23+
- `lib/rdoc/markup/regexp_handling.rb` (40 lines)
24+
- `test/rdoc/markup/attribute_manager_test.rb` (474 lines)
25+
- `test/rdoc/markup/attributes_test.rb` (39 lines)
26+
27+
**Created files (2):**
28+
- `lib/rdoc/markup/inline_parser.rb` (321 lines)
29+
- `test/rdoc/markup/inline_parser_test.rb` (265 lines)
30+
31+
Net: -1264 / +1827 across 31 files.
32+
33+
### Architectural change
34+
35+
Old (AttributeManager):
36+
- Flat string with sentinel characters marking format boundaries
37+
- Regexp-based TIDYLINK handling as a "regexp_handling" macro
38+
- `convert_flow` walked a flat array of strings + AttrChanger markers
39+
- Architecture mismatch: structured inline styles represented as flat string operations
40+
41+
New (InlineParser):
42+
- Recursive-descent parser producing nested AST: `{ type: :BOLD, children: [string_or_node, ...] }`
43+
- TIDYLINK is a first-class syntax construct with its own `{ type: :TIDYLINK, url: ..., children: [...] }`
44+
- Formatters traverse the tree via `handle_BOLD`, `handle_EM`, `handle_TT`, `handle_TIDYLINK`, etc.
45+
- Regexp handling (crossref, hyperlink) applied only to text nodes after parse, not to the raw string
46+
47+
### Bugs fixed / behavior changes in the inline formatting
48+
49+
#### 1. Tidylink nested formatting now works correctly
50+
51+
Old: `{Label with *bold* text}[url]` -- formatting inside tidylink labels frequently failed because the regexp-based TIDYLINK macro couldn't handle nested structure.
52+
53+
New: Works correctly; the parser treats `{...}[url]` as a containing node with children that can include `*bold*`, `_em_`, etc.
54+
55+
Test: `test/rdoc/markup/inline_parser_test.rb:173-179`
56+
57+
#### 2. Simplified tidylink restricted to alphanumeric labels
58+
59+
Old: ``a*_`+<b>c[foo]`` would turn the entire preceding text into a link: `<a href="foo">a*_`+&lt;b&gt;c</a>`. This made simplified tidylinks conflict with other syntaxes.
60+
61+
New: Only `Alphanumeric[url]` (must start with a letter) is treated as a simplified tidylink.
62+
63+
From PR description: "This is terrible, it can't coexist with other syntaxes like `*word*` `_word_` `+word+`. We should restrict characters and recommend `{label}[url]`."
64+
65+
Test: `test/rdoc/markup/inline_parser_test.rb:148-153`
66+
67+
#### 3. `C1.m[:sym]` no longer swallowed by tidylink
68+
69+
Old: `C1.m[:sym]` rendered as `<a href="C1.html#method-c-m"><code>C1.m[:sym]</code></a>` -- the `[:sym]` was consumed as a link URL.
70+
71+
New: `<a href="C1.html#method-c-m"><code>C1.m</code></a>[:sym]` -- only `C1.m` is the crossref; `[:sym]` remains plain text.
72+
73+
Test: `test/rdoc/markup/to_html_crossref_test.rb:243-248`
74+
75+
#### 4. Cross-references disabled in word-pair markup (`*word*`, `_word_`)
76+
77+
New behavior: `*C1*` renders as `<strong>C1</strong>` (no crossref link), while `<b>C1</b>` renders as `<strong><a href="C1.html"><code>C1</code></a></strong>` (crossref still active).
78+
79+
Rationale: Word-pair types (`BOLD_WORD`, `EM_WORD`, `TT`) are leaf nodes with no regexp handling; tag-based types (`BOLD`, `EM`) contain text nodes where regexp handling runs.
80+
81+
Test: `test/rdoc/markup/to_html_crossref_test.rb:255-259` (`test_crossref_disabled_in_word_pair`)
82+
83+
#### 5. Hyperlinks/rdoc-ref/rdoc-image suppressed inside tidylink labels
84+
85+
Old: `{See http://example.com}[README.txt]` -- the URL inside the label would be converted to a hyperlink, breaking the tidylink.
86+
87+
New: Regexp handling for hyperlinks is disabled inside tidylink labels. Only rdoc-image gets special treatment when it's the entire label.
88+
89+
Tests:
90+
- `test/rdoc/markup/to_html_test.rb:1052` (`test_convert_hyperlink_disabled_inside_tidylink`)
91+
- `test/rdoc/markup/to_html_test.rb:1057` (`test_convert_rdoc_image_inside_tidylink`)
92+
- `test/rdoc/markup/to_html_test.rb:1068` (`test_convert_rdoc_label_disabled_inside_tidylink`)
93+
94+
#### 6. Cross-references suppressed inside tidylink labels
95+
96+
New: `{rdoc-ref:C1.m http://example.com C1}[url]` renders as a single link to `url` with all of the label text, without internal crossref processing.
97+
98+
Test: `test/rdoc/markup/to_html_crossref_test.rb:250-252` (`test_suppress_link_inside_tidylink_label`)
99+
100+
#### 7. Proper error recovery for unclosed/mismatched tags
101+
102+
Old: Unclosed tags could corrupt surrounding formatting.
103+
104+
New: Explicit error recovery rules:
105+
- Closing tags invalidate unclosed tags; unclosed tags become plain text
106+
- Unmatched closing tags become plain text
107+
- Nested tidylinks invalidate outer tidylinks
108+
109+
Tests: `test/rdoc/markup/inline_parser_test.rb:223-268`
110+
111+
#### 8. `__FILE__` / `__send__` exceptions for underscore word pairs
112+
113+
`__FILE__`, `__LINE__`, `__send__` are NOT treated as em-word (`_FILE_` inside `__...__`); they remain plain text.
114+
115+
Code: `lib/rdoc/markup/inline_parser.rb:242-244`
116+
Test: `test/rdoc/markup/inline_parser_test.rb:84-85`
117+
118+
### Follow-up InlineParser fixes
119+
120+
**`306ca6e5` -- Handle in_tidylink_label check correctly, fix edge cases of link/rdoc-ref/rdoc-image in tidylink label**
121+
- Added tests for `hyperlink_disabled_inside_tidylink`, `rdoc_image_inside_tidylink`, `rdoc_label_disabled_inside_tidylink`
122+
- 4 files changed, +50/-13
123+
124+
**`cc9dcdf8` -- Simplify inline parser state**
125+
- Internal simplification, no behavior change
126+
127+
**`bb771be4` -- Move suppressed-crossref backslash removing logic to regexp_handling**
128+
- Moved backslash-removal for suppressed cross-references from the formatter to the regexp handling phase
129+
130+
**`1b7488d5` -- Use strscan in InlineParser token scan**
131+
- Performance improvement using StringScanner
132+
133+
---
134+
135+
## Part 2: Prism Ruby Parser Bugs Fixed (PR #1284 + follow-ups)
136+
137+
The `accept_legacy_bug?` test marker is defined in `test/rdoc/parser/prism_ruby_test.rb`:
138+
- Line 2382: `RDocParserPrismRubyTest` returns `false`
139+
- Line 2396: `RDocParserRipperRubyWithPrismRubyTestCasesTest` returns `true`
140+
141+
The Ripper-based test class only runs when `ENV['RDOC_USE_RIPPER_PARSER']` is set (line 2404). Every `accept_legacy_bug?` guard documents a bug in the old Ripper parser that the new Prism parser fixes.
142+
143+
### Catalog of `accept_legacy_bug?` guarded bugs
144+
145+
| Line(s) | Test method | Old (Ripper) bug | Fixed (Prism) behavior |
146+
|---------|-------------|------------------|----------------------|
147+
| 325 | `test_class_module_nodoc` | `:nodoc:` on `module MBaz::M` not properly applied; documentable list incorrect | Correctly filters to `['Baz', 'MBaz']` |
148+
| 603 | `test_method_definition` | `three(x)` arglists not parsed when `def three x` uses space-separated param | Correctly reports `three(x)` |
149+
| 720 | `test_define_method` | `define_method :baz` inside `class << self` not marked as singleton | Correctly sets `singleton = true` |
150+
| 757 | `test_method_definition_nested_inside_block` | `def` inside blocks (e.g., `included do`) incorrectly documented; `baz1` inside unknown block documented | Rejects `def` inside metaprogramming blocks; only documents `[:foo, :bar, :baz2, :baz3]` |
151+
| 882 | `test_singleton_class_nonself` | `class << ::Bar` and `class << Baz1` incorrectly created extra modules | Only `['A', 'A::Foo', 'Bar']` modules exist |
152+
| 920 | `test_singleton_class` | `def self.dummy2` inside `class << self` and `def self.dummy4` inside `class << Foo` incorrectly documented as regular methods | Correctly filtered out |
153+
| 953 | `test_singleton_class_meta_method` | `:method: m4` inside `class << self` not treated as singleton | Correctly sets `singleton = true` |
154+
| 1072 | `test_module_function` | `module_function def m4` -- old parser fails to make instance copy private and doesn't create singleton copy for `m4` | Correctly handles `module_function def m4` |
155+
| 1097 | `test_class_method_visibility` | `private_class_method`/`public_class_method` with `def` argument not handled | `['m1', 'm3', 'm5']` correctly public |
156+
| 1129 | `test_undocumentable_change_visibility` | Not implemented: old parser can't handle `private 42, :m` or `private def self.m1` edge cases | Correctly ignores non-standard visibility calls |
157+
| 1164 | `test_method_visibility_change_in_subclass` | Not implemented: `private :m1` / `public :m2` in subclass B not tracked | Creates visibility-changed method entries in subclass |
158+
| 1266 | `test_alias_method_visibility` | `alias_method` visibility not correct; `foo` should be `:private` after `private :foo`, `foo2` should stay `:public` | Correct visibility for aliased methods |
159+
| 1275 | `test_invalid_alias_method` | Old parser doesn't reject invalid `alias_method` calls (0 args, 3 args, non-symbol) | Only valid `alias_method :new, :old` documented |
160+
| 1321 | `test_attributes_stopdoc_nodoc` | `attr :attr1, :attr2` treated as single attribute; `rw` set to wrong value | Correctly creates 2 attributes per `attr`; 8 total |
161+
| 1393-1413 | `test_attributes_nodoc` / `test_attributes_nodoc_track` | `:nodoc:` on multi-arg `attr_reader`/`attr_writer`/`attr_accessor` not applied to all attrs | Correct count: 4 documented / 12 with `:nodoc` tracking |
162+
| 1504, 1559 | `test_meta_attributes` / `test_meta_attributes_named` | `:attr:` directive produces wrong `rw` value (not `'R'`) | Correctly sets `rw = 'R'` for `:attr:` |
163+
| 1595-1601 | `test_constant` | Constants in wrong namespaces; `::C` not placed in Object; line numbers wrong | Constants placed in correct classes with correct line numbers |
164+
| 1721 | `test_constant_with_singleton_class` | Constants inside `class <<Bar` mishandled | Correctly assigns constants in singleton class contexts |
165+
| 1778 | `test_true_false_nil_method` | `def nil.foo` creates `NilClass::foo` (class method separator) | Creates `NilClass#foo` (instance method on singleton) |
166+
| 1810 | `test_include_extend_to_singleton_class` | Not implemented: `include I` inside `class << self` not converted to `extend` | Correctly converts to extend |
167+
| 1828 | `test_include_with_module_nesting` | Not implemented: module nesting context not resolved for `include M` | Resolves `include M` to correct module via nesting |
168+
| 1861 | `test_various_argument_include` | Not implemented: multi-arg `include A, B` and invalid `include 42, C` not handled | Handles multi-arg; rejects non-constant args |
169+
| 2120 | `test_include_extend_suppressed_within_block` | `include M` / `extend N` inside metaprogramming blocks incorrectly applied to outer class | Suppressed inside blocks; only direct `include O`/`extend O` applied |
170+
| 2162 | `test_visibility_methods_suppressed_within_block` | `private_class_method :s_pri` inside `Module.new do` block incorrectly applied | Block-level visibility changes suppressed |
171+
| 2166 | `test_alias_method_suppressed_within_block` | `alias_method` inside blocks incorrectly applied to outer class | `alias_method` suppressed inside blocks |
172+
173+
### Key commits in PR #1284
174+
175+
| Commit | Description |
176+
|--------|-------------|
177+
| `7b3331b0` | Reduce the difference between PrismRuby and Ruby parsers (foundational) |
178+
| `14dc2e90` | Suppress include and extend inside method block |
179+
| `01cbb99d` | Fix constant defined in singleton class, class constant_path, module constant_path |
180+
| `89667183` | Strip prefix `::` from superclass name |
181+
| `69bb46de` | Reject documenting `def` inside block |
182+
| `c5d96a81` | Accept "new" and "initialize" method both documented |
183+
| `25f1df9b` | Fix wrong test for class defined inside singleton class |
184+
| `0299ae14` | Fix superclass override when class Object is documented |
185+
| `2973d9de` | Support `:nodoc:` on method definition parameter end line |
186+
| `de966075` | pend->omit test that RDoc::Parser::Ruby has a bug or does not support |
187+
188+
### Post-#1284 follow-ups
189+
190+
| PR | Commit | Description |
191+
|----|--------|-------------|
192+
| #1581 | `78ca6f50` | **Switch default parser to PrismRuby.** Fixes issues #398, #782, #816, #885, #1373, #1553, #1555 |
193+
| #1595 | `7e1157ae` | Ignore visibility method, attr definition, module_function within block (found in rails/rails) |
194+
195+
---
196+
197+
## Part 3: Other tompng Fixes (Not Parser or Inline)
198+
199+
| Commit | Description |
200+
|--------|-------------|
201+
| `906a72b0` | Document is not a Comment -- stops mixing `Document` with `Comment` in `CodeObject#comment`, fixing a class of potential bugs from Marshal round-trips |
202+
| `49551988` | Simplify section add/remove comment logic |
203+
| `f5a67bdc` | Fix `ToRdoc#accept_table` -- table alignment was broken; added `accept_table_align` tests for ANSI, BS, RDoc formatters |
204+
| `61f11a6c` | Simplify newline handling of comment token in `TokenStream#to_html` |
205+
| `acaf13dd` | Fix ri completion to always return candidates starting with a given name |
206+
| `b59ca2f9` | Delay DidYouMean until `NotFoundError#message` is called |
207+
| `40a66900` | Use ASCII character in HTML file |
208+
209+
---
210+
211+
## Summary of User-Facing Changes
212+
213+
### Things that now work correctly (InlineParser)
214+
215+
1. `{Bold *label*}[url]` -- formatting inside tidylink labels
216+
2. `C1.m[:sym]` -- no longer swallowed as a link URL
217+
3. `__FILE__` and `__send__` -- no longer mis-parsed as emphasis
218+
4. Unclosed `<b>` or mismatched `</i>` -- graceful degradation to plain text
219+
5. `{See http://example.com}[url]` -- hyperlinks inside tidylink labels not double-processed
220+
221+
### Things that now work correctly (Prism parser)
222+
223+
1. `:nodoc:` on multi-line method params (`def foo(arg1,\n arg2) # :nodoc:`)
224+
2. `define_method` inside `class << self` correctly marked singleton
225+
3. `include`/`extend`/`alias_method` inside metaprogramming blocks suppressed
226+
4. `module_function def m` correctly creates private instance + public singleton
227+
5. Constants in `class <<Bar` / `module ::Foo::Bar` placed in correct namespace
228+
6. `def nil.foo` creates `NilClass#foo` not `NilClass::foo`
229+
7. Multi-arg `attr_reader :a, :b # :nodoc:` applies nodoc to all attributes
230+
8. Visibility changes in subclasses tracked
231+
232+
### Intentional behavior change (InlineParser)
233+
234+
Simplified tidylink `word[url]` now restricted to identifiers starting with a letter. Old behavior where `a*_`+<b>c[foo]` became a link is no longer supported. Use `{label}[url]` instead.

0 commit comments

Comments
 (0)