Skip to content

Commit f68905a

Browse files
committed
feat(ie-html): implement core HTML tree builder state machine #57
WHATWG tree construction with basic insertion modes: - TreeBuilder struct: open elements stack, head/form pointers, insertion mode tracking, pending text accumulation - ParseResult: document + errors + style_elements + link_stylesheets - Insertion modes: Initial, BeforeHtml, BeforeHead, InHead, InHeadNoscript, AfterHead, InBody, Text, AfterBody, AfterAfterBody - InBody handles: block elements (div, p, ul, etc.), headings (h1-h6 with nesting fix), void elements (br, hr, img, input), inline elements (basic push/pop), form handling - Core operations: insert_element with attributes, insert_character with text node coalescing, insert_comment, generate_implied_end_tags, scope checking (regular + button scope), close_p_element - Tokenizer integration: state switching for script (ScriptData), style/noframes (RawText), title/textarea (RcData) - Text mode: captures style content → style_elements, link hrefs → link_stylesheets - Implicit element creation: html, head, body auto-inserted - 12 unit tests covering document structure, implicit elements, void elements, style/link extraction, nesting, comments, doctype
1 parent 3959213 commit f68905a

3 files changed

Lines changed: 1385 additions & 5 deletions

File tree

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
#[derive(Debug, Clone, Copy, PartialEq)]
2+
pub enum InsertionMode {
3+
Initial,
4+
BeforeHtml,
5+
BeforeHead,
6+
InHead,
7+
InHeadNoscript,
8+
AfterHead,
9+
InBody,
10+
Text,
11+
InTable,
12+
InTableText,
13+
InCaption,
14+
InColumnGroup,
15+
InTableBody,
16+
InRow,
17+
InCell,
18+
InSelect,
19+
InSelectInTable,
20+
InTemplate,
21+
AfterBody,
22+
InFrameset,
23+
AfterFrameset,
24+
AfterAfterBody,
25+
AfterAfterFrameset,
26+
}

crates/ie-html/src/lib.rs

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,11 @@
44
//! Targets latest spec only — no quirks mode, no legacy element support.
55
66
pub mod entities;
7+
pub mod insertion_mode;
78
pub mod token;
89
pub mod tokenizer;
910
pub mod tree_builder;
1011

1112
pub use token::Token;
1213
pub use tokenizer::Tokenizer;
13-
pub use tree_builder::parse;
14+
pub use tree_builder::{ParseResult, parse};

0 commit comments

Comments
 (0)