@@ -53,6 +53,7 @@ This adds the following PSLR-related grammar directives and integration points:
5353- ` %symbol-set ` declares reusable sets of terminal tokens for PSLR lexical declarations
5454- ` %lex-tie ` expands parser-state acceptable-token sets for tied terminals
5555- ` %lex-no-tie ` records an explicit no-tie decision for terminals with overlapping token patterns
56+ - ` YYLAYOUT* ` token patterns are recognized in every parser state and discarded by PSLR-aware lexers
5657- ` %define pslr.max-states ` and ` %define pslr.max-state-ratio ` are Lrama-specific safety guards for state growth
5758- ` %define api.pslr.state-member ` names the parser-state field to be shared with the lexer when using the generated helper macros
5859
@@ -73,9 +74,16 @@ Typical usage looks like this:
7374%lex-prec RANGLE -s RSHIFT
7475```
7576
76- In this setup, ` %token-pattern ` lists the tokens that the PSLR scanner should consider, and ` %lex-prec `
77- resolves conflicts between overlapping matches. For example, ` %lex-prec RANGLE -s RSHIFT ` tells Lrama to
78- prefer ` RANGLE ` over ` RSHIFT ` when the shorter token should win.
77+ In this setup, ` %token-pattern ` lists the tokens that the generated pseudo-scanner FSA should consider, and
78+ ` %lex-prec ` resolves conflicts between overlapping matches. For example, ` %lex-prec RANGLE -s RSHIFT ` tells
79+ Lrama to prefer ` RANGLE ` over ` RSHIFT ` when the shorter token should win.
80+
81+ For normal parser-state scanner rows, unresolved pseudo-scanner conflicts are not resolved by token declaration
82+ order. They are reported as errors so the grammar can add an explicit ` %lex-prec ` , ` %lex-tie ` , or ` %lex-no-tie `
83+ declaration. Lrama also emits a fallback scanner row for syntax error handling; only that fallback row uses
84+ traditional lexical fallback behavior, choosing the longest match and then token declaration order. If no token
85+ pattern matches at all, the PSLR helper consumes one byte and returns ` YYUNDEF ` as a character-token fallback, so
86+ error paths do not loop forever.
7987
8088` %lex-prec ` uses ASCII spellings for the PSLR lexical precedence operators:
8189
@@ -103,14 +111,24 @@ Here, `IF` can be considered when the parser state accepts `ID`, but `%lex-tie`
103111The ` %lex-prec ID <~ keywords ` declaration resolves the ` if ` identity conflict in favor of ` IF ` while keeping
104112longer identifiers such as ` ifx ` as ` ID ` .
105113
114+ ` %lex-no-tie ` suppresses lexical tie candidate warnings; it does not break a final transitive tie closure. Generic
115+ declarations such as ` %lex-no-tie yyall yyall ` can suppress broad candidate reports, and a more specific ` %lex-tie `
116+ can still tie the relevant token pair.
117+
118+ Token patterns named ` YYLAYOUT ` or starting with ` YYLAYOUT ` are layout tokens. They are included in every
119+ parser-state scanner row and should be consumed and skipped by the PSLR-aware lexer instead of being returned to
120+ the parser. The generated helpers include ` YYPSLR_TOKEN_IS_LAYOUT(Token) ` and the structured
121+ ` YYPSLR_PSEUDO_SCAN_RESULT(...) ` API for this purpose.
122+
106123When the parser and lexer share a context through ` %parse-param ` / ` %lex-param ` , the generated header also
107124provides helpers such as ` YYPSLR_PSEUDO_SCAN(...) ` , so the lexer can choose a token based on the current parser
108- state.
125+ state. The paper-compatible scanning path needs the lexer to pass the unconsumed input prefix, not only an
126+ already-decided token fragment, so legacy external lexer bridges may still be limited by the text they provide.
109127
110- The implementation reports unresolved pseudo-scanner conflicts instead of silently resolving them by declaration
111- order. PSLR support is still experimental. Scoped lexical declarations, lexical nonterminals, ` %lex ` ,
112- ` %token-action ` , LAC, fallback rows, character-token fallback, and full layout-token semantics are not implemented
113- yet. If you find any bugs, please report them.
128+ PSLR parsers enable a lightweight LAC check in the generated parser so syntax errors caused by LR state merging,
129+ default reductions, or ` %nonassoc ` error actions are detected before user semantic actions are run for the bad
130+ lookahead. PSLR support is still experimental. Scoped lexical declarations, lexical nonterminals, ` %lex ` , and
131+ ` %token-action ` are not implemented yet. If you find any bugs, please report them.
114132
115133## Lrama 0.7.1 (2025-12-24)
116134
0 commit comments