Skip to content

Commit 5d2d6d0

Browse files
committed
Add comprehensive Lrama manual documentation
1 parent c478fe9 commit 5d2d6d0

23 files changed

Lines changed: 1597 additions & 32 deletions

doc/Index.md

Lines changed: 51 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,58 +1,77 @@
11
# Lrama
22

3-
[![Gem Version](https://badge.fury.io/rb/lrama.svg)](https://badge.fury.io/rb/lrama)
4-
[![build](https://github.com/ruby/lrama/actions/workflows/test.yaml/badge.svg)](https://github.com/ruby/lrama/actions/workflows/test.yaml)
3+
Lrama is a Ruby implementation of a LALR(1) parser generator. It reads a Bison-style grammar file, builds parser tables, and emits a C parser. Its primary goal is to support CRuby parser development, including Bison-compatible grammar files, error-tolerant parsing work, parameterized grammar rules, inlining, syntax diagrams, and a toolchain that can run as part of Ruby's build.
54

5+
## Quick Start
66

7-
## Overview
8-
9-
Lrama is LALR (1) parser generator written by Ruby. The first goal of this project is providing error tolerant parser for CRuby with minimal changes on CRuby parse.y file.
10-
11-
## Installation
12-
13-
Lrama's installation is simple. You can install it via RubyGems.
7+
Install the released gem:
148

159
```shell
1610
$ gem install lrama
11+
$ lrama --version
1712
```
1813

19-
From source codes, you can install it as follows:
14+
From a checkout of this repository:
2015

2116
```shell
22-
$ cd "$(lrama root)"
2317
$ bundle install
24-
$ bundle exec rake install
25-
$ bundle exec lrama --version
26-
lrama 0.7.0
18+
$ bundle exec ruby exe/lrama --version
2719
```
28-
## Usage
2920

30-
Lrama is a command line tool. You can generate a parser from a grammar file by running `lrama` command.
21+
Generate and run the calculator sample:
3122

3223
```shell
33-
# "y.tab.c" and "y.tab.h" are generated
34-
$ lrama -d sample/parse.y
24+
$ bundle exec ruby exe/lrama -d sample/calc.y -o /tmp/calc.c
25+
$ gcc -Wall /tmp/calc.c -o /tmp/calc
26+
$ /tmp/calc
3527
```
36-
Specify the output file with `-o` option. The following example generates "calc.c" and "calc.h".
28+
29+
Generate a state report and a syntax diagram while developing a grammar:
3730

3831
```shell
39-
# "calc", "calc.c", and "calc.h" are generated
40-
$ lrama -d sample/calc.y -o calc.c && gcc -Wall calc.c -o calc && ./calc
41-
Enter the formula:
42-
1
43-
=> 1
44-
1+2*3
45-
=> 7
46-
(1+2)*3
47-
=> 9
32+
$ bundle exec ruby exe/lrama -v --report-file=/tmp/calc.output sample/calc.y
33+
$ bundle exec ruby exe/lrama --diagram=/tmp/calc.html sample/calc.y
4834
```
4935

50-
## Supported Ruby version
36+
## Manual
37+
38+
Start with the introduction and examples if you are new to Lrama. Use the grammar, invocation, directive, and option references when you are maintaining an existing grammar.
39+
40+
1. [Introduction](chapters/00-introduction.md)
41+
2. [Installation and Conditions](chapters/00-installation-and-conditions.md)
42+
3. [Concepts](chapters/01-concepts.md)
43+
4. [Examples](chapters/02-examples.md)
44+
5. [Grammar Files](chapters/03-grammar-files.md)
45+
6. [Parser C Interface](chapters/04-parser-interface.md)
46+
7. [Parser Algorithm](chapters/05-parser-algorithm.md)
47+
8. [Error Recovery and Error Tolerance](chapters/06-error-recovery.md)
48+
9. [Context Dependencies](chapters/07-context-dependencies.md)
49+
10. [Debugging Your Parser](chapters/08-debugging.md)
50+
11. [Invoking Lrama](chapters/09-invoking-lrama.md)
51+
12. [Generated Parser and Integration](chapters/10-generated-parser-and-integration.md)
52+
13. [History](chapters/11-history.md)
53+
14. [Version Compatibility](chapters/12-version-compatibility.md)
54+
15. [FAQ](chapters/13-faq.md)
55+
56+
## Appendices
57+
58+
- [Directive Reference](appendices/a-directive-reference.md)
59+
- [Command Line Option Reference](appendices/b-command-line-option-reference.md)
60+
- [Bison Compatibility](appendices/c-bison-compatibility.md)
61+
- [Standard Library](appendices/d-standard-library.md)
62+
- [Glossary](appendices/e-glossary.md)
63+
- [Troubleshooting](appendices/f-troubleshooting.md)
64+
- [License and Legal Notes](appendices/g-license-and-legal-notes.md)
65+
66+
## Development Documents
67+
68+
- [Profiling](development/profiling.md)
69+
- [Compressed state table](development/compressed_state_table/main.md)
5170

52-
Lrama is executed with BASERUBY when building ruby from source code. Therefore Lrama needs to support BASERUBY, currently 3.1, or later version.
71+
## Supported Ruby Version
5372

54-
This also requires Lrama to be able to run with only default gems because BASERUBY runs with `--disable=gems` option.
73+
Lrama is executed with BASERUBY when building Ruby from source. For that reason, Lrama must run on the BASERUBY version used by Ruby and must work with default gems only, because BASERUBY is executed with `--disable=gems`.
5574

5675
## License
5776

58-
See [LEGAL.md](https://github.com/ruby/lrama/blob/master/LEGAL.md) file.
77+
See [LEGAL.md](../LEGAL.md) for the authoritative legal notice for this repository.
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# Directive Reference
2+
3+
This table is based on the current grammar accepted by `parser.y`. "Recognized" means the parser accepts the syntax, but the current implementation has limited or no generated-code effect.
4+
5+
| Directive | Syntax | Status | Notes |
6+
| --- | --- | --- | --- |
7+
| Prologue | `%{ ... %}` | Supported | Copied into generated C output. |
8+
| `%require` | `%require "VERSION"` | Recognized | Accepted as grammar syntax. With warnings enabled, Lrama reports that it currently does nothing. |
9+
| `%expect` | `%expect INTEGER` | Supported | Sets the expected conflict count. |
10+
| `%define` | `%define name value`, `%define name { value }` | Supported | Stored in grammar definitions. `lr.type=ielr` enables IELR computation; `parse.trace` enables parser debug parsing of the grammar file. |
11+
| `%param` | `%param {type name}` | Recognized | Accepted by `parser.y`; prefer `%lex-param` and `%parse-param` for generated parser parameters. |
12+
| `%lex-param` | `%lex-param {type name}` | Supported | Adds a parameter passed to `yylex`. |
13+
| `%parse-param` | `%parse-param {type name}` | Supported | Adds a parameter passed to `yyparse` and `yyerror`. |
14+
| `%code` | `%code ID { ... }` | Supported | Stores named code blocks for skeleton output. Common examples use `%code provides`. |
15+
| `%initial-action` | `%initial-action { ... }` | Supported | Stores initialization code for generated parser use. |
16+
| `%no-stdlib` | `%no-stdlib` | Supported | Prevents automatic loading of `lib/lrama/grammar/stdlib.y`. |
17+
| `%locations` | `%locations` | Supported | Enables location support. Location references also enable locations during grammar preparation. |
18+
| `%union` | `%union { ... }` | Supported | Defines `YYSTYPE` union members. |
19+
| `%destructor` | `%destructor { ... } symbols-or-tags` | Supported | Associates cleanup code with symbols or tags. |
20+
| `%printer` | `%printer { ... } symbols-or-tags` | Supported | Associates debug printing code with symbols or tags. |
21+
| `%error-token` | `%error-token { ... } symbols-or-tags` | Supported | Associates error-token code with symbols or tags. |
22+
| `%after-shift` | `%after-shift identifier` | Supported | Lrama-specific generated parser hook. |
23+
| `%before-reduce` | `%before-reduce identifier` | Supported | Lrama-specific generated parser hook. |
24+
| `%after-reduce` | `%after-reduce identifier` | Supported | Lrama-specific generated parser hook. |
25+
| `%after-shift-error-token` | `%after-shift-error-token identifier` | Supported | Lrama-specific generated parser hook. |
26+
| `%after-pop-stack` | `%after-pop-stack identifier` | Supported | Lrama-specific generated parser hook. |
27+
| `%token` | `%token [<tag>] NAME [NUMBER] ["alias"]` | Supported | Declares terminals. Multiple tag groups are accepted. |
28+
| `%type` | `%type <tag> symbol...` | Supported | Assigns semantic value tags. |
29+
| `%nterm` | `%nterm <tag> symbol...` | Supported | Assigns tags to nonterminals and errors if a terminal is redeclared as a nonterminal. |
30+
| `%left` | `%left [<tag>] token...` | Supported | Declares left associativity and precedence. |
31+
| `%right` | `%right [<tag>] token...` | Supported | Declares right associativity and precedence. |
32+
| `%nonassoc` | `%nonassoc [<tag>] token...` | Supported | Declares nonassociativity and precedence. |
33+
| `%precedence` | `%precedence [<tag>] token...` | Supported | Declares precedence without associativity. |
34+
| `%start` | `%start nonterminal` | Supported | Sets the start nonterminal. Multiple `%start` declarations are an error in Lrama. |
35+
| `%rule` | `%rule name(args): alternatives ;` | Supported | Defines a parameterized rule. |
36+
| `%rule %inline` | `%rule %inline name(args): alternatives ;` | Supported | Defines a parameterized rule expanded at use sites. |
37+
| Grammar separator | `%%` | Supported | Separates declarations, rules, and optional epilogue. |
38+
| `%empty` | `%empty` | Supported | Marks an empty alternative explicitly. |
39+
| `%prec` | `%prec symbol` | Supported | Overrides the rule precedence. Multiple `%prec` directives in one rule are an error. |
40+
41+
## Rule References
42+
43+
Semantic value references:
44+
45+
| Form | Meaning |
46+
| --- | --- |
47+
| `$$` | Value of the left-hand side. |
48+
| `$1`, `$2` | Positional right-hand side values. |
49+
| `$name` | Named reference. |
50+
| `$[name.with-punctuation]` | Bracketed named reference. |
51+
| `$<tag>1`, `$<tag>$` | Explicit tag override. |
52+
53+
Location references:
54+
55+
| Form | Meaning |
56+
| --- | --- |
57+
| `@$` | Location of the left-hand side. |
58+
| `@1`, `@2` | Positional right-hand side locations. |
59+
| `@name` | Named location reference. |
60+
| `@[name.with-punctuation]` | Bracketed named location reference. |
61+
62+
Index references:
63+
64+
| Form | Meaning |
65+
| --- | --- |
66+
| `$:1`, `$:name` | Parser stack index reference used by Lrama-generated code. |
67+
| `$:$` | Parsed as a reference form but not supported by code generation. |
68+
69+
## Parameterized Rule Forms
70+
71+
| Form | Expansion target |
72+
| --- | --- |
73+
| `symbol?` | `option(symbol)` |
74+
| `symbol*` | `list(symbol)` |
75+
| `symbol+` | `nonempty_list(symbol)` |
76+
| `name(arg1, arg2)` | User-defined or standard-library parameterized rule. |
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# Command Line Option Reference
2+
3+
This table is based on `lib/lrama/option_parser.rb`.
4+
5+
| Option | Argument | Description | Example |
6+
| --- | --- | --- | --- |
7+
| `-S`, `--skeleton=FILE` | File | Use a skeleton other than the default `bison/yacc.c`. | `lrama -S template/bison/yacc.c grammar.y` |
8+
| `-t`, `--debug` | None | Display debugging output from Lrama's internal grammar parser. Equivalent to `-Dparse.trace`. | `lrama -t grammar.y` |
9+
| `--locations` | None | Enable location support. | `lrama --locations grammar.y` |
10+
| `-D`, `--define=NAME[=VALUE]` | Name or name/value | Similar to a `%define` declaration. | `lrama -Dlr.type=ielr grammar.y` |
11+
| `-H`, `--header=[FILE]` | Optional file | Generate a header, optionally at `FILE`. | `lrama -Hparser.h grammar.y` |
12+
| `-d` | None | Generate a header with a derived path. | `lrama -d -o parser.c grammar.y` |
13+
| `-r`, `--report=REPORTS` | Comma-separated words | Generate automaton details. | `lrama --report=states,itemsets grammar.y` |
14+
| `--report-file=FILE` | File | Write report output to `FILE`. | `lrama --report=all --report-file=parser.output grammar.y` |
15+
| `-o`, `--output=FILE` | File | Write generated C output to `FILE`. | `lrama -o parser.c grammar.y` |
16+
| `--trace=TRACES` | Comma-separated words | Write generation traces to standard error. | `lrama --trace=rules,actions grammar.y` |
17+
| `--diagram=[FILE]` | Optional file | Generate an HTML syntax diagram. Defaults to `diagram.html`. | `lrama --diagram=/tmp/grammar.html grammar.y` |
18+
| `--profile=PROFILES` | Comma-separated words | Profile parser generation. | `lrama --profile=call-stack grammar.y` |
19+
| `-v`, `--verbose` | None | Same as adding the `states` report. | `lrama -v grammar.y` |
20+
| `-W`, `--warnings` | None | Enable warnings. | `lrama -W grammar.y` |
21+
| `-e` | None | Enable error recovery support in generated output. | `lrama -e grammar.y` |
22+
| `-V`, `--version` | None | Print version and exit. | `lrama --version` |
23+
| `-h`, `--help` | None | Print help and exit. | `lrama --help` |
24+
25+
## Report Keywords
26+
27+
| Keyword | Meaning |
28+
| --- | --- |
29+
| `states` | Describe parser states. |
30+
| `itemsets` | Include complete item-set closures. |
31+
| `lookaheads` | Show lookahead tokens for reduce items. |
32+
| `solved` | Describe solved shift/reduce conflicts. |
33+
| `counterexamples`, `cex` | Generate conflict counterexamples. |
34+
| `rules` | List unused rules. |
35+
| `terms` | List unused terminals. |
36+
| `verbose` | Include detailed internal state and analysis information. |
37+
| `all` | Enable all report keywords above. |
38+
| `none` | Disable reports. |
39+
40+
The default report option set contains the grammar report internally. A report file is written when `--report`, `-v`, or `--report-file` causes a report path to be present.
41+
42+
## Trace Keywords
43+
44+
| Keyword | Meaning |
45+
| --- | --- |
46+
| `automaton` | Trace states. |
47+
| `closure` | Trace item-set closure computation. |
48+
| `rules` | Trace grammar rules. |
49+
| `only-explicit-rules` | Trace only rules written explicitly in the grammar. |
50+
| `actions` | Trace grammar rules with actions. |
51+
| `time` | Trace generation time. |
52+
| `all` | Enable all supported trace keywords except `only-explicit-rules`. |
53+
| `none` | Disable traces. |
54+
55+
The validator knows additional Bison trace names, but only the keywords listed above are supported by the current Lrama tracer.
56+
57+
## Profile Keywords
58+
59+
| Keyword | Meaning |
60+
| --- | --- |
61+
| `call-stack` | Use the sampling call-stack profiler. |
62+
| `memory` | Use the memory profiler. |
63+
64+
## Defaults And Derived Paths
65+
66+
| Setting | Default |
67+
| --- | --- |
68+
| Output file | `y.tab.c` |
69+
| Skeleton | `bison/yacc.c` |
70+
| Diagram file | `diagram.html` |
71+
| Header path with `-d -o parser.c` | `parser.h` |
72+
| Header path with `-d grammar.y` | `grammar.h` |
73+
| Report path with `--report=all grammar.y` | `grammar.output` |
74+
75+
## STDIN Mode
76+
77+
Use:
78+
79+
```shell
80+
$ lrama [options] - FILE
81+
```
82+
83+
Lrama reads grammar text from standard input and uses `FILE` for diagnostics and derived paths.
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# Bison Compatibility
2+
3+
Lrama is Bison-style, not Bison-complete. This matrix records practical compatibility for the current repository.
4+
5+
| Bison feature or area | Lrama status | Notes |
6+
| --- | --- | --- |
7+
| `.y` grammar layout | Supported | Prologue, declarations, rules, and epilogue are supported. |
8+
| C LALR(1) parser generation | Supported | This is Lrama's main output mode. |
9+
| `%token`, `%type`, `%nterm`, `%union` | Supported | See the directive reference for syntax details. |
10+
| Precedence and associativity | Supported | `%left`, `%right`, `%nonassoc`, `%precedence`, and `%prec` are accepted. |
11+
| `%start` | Supported with difference | Multiple `%start` declarations are an error in Lrama. |
12+
| Semantic actions | Supported | Includes positional, named, bracketed named, and explicit-tag references. |
13+
| Location references | Supported | `@` references enable location support during grammar preparation. |
14+
| `%destructor` and `%printer` | Supported | Used by generated C output and debug support. |
15+
| `%require` | Recognized only | Accepted, but warnings state that it currently has no effect. |
16+
| `%param` | Recognized with limited effect | Use `%lex-param` and `%parse-param` for generated parser parameters. |
17+
| `%lex-param`, `%parse-param` | Supported | Parameters are passed to generated parser functions. |
18+
| IELR | Supported when requested | Use `%define lr.type ielr`; covered by integration fixtures. |
19+
| GLR | Not currently provided | No GLR skeleton or parser mode is provided. |
20+
| Push parser | Not currently provided | The generated C skeleton is pull-parser oriented. |
21+
| LAC | Not currently provided | The README records the LAC branch as disabled in the compatibility assumptions. |
22+
| C++/D/Java backends | Not currently provided | Current repository templates target C output. |
23+
| Parameterized rules | Lrama extension | `%rule`, suffixes, and `stdlib.y` helpers are Lrama grammar features. |
24+
| `%inline` parameterized rules | Lrama extension | Rules are expanded before parser states are finalized. |
25+
| Syntax diagrams | Lrama extension | Generated with `--diagram`. |
26+
| Error-tolerant parser support | Lrama extension | Enabled with `-e` and related grammar support. |
27+
| Bison manual text | Not reused | Documentation should explain Lrama behavior in original wording. |
28+
29+
## Compatibility Guidance
30+
31+
When porting a Bison grammar:
32+
33+
1. Generate reports with both tools if possible.
34+
2. Check unsupported directives before changing parser behavior.
35+
3. Compile and run generated parser tests.
36+
4. Confirm conflict counts with `%expect`.
37+
5. Avoid documenting future or unmerged behavior as supported.
38+
39+
## README Compatibility Assumptions
40+
41+
The README records several Bison template compatibility assumptions:
42+
43+
- `b4_locations_if` is always true.
44+
- `b4_pure_if` is always true.
45+
- `b4_pull_if` is always false.
46+
- `b4_lac_if` is always false.
47+
48+
These are implementation compatibility notes for Lrama's Bison-style template layer. They should not be read as a promise that every Bison command-line option or skeleton branch exists in Lrama.

0 commit comments

Comments
 (0)