Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,7 @@ dist/
client/out/
server/out/
*.vsix
.env
.env
tests/_*
docs/
.vscode/
1 change: 1 addition & 0 deletions .vscodeignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,4 @@ package-lock.json
.gitattributes
.env
.env.*
docs/
18 changes: 17 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,22 @@

All notable changes to the **Bison/Flex Language Support** extension will be documented in this file.

## [1.5.1] - 2026-04-02

### Fixed

- **Flex — escaped quotes in quoted string patterns** (#30): patterns like `X"\'"` and `Y"\""` no longer trigger false `flex/invalid-pattern` errors. The validator now correctly handles `\"` and `\'` escape sequences inside Flex quoted strings.
- **Flex — abbreviation refs on rule lines with no inline action** (#31): `{ABBR}` used after a `^` BOL anchor or on a rule line whose action block appears on the following line was not recorded as an abbreviation reference, causing false `flex/unused-abbrev` warnings.
- **Flex — quoted strings with spaces in `rawPattern`** (audit-A): patterns like `"hello world"` were truncated at the space inside the quoted literal, causing false `flex/unreachable-rule` duplicates for distinct patterns sharing a common word prefix. `rawPattern()` now tracks quoted-string depth.
- **Flex — standalone `{` as multi-line action opener** (audit-B): a `{` appearing alone on the line after a rule pattern (valid Flex multi-line action syntax) was pushed as a spurious rule entry with pattern `{`, producing false `flex/unreachable-rule` diagnostics for every subsequent multi-line-action rule.
- **Flex — lowercase start condition names** (audit-C): all start-condition regex patterns used `[A-Z_][A-Z0-9_]*` (uppercase only). SC names that are valid C identifiers but lowercase (e.g. `%x comment`) were silently ignored, skipping `flex/undefined-sc` and `flex/unused-sc` diagnostics for them entirely.
- **Flex — single-tab action separator in abbreviation ref scan** (audit-D): the heuristic that separates the pattern from the action used `\s{2,}`, which did not match a single-tab separator. `{identifier}` tokens inside the C action body (e.g. compound literals) were falsely counted as abbreviation references, suppressing `flex/unused-abbrev`.
- **Cleanup**: removed two dead entries in the catch-all pattern set that contained a literal newline character and could never match a rule line.
- **Bison — lowercase/mixed-case tokens in precedence declarations** (audit-E): `%left`/`%right`/`%nonassoc` used an uppercase-only regex `[A-Z_][A-Z0-9_]*`, silently dropping tokens like `kPLUS` or `tTOKEN` from the precedence table. This caused false `bison/undeclared-token` warnings and incorrect shift/reduce heuristic results for such tokens.
- **Bison — `$N` references after nested sub-blocks in inline actions** (audit-F): the `extractDollarRefs` scanner used `/\{([^}]*)\}/` which stops at the first `}`, missing `$N` references that appear after a nested `{ … }` block inside the same action (e.g. `{ if (cond) { log(); } $$ = $5; }`). Replaced with a brace-depth scanner; the same fix was applied to `extractSymbols`, `getFirstSymbol`, and `extractRuleReferences` for consistency.

---

## [1.5.0] - 2026-04-01

### Added
Expand All @@ -11,7 +27,7 @@ All notable changes to the **Bison/Flex Language Support** extension will be doc
- **`Bison/Flex: Show in Generated File`** — from a `.y` / `.l` source, locates the generated file (using `bisonFlex.buildDirectory` setting, CMake detection, Makefile detection, same-directory fallback, then workspace-wide search) and navigates to the matching line. A QuickPick is shown when multiple candidates are found.
- New setting `bisonFlex.buildDirectory`: optional path to the build output directory, used by **Show in Generated File** to locate generated files when they are not in the same directory as the source.

--
---

## [1.4.1] - 2026-03-31

Expand Down
42 changes: 42 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,44 @@ Real-time error detection as you type:
| Shift/reduce conflict heuristic | |
| Unknown/invalid directive | |

Every diagnostic carries a **source** field (`bison` / `flex`), a **code slug** (e.g. `bison/unused-token`), and where available a **link** to the GNU documentation — rendered as a clickable `[bison/unused-token]` link in the Problems panel. Unused symbols are rendered greyed-out via `DiagnosticTag.Unnecessary`.

### Fix-it Hints (Quick Fixes)

22 code actions available via the lightbulb (`Ctrl+.`) or directly from the Problems panel:

**Bison** (11 fixes):
- Insert missing `%%` separator
- Declare undeclared `%token`
- Insert `%empty` for empty production
- Remove unused token declaration
- Remove unknown directive
- Add rule stub for missing non-terminal
- Add `%type <todo>` declaration
- Remove invalid `%start` / Add `%start` directive
- Close unclosed `%{` block
- Migrate Yacc legacy directives (`%error-verbose` → `%define parse.error verbose`, `%name-prefix`, `%pure-parser`, `%binary`)

**Flex** (11 fixes):
- Insert missing `%%` separator
- Define abbreviation stub
- Remove unused abbreviation
- Remove unused start condition
- Remove unknown directive
- Declare `%x SC_NAME` for undefined start condition
- Remove unused `%option`
- Remove duplicate `<<EOF>>` rule
- Add `%option noyywrap`
- Close unclosed `%{` block
- Remove inaccessible rule

### Source ↔ Generated File Navigation

Jump between Bison/Flex grammar sources and their generated C files using `#line` directives:

- **Bison/Flex: Show in Source** — from a generated `.tab.c` / `lex.yy.c` file, reads the nearest `#line N "file.y"` directive above the cursor and opens the grammar source at the correct line. Appears in the context menu only when a generated file is detected.
- **Bison/Flex: Show in Generated File** — from a `.y` / `.l` source, locates the generated file and navigates to the matching line. Searches `bisonFlex.buildDirectory`, then CMake/Makefile detection, then the same directory, then a workspace-wide scan. A QuickPick is shown when multiple candidates are found.

### Autocompletion

Context-aware suggestions triggered as you type:
Expand Down Expand Up @@ -185,6 +223,10 @@ Then press `F5` in VS Code to launch the Extension Development Host.
| `bisonFlex.showInlayHints` | `boolean` | `true` | Show inline type hints for `$$`/`$1`/`@$` semantic values |
| `bisonFlex.enableCodeLens` | `boolean` | `true` | Show reference counts and entry-point badges above rules |
| `bisonFlex.enableCmakeDiagnostics` | `boolean` | `true` | Warn when a `.y`/`.l` file is not referenced in `CMakeLists.txt` |
| `bisonFlex.minVersionBison` | `string` | `""` | Suppress checks that require a newer Bison version (e.g. `"3.0"`). Fires `bison/feature-requires-version` when a `%define` feature exceeds this version. |
| `bisonFlex.minVersionFlex` | `string` | `""` | Same as above for Flex. |
| `bisonFlex.disabledChecks` | `array` | `[]` | Diagnostic code slugs to suppress entirely (e.g. `["bison/shift-reduce", "flex/missing-yywrap"]`). |
| `bisonFlex.buildDirectory` | `string` | `""` | Path to the build output directory. Used by **Show in Generated File** to locate `.tab.c` / `lex.yy.c` when they are not next to the source. |

---

Expand Down
4 changes: 2 additions & 2 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"name": "bison-flex-lang",
"displayName": "Bison/Flex Language Support",
"description": "Full-featured language support for GNU Bison (.y, .yy) and Flex/RE-flex (.l, .ll) — syntax highlighting with embedded C/C++, real-time diagnostics, intelligent autocompletion, and hover documentation for all directives.",
"version": "1.5.0",
"version": "1.5.1",
"publisher": "theodevelop",
"license": "MIT",
"repository": {
Expand Down
62 changes: 49 additions & 13 deletions server/src/parser/bisonParser.ts
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,7 @@ export function parseBisonDocument(text: string): BisonDocument {
const precMatch = trimmed.match(/^%(left|right|nonassoc|precedence)\s+(.*)/);
if (precMatch) {
const kind = precMatch[1] as PrecedenceDeclaration['kind'];
const rawSymbols = precMatch[2].match(/[A-Z_][A-Z0-9_]*|"[^"]*"/g) || [];
const rawSymbols = precMatch[2].match(/[A-Za-z_][A-Za-z0-9_]*|"[^"]*"/g) || [];
const symbols: string[] = [];
const symbolRanges: Range[] = [];
for (const raw of rawSymbols) {
Expand Down Expand Up @@ -389,6 +389,33 @@ function replaceStringLiterals(text: string): string {
.replace(/'((?:[^'\\]|\\.)*)'/g, (_, content) => ` ${strLiteralPlaceholder(`'${content}'`)} `);
}

/**
* Remove all brace-balanced { ... } blocks from `text`, replacing each with `replacement`.
* Handles arbitrarily nested braces, unlike /\{[^}]*\}/ which stops at the first `}`.
* Unmatched `{` without a closing `}` (e.g. a multi-line action opener on its own line)
* are left out of the result — the Phase 3 brace tracker handles them separately.
*/
function removeBalancedBraces(text: string, replacement: string = ' '): string {
let result = '';
let depth = 0;
let pendingOpen = false; // true while inside a block that hasn't been closed yet
for (let i = 0; i < text.length; i++) {
if (text[i] === '{') {
if (depth === 0) pendingOpen = true;
depth++;
} else if (text[i] === '}') {
depth = Math.max(0, depth - 1);
if (depth === 0 && pendingOpen) {
result += replacement; // only emit placeholder when the block is fully closed
pendingOpen = false;
}
} else if (depth === 0) {
result += text[i];
}
}
return result;
}

/**
* Extract all grammar symbols (identifiers) from a production RHS in order.
*
Expand All @@ -399,8 +426,7 @@ function replaceStringLiterals(text: string): string {
* `"("` apart from `"{"` (both have different placeholders).
*/
function extractSymbols(text: string): string[] {
const cleaned = replaceStringLiterals(text)
.replace(/\{[^}]*\}/g, ' __midaction__ ') // inline actions count as a symbol ($N position)
const cleaned = removeBalancedBraces(replaceStringLiterals(text), ' __midaction__ ') // inline actions count as a symbol ($N position)
.replace(/%prec\s+\S+/g, ' ') // remove %prec TOKEN
.replace(/%empty/g, ' ') // remove %empty
.replace(/\/\/.*$/g, ' ') // remove line comments
Expand All @@ -423,8 +449,7 @@ function extractSymbols(text: string): string[] {
* `__s` (not all-caps) and is therefore not confused with a real terminal.
*/
function getFirstSymbol(text: string): string | undefined {
const cleaned = replaceStringLiterals(text)
.replace(/\{[^}]*\}/g, ' ') // remove inline actions
const cleaned = removeBalancedBraces(replaceStringLiterals(text)) // remove inline actions
.replace(/%prec\s+\S+/g, ' ') // remove %prec TOKEN
.replace(/%empty/g, ' ') // remove %empty
.replace(/\/\/.*$/g, ' ') // remove line comments
Expand Down Expand Up @@ -518,15 +543,27 @@ function parseTokenNames(text: string, type: string | undefined, lineNum: number

/**
* Scan the inline action block(s) on a single line for $n references.
* Only handles single-line { ... } blocks; multi-line actions are not detected here.
* Uses a brace-depth scanner so that $n references appearing after a nested
* sub-block (e.g. `{ if (x) { foo(); } $$ = $1; }`) are not missed.
* Only handles single-line { ... } blocks; multi-line actions are tracked by
* the caller (Phase 3 loop in parseBisonDocument).
* $$ and $<type>n are deliberately skipped.
*/
function extractDollarRefs(text: string, lineNum: number, fullLine: string): DollarRef[] {
const refs: DollarRef[] = [];
const actionRegex = /\{([^}]*)\}/g;
let actionMatch: RegExpExecArray | null;
while ((actionMatch = actionRegex.exec(text)) !== null) {
const actionContent = actionMatch[1];
let i = 0;
while (i < text.length) {
if (text[i] !== '{') { i++; continue; }
// Found the opening brace of an action block — scan to the matching '}'
let depth = 1;
let j = i + 1;
while (j < text.length && depth > 0) {
if (text[j] === '{') depth++;
else if (text[j] === '}') depth--;
j++;
}
// text[i+1 .. j-2] is the full content of this balanced action block
const actionContent = text.substring(i + 1, j - 1);
const dollarRegex = /\$(\d+)/g;
let m: RegExpExecArray | null;
while ((m = dollarRegex.exec(actionContent)) !== null) {
Expand All @@ -540,6 +577,7 @@ function extractDollarRefs(text: string, lineNum: number, fullLine: string): Dol
range: Range.create(lineNum, col >= 0 ? col : 0, lineNum, (col >= 0 ? col : 0) + fullMatch.length),
});
}
i = j; // advance past the entire balanced block
}
return refs;
}
Expand Down Expand Up @@ -579,9 +617,7 @@ function extractRuleReferences(text: string, lineNum: number, fullLine: string,

// Find identifiers in rule bodies (potential token/nonterminal references)
// Skip: strings, actions (braces), %prec keyword (but keep its token), %empty, comments
const cleaned = text
.replace(/"(?:[^"\\]|\\.)*"/g, '') // remove strings
.replace(/\{[^}]*\}/g, '') // remove inline actions
const cleaned = removeBalancedBraces(text.replace(/"(?:[^"\\]|\\.)*"/g, '')) // remove strings, then inline actions
.replace(/%prec/g, '') // remove %prec keyword (keep the token name)
.replace(/%empty/g, '') // remove %empty
.replace(/\/\/.*$/g, ''); // remove line comments
Expand Down
31 changes: 22 additions & 9 deletions server/src/parser/flexParser.ts
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,7 @@ export function parseFlexDocument(text: string): FlexDocument {
if (closeIdx >= 0) {
// Collect any additional SC names before the >
const before = trimmed.substring(0, closeIdx);
const moreConds = before.match(/[A-Z_][A-Z0-9_]*/g);
const moreConds = before.match(/[A-Za-z_][A-Za-z0-9_]*/g);
if (moreConds) pendingScHeader += ',' + moreConds.join(',');
const conds = pendingScHeader.replace(/^,+/, '').split(',').filter(s => s.length > 0);
pendingScHeader = null;
Expand All @@ -246,7 +246,7 @@ export function parseFlexDocument(text: string): FlexDocument {
}
} else {
// Still accumulating conditions from this line
const moreConds = trimmed.match(/[A-Z_][A-Z0-9_]*/g);
const moreConds = trimmed.match(/[A-Za-z_][A-Za-z0-9_]*/g);
if (moreConds) pendingScHeader += ',' + moreConds.join(',');
}
continue;
Expand All @@ -267,10 +267,19 @@ export function parseFlexDocument(text: string): FlexDocument {
continue;
}

// ── Multi-line action opener: bare `{` on its own line ────────────────────
// In Flex, the action brace may appear on the line after the pattern.
// Treat a standalone `{` as the opening of a C action block, not a rule.
if (trimmed === '{') {
actionDepth = 1;
continue;
}

// ── SC block opener: <SC1,SC2>{ ───────────────────────────────────────────
// Single-line header: <SC1,SC2>{ or <SC1,SC2> {
// SC names may be upper or lower case (any valid C identifier).
{
const scBlockMatch = trimmed.match(/^<([A-Z_][A-Z0-9_]*(?:,[A-Z_][A-Z0-9_]*)*)>\s*\{/);
const scBlockMatch = trimmed.match(/^<([A-Za-z_][A-Za-z0-9_]*(?:,[A-Za-z_][A-Za-z0-9_]*)*)>\s*\{/);
if (scBlockMatch) {
const conds = scBlockMatch[1].split(',');
scBlockStack.push(conds);
Expand All @@ -284,7 +293,7 @@ export function parseFlexDocument(text: string): FlexDocument {
continue;
}
// Multi-line header start: <SC1, (no closing > on this line)
const scMultiStart = trimmed.match(/^<([A-Z_][A-Z0-9_]*(?:,[A-Z_][A-Z0-9_]*)*,\s*)$/);
const scMultiStart = trimmed.match(/^<([A-Za-z_][A-Za-z0-9_]*(?:,[A-Za-z_][A-Za-z0-9_]*)*,\s*)$/);
if (scMultiStart) {
pendingScHeader = scMultiStart[1].replace(/,\s*$/, '');
continue;
Expand All @@ -293,7 +302,8 @@ export function parseFlexDocument(text: string): FlexDocument {

// ── Extract start condition references: <SC_NAME> or <SC1,SC2> ────────────
// Exclude <<EOF>> which is a special pattern, not a start condition
const scRefs = line.matchAll(/(?<!<)<([A-Z_][A-Z0-9_]*(?:,[A-Z_][A-Z0-9_]*)*)>(?!>)/g);
// SC names may be upper or lower case (any valid C identifier).
const scRefs = line.matchAll(/(?<!<)<([A-Za-z_][A-Za-z0-9_]*(?:,[A-Za-z_][A-Za-z0-9_]*)*)>(?!>)/g);
for (const m of scRefs) {
const conditions = m[1].split(',');
for (const cond of conditions) {
Expand All @@ -311,8 +321,11 @@ export function parseFlexDocument(text: string): FlexDocument {
const abbrRefs = line.matchAll(/\{([a-zA-Z_][a-zA-Z0-9_]*)\}/g);
for (const m of abbrRefs) {
const name = m[1];
// Only count as abbreviation ref if it appears before any action block on this line
const actionStart = line.indexOf('{', (line.match(/\s{2,}\{/) || { index: line.length }).index || line.length);
// Only count as abbreviation ref if it appears before any action block on this line.
// If there is no action { on this line (multi-line action), treat actionStart as line.length
// so all {name} refs on this line are counted.
const actionMatch = line.match(/\s+\{/);
const actionStart = actionMatch !== null ? line.indexOf('{', actionMatch.index!) : line.length;
if (m.index !== undefined && m.index < actionStart) {
const col = m.index;
const range = Range.create(i, col, i, col + m[0].length);
Expand All @@ -327,7 +340,7 @@ export function parseFlexDocument(text: string): FlexDocument {
// Start conditions: explicit <SC> prefix on this line PLUS any inherited from <SC>{ block
const inherited = scBlockStack.length > 0 ? scBlockStack[scBlockStack.length - 1] : [];
const startConditions: string[] = [...inherited];
const scMatch = trimmed.match(/^<([A-Z_][A-Z0-9_]*(?:,[A-Z_][A-Z0-9_]*)*)>/);
const scMatch = trimmed.match(/^<([A-Za-z_][A-Za-z0-9_]*(?:,[A-Za-z_][A-Za-z0-9_]*)*)>/);
if (scMatch) {
for (const c of scMatch[1].split(',')) {
if (!startConditions.includes(c)) startConditions.push(c);
Expand Down Expand Up @@ -375,7 +388,7 @@ function parseOptions(text: string, lineNum: number, fullLine: string, doc: Flex
}

function parseStartConditions(text: string, exclusive: boolean, lineNum: number, fullLine: string, doc: FlexDocument): void {
const names = text.match(/[A-Z_][A-Z0-9_]*/g);
const names = text.match(/[A-Za-z_][A-Za-z0-9_]*/g);
if (!names) return;
for (const name of names) {
const col = fullLine.indexOf(name);
Expand Down
Loading
Loading