Summary
The lexer's SCOPE_IDENTIFIER pattern does not allow [ or ], so scope names
produced by Verilog generate blocks — e.g. g_lane[0], g_lane[1] — are
truncated to g_lane. All array instances collapse to the same name, making it
impossible to distinguish them after parsing.
Root cause
File: src/VCDScanner.l, line 76
SCOPE_IDENTIFIER [a-zA-Z_][a-zA-Z_0-9\(\)]*
The character class allows letters, digits, underscores, and parentheses, but
not square brackets. When the lexer is in the IN_SCOPE state and
encounters a token like g_lane[0], the rule on line 213:
<IN_SCOPE>{SCOPE_IDENTIFIER} {
return VCDParser::parser::make_TOK_IDENTIFIER(std::string(yytext),loc);
}
matches only g_lane. The remaining [0] is then consumed by the catch-all
silent discard rule at line 383:
<*>.|\n {
// DO nothing!
}
No warning or error is emitted. The information is silently lost.
Why this is spec non-conformant
IEEE Std 1800-2023, section 21.7.2.1 (page 688 of the standard), defines the
4-state VCD syntax as:
$scope [ scope_type scope_identifier ] $end
...
scope_identifier ::= { ASCII character }
The grammar specifies scope_identifier as any sequence of ASCII
characters — there is no restriction on which characters are allowed.
Square brackets are valid ASCII and must be accepted.
Contrast this with SIGNAL_REFERENCE on line 77 of the same file, which
does include square brackets:
SIGNAL_REFERENCE [a-zA-Z_][a-zA-Z_0-9\.\(\)\[\]]*
The omission in SCOPE_IDENTIFIER is inconsistent with both the spec and the
treatment of signal references in the same lexer.
Impact
Any VCD file generated from a design with generate loops will produce
scopes whose names include an array index, for example:
$scope module g_lane[0] $end
...
$scope module g_lane[1] $end
...
$scope module g_lane[2] $end
After parsing, all three scopes are stored with name == "g_lane". Any
downstream tool that relies on scope names to navigate or compare the
hierarchy (e.g. a VCD diff tool) will see duplicate names and cannot
correctly identify individual instances.
Suggested fix
Add \[\] to the SCOPE_IDENTIFIER character class in src/VCDScanner.l,
line 76:
-SCOPE_IDENTIFIER [a-zA-Z_][a-zA-Z_0-9\(\)]*
+SCOPE_IDENTIFIER [a-zA-Z_][a-zA-Z_0-9\(\)\[\]]*
This mirrors how SIGNAL_REFERENCE is already defined and brings the lexer
into conformance with the IEEE 1800-2023 spec.
Summary
The lexer's
SCOPE_IDENTIFIERpattern does not allow[or], so scope namesproduced by Verilog
generateblocks — e.g.g_lane[0],g_lane[1]— aretruncated to
g_lane. All array instances collapse to the same name, making itimpossible to distinguish them after parsing.
Root cause
File:
src/VCDScanner.l, line 76The character class allows letters, digits, underscores, and parentheses, but
not square brackets. When the lexer is in the
IN_SCOPEstate andencounters a token like
g_lane[0], the rule on line 213:matches only
g_lane. The remaining[0]is then consumed by the catch-allsilent discard rule at line 383:
No warning or error is emitted. The information is silently lost.
Why this is spec non-conformant
IEEE Std 1800-2023, section 21.7.2.1 (page 688 of the standard), defines the
4-state VCD syntax as:
The grammar specifies
scope_identifieras any sequence of ASCIIcharacters — there is no restriction on which characters are allowed.
Square brackets are valid ASCII and must be accepted.
Contrast this with
SIGNAL_REFERENCEon line 77 of the same file, whichdoes include square brackets:
The omission in
SCOPE_IDENTIFIERis inconsistent with both the spec and thetreatment of signal references in the same lexer.
Impact
Any VCD file generated from a design with
generateloops will producescopes whose names include an array index, for example:
After parsing, all three scopes are stored with
name == "g_lane". Anydownstream tool that relies on scope names to navigate or compare the
hierarchy (e.g. a VCD diff tool) will see duplicate names and cannot
correctly identify individual instances.
Suggested fix
Add
\[\]to theSCOPE_IDENTIFIERcharacter class insrc/VCDScanner.l,line 76:
This mirrors how
SIGNAL_REFERENCEis already defined and brings the lexerinto conformance with the IEEE 1800-2023 spec.