This document describes the ANTLR4-based parser implementation for MDL (Mendix Definition Language) in the Go library.
The MDL parser translates SQL-like MDL syntax into executable operations against Mendix project files. It uses ANTLR4 for grammar definition, enabling cross-language grammar sharing with other implementations (TypeScript, Java, Python).
┌─────────────────────────────────────────────────────────────────────┐
│ MDL Input String │
│ "SHOW ENTITIES IN MyModule" │
└─────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ ANTLR4 Lexer (mdl_lexer.go) │
│ Generated from MDLLexer.g4 - Tokenizes input into SHOW, ENTITIES,│
│ IN, IDENTIFIER tokens │
└─────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ ANTLR4 Parser (mdl_parser.go) │
│ Generated from MDLParser.g4 - Builds parse tree according to grammar │
└─────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ ANTLR Listener (visitor/visitor.go) │
│ Walks parse tree and builds strongly-typed AST nodes │
└─────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ AST (ast/ast.go) │
│ *ast.ShowStmt{Type: "ENTITIES", Module: "MyModule"} │
└─────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ Executor (executor/executor.go) │
│ Executes AST against modelsdk-go API │
└─────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ modelsdk-go Library │
│ mpr.Writer, domainmodel.Entity, etc. │
└─────────────────────────────────────────────────────────────────────┘
mdl/
├── grammar/
│ ├── MDLLexer.g4 # ANTLR4 lexer grammar (tokens)
│ ├── MDLParser.g4 # ANTLR4 parser grammar (rules)
│ └── parser/ # Generated parser code (DO NOT EDIT)
│ ├── mdl_lexer.go
│ ├── mdl_parser.go
│ ├── mdlparser_listener.go
│ └── mdlparser_base_listener.go
├── ast/
│ └── ast.go, ast_microflow.go, ast_expression.go, ast_datatype.go, ...
├── visitor/
│ └── visitor.go # ANTLR listener implementation
├── executor/
│ ├── executor.go # AST execution logic
│ ├── cmd_microflows_builder.go # Microflow builder (variable tracking)
│ └── validate_microflow.go # AST-level semantic checks (mxcli check)
├── catalog/
│ └── catalog.go # SQLite-based project metadata catalog
├── linter/
│ ├── linter.go # Linting framework
│ └── rules/ # Built-in lint rules (MDL001–MDL004)
└── repl/
└── repl.go # Interactive REPL interface
cmd/mxcli/
└── main.go # Cobra CLI entry point
The grammar defines MDL syntax using ANTLR4's EBNF-like notation.
Key design patterns:
Uses fragment rules for case-insensitive matching:
// Keywords are case-insensitive
SHOW : S H O W ;
ENTITY : E N T I T Y ;
// Fragment rules for each letter
fragment S : [sS] ;
fragment H : [hH] ;
fragment O : [oO] ;
fragment W : [wW] ;
// ... etcParser rules use labeled alternatives for type-safe listener methods:
showStatement
: SHOW MODULES SEMI? # ShowModules
| SHOW ENTITIES (IN IDENTIFIER)? SEMI? # ShowEntities
| SHOW ENTITY qualifiedName SEMI? # ShowEntity
;Each label generates a specific listener method (e.g., EnterShowModules, EnterShowEntities).
Whitespace is sent to a hidden channel (skipped):
WS : [ \t\r\n]+ -> skip ;ANTLR4 generates four files:
| File | Purpose |
|---|---|
mdl_lexer.go |
Tokenizer - converts input to token stream |
mdl_parser.go |
Parser - builds parse tree from tokens |
mdl_listener.go |
Listener interface - callbacks for each rule |
mdl_base_listener.go |
Empty listener implementation for extension |
Regenerating the parser:
cd mdl/grammar
antlr4 -Dlanguage=Go -package parser -o parser MDLLexer.g4 MDLParser.g4Or from the project root:
make grammarRequirements:
- ANTLR4 tool (
antlr4command or Java JAR) - Go target runtime (
github.com/antlr4-go/antlr/v4)
Strongly-typed AST nodes representing MDL statements.
// Statement is the interface for all MDL statements
type Statement interface {
statementNode()
}
// ShowStmt represents SHOW commands
type ShowStmt struct {
Type string // MODULES, ENTITIES, ASSOCIATIONS, ENUMERATIONS
Module string // Optional: filter by module
Name QualifiedName // For SHOW ENTITY/ASSOCIATION
}
// CreateEntityStmt represents CREATE ENTITY
type CreateEntityStmt struct {
Name QualifiedName
Persistent bool
Attributes []Attribute
Position *Position
Comment string
Doc string
}
// QualifiedName represents Module.Name or just Name
type QualifiedName struct {
Module string
Name string
}The visitor walks the ANTLR parse tree and builds AST nodes.
Key patterns:
ANTLR generates interface types for rule contexts. To access specific methods, type assertions are required:
func (v *Visitor) EnterShowEntities(ctx *parser.ShowEntitiesContext) {
stmt := &ast.ShowStmt{Type: "ENTITIES"}
// Access IDENTIFIER token if present (IN clause)
if id := ctx.IDENTIFIER(); id != nil {
stmt.Module = id.GetText()
}
v.program.Statements = append(v.program.Statements, stmt)
}Helper function for Module.Name parsing:
func buildQualifiedName(ctx parser.IQualifiedNameContext) ast.QualifiedName {
qn := ctx.(*parser.QualifiedNameContext)
ids := qn.AllIDENTIFIER()
if len(ids) == 1 {
return ast.QualifiedName{Name: ids[0].GetText()}
}
return ast.QualifiedName{
Module: ids[0].GetText(),
Name: ids[1].GetText(),
}
}Syntax errors are collected via a custom error listener:
type ErrorListener struct {
*antlr.DefaultErrorListener
Errors []error
}
func (e *ErrorListener) SyntaxError(recognizer antlr.Recognizer, offendingSymbol interface{},
line, column int, msg string, ex antlr.RecognitionException) {
e.Errors = append(e.Errors, fmt.Errorf("line %d:%d %s", line, column, msg))
}Executes AST statements against the modelsdk-go API.
type Executor struct {
writer *mpr.Writer
output io.Writer
}
func (e *Executor) Execute(stmt ast.Statement) error {
switch s := stmt.(type) {
case *ast.ConnectStmt:
return e.executeConnect(s)
case *ast.ShowStmt:
return e.executeShow(s)
case *ast.CreateEntityStmt:
return e.executeCreateEntity(s)
// ... other statement types
}
}Integration with modelsdk-go:
func (e *Executor) executeCreateEntity(stmt *ast.CreateEntityStmt) error {
// Build domain model entity
entity := &domainmodel.Entity{
ID: mpr.GenerateID(),
Name: stmt.Name.Name,
// ... other fields
}
// Get module and add entity
module := e.getOrCreateModule(stmt.Name.Module)
dm := module.DomainModel
dm.Entities = append(dm.Entities, entity)
return nil
}Interactive read-eval-print loop for MDL commands.
type REPL struct {
executor *executor.Executor
input io.Reader
output io.Writer
}
func (r *REPL) Run() error {
scanner := bufio.NewScanner(r.input)
for {
fmt.Fprint(r.output, "mdl> ")
if !scanner.Scan() {
break
}
input := scanner.Text()
prog, errs := visitor.Build(input)
if len(errs) > 0 {
// Handle parse errors
continue
}
for _, stmt := range prog.Statements {
if err := r.executor.Execute(stmt); err != nil {
fmt.Fprintf(r.output, "Error: %v\n", err)
}
}
}
return nil
}Cobra-based command-line interface.
var rootCmd = &cobra.Command{
Use: "mxcli",
Short: "Mendix CLI - Work with Mendix projects using MDL syntax",
Run: func(cmd *cobra.Command, args []string) {
commands, _ := cmd.Flags().GetString("command")
if commands != "" {
// Execute commands from -c flag
exec := executor.New(os.Stdout)
prog, _ := visitor.Build(commands)
for _, stmt := range prog.Statements {
exec.Execute(stmt)
}
} else {
// Start interactive REPL
repl.New(os.Stdin, os.Stdout).Run()
}
},
}| Consideration | ANTLR4 | Parser Combinators |
|---|---|---|
| Cross-language | ✅ Same grammar for Go, TS, Java | ❌ Rewrite per language |
| Grammar docs | ✅ EBNF-like, readable | |
| Error messages | ✅ Built-in recovery | |
| Performance | ✅ Optimized lexer/parser | ✅ Comparable |
| Tooling | ✅ ANTLR Lab, IDE plugins |
- Listener: Callbacks fired during tree walk, simpler for AST building
- Visitor: Returns values from each node, better for expression evaluation
For MDL, statements are independent and don't need return value propagation, making the listener pattern more appropriate.
MDL follows SQL conventions with case-insensitive keywords. ANTLR handles this via fragment rules:
SHOW : S H O W ;
fragment S : [sS] ;This allows SHOW, show, Show, etc. to all match the same token.
- Update grammar (
MDLLexer.g4for tokens,MDLParser.g4for rules):
ddlStatement
: createStatement
| newStatement // Add new statement
;
newStatement
: NEW KEYWORD qualifiedName SEMI? # NewKeyword
;
NEW : N E W ;
KEYWORD : K E Y W O R D ;- Regenerate parser:
make grammar- Add AST type (
ast/ast.go):
type NewKeywordStmt struct {
Name QualifiedName
}
func (*NewKeywordStmt) statementNode() {}- Update visitor (
visitor/visitor.go):
func (v *Visitor) EnterNewKeyword(ctx *parser.NewKeywordContext) {
stmt := &ast.NewKeywordStmt{
Name: buildQualifiedName(ctx.QualifiedName()),
}
v.program.Statements = append(v.program.Statements, stmt)
}- Update executor (
executor/executor.go):
func (e *Executor) Execute(stmt ast.Statement) error {
switch s := stmt.(type) {
// ... existing cases
case *ast.NewKeywordStmt:
return e.executeNewKeyword(s)
}
}func TestParseShowEntities(t *testing.T) {
prog, errs := visitor.Build("SHOW ENTITIES IN MyModule")
if len(errs) > 0 {
t.Fatalf("unexpected errors: %v", errs)
}
if len(prog.Statements) != 1 {
t.Fatalf("expected 1 statement, got %d", len(prog.Statements))
}
show, ok := prog.Statements[0].(*ast.ShowStmt)
if !ok {
t.Fatalf("expected ShowStmt, got %T", prog.Statements[0])
}
if show.Type != "ENTITIES" || show.Module != "MyModule" {
t.Errorf("unexpected statement: %+v", show)
}
}func TestExecuteShowEntities(t *testing.T) {
// Create test MPR file
// Connect executor
// Execute SHOW ENTITIES
// Verify output
}Usually caused by:
- Typo in grammar keyword definition (e.g.,
ENTITIES : E N T I E Smissing letters) - Lexer rule ordering issues (longer matches should come first)
- Missing whitespace handling
When accessing ANTLR context methods, always use type assertions:
// Wrong - will panic if ctx is nil or wrong type
ids := ctx.AllIDENTIFIER()
// Correct - check interface first
qn, ok := ctx.(*parser.QualifiedNameContext)
if !ok {
return ast.QualifiedName{}
}
ids := qn.AllIDENTIFIER()Check that:
- Lexer rules are defined before
IDENTIFIERrule - Keywords aren't being matched as identifiers
- Whitespace is properly skipped
Before execution, mxcli check runs AST-level semantic checks on microflow bodies via ValidateMicroflow(). These checks require no project connection — they operate purely on the parsed AST.
The microflowValidator struct walks the body and checks:
- Return value consistency — RETURN must provide a value when the microflow declares a return type; RETURN must not provide a value on void microflows (except
RETURN empty). - Return type plausibility — Scalar literals (string, integer, boolean, decimal) cannot be returned from entity-typed microflows.
- Return path coverage — All code paths must end with RETURN for non-void microflows. The
bodyReturns()helper recursively checks whether the last statement in a body is a RETURN, or an IF/ELSE where both branches return. - Variable scope — Variables declared inside IF/ELSE branches or ON ERROR bodies cannot be referenced after the branch ends. The
checkBranchScoping()method collects variables declared inside branches and checks if subsequent statements reference them. - Validation feedback — VALIDATION FEEDBACK must have a non-empty message template (CE0091).
This is separate from ValidateMicroflowBody() (in cmd_microflows_builder.go), which checks undeclared variable usage and runs during --references validation.
The microflow builder (cmd_microflows_builder.go) converts MDL microflow AST nodes into Mendix microflow objects. A key aspect is variable type tracking.
The flowBuilder struct maintains a map[string]string called varTypes that tracks the type of each variable during microflow construction. This is essential for building qualified names in CHANGE statements.
Type Format:
- Single entity:
"Module.Entity"(e.g.,"MfTest.Product") - List of entities:
"List of Module.Entity"(e.g.,"List of MfTest.Product")
Sources of Variable Types:
| Source | Registration | Type Format |
|---|---|---|
| Parameters (entity/list) | cmd_microflows_create.go |
"Module.Entity" or "List of Module.Entity" |
| CREATE statement | addCreateObjectAction |
"Module.Entity" (single) |
| RETRIEVE with LIMIT 1 | addRetrieveAction |
"Module.Entity" (single) |
| RETRIEVE without LIMIT 1 | addRetrieveAction |
"List of Module.Entity" (list) |
| FOREACH loop variable | addLoopStatement |
Derived from list type |
FOREACH Loop Variable Derivation:
// If $ProductList is "List of MfTest.Product", then $Product is "MfTest.Product"
listType := fb.varTypes[s.ListVariable]
if strings.HasPrefix(listType, "List of ") {
elementType := strings.TrimPrefix(listType, "List of ")
fb.varTypes[s.LoopVariable] = elementType
}Usage in CHANGE Statements:
The AttributeQualifiedName field in MemberChange is built by looking up the variable's entity type:
entityQN := fb.varTypes[s.Variable] // e.g., "MfTest.Product"
memberChange.AttributeQualifiedName = entityQN + "." + change.Attribute
// Result: "MfTest.Product.LastProcessedDate"-
RETRIEVE Type Depends on LIMIT: RETRIEVE with
LIMIT 1returns a single entity, otherwise it returns a list. The output variable must be registered accordingly. FOREACH loops require a list type to derive the element type. -
Variable Scope Sharing: The
loopBuildershares the samevarTypesmap with its parent, so loop variable registrations are visible to nested statements. -
Order of Operations: In
addLoopStatement, the loop variable must be registered invarTypesbefore processing the loop body statements.
CRITICAL: ANTLR parsers can return partial parse trees with nil nodes when there are syntax errors. Always check if grammar element getters return nil before calling methods on them.
When parsing malformed MDL like:
CREATE PERSISTENT ENTITY Test.Broken (
: String(100), -- missing attribute name
ValidAttr: Integer
);
The ANTLR parser creates an AttributeDefinitionContext for the malformed line, but AttributeName() returns nil because there's no valid identifier. Code like this will panic:
// DANGEROUS - will panic if AttributeName() returns nil
attr.Name = a.AttributeName().GetText()Always add nil checks before accessing potentially-nil grammar elements:
// SAFE - check for nil first
if a.AttributeName() == nil {
b.addErrorWithExample(
"Invalid attribute: each attribute must have a name and type",
` CREATE PERSISTENT ENTITY MyModule.Customer (
Name: String(100) NOT NULL,
Email: String(200),
Age: Integer
);`)
continue
}
attr.Name = a.AttributeName().GetText()Use Builder.addErrorWithExample() to provide helpful error messages that include example MDL syntax. This helps LLMs (and humans) understand the expected format:
func (b *Builder) addErrorWithExample(message, example string) {
b.errors = append(b.errors, fmt.Errorf("%s\n\nExpected syntax:\n%s", message, example))
}The error output will look like:
Invalid attribute: each attribute must have a name and type
Expected syntax:
CREATE PERSISTENT ENTITY MyModule.Customer (
Name: String(100) NOT NULL,
Email: String(200),
Age: Integer
);
Common ANTLR context methods that can return nil on parse errors:
AttributeName()- missing attribute identifierEnumValueName()- missing enumeration value identifierQualifiedName()- missing or malformed qualified nameDataType()- missing type specificationExpression()- missing or malformed expression
Rule of thumb: Any grammar element that could be missing due to a syntax error should be checked for nil before use.