Skip to content

Commit 85ad3e5

Browse files
ajitpratap0Ajit Pratap Singhclaude
authored
feat: implement GROUPING SETS, ROLLUP, CUBE, and MERGE statement support (SQL-99 T431 + SQL:2003 F312) (#119)
* feat: implement GROUPING SETS, ROLLUP, CUBE support (SQL-99 T431) Implement SQL-99 advanced grouping operations for aggregate queries: ## New Features - ROLLUP(col1, col2, ...) - hierarchical subtotals (SQL-99 syntax) - CUBE(col1, col2, ...) - all possible subtotal combinations (SQL-99 syntax) - GROUPING SETS((a,b), (a), ()) - explicit grouping set specification - GROUP BY cols WITH ROLLUP/CUBE - MySQL syntax support ## Implementation Details ### AST Nodes (ast.go) - RollupExpression: stores column list for ROLLUP operation - CubeExpression: stores column list for CUBE operation - GroupingSetsExpression: stores list of grouping sets (including empty sets) ### Parser (parser.go) - parseGroupingExpressionList(): shared helper for ROLLUP/CUBE parsing - parseRollup(): parses ROLLUP(columns) with validation - parseCube(): parses CUBE(columns) with validation - parseGroupingSets(): parses GROUPING SETS with nested sets support - Updated parseGroupByClause() to detect and route to correct parser - Added MySQL syntax support: GROUP BY cols WITH ROLLUP/CUBE ### Tokenizer (tokenizer.go) - Added ROLLUP, CUBE, GROUPING, SETS as keyword token types - Added "GROUPING SETS" compound keyword support ### Keywords (keywords.go, categories.go) - Registered keywords in ADDITIONAL_KEYWORDS - Added to DMLKeywords and CompoundKeywords maps ## Validation - Empty ROLLUP() returns error: "ROLLUP requires at least one expression" - Empty CUBE() returns error: "CUBE requires at least one expression" - Empty set in GROUPING SETS(()) is valid (SQL-99 compliant for grand total) ## Tests - 7 formal test cases in parser_coverage_test.go - Tests cover valid syntax, empty validation, and mixed operations - All integration tests pass including MySQL WITH ROLLUP syntax Closes #67 (Phase 1: GROUPING SETS, ROLLUP, CUBE) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat: implement MERGE statement support (SQL:2003 F312) Add comprehensive MERGE statement parsing with full SQL:2003 compliance: AST Nodes: - MergeStatement: Target/source tables, ON condition, WHEN clauses - MergeWhenClause: MATCHED, NOT MATCHED, NOT MATCHED BY SOURCE - MergeAction: UPDATE (with SET clauses), INSERT (with VALUES), DELETE - SetClause: Column assignment in UPDATE actions Parser Features: - MERGE INTO target USING source ON condition syntax - Multiple WHEN clauses with AND conditions - WHEN MATCHED THEN UPDATE SET / DELETE - WHEN NOT MATCHED THEN INSERT (columns) VALUES (values) - WHEN NOT MATCHED BY SOURCE THEN UPDATE/DELETE - INSERT DEFAULT VALUES support - Qualified column names in SET clauses (t.column = s.value) - Case-insensitive keyword handling Tokenizer Updates: - Added DML keywords: INSERT, UPDATE, DELETE, INTO, VALUES, SET, DEFAULT - Added MERGE keywords: MERGE, MATCHED, SOURCE, TARGET - Updated PostgreSQL test expectations for new keyword types Test Coverage: - 12 comprehensive tests covering all MERGE scenarios - Error case validation (INSERT in MATCHED, DELETE in NOT MATCHED) - Benchmarks: 2.9M ops/sec (simple), 1.0M ops/sec (complex) Performance validated with race detection - zero race conditions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Ajit Pratap Singh <ajitpratapsingh@Ajits-Mac-mini.local> Co-authored-by: Claude <noreply@anthropic.com>
1 parent a50a055 commit 85ad3e5

8 files changed

Lines changed: 1429 additions & 18 deletions

File tree

pkg/sql/ast/ast.go

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -230,6 +230,49 @@ func nodifyExpressions(exprs []Expression) []Node {
230230
return nodes
231231
}
232232

233+
// RollupExpression represents ROLLUP(col1, col2, ...) in GROUP BY clause
234+
// ROLLUP generates hierarchical grouping sets from right to left
235+
// Example: ROLLUP(a, b, c) generates grouping sets:
236+
//
237+
// (a, b, c), (a, b), (a), ()
238+
type RollupExpression struct {
239+
Expressions []Expression
240+
}
241+
242+
func (r *RollupExpression) expressionNode() {}
243+
func (r RollupExpression) TokenLiteral() string { return "ROLLUP" }
244+
func (r RollupExpression) Children() []Node { return nodifyExpressions(r.Expressions) }
245+
246+
// CubeExpression represents CUBE(col1, col2, ...) in GROUP BY clause
247+
// CUBE generates all possible combinations of grouping sets
248+
// Example: CUBE(a, b) generates grouping sets:
249+
//
250+
// (a, b), (a), (b), ()
251+
type CubeExpression struct {
252+
Expressions []Expression
253+
}
254+
255+
func (c *CubeExpression) expressionNode() {}
256+
func (c CubeExpression) TokenLiteral() string { return "CUBE" }
257+
func (c CubeExpression) Children() []Node { return nodifyExpressions(c.Expressions) }
258+
259+
// GroupingSetsExpression represents GROUPING SETS(...) in GROUP BY clause
260+
// Allows explicit specification of grouping sets
261+
// Example: GROUPING SETS((a, b), (a), ())
262+
type GroupingSetsExpression struct {
263+
Sets [][]Expression // Each inner slice is one grouping set
264+
}
265+
266+
func (g *GroupingSetsExpression) expressionNode() {}
267+
func (g GroupingSetsExpression) TokenLiteral() string { return "GROUPING SETS" }
268+
func (g GroupingSetsExpression) Children() []Node {
269+
children := make([]Node, 0)
270+
for _, set := range g.Sets {
271+
children = append(children, nodifyExpressions(set)...)
272+
}
273+
return children
274+
}
275+
233276
// Identifier represents a column or table name
234277
type Identifier struct {
235278
Name string
@@ -845,6 +888,94 @@ func (i *IndexColumn) expressionNode() {}
845888
func (i IndexColumn) TokenLiteral() string { return i.Column }
846889
func (i IndexColumn) Children() []Node { return nil }
847890

891+
// MergeStatement represents a MERGE statement (SQL:2003 F312)
892+
// Syntax: MERGE INTO target USING source ON condition
893+
//
894+
// WHEN MATCHED THEN UPDATE/DELETE
895+
// WHEN NOT MATCHED THEN INSERT
896+
// WHEN NOT MATCHED BY SOURCE THEN UPDATE/DELETE
897+
type MergeStatement struct {
898+
TargetTable TableReference // The table being merged into
899+
TargetAlias string // Optional alias for target
900+
SourceTable TableReference // The source table or subquery
901+
SourceAlias string // Optional alias for source
902+
OnCondition Expression // The join/match condition
903+
WhenClauses []*MergeWhenClause // List of WHEN clauses
904+
}
905+
906+
func (m *MergeStatement) statementNode() {}
907+
func (m MergeStatement) TokenLiteral() string { return "MERGE" }
908+
func (m MergeStatement) Children() []Node {
909+
children := []Node{&m.TargetTable, &m.SourceTable}
910+
if m.OnCondition != nil {
911+
children = append(children, m.OnCondition)
912+
}
913+
for _, when := range m.WhenClauses {
914+
children = append(children, when)
915+
}
916+
return children
917+
}
918+
919+
// MergeWhenClause represents a WHEN clause in a MERGE statement
920+
// Types: MATCHED, NOT_MATCHED, NOT_MATCHED_BY_SOURCE
921+
type MergeWhenClause struct {
922+
Type string // "MATCHED", "NOT_MATCHED", "NOT_MATCHED_BY_SOURCE"
923+
Condition Expression // Optional AND condition
924+
Action *MergeAction // The action to perform (UPDATE/INSERT/DELETE)
925+
}
926+
927+
func (w *MergeWhenClause) expressionNode() {}
928+
func (w MergeWhenClause) TokenLiteral() string { return "WHEN " + w.Type }
929+
func (w MergeWhenClause) Children() []Node {
930+
children := make([]Node, 0)
931+
if w.Condition != nil {
932+
children = append(children, w.Condition)
933+
}
934+
if w.Action != nil {
935+
children = append(children, w.Action)
936+
}
937+
return children
938+
}
939+
940+
// MergeAction represents the action in a WHEN clause
941+
// ActionType: UPDATE, INSERT, DELETE
942+
type MergeAction struct {
943+
ActionType string // "UPDATE", "INSERT", "DELETE"
944+
SetClauses []SetClause // For UPDATE: SET column = value pairs
945+
Columns []string // For INSERT: column list
946+
Values []Expression // For INSERT: value list
947+
DefaultValues bool // For INSERT: use DEFAULT VALUES
948+
}
949+
950+
func (a *MergeAction) expressionNode() {}
951+
func (a MergeAction) TokenLiteral() string { return a.ActionType }
952+
func (a MergeAction) Children() []Node {
953+
children := make([]Node, 0)
954+
for _, set := range a.SetClauses {
955+
set := set // G601: Create local copy
956+
children = append(children, &set)
957+
}
958+
for _, val := range a.Values {
959+
children = append(children, val)
960+
}
961+
return children
962+
}
963+
964+
// SetClause represents a SET clause in UPDATE (also used in MERGE UPDATE)
965+
type SetClause struct {
966+
Column string
967+
Value Expression
968+
}
969+
970+
func (s *SetClause) expressionNode() {}
971+
func (s SetClause) TokenLiteral() string { return s.Column }
972+
func (s SetClause) Children() []Node {
973+
if s.Value != nil {
974+
return []Node{s.Value}
975+
}
976+
return nil
977+
}
978+
848979
// AST represents the root of the Abstract Syntax Tree
849980
type AST struct {
850981
Statements []Statement

pkg/sql/keywords/categories.go

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -50,16 +50,21 @@ func (k *Keywords) initialize() {
5050
"NULLS": models.TokenTypeKeyword,
5151
"FIRST": models.TokenTypeKeyword,
5252
"LAST": models.TokenTypeKeyword,
53+
"ROLLUP": models.TokenTypeKeyword, // SQL-99 grouping operation
54+
"CUBE": models.TokenTypeKeyword, // SQL-99 grouping operation
55+
"GROUPING": models.TokenTypeKeyword, // SQL-99 GROUPING SETS
56+
"SETS": models.TokenTypeKeyword, // SQL-99 GROUPING SETS
5357
}
5458

5559
// Initialize compound keywords
5660
k.CompoundKeywords = map[string]models.TokenType{
57-
"FULL JOIN": models.TokenTypeKeyword,
58-
"CROSS JOIN": models.TokenTypeKeyword,
59-
"NATURAL JOIN": models.TokenTypeKeyword,
60-
"GROUP BY": models.TokenTypeKeyword,
61-
"ORDER BY": models.TokenTypeKeyword,
62-
"LEFT JOIN": models.TokenTypeKeyword,
61+
"FULL JOIN": models.TokenTypeKeyword,
62+
"CROSS JOIN": models.TokenTypeKeyword,
63+
"NATURAL JOIN": models.TokenTypeKeyword,
64+
"GROUP BY": models.TokenTypeKeyword,
65+
"ORDER BY": models.TokenTypeKeyword,
66+
"LEFT JOIN": models.TokenTypeKeyword,
67+
"GROUPING SETS": models.TokenTypeKeyword, // SQL-99 grouping operation
6368
}
6469

6570
// Add all keywords to the main keyword map

pkg/sql/keywords/keywords.go

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,17 @@ var ADDITIONAL_KEYWORDS = []Keyword{
110110
{Word: "LEAD", Type: models.TokenTypeKeyword, Reserved: false, ReservedForTableAlias: false},
111111
{Word: "FIRST_VALUE", Type: models.TokenTypeKeyword, Reserved: false, ReservedForTableAlias: false},
112112
{Word: "LAST_VALUE", Type: models.TokenTypeKeyword, Reserved: false, ReservedForTableAlias: false},
113+
// SQL-99 grouping operations
114+
{Word: "ROLLUP", Type: models.TokenTypeKeyword, Reserved: true, ReservedForTableAlias: false},
115+
{Word: "CUBE", Type: models.TokenTypeKeyword, Reserved: true, ReservedForTableAlias: false},
116+
{Word: "GROUPING", Type: models.TokenTypeKeyword, Reserved: true, ReservedForTableAlias: false},
117+
{Word: "SETS", Type: models.TokenTypeKeyword, Reserved: true, ReservedForTableAlias: false},
118+
// MERGE statement keywords (SQL:2003 F312)
119+
{Word: "MERGE", Type: models.TokenTypeKeyword, Reserved: true, ReservedForTableAlias: true},
120+
{Word: "USING", Type: models.TokenTypeKeyword, Reserved: true, ReservedForTableAlias: true},
121+
{Word: "MATCHED", Type: models.TokenTypeKeyword, Reserved: true, ReservedForTableAlias: false},
122+
{Word: "SOURCE", Type: models.TokenTypeKeyword, Reserved: false, ReservedForTableAlias: false},
123+
{Word: "TARGET", Type: models.TokenTypeKeyword, Reserved: false, ReservedForTableAlias: false},
113124
}
114125

115126
// addKeywordsWithCategory is a helper method to add multiple keywords
@@ -130,6 +141,7 @@ func New(dialect SQLDialect, ignoreCase bool) *Keywords {
130141
k.CompoundKeywords["FULL JOIN"] = models.TokenTypeKeyword
131142
k.CompoundKeywords["CROSS JOIN"] = models.TokenTypeKeyword
132143
k.CompoundKeywords["NATURAL JOIN"] = models.TokenTypeKeyword
144+
k.CompoundKeywords["GROUPING SETS"] = models.TokenTypeKeyword // SQL-99 grouping operation
133145

134146
// Add standard keywords
135147
k.addKeywordsWithCategory(RESERVED_FOR_TABLE_ALIAS)

0 commit comments

Comments
 (0)