Skip to content

feat(parser): Snowflake MATCH_RECOGNIZE (SQL:2016 R010) (#483)#506

Merged
ajitpratap0 merged 2 commits intomainfrom
feat/snowflake-match-recognize-483
Apr 11, 2026
Merged

feat(parser): Snowflake MATCH_RECOGNIZE (SQL:2016 R010) (#483)#506
ajitpratap0 merged 2 commits intomainfrom
feat/snowflake-match-recognize-483

Conversation

@ajitpratap0
Copy link
Copy Markdown
Owner

Summary

Implements the SQL:2016 MATCH_RECOGNIZE clause for Snowflake (and Oracle). This was the last remaining Snowflake QA failure.

What now parses

```sql
SELECT * FROM stock_price MATCH_RECOGNIZE (
PARTITION BY symbol
ORDER BY ts
MEASURES MATCH_NUMBER() AS m, FIRST(ts) AS start_ts, LAST(ts) AS end_ts
ALL ROWS PER MATCH
PATTERN (UP+ DOWN+)
DEFINE UP AS price > PREV(price), DOWN AS price < PREV(price)
)
```

All sub-clauses: PARTITION BY, ORDER BY, MEASURES, ONE/ALL ROWS PER MATCH, AFTER MATCH SKIP, PATTERN (regex), DEFINE. Pattern alternation `(A | B)`, quantifiers `+`/`*`/`?`, nested parens.

Implementation

  • New file `match_recognize.go` (~200 lines): `parseMatchRecognize()` + `isMatchRecognizeKeyword()`. Each sub-clause parsed in order; PATTERN stored as raw string (no regex AST); MEASURES and DEFINE parse expressions via `parseExpression()`.
  • AST (`ast.go`): `MatchRecognizeClause` with `PartitionBy`, `OrderBy`, `Measures []MeasureDef`, `RowsPerMatch`, `AfterMatch`, `Pattern`, `Definitions []PatternDef`. Children() traverses all sub-expressions.
  • `TableReference.MatchRecognize` field + Children() update.
  • FIRST/LAST as function names: extended the keyword-as-function-call path in `parsePrimaryExpression` to accept `TokenTypeFirst` and `TokenTypeLast` followed by `(` — needed for `FIRST(ts)`, `LAST(ts)` in MEASURES.
  • Alias guard in FROM updated to exclude MATCH_RECOGNIZE.

QA corpus (on this branch)

Dialect Pass Total Rate
ClickHouse 84 84 100%
Snowflake 87 88 98.9%

Sole remaining SF failure is `positional_col_$1` which is the @stage refs fix in PR #505.

Test plan

  • `TestSnowflakeMatchRecognize`: 6 shapes (basic, ONE/ALL ROW, alias, alternation, measures, partition+order DESC)
  • `TestMatchRecognizeASTShape`: verifies all AST fields populated and Children() traversal reaches sub-expressions
  • `go test -race ./pkg/sql/parser/ ./pkg/sql/ast/ ./pkg/gosqlx/` green

Closes the MATCH_RECOGNIZE item in #483. After this + #505, Snowflake hits 100% of the QA corpus.

🤖 Generated with Claude Code

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 11, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gosqlx-website Ready Ready Preview, Comment Apr 11, 2026 5:30pm

@github-actions github-actions bot added parser ast AST node changes labels Apr 11, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 11, 2026

Claude finished @ajitpratap0's task —— View job


PR Review: Snowflake MATCH_RECOGNIZE Implementation

I've completed a comprehensive review of this MATCH_RECOGNIZE implementation. This is a high-quality PR that implements a complex SQL:2016 feature with attention to architectural patterns and comprehensive testing.

Review Checklist

  • Analyze new MATCH_RECOGNIZE parser implementation (match_recognize.go)
  • Review AST structure and integration (ast.go changes)
  • Examine parser integration points (select_subquery.go, expressions_literal.go)
  • Evaluate test coverage and quality (snowflake_match_recognize_test.go)
  • Check for potential bugs, race conditions, and security concerns
  • Assess performance implications and memory management
  • Verify adherence to project patterns and conventions

✅ Strengths

1. Excellent Architecture & Design

  • Proper dialect gating: isMatchRecognizeKeyword() correctly restricts to Snowflake/Oracle (match_recognize.go:24-30)
  • Clean AST design: MatchRecognizeClause struct properly models all sub-clauses with appropriate types (ast.go:2166-2175)
  • Perfect AST integration: Correctly added to TableReference.MatchRecognize field and Children() method (ast.go:250, 280-282)
  • Seamless parser integration: Well-integrated into table reference parsing flow (select_subquery.go:209-224)

2. Robust Parsing Logic

  • Handles optional clauses correctly: All sub-clauses are optional except PATTERN/DEFINE, with proper order handling
  • Smart pattern parsing: Raw text collection with proper parentheses depth tracking (match_recognize.go:168-188)
  • Expression integration: Proper use of parseExpression() for MEASURES and DEFINE conditions
  • Keyword-as-function support: Excellent enhancement to allow FIRST(ts), LAST(ts) in MEASURES (expressions_literal.go:109)

3. Comprehensive Test Coverage

  • 6 test scenarios: Covers basic usage, row matching modes, aliases, alternation, measures-only, and complex partitioning
  • AST verification: TestMatchRecognizeASTShape validates all fields populated and Children() traversal works
  • Real-world examples: Uses meaningful patterns like stock price analysis

🔍 Minor Issues & Suggestions

1. Memory Management (Low Priority)

Issue: No object pooling for MatchRecognizeClause despite project's heavy use of sync.Pool patterns.

Recommendation: Consider adding pooling in future iterations:

// Future enhancement
var matchRecognizePool = sync.Pool{
    New: func() interface{} { return &MatchRecognizeClause{} },
}

2. Error Handling Enhancement

Issue: Some error messages could be more specific (match_recognize.go:51)

Current: "( after MATCH_RECOGNIZE"
Suggestion: "opening ( after MATCH_RECOGNIZE keyword"

3. Pattern Validation (Enhancement)

Current behavior: Pattern stored as raw string without validation
Suggestion: Consider adding basic regex validation for common errors, though current approach is correct for an MVP.

4. Code Style (Minor)

Issue: Inconsistent string comparison methods:

  • Uses strings.EqualFold() for some keywords (match_recognize.go:104)
  • Uses p.isType() for others (match_recognize.go:59)

Recommendation: Standardize on p.isType() where possible for consistency.

🛡️ Security & Performance Assessment

Security: ✅ Excellent

  • No SQL injection vectors (all parsing uses tokenized input)
  • Proper input validation and bounds checking
  • No unsafe string operations

Performance: ✅ Good

  • Reuses existing expression parsing (no duplication)
  • Minimal allocations for pattern collection
  • No performance regressions expected

Race Conditions: ✅ Safe

  • Stateless parsing (no shared mutable state)
  • Thread-safe design consistent with existing parser
  • Recommend running task test:race to verify

📊 Code Quality Score: 9.2/10

Breakdown:

  • Architecture & Design: 10/10
  • Code Quality: 9/10
  • Test Coverage: 9/10
  • Documentation: 8/10
  • Performance: 9/10

🎯 Recommendations

Before Merge:

  1. Run standard checks: task test:race && task check to ensure all tests pass
  2. Verify QA corpus: Confirm 98.9% Snowflake success rate claim

Future Enhancements:

  1. Add object pooling for MatchRecognizeClause
  2. Consider pattern validation for better error messages
  3. Add more edge case tests (empty patterns, malformed syntax)

🏆 Overall Assessment

This is an excellent implementation of a complex SQL feature. The code demonstrates:

  • Strong understanding of the existing codebase patterns
  • Proper architectural integration
  • Comprehensive testing approach
  • Clean, readable implementation

The MATCH_RECOGNIZE implementation successfully completes Snowflake dialect support and should be merged after the standard test verification.

Verdict: APPROVE ✅ with minor suggestions for future improvements.


Job run

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 11, 2026

Claude finished @ajitpratap0's task —— View job


PR Review: Snowflake MATCH_RECOGNIZE Implementation

I've completed a comprehensive analysis of this MATCH_RECOGNIZE implementation. This is a high-quality PR that implements a complex SQL:2016 feature with excellent attention to architectural patterns and comprehensive testing.

Review Checklist

  • Analyze new MATCH_RECOGNIZE parser implementation (match_recognize.go)
  • Review AST structure and integration (ast.go changes)
  • Examine parser integration points (select_subquery.go, expressions_literal.go)
  • Evaluate test coverage and quality (snowflake_match_recognize_test.go)
  • Check for potential bugs, race conditions, and security concerns
  • Assess performance implications and memory management
  • Verify adherence to project patterns and conventions
  • Run quality checks and tests (limited by permissions)

✅ Strengths

1. Excellent Architecture & Design

  • Proper dialect gating: isMatchRecognizeKeyword() correctly restricts to Snowflake/Oracle (match_recognize.go:24-30)
  • Clean AST design: MatchRecognizeClause struct properly models all sub-clauses with appropriate types (ast.go:2166-2175)
  • Perfect AST integration: Correctly added to TableReference.MatchRecognize field and Children() method (ast.go:250, 280-282)
  • Seamless parser integration: Well-integrated into table reference parsing flow (select_subquery.go:241-255)

2. Robust Parsing Logic

  • Handles optional clauses correctly: All sub-clauses are optional except PATTERN/DEFINE, with proper order handling
  • Smart pattern parsing: Raw text collection with proper parentheses depth tracking (match_recognize.go:168-188)
  • Expression integration: Proper use of parseExpression() for MEASURES and DEFINE conditions
  • Keyword-as-function support: Excellent enhancement to allow FIRST(ts), LAST(ts) in MEASURES (expressions_literal.go:108-109)

3. Comprehensive Test Coverage

  • 6 test scenarios: Covers basic usage, row matching modes, aliases, alternation, measures-only, and complex partitioning
  • AST verification: TestMatchRecognizeASTShape validates all fields populated and Children() traversal works
  • Real-world examples: Uses meaningful patterns like stock price analysis

🔍 Minor Issues & Suggestions

1. Memory Management (Low Priority)

Issue: No object pooling for MatchRecognizeClause despite project's heavy use of sync.Pool patterns.

Recommendation: Consider adding pooling in future iterations:

// Future enhancement
var matchRecognizePool = sync.Pool{
    New: func() interface{} { return &MatchRecognizeClause{} },
}

2. Error Handling Enhancement

Issue: Some error messages could be more specific (match_recognize.go:51)

Current: "( after MATCH_RECOGNIZE"
Suggestion: "opening ( after MATCH_RECOGNIZE keyword"

3. Pattern Validation (Enhancement)

Current behavior: Pattern stored as raw string without validation
Suggestion: Consider adding basic regex validation for common errors, though current approach is correct for an MVP.

4. Code Style (Minor)

Issue: Inconsistent string comparison methods:

  • Uses strings.EqualFold() for some keywords (match_recognize.go:104)
  • Uses p.isType() for others (match_recognize.go:59)

Recommendation: Standardize on p.isType() where possible for consistency.

🛡️ Security & Performance Assessment

Security: ✅ Excellent

  • No SQL injection vectors (all parsing uses tokenized input)
  • Proper input validation and bounds checking
  • No unsafe string operations

Performance: ✅ Good

  • Reuses existing expression parsing (no duplication)
  • Minimal allocations for pattern collection
  • No performance regressions expected

Race Conditions: ✅ Safe

  • Stateless parsing (no shared mutable state)
  • Thread-safe design consistent with existing parser
  • Recommend running task test:race to verify

📊 Code Quality Score: 9.2/10

Breakdown:

  • Architecture & Design: 10/10
  • Code Quality: 9/10
  • Test Coverage: 9/10
  • Documentation: 8/10
  • Performance: 9/10

🎯 Recommendations

Before Merge:

  1. Run standard checks: task test:race && task check to ensure all tests pass
  2. Verify QA corpus: Confirm 98.9% Snowflake success rate claim

Future Enhancements:

  1. Add object pooling for MatchRecognizeClause
  2. Consider pattern validation for better error messages
  3. Add more edge case tests (empty patterns, malformed syntax)

🏆 Overall Assessment

This is an excellent implementation of a complex SQL feature. The code demonstrates:

  • Strong understanding of the existing codebase patterns
  • Proper architectural integration
  • Comprehensive testing approach
  • Clean, readable implementation

The MATCH_RECOGNIZE implementation successfully completes Snowflake dialect support and should be merged after the standard test verification.

Verdict: APPROVE ✅ with minor suggestions for future improvements.


Job run

@ajitpratap0 ajitpratap0 merged commit 627e712 into main Apr 11, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ast AST node changes parser

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant