Skip to content

feat(parser): Snowflake @stage references in FROM (#483)#505

Merged
ajitpratap0 merged 2 commits intomainfrom
feat/snowflake-stage-refs-483
Apr 11, 2026
Merged

feat(parser): Snowflake @stage references in FROM (#483)#505
ajitpratap0 merged 2 commits intomainfrom
feat/snowflake-stage-refs-483

Conversation

@ajitpratap0
Copy link
Copy Markdown
Owner

Snowflake @stage ref support. QA: CH 100%, SF 97.7%. Part of #483.

🤖 Generated with Claude Code

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 11, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gosqlx-website Ready Ready Preview, Comment Apr 11, 2026 5:22pm

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 11, 2026

Claude finished @ajitpratap0's task —— View job


PR Review: Snowflake @stage references in FROM clause

I've completed my analysis of the PR. This implementation adds support for Snowflake @stage references in FROM clauses with appropriate test coverage.

Review Tasks:

  • Read and analyze the modified parser file
  • Review the new test file for coverage
  • Check adherence to GoSQLX coding standards
  • Assess potential bugs and edge cases
  • Evaluate performance considerations
  • Check security implications
  • Provide comprehensive feedback

Code Quality Assessment

Strengths:

  1. Consistent Integration: The @stage parsing logic fits well within the existing parseFromTableReference() flow (select_subquery.go:74-102)
  2. Proper Token Handling: Correctly identifies @stage references via TokenTypePlaceholder with "@" prefix check
  3. Path Support: Handles optional path suffixes like @stage/data/file.csv by parsing subsequent / tokens
  4. Function Arguments: Supports (FILE_FORMAT => ...) syntax by reusing existing parseFunctionCall() infrastructure
  5. Clean Test Coverage: Test file covers key scenarios (bare stage, with path, with format options, COPY INTO)

⚠️ Areas of Concern:

1. Security Considerations

  • Input Validation: No validation on stage name format or path components
  • Path Traversal: The path concatenation logic (lines 81-88) could potentially accept dangerous sequences like ../
  • Recommendation: Add validation to ensure stage names follow Snowflake naming conventions

2. Potential Bugs & Edge Cases

Critical Issue - Dialect Independence (select_subquery.go:74):

} else if p.isType(models.TokenTypePlaceholder) && strings.HasPrefix(p.currentToken.Token.Value, "@") {
  • This parsing logic runs for all dialects, not just Snowflake
  • Could incorrectly parse @variable in other SQL dialects as stage references
  • Recommendation: Add dialect check: p.dialect == string(keywords.DialectSnowflake)

Path Parsing Issues (select_subquery.go:81-88):

for p.isType(models.TokenTypeDiv) {
    stageName += "/"
    p.advance()
    if p.isIdentifier() || p.isType(models.TokenTypeKeyword) {
        stageName += p.currentToken.Token.Value
        p.advance()
    }
}
  • Missing error handling for malformed paths
  • Could infinite loop if TokenTypeDiv is followed by unexpected tokens
  • No bounds checking on path length

Token Boundary Issues:

  • The condition p.isIdentifier() || p.isType(models.TokenTypeKeyword) may be too permissive
  • Could accidentally consume tokens that should belong to subsequent clauses

3. Performance Considerations

  • String Concatenation: Uses += which is acceptable for the expected short path lengths
  • Reuses Infrastructure: Leverages existing parseFunctionCall() for arguments
  • ⚠️ Token Lookahead: No unnecessary backtracking or complex lookahead patterns

4. Test Coverage Analysis

Covered Scenarios:

  • Basic stage reference: @mystage
  • Stage with path: @mystage/data
  • Stage with format: @mystage (FILE_FORMAT => 'myfmt')
  • COPY INTO usage: COPY INTO my_table FROM @mystage

Missing Test Cases:

  • Multi-level paths: @db.schema.stage/path/to/file.csv
  • Error conditions (malformed stage names, invalid paths)
  • Stage references in JOIN clauses
  • Stage aliases: SELECT * FROM @stage AS s
  • Other SQL dialects (should NOT parse @stage)

5. Architecture Compliance

  • Follows Parser Patterns: Consistent with existing table reference parsing
  • AST Integration: Properly sets TableReference.Name and TableReference.TableFunc
  • Error Handling: Uses standard p.expectedError() patterns

Recommendations

High Priority:

  1. Add Dialect Check: Wrap @stage parsing in if p.dialect == string(keywords.DialectSnowflake)
  2. Improve Path Validation: Add bounds checking and validate path components
  3. Add Error Cases to Tests: Test malformed stage references and wrong dialects

Medium Priority:

  1. Enhanced Test Coverage: Add tests for stage aliases, JOIN clauses, and complex paths
  2. Input Sanitization: Validate stage names against Snowflake naming rules

Low Priority:

  1. Performance: Consider using strings.Builder for longer paths (though current usage is fine)

Overall Assessment

This is a solid implementation that correctly adds Snowflake @stage support with good test coverage. The main concern is the lack of dialect restriction, which could cause parsing issues in non-Snowflake contexts. With the recommended fixes, this feature would be production-ready.

Score: 7.5/10 - Good implementation with one critical fix needed for dialect safety.


@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 11, 2026

Claude finished @ajitpratap0's task —— View job


PR Review: Snowflake @stage references in FROM clause

I've completed my comprehensive review of this PR. The implementation adds Snowflake @stage references support with good test coverage and has addressed the critical dialect safety issue from the previous review.

Review Tasks:

  • Read and analyze the modified parser file
  • Review the new test file for coverage
  • Check adherence to GoSQLX coding standards
  • Assess potential bugs and edge cases
  • Evaluate performance considerations
  • Check security implications
  • Provide comprehensive feedback

Code Quality Assessment

Excellent Improvements from Previous Review:

  1. Critical Fix Applied: The dialect safety issue has been properly addressed in select_subquery.go:74-75:

    } else if p.dialect == string(keywords.DialectSnowflake) &&
        p.isType(models.TokenTypePlaceholder) && strings.HasPrefix(p.currentToken.Token.Value, "@") {

    This ensures @stage parsing only occurs in Snowflake dialect, preventing conflicts with other SQL dialects.

  2. Proper Test Coverage: Added TestStageRefsNotInOtherDialects which explicitly tests that @variables in PostgreSQL, MySQL, and SQL Server are NOT parsed as stage references.

Strong Implementation Qualities:

  1. Consistent Integration: The implementation fits well within the existing parseFromTableReference() flow and follows established patterns
  2. Proper Token Handling: Correctly identifies @stage references via TokenTypePlaceholder with "@" prefix check
  3. Path Support: Handles optional path suffixes like @stage/data/file.csv by parsing subsequent / tokens (lines 83-90)
  4. Function Arguments: Supports (FILE_FORMAT => ...) syntax by reusing existing parseFunctionCall() infrastructure (lines 98-104)
  5. Code Style: Follows GoSQLX conventions with proper comments and error handling
  6. AST Integration: Correctly populates TableReference.Name and TableReference.TableFunc fields

⚠️ Minor Areas for Consideration:

1. Path Parsing Edge Cases (select_subquery.go:83-90)

The current path parsing logic is robust but has some potential edge cases:

for p.isType(models.TokenTypeDiv) {
    stageName += "/"
    p.advance()
    if p.isIdentifier() || p.isType(models.TokenTypeKeyword) {
        stageName += p.currentToken.Token.Value
        p.advance()
    }
}

Potential Issues:

  • If / is followed by EOF or unexpected tokens, the path component would be empty
  • No validation that path components are valid identifiers
  • Could theoretically parse @stage/// as a valid path

Impact: Low - Snowflake's own validation would catch invalid paths, and the parser gracefully handles these cases.

2. Missing Test Scenarios

The test coverage is good but could be enhanced with:

  • Multi-level paths: @db.schema.stage/path/to/file.csv
  • Edge cases: @stage/ (trailing slash), @stage// (double slash)
  • Complex stage arguments with nested expressions
  • Stage references in JOIN clauses (currently only tests FROM clause)

3. Performance & Security Assessment

Performance: ✅ Excellent

  • String concatenation with += is appropriate for expected short paths
  • Leverages existing parseFunctionCall() infrastructure efficiently
  • No unnecessary token lookahead or backtracking
  • Dialect check happens early to avoid unnecessary processing

Security: ✅ Good

  • Properly gated to Snowflake dialect only
  • No SQL injection concerns (stage names are parsed, not executed)
  • Path parsing doesn't introduce buffer overflow risks
  • Input validation is delegated to Snowflake (appropriate design)

4. Adherence to GoSQLX Standards

Architecture Compliance: ✅ Excellent

  • Follows established parser patterns consistently
  • Uses proper error handling with p.expectedError()
  • Maintains clean separation of concerns
  • Integrates seamlessly with existing AST structure
  • Follows memory management patterns (no pool violations)

Code Quality: ✅ Good

  • Clear, descriptive comments explaining Snowflake-specific behavior
  • Consistent variable naming (stageName)
  • Proper token advancement patterns
  • No magic numbers or hardcoded strings

Overall Assessment

This is a high-quality implementation that successfully adds Snowflake @stage reference support. The code is well-integrated, properly tested, and follows GoSQLX conventions. The critical dialect safety issue from the previous review has been fixed.

Strengths:

  • ✅ Dialect-specific parsing prevents conflicts
  • ✅ Comprehensive test coverage including negative cases
  • ✅ Excellent integration with existing parser infrastructure
  • ✅ Good performance characteristics
  • ✅ Secure implementation with appropriate input handling

Minor Recommendations:

  1. Consider adding test cases for edge cases in path parsing
  2. Could add validation for malformed stage names (though not strictly necessary)

Final Score: 9/10 - Excellent implementation ready for production use.


@ajitpratap0 ajitpratap0 merged commit 387cca8 into main Apr 11, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant