Skip to content

feat: add SQL injection pattern detection (pkg/sql/security)#121

Merged
ajitpratap0 merged 3 commits into
mainfrom
feat/sql-injection-detection
Nov 25, 2025
Merged

feat: add SQL injection pattern detection (pkg/sql/security)#121
ajitpratap0 merged 3 commits into
mainfrom
feat/sql-injection-detection

Conversation

@ajitpratap0

Copy link
Copy Markdown
Owner

Summary

Add new pkg/sql/security package for detecting SQL injection vulnerabilities in parsed SQL AST and raw SQL strings.

Detection Patterns (20+ patterns)

Pattern Type Examples Severity
Tautology 1=1, 'a'='a', col=col, OR 1=1 CRITICAL
Comment Bypass --, /* */, # MEDIUM-HIGH
Time-Based Blind SLEEP(), pg_sleep(), WAITFOR DELAY, BENCHMARK() HIGH
UNION-Based NULL placeholders, system table access HIGH-CRITICAL
Out-of-Band xp_cmdshell, LOAD_FILE(), INTO OUTFILE CRITICAL
Dangerous Functions EXEC(), sp_executesql MEDIUM-CRITICAL

Features

  • Severity Levels: CRITICAL, HIGH, MEDIUM, LOW
  • Configurable Threshold: Filter findings by minimum severity
  • Dual Scanning: AST-based (accurate) and raw SQL (edge cases)
  • Helper Methods: HasCritical(), HasHighOrAbove(), IsClean()
  • JSON-Compatible: All findings serializable for API integration

Example Usage

scanner := security.NewScanner()
result := scanner.Scan(ast)

if result.HasCritical() {
    for _, finding := range result.Findings {
        fmt.Printf("%s: %s\n", finding.Severity, finding.Description)
    }
}

Performance

Benchmark Time Ops/sec
Simple AST scan ~30ns 38M
Complex AST scan ~175ns 7M
Raw SQL scan ~16μs 69K

Test plan

  • 24 test cases covering all detection patterns
  • Benchmarks for performance validation
  • False positive tests for legitimate queries
  • Race detection passes
  • Pre-commit hooks pass

Closes #71

🤖 Generated with Claude Code

Add new security package for detecting SQL injection vulnerabilities
in parsed SQL AST and raw SQL strings.

Detection patterns:
- Tautologies: 1=1, 'a'='a', col=col, OR 1=1
- Comment-based bypass: --, /**/, #
- Time-based blind: SLEEP(), pg_sleep(), WAITFOR DELAY, BENCHMARK()
- UNION-based: NULL placeholders, system table access
- Out-of-band: xp_cmdshell, LOAD_FILE(), INTO OUTFILE
- Dangerous functions: EXEC(), sp_executesql

Features:
- Severity levels: CRITICAL, HIGH, MEDIUM, LOW
- Configurable minimum severity threshold
- AST-based scanning for accurate detection
- Raw SQL regex scanning for edge cases
- Helper methods: HasCritical(), HasHighOrAbove(), IsClean()

Performance:
- AST scan: ~30ns simple, ~175ns complex queries
- Raw SQL scan: ~16μs per query

Test coverage: 24 tests covering all detection patterns

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@claude

claude Bot commented Nov 25, 2025

Copy link
Copy Markdown

Claude finished @ajitpratap0's task —— View job


PR Review: SQL Injection Detection Package

I've conducted a thorough analysis of the new pkg/sql/security package. This is a well-implemented security feature with comprehensive detection patterns. Here's my detailed feedback:

Review Checklist

  • Read and analyze scanner.go implementation
  • Review scanner_test.go for test coverage and quality
  • Evaluate code quality and adherence to GoSQLX patterns
  • Check for potential bugs and security issues
  • Assess performance considerations
  • Review detection patterns for completeness and accuracy
  • Provide comprehensive feedback and recommendations

Strengths

Architecture & Design

  • Clean API: Well-structured with clear separation between AST-based and raw SQL scanning
  • Severity Model: Excellent use of severity levels (CRITICAL/HIGH/MEDIUM/LOW) with configurable filtering
  • JSON Compatibility: All structs properly tagged for API integration
  • Helper Methods: Convenient methods like HasCritical(), HasHighOrAbove(), IsClean()
  • GoSQLX Integration: Properly integrates with existing AST types and patterns

Detection Patterns

  • Comprehensive Coverage: 20+ patterns covering major injection vectors
  • Accurate Tautology Detection: Covers literals (1=1, 'a'='a') and identifiers (col=col)
  • Advanced Patterns: OR-based injection, UNION-based extraction, time-based blind attacks
  • System Table Access: Detects access to information_schema, sys, mysql, pg_catalog
  • Dangerous Functions: Good coverage of risky functions like xp_cmdshell, LOAD_FILE()

Code Quality

  • Error Handling: Proper nil checks throughout
  • Memory Efficiency: No obvious memory leaks or excessive allocations
  • Thread Safety: No shared state, safe for concurrent use
  • Documentation: Excellent package documentation with examples

Test Coverage

  • Comprehensive: 24 test cases covering all major patterns
  • Edge Cases: Tests for nested expressions, false positives, legitimate queries
  • Benchmarks: Performance validation with realistic scenarios
  • Race Detection: Passes race detection as mentioned in PR

⚠️ Issues & Recommendations

Critical Issues

  1. UpdateStatement Field Access (scanner.go:232-240)

    // Check SET values - use Assignments field
    for _, assignment := range stmt.Assignments {
        s.scanExpressionForDangerousFunctions(assignment.Value, result)
    }
    
    // Also check Updates field for backward compatibility
    for _, update := range stmt.Updates {
        s.scanExpressionForDangerousFunctions(update.Value, result)
    }

    Issue: The code assumes both Assignments and Updates fields exist on UpdateStatement, but I need to verify these field names match the actual AST structure. This could cause runtime panics if fields don't exist.

    Recommendation: Verify AST field names or use type assertions with proper error handling.

  2. Regex Pattern Compilation (scanner.go:106-135)

    func (s *Scanner) initPatterns() {
        s.patterns[PatternTimeBased] = []*regexp.Regexp{
            regexp.MustCompile(`(?i)\bSLEEP\s*\(`),
            // ... more patterns
        }
    }

    Issue: Regex compilation happens on every scanner creation without error handling. Malformed regex could panic.

    Recommendation: Handle compilation errors and consider pre-compiling patterns as package-level variables.

High Priority Issues

  1. String-to-Severity Comparison Vulnerability (scanner.go:585-593)

    func (s *Scanner) shouldInclude(severity Severity) bool {
        severityOrder := map[Severity]int{
            SeverityLow:      0,
            SeverityMedium:   1,
            SeverityHigh:     2,
            SeverityCritical: 3,
        }
        return severityOrder[severity] >= severityOrder[s.MinSeverity]
    }

    Issue: If an unknown severity is passed, map lookup returns 0, potentially causing security bypass by allowing low-severity findings through high-severity filters.

    Recommendation: Add default case or validation to handle unknown severities securely.

  2. Case Sensitivity in System Table Detection (scanner.go:416-432)

    tableLower := strings.ToLower(rightSelect.TableName)
    if strings.Contains(tableLower, "information_schema") ||
        strings.Contains(tableLower, "sys.") ||

    Issue: Using strings.Contains instead of prefix/exact matching could cause false positives (e.g., my_information_schema_backup table).

    Recommendation: Use more precise matching or regex patterns.

Medium Priority Issues

  1. Recursive Expression Scanning Depth (scanner.go:268-283)
    Issue: No depth limit on recursive expression scanning could cause stack overflow with deeply nested malicious expressions.

    Recommendation: Add max depth parameter or iterative traversal.

  2. Comment Pattern Overlap (scanner.go:516-543)
    Issue: Multiple comment regex patterns might match the same content, creating duplicate findings.

    Recommendation: De-duplicate findings or use more specific patterns.

Performance & Memory Considerations

  1. Pattern Compilation Per Scanner
    Issue: Regex patterns are recompiled for every scanner instance, wasting CPU and memory.

    Recommendation: Use package-level pre-compiled patterns with sync.Once.

  2. String Concatenation in Findings (scanner.go:572)

    Description: "Pattern detected: " + matches[0],

    Issue: String concatenation in hot path could be optimized.

    Recommendation: Use fmt.Sprintf or string builder for better performance.

Missing Features & Enhancements

  1. Incomplete Coverage

    • Stacked Queries: Pattern defined but not used in detection
    • Boolean-Based Blind: Pattern defined but minimal detection logic
    • Second-Order Injection: No detection for stored XSS-style SQL injection
    • NoSQL Injection: No MongoDB/other NoSQL injection patterns
  2. Context Information

    • Line/Column Numbers: Not populated in findings despite fields existing
    • SQL Context: Original SQL not preserved in AST-based scanning

🚀 Performance Analysis

The benchmarks show excellent performance:

  • Simple AST scan: ~30ns (38M ops/sec) ✅
  • Complex AST scan: ~175ns (7M ops/sec) ✅
  • Raw SQL scan: ~16μs (69K ops/sec) ✅

This meets production requirements and is consistent with GoSQLX's high-performance standards.


📋 Recommendations Summary

Before Merge

  1. Fix critical field access issue in UpdateStatement handling
  2. Add error handling for regex compilation
  3. Secure severity comparison logic
  4. Improve system table detection precision

Future Enhancements

  1. Add line/column tracking for better error reporting
  2. Implement stacked query detection
  3. Add depth limiting for recursive scanning
  4. Pre-compile regex patterns for better performance
  5. Extend coverage to boolean-based and second-order injection

🎯 Overall Assessment

This is a high-quality security addition that significantly enhances GoSQLX's defensive capabilities. The implementation demonstrates strong understanding of SQL injection patterns and follows GoSQLX architectural patterns well.

Recommendation: ✅ Approve with minor fixes

The critical issues are easily fixable and don't impact the core architecture. This feature provides substantial security value and is well-tested.

Fix golangci-lint S1040 error - SetOperation.Left and .Right are
already of type ast.Statement, so type assertions are unnecessary.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@claude

claude Bot commented Nov 25, 2025

Copy link
Copy Markdown

Claude finished @ajitpratap0's task —— View job


PR Review: SQL Injection Detection Package

I've conducted a thorough analysis of the new pkg/sql/security package. This is a well-implemented security feature with comprehensive detection patterns. Here's my detailed feedback:

Review Checklist

  • Read and analyze scanner.go implementation
  • Review scanner_test.go for test coverage and quality
  • Evaluate code quality and adherence to GoSQLX patterns
  • Check for potential bugs and security issues
  • Assess performance considerations
  • Review detection patterns for completeness and accuracy
  • Provide comprehensive feedback and recommendations

Strengths

Architecture & Design

  • Clean API: Well-structured with clear separation between AST-based and raw SQL scanning
  • Severity Model: Excellent use of severity levels (CRITICAL/HIGH/MEDIUM/LOW) with configurable filtering
  • JSON Compatibility: All structs properly tagged for API integration
  • Helper Methods: Convenient methods like HasCritical(), HasHighOrAbove(), IsClean()
  • GoSQLX Integration: Properly integrates with existing AST types and patterns

Detection Patterns

  • Comprehensive Coverage: 20+ patterns covering major injection vectors
  • Accurate Tautology Detection: Covers literals (1=1, 'a'='a') and identifiers (col=col)
  • Advanced Patterns: OR-based injection, UNION-based extraction, time-based blind attacks
  • System Table Access: Detects access to information_schema, sys, mysql, pg_catalog
  • Dangerous Functions: Good coverage of risky functions like xp_cmdshell, LOAD_FILE()

Code Quality

  • Error Handling: Proper nil checks throughout
  • Memory Efficiency: No obvious memory leaks or excessive allocations
  • Thread Safety: No shared state, safe for concurrent use
  • Documentation: Excellent package documentation with examples

Test Coverage

  • Comprehensive: 24 test cases covering all major patterns
  • Edge Cases: Tests for nested expressions, false positives, legitimate queries
  • Benchmarks: Performance validation with realistic scenarios
  • Race Detection: Passes race detection as mentioned in PR

⚠️ Issues & Recommendations

Critical Issues

  1. UpdateStatement Field Access (scanner.go:232-240)

    // Check SET values - use Assignments field
    for _, assignment := range stmt.Assignments {
        s.scanExpressionForDangerousFunctions(assignment.Value, result)
    }
    
    // Also check Updates field for backward compatibility
    for _, update := range stmt.Updates {
        s.scanExpressionForDangerousFunctions(update.Value, result)
    }

    Issue: The code assumes both Assignments and Updates fields exist on UpdateStatement, but I verified these field names match the actual AST structure. This is correct.

  2. Regex Pattern Compilation (scanner.go:106-135)

    func (s *Scanner) initPatterns() {
        s.patterns[PatternTimeBased] = []*regexp.Regexp{
            regexp.MustCompile(`(?i)\bSLEEP\s*\(`),
            // ... more patterns
        }
    }

    Issue: Regex compilation happens on every scanner creation without error handling. Malformed regex could panic.

    Recommendation: Handle compilation errors and consider pre-compiling patterns as package-level variables.

High Priority Issues

  1. String-to-Severity Comparison Vulnerability (scanner.go:585-593)

    func (s *Scanner) shouldInclude(severity Severity) bool {
        severityOrder := map[Severity]int{
            SeverityLow:      0,
            SeverityMedium:   1,
            SeverityHigh:     2,
            SeverityCritical: 3,
        }
        return severityOrder[severity] >= severityOrder[s.MinSeverity]
    }

    Issue: If an unknown severity is passed, map lookup returns 0, potentially causing security bypass by allowing low-severity findings through high-severity filters.

    Recommendation: Add default case or validation to handle unknown severities securely.

  2. Case Sensitivity in System Table Detection (scanner.go:416-432)

    tableLower := strings.ToLower(rightSelect.TableName)
    if strings.Contains(tableLower, "information_schema") ||
        strings.Contains(tableLower, "sys.") ||

    Issue: Using strings.Contains instead of prefix/exact matching could cause false positives (e.g., my_information_schema_backup table).

    Recommendation: Use more precise matching or regex patterns.

Medium Priority Issues

  1. Recursive Expression Scanning Depth (scanner.go:268-283)
    Issue: No depth limit on recursive expression scanning could cause stack overflow with deeply nested malicious expressions.

    Recommendation: Add max depth parameter or iterative traversal.

  2. Comment Pattern Overlap (scanner.go:516-543)
    Issue: Multiple comment regex patterns might match the same content, creating duplicate findings.

    Recommendation: De-duplicate findings or use more specific patterns.

Performance & Memory Considerations

  1. Pattern Compilation Per Scanner
    Issue: Regex patterns are recompiled for every scanner instance, wasting CPU and memory.

    Recommendation: Use package-level pre-compiled patterns with sync.Once.

  2. String Concatenation in Findings (scanner.go:572)

    Description: "Pattern detected: " + matches[0],

    Issue: String concatenation in hot path could be optimized.

    Recommendation: Use fmt.Sprintf or string builder for better performance.

Missing Features & Enhancements

  1. Incomplete Coverage

    • Stacked Queries: Pattern defined but not used in detection
    • Boolean-Based Blind: Pattern defined but minimal detection logic
    • Second-Order Injection: No detection for stored XSS-style SQL injection
    • NoSQL Injection: No MongoDB/other NoSQL injection patterns
  2. Context Information

    • Line/Column Numbers: Not populated in findings despite fields existing
    • SQL Context: Original SQL not preserved in AST-based scanning

🚀 Performance Analysis

The benchmarks show excellent performance:

  • Simple AST scan: ~30ns (38M ops/sec) ✅
  • Complex AST scan: ~175ns (7M ops/sec) ✅
  • Raw SQL scan: ~16μs (69K ops/sec) ✅

This meets production requirements and is consistent with GoSQLX's high-performance standards.


📋 Recommendations Summary

Before Merge

  1. Add error handling for regex compilation
  2. Secure severity comparison logic
  3. Improve system table detection precision
  4. Add input validation for public methods

Future Enhancements

  1. Add line/column tracking for better error reporting
  2. Implement stacked query detection
  3. Add depth limiting for recursive scanning
  4. Pre-compile regex patterns for better performance
  5. Extend coverage to boolean-based and second-order injection

🎯 Overall Assessment

This is a high-quality security addition that significantly enhances GoSQLX's defensive capabilities. The implementation demonstrates strong understanding of SQL injection patterns and follows GoSQLX architectural patterns well.

Recommendation: ✅ Approve with minor fixes

The critical issues are easily fixable and don't impact the core architecture. This feature provides substantial security value and is well-tested.


- Fix severity comparison vulnerability: unknown severities now default
  to being included (fail-safe behavior) instead of returning 0
- Pre-compile regex patterns at package level using sync.Once for
  better performance (patterns compiled once instead of per-scanner)
- Improve system table detection precision: use prefix/exact matching
  instead of strings.Contains to avoid false positives
- Add input validation for NewScannerWithSeverity (returns error for
  invalid severity levels)
- Add comprehensive tests for isSystemTable and unknown severity handling

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@claude

claude Bot commented Nov 25, 2025

Copy link
Copy Markdown

Claude encountered an error —— View job


I'll analyze this and get back to you.

@claude

claude Bot commented Nov 25, 2025

Copy link
Copy Markdown

Claude finished @ajitpratap0's task —— View job


PR Review: SQL Injection Detection Package

I've conducted a thorough analysis of the new pkg/sql/security package. This is a well-implemented security feature with comprehensive detection patterns. Here's my detailed feedback:

Review Checklist

  • Read and analyze scanner.go implementation
  • Review scanner_test.go for test coverage and quality
  • Evaluate code quality and adherence to GoSQLX patterns
  • Check for potential bugs and security issues
  • Assess performance considerations
  • Review detection patterns for completeness and accuracy
  • Provide comprehensive feedback and recommendations

Strengths

Architecture & Design

  • Clean API: Well-structured with clear separation between AST-based and raw SQL scanning
  • Severity Model: Excellent use of severity levels (CRITICAL/HIGH/MEDIUM/LOW) with configurable filtering
  • JSON Compatibility: All structs properly tagged for API integration
  • Helper Methods: Convenient methods like HasCritical(), HasHighOrAbove(), IsClean()
  • GoSQLX Integration: Properly integrates with existing AST types and patterns

Detection Patterns

  • Comprehensive Coverage: 20+ patterns covering major injection vectors
  • Accurate Tautology Detection: Covers literals (1=1, 'a'='a') and identifiers (col=col)
  • Advanced Patterns: OR-based injection, UNION-based extraction, time-based blind attacks
  • System Table Access: Detects access to information_schema, sys, mysql, pg_catalog
  • Dangerous Functions: Good coverage of risky functions like xp_cmdshell, LOAD_FILE()

Code Quality

  • Error Handling: Proper nil checks throughout
  • Memory Efficiency: No obvious memory leaks or excessive allocations
  • Thread Safety: No shared state, safe for concurrent use
  • Documentation: Excellent package documentation with examples

Test Coverage

  • Comprehensive: 24 test cases covering all major patterns
  • Edge Cases: Tests for nested expressions, false positives, legitimate queries
  • Benchmarks: Performance validation with realistic scenarios
  • Race Detection: Passes race detection as mentioned in PR

⚠️ Issues & Recommendations

High Priority Issues

  1. Package-level Pattern Compilation (scanner.go:69-100)FIXED

    var (
        compiledPatterns     map[PatternType][]*regexp.Regexp
        compiledPatternsOnce sync.Once
    )

    Good: The patterns are now properly pre-compiled at package level with sync.Once, which addresses the performance issue I would have raised about recompilation.

  2. Severity Filtering Logic (scanner.go:668-683)SECURE

    func (s *Scanner) shouldInclude(severity Severity) bool {
        findingSeverity, findingExists := severityOrder[severity]
        minSeverity, minExists := severityOrder[s.MinSeverity]
    
        // Unknown severities are always included (fail-safe: don't hide potential issues)
        if !findingExists {
            return true
        }

    Good: The security scanner correctly handles unknown severities by defaulting to inclusion (fail-safe approach), which prevents security bypass issues.

  3. System Table Detection (scanner.go:507-525)IMPROVED

    func (s *Scanner) isSystemTable(tableName string) bool {
        // Check exact matches first
        for _, name := range systemTableNames {
            if tableLower == name {
                return true
            }
        }
        // Check prefix matches (e.g., "information_schema.tables", "pg_class")
        for _, prefix := range systemTablePrefixes {
            if strings.HasPrefix(tableLower, prefix) {
                return true
            }
        }

    Good: Uses precise prefix matching and exact matching to avoid false positives.

Medium Priority Issues

  1. Recursive Expression Scanning (scanner.go:588-602)
    Issue: No depth limit on recursive expression scanning could cause stack overflow with deeply nested malicious expressions.
    Recommendation: Consider adding max depth parameter for extremely nested cases.

  2. UpdateStatement Field Access (scanner.go:304-311)

    // Check SET values - use Assignments field
    for _, assignment := range stmt.Assignments {
        s.scanExpressionForDangerousFunctions(assignment.Value, result)
    }
    
    // Also check Updates field for backward compatibility
    for _, update := range stmt.Updates {
        s.scanExpressionForDangerousFunctions(update.Value, result)
    }

    Status: ✅ CORRECT - I verified that UpdateStatement has both Assignments and Updates fields for backward compatibility.

Minor Enhancements

  1. Missing Pattern Implementations

    • Stacked Queries: Pattern defined but not fully utilized in detection logic
    • Boolean-Based Blind: Pattern defined but minimal detection logic
    • Line/Column Numbers: Not populated in findings despite fields existing
  2. Comment Pattern Overlap (scanner.go:604-623)
    Issue: Multiple comment regex patterns might match the same content, creating duplicate findings.
    Impact: Low - just creates additional findings for same issue.


🚀 Performance Analysis

The benchmarks show excellent performance:

  • Simple AST scan: ~30ns (38M ops/sec) ✅
  • Complex AST scan: ~175ns (7M ops/sec) ✅
  • Raw SQL scan: ~16μs (69K ops/sec) ✅

This meets production requirements and is consistent with GoSQLX's high-performance standards.


📋 Test Coverage Assessment

Excellent test coverage with 24+ comprehensive test cases:

  • Tautology Detection: Numeric, string, and identifier equality
  • OR-based Injection: Complex boolean logic patterns
  • Function-based: Time-based and dangerous function detection
  • UNION-based: NULL placeholders and system table access
  • SQL Injection via Comments: Various comment bypass techniques
  • False Positive Prevention: Legitimate queries properly excluded
  • Edge Cases: Nested expressions, different operators
  • Benchmarks: Performance validation across scenarios

🎯 Overall Assessment

This is a high-quality security addition that significantly enhances GoSQLX's defensive capabilities. The implementation demonstrates strong understanding of SQL injection patterns and follows GoSQLX architectural patterns well.

Key Strengths:

  • Comprehensive pattern detection covering major attack vectors
  • Production-ready performance (millions of operations per second)
  • Clean integration with existing AST structures
  • Excellent test coverage with realistic scenarios
  • Thread-safe design with proper error handling

Recommendation: ✅ APPROVE

The implementation is production-ready and provides substantial security value. The few minor issues identified are mostly enhancements rather than critical problems, and don't impact the core functionality.


@ajitpratap0 ajitpratap0 merged commit 2fcbdc1 into main Nov 25, 2025
16 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FEAT-010: SQL Injection Pattern Detection

1 participant