Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 57 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,78 +118,103 @@ gosqlx analyze "SELECT COUNT(*) FROM orders GROUP BY status"
gosqlx parse -f json complex_query.sql
```

### Library Usage
### Library Usage - Simple API

GoSQLX provides a simple, high-level API that handles all complexity for you:

```go
package main

import (
"fmt"
"log"
"github.com/ajitpratap0/GoSQLX/pkg/sql/tokenizer"

"github.com/ajitpratap0/GoSQLX/pkg/gosqlx"
)

func main() {
// Get tokenizer from pool (always return it!)
tkz := tokenizer.GetTokenizer()
defer tokenizer.PutTokenizer(tkz)

// Tokenize SQL
sql := "SELECT id, name FROM users WHERE age > 18"
tokens, err := tkz.Tokenize([]byte(sql))
// Parse SQL in one line - that's it!
ast, err := gosqlx.Parse("SELECT * FROM users WHERE active = true")
if err != nil {
log.Fatal(err)
}

// Process tokens
fmt.Printf("Generated %d tokens\n", len(tokens))
for _, token := range tokens {
fmt.Printf(" %s (line %d, col %d)\n",
token.Token.Value,
token.Start.Line,
token.Start.Column)
}

fmt.Printf("Successfully parsed %d statement(s)\n", len(ast.Statements))
}
```

### Advanced Example with AST
**That's it!** Just 3 lines of code. No pool management, no manual cleanup - everything is handled for you.

### More Examples

```go
// Validate SQL without parsing
if err := gosqlx.Validate("SELECT * FROM users"); err != nil {
fmt.Println("Invalid SQL:", err)
}

// Parse multiple queries efficiently
queries := []string{
"SELECT * FROM users",
"SELECT * FROM orders",
}
asts, err := gosqlx.ParseMultiple(queries)

// Parse with timeout for long queries
ast, err := gosqlx.ParseWithTimeout(sql, 5*time.Second)

// Parse from byte slice (zero-copy)
ast, err := gosqlx.ParseBytes([]byte("SELECT * FROM users"))
```

### Advanced Usage - Low-Level API

For performance-critical code that needs fine-grained control, use the low-level API:

```go
package main

import (
"fmt"

"github.com/ajitpratap0/GoSQLX/pkg/sql/tokenizer"
"github.com/ajitpratap0/GoSQLX/pkg/sql/parser"
)

func AnalyzeSQL(sql string) error {
// Tokenize
func main() {
// Get tokenizer from pool (always return it!)
tkz := tokenizer.GetTokenizer()
defer tokenizer.PutTokenizer(tkz)


// Tokenize SQL
sql := "SELECT id, name FROM users WHERE age > 18"
tokens, err := tkz.Tokenize([]byte(sql))
if err != nil {
return fmt.Errorf("tokenization failed: %w", err)
panic(err)
}


// Convert tokens
converter := parser.NewTokenConverter()
result, err := converter.Convert(tokens)
if err != nil {
panic(err)
}

// Parse to AST
p := parser.NewParser()
defer p.Release()
ast, err := p.Parse(convertTokens(tokens))

ast, err := p.Parse(result.Tokens)
if err != nil {
return fmt.Errorf("parsing failed: %w", err)
panic(err)
}

// Analyze AST

fmt.Printf("Statement type: %T\n", ast)
return nil
}
```

> **Note:** The simple API has < 1% performance overhead compared to low-level API. Use the simple API unless you need fine-grained control.

## πŸ“š Documentation

### πŸ“– Comprehensive Guides
Expand Down
61 changes: 57 additions & 4 deletions SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,23 @@ tokens, err := tokenizer.Tokenize([]byte(userSQL))
```

### 2. Resource Limits
Set appropriate limits for SQL parsing:
GoSQLX includes built-in DoS protection with the following limits:
- **Maximum Input Size**: 10MB (10 * 1024 * 1024 bytes)
- **Maximum Token Count**: 1,000,000 tokens per query

These limits are enforced automatically by the tokenizer:
```go
// Built-in protection - no additional code needed
tokens, err := tokenizer.Tokenize([]byte(sql))
if err != nil {
// Will return error if input exceeds 10MB or would generate >1M tokens
return fmt.Errorf("tokenization failed: %w", err)
}
```

For additional application-specific limits:
```go
const maxSQLLength = 1_000_000 // 1MB max
const maxSQLLength = 1_000_000 // 1MB max (custom limit)
if len(sql) > maxSQLLength {
return errors.New("SQL query too large")
}
Expand All @@ -77,6 +91,9 @@ Always return objects to pools to prevent resource exhaustion:
```go
tkz := tokenizer.GetTokenizer()
defer tokenizer.PutTokenizer(tkz) // Always defer return

astObj := ast.NewAST()
defer ast.ReleaseAST(astObj) // Always defer return
```

## Known Security Considerations
Expand All @@ -87,9 +104,45 @@ defer tokenizer.PutTokenizer(tkz) // Always defer return
- Use dedicated pools for security-sensitive contexts

### Denial of Service
- Large or complex SQL queries could cause high CPU/memory usage
- Implement rate limiting and resource quotas
GoSQLX includes built-in DoS protection:
- **Input Size Limit**: Maximum 10MB per query (automatically enforced)
- **Token Count Limit**: Maximum 1,000,000 tokens per query (automatically enforced)
- **Recursion Depth Limit**: Maximum 100 levels of nesting (automatically enforced)
- Queries exceeding these limits will fail fast with descriptive errors

Additional recommendations:
- Implement rate limiting at the application level
- Set timeout contexts for parsing operations
- Monitor resource usage in production
- Consider additional custom limits based on your use case

### Stack Overflow Protection (QW-005)
GoSQLX implements recursion depth limits to prevent stack overflow attacks from deeply nested SQL expressions:

**Protection Features**:
- **Maximum Recursion Depth**: 100 levels (configurable via `MaxRecursionDepth` constant in `parser.go`)
- **Protected Operations**: Expression parsing, CTEs, nested function calls, window functions
- **Performance Impact**: <1% overhead (verified via benchmarks)
- **Error Handling**: Returns structured error with clear message when depth exceeded

**Example of Protected Attack**:
```go
// This malicious query with 1000+ nested functions is safely rejected:
// SELECT f(f(f(...f(x)...))) FROM t -- 1000 levels deep
// Error: "maximum recursion depth exceeded (100) - expression too deeply nested"

// The parser safely rejects this without stack overflow
tokens, _ := tokenizer.Tokenize([]byte(maliciousSQL))
_, err := parser.Parse(tokens)
// err != nil: "maximum recursion depth exceeded"
```

**Implementation Details**:
- Depth counter incremented on entry to recursive methods (`parseExpression`, `parseCommonTableExpr`)
- Automatic decrement on exit via `defer` ensures proper cleanup
- Depth reset between independent parse operations
- Thread-safe depth tracking per parser instance
- No performance degradation for normal queries (tested up to 50 levels of realistic nesting)

### SQL Injection
- GoSQLX is a parser, not a query executor
Expand Down
11 changes: 6 additions & 5 deletions cmd/gosqlx/cmd/config.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package cmd

import (
_ "embed"
"fmt"
"os"

Expand All @@ -10,6 +11,9 @@ import (
"github.com/ajitpratap0/GoSQLX/cmd/gosqlx/internal/config"
)

//go:embed config_template.yml
var configTemplate string

var (
configFile string
configPath string
Expand Down Expand Up @@ -79,11 +83,8 @@ func configInitRun(cmd *cobra.Command, args []string) error {
return fmt.Errorf("configuration file already exists at %s (use --force to overwrite)", path)
}

// Create default config
cfg := config.DefaultConfig()

// Save to file
if err := cfg.Save(path); err != nil {
// Write the template file with comments
if err := os.WriteFile(path, []byte(configTemplate), 0644); err != nil {
return fmt.Errorf("failed to create config file: %w", err)
}

Expand Down
100 changes: 100 additions & 0 deletions cmd/gosqlx/cmd/config_template.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# GoSQLX Configuration File
# This file allows you to set default options for all GoSQLX CLI commands.
# CLI flags always override these settings.

# Format settings - controls SQL formatting behavior
format:
# Number of spaces for indentation (0-8)
# Default: 2
indent: 2

# Convert SQL keywords to uppercase
# Default: true
uppercase_keywords: true

# Maximum line length for formatted SQL (0-500, 0 = unlimited)
# Default: 80
max_line_length: 80

# Use compact formatting with minimal whitespace
# Default: false
compact: false

# Validation settings - controls SQL validation behavior
validate:
# SQL dialect for validation
# Options: postgresql, mysql, sqlserver, oracle, sqlite, generic
# Default: postgresql
dialect: postgresql

# Enable strict validation mode (more rigorous checks)
# Default: false
strict_mode: false

# Recursively process directories
# Default: false
recursive: false

# File pattern for recursive processing
# Default: *.sql
pattern: "*.sql"

# Security settings - controls security limits
security:
# Maximum file size in bytes (10MB = 10485760 bytes)
# Default: 10485760 (10MB)
max_file_size: 10485760

# Output settings - controls how results are displayed
output:
# Output format for analysis results
# Options: json, yaml, table, tree, auto
# Default: auto
format: auto

# Enable verbose output
# Default: false
verbose: false

# Analyze settings - controls analysis features
analyze:
# Enable security analysis (SQL injection detection, etc.)
# Default: true
security: true

# Enable performance analysis (optimization suggestions)
# Default: true
performance: true

# Enable complexity analysis (metrics and scoring)
# Default: true
complexity: true

# Enable all analysis features
# Default: false
all: false

# Configuration Precedence:
# 1. CLI flags (highest priority)
# 2. .gosqlx.yml in current directory
# 3. ~/.gosqlx.yml in home directory
# 4. /etc/gosqlx.yml system-wide
# 5. Built-in defaults (lowest priority)

# Examples:
#
# Format with 4-space indentation:
# format:
# indent: 4
#
# Validate MySQL dialect:
# validate:
# dialect: mysql
#
# Enable all analysis by default:
# analyze:
# all: true
#
# Use JSON output format:
# output:
# format: json
Loading
Loading