|
| 1 | +# Performance Regression Testing |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +GoSQLX includes a comprehensive performance regression test suite to prevent performance degradation over time. The suite tracks key performance metrics against established baselines and alerts developers to regressions. |
| 6 | + |
| 7 | +## Running Performance Tests |
| 8 | + |
| 9 | +### Quick Test (Recommended for CI/CD) |
| 10 | + |
| 11 | +```bash |
| 12 | +go test -v ./pkg/sql/parser/ -run TestPerformanceRegression |
| 13 | +``` |
| 14 | + |
| 15 | +**Execution Time:** ~8 seconds |
| 16 | +**Coverage:** 5 critical query types |
| 17 | + |
| 18 | +### Baseline Benchmark (For Establishing New Baselines) |
| 19 | + |
| 20 | +```bash |
| 21 | +go test -bench=BenchmarkPerformanceBaseline -benchmem -count=5 ./pkg/sql/parser/ |
| 22 | +``` |
| 23 | + |
| 24 | +**Use Case:** After significant parser changes or optimizations to establish new performance baselines. |
| 25 | + |
| 26 | +## Performance Baselines |
| 27 | + |
| 28 | +Current baselines are stored in `performance_baselines.json` at the project root: |
| 29 | + |
| 30 | +### Tracked Metrics |
| 31 | + |
| 32 | +1. **SimpleSelect** (280 ns/op baseline) |
| 33 | + - Basic SELECT query: `SELECT id, name FROM users` |
| 34 | + - Current: ~265 ns/op (9 allocs, 536 B/op) |
| 35 | + |
| 36 | +2. **ComplexQuery** (1100 ns/op baseline) |
| 37 | + - Complex SELECT with JOIN, WHERE, ORDER BY, LIMIT |
| 38 | + - Current: ~1020 ns/op (36 allocs, 1433 B/op) |
| 39 | + |
| 40 | +3. **WindowFunction** (450 ns/op baseline) |
| 41 | + - Window function: `ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...)` |
| 42 | + - Current: ~400 ns/op (14 allocs, 760 B/op) |
| 43 | + |
| 44 | +4. **CTE** (450 ns/op baseline) |
| 45 | + - Common Table Expression with WITH clause |
| 46 | + - Current: ~395 ns/op (14 allocs, 880 B/op) |
| 47 | + |
| 48 | +5. **INSERT** (350 ns/op baseline) |
| 49 | + - Simple INSERT statement |
| 50 | + - Current: ~310 ns/op (14 allocs, 536 B/op) |
| 51 | + |
| 52 | +### Tolerance Levels |
| 53 | + |
| 54 | +- **Failure Threshold:** 20% degradation from baseline |
| 55 | +- **Warning Threshold:** 10% degradation from baseline (half of tolerance) |
| 56 | + |
| 57 | +## Test Output |
| 58 | + |
| 59 | +### Successful Run |
| 60 | + |
| 61 | +``` |
| 62 | +================================================================================ |
| 63 | +PERFORMANCE REGRESSION TEST SUMMARY |
| 64 | +================================================================================ |
| 65 | +✓ All performance tests passed with no warnings |
| 66 | +
|
| 67 | +Baseline Version: 1.4.0 |
| 68 | +Baseline Updated: 2025-01-17 |
| 69 | +Tests Run: 5 |
| 70 | +Failures: 0 |
| 71 | +Warnings: 0 |
| 72 | +================================================================================ |
| 73 | +``` |
| 74 | + |
| 75 | +### Regression Detected |
| 76 | + |
| 77 | +``` |
| 78 | +REGRESSIONS DETECTED: |
| 79 | + ✗ ComplexQuery: 25.5% slower (actual: 1381 ns/op, baseline: 1100 ns/op) |
| 80 | +
|
| 81 | +WARNINGS (approaching threshold): |
| 82 | + ⚠ SimpleSelect: 12.3% slower (approaching threshold) |
| 83 | +
|
| 84 | +Tests Run: 5 |
| 85 | +Failures: 1 |
| 86 | +Warnings: 1 |
| 87 | +``` |
| 88 | + |
| 89 | +## Updating Baselines |
| 90 | + |
| 91 | +### When to Update |
| 92 | + |
| 93 | +Update baselines when: |
| 94 | +- Intentional optimizations improve performance significantly |
| 95 | +- Parser architecture changes fundamentally alter performance characteristics |
| 96 | +- New SQL features are added that affect parsing speed |
| 97 | + |
| 98 | +### How to Update |
| 99 | + |
| 100 | +1. Run the baseline benchmark: |
| 101 | + ```bash |
| 102 | + go test -bench=BenchmarkPerformanceBaseline -benchmem -count=5 ./pkg/sql/parser/ |
| 103 | + ``` |
| 104 | + |
| 105 | +2. Calculate new conservative baselines (add 10-15% buffer to measured values) |
| 106 | + |
| 107 | +3. Update `performance_baselines.json`: |
| 108 | + ```json |
| 109 | + { |
| 110 | + "SimpleSelect": { |
| 111 | + "ns_per_op": <new_baseline>, |
| 112 | + "tolerance_percent": 20, |
| 113 | + "description": "...", |
| 114 | + "current_performance": "<measured_value> ns/op" |
| 115 | + } |
| 116 | + } |
| 117 | + ``` |
| 118 | + |
| 119 | +4. Update the `updated` timestamp in the JSON file |
| 120 | + |
| 121 | +5. Commit changes with a clear explanation of why baselines were updated |
| 122 | + |
| 123 | +## Integration with CI/CD |
| 124 | + |
| 125 | +### GitHub Actions Example |
| 126 | + |
| 127 | +```yaml |
| 128 | +- name: Performance Regression Tests |
| 129 | + run: | |
| 130 | + go test -v ./pkg/sql/parser/ -run TestPerformanceRegression |
| 131 | + timeout-minutes: 2 |
| 132 | +``` |
| 133 | +
|
| 134 | +### Exit Codes |
| 135 | +
|
| 136 | +- **0:** All tests passed |
| 137 | +- **1:** Performance regression detected (test failure) |
| 138 | +
|
| 139 | +## Troubleshooting |
| 140 | +
|
| 141 | +### Test Timing Variance |
| 142 | +
|
| 143 | +Performance tests can show variance due to: |
| 144 | +- System load |
| 145 | +- CPU thermal throttling |
| 146 | +- Background processes |
| 147 | +
|
| 148 | +**Solution:** Run tests multiple times and average results. The suite uses `testing.Benchmark` which automatically adjusts iteration count for stable measurements. |
| 149 | + |
| 150 | +### False Positives |
| 151 | + |
| 152 | +If you see intermittent failures: |
| 153 | +1. Check system load during test execution |
| 154 | +2. Run the test 3-5 times to confirm consistency |
| 155 | +3. Consider increasing tolerance for that specific baseline |
| 156 | + |
| 157 | +### Baseline Drift |
| 158 | + |
| 159 | +Over time, minor optimizations may accumulate. If current performance is consistently better: |
| 160 | +1. Document the improvements |
| 161 | +2. Update baselines to reflect the new performance level |
| 162 | +3. Keep tolerance at 20% to catch future regressions |
| 163 | + |
| 164 | +## Performance Metrics Guide |
| 165 | + |
| 166 | +### ns/op (Nanoseconds per Operation) |
| 167 | +- Lower is better |
| 168 | +- Measures parsing speed for a single query |
| 169 | +- Most sensitive metric for detecting regressions |
| 170 | + |
| 171 | +### B/op (Bytes per Operation) |
| 172 | +- Memory allocated per parse operation |
| 173 | +- Tracked in benchmarks but not in regression tests |
| 174 | +- Useful for identifying memory leaks |
| 175 | + |
| 176 | +### allocs/op (Allocations per Operation) |
| 177 | +- Number of heap allocations per parse |
| 178 | +- Lower indicates better object pool efficiency |
| 179 | +- Critical for GC pressure |
| 180 | + |
| 181 | +## Related Documentation |
| 182 | + |
| 183 | +- [Benchmark Guide](../CLAUDE.md#performance-testing-new-features) |
| 184 | +- [Development Workflow](../CLAUDE.md#common-development-workflows) |
| 185 | +- [Production Metrics](../pkg/metrics/README.md) |
| 186 | + |
| 187 | +## Version History |
| 188 | + |
| 189 | +- **v1.4.0** (2025-01-17): Initial performance regression suite |
| 190 | + - 5 baseline metrics established |
| 191 | + - 20% tolerance threshold |
| 192 | + - ~8 second execution time |
0 commit comments