Skip to content

Commit c00f80e

Browse files
committed
Add PostgreSQL quality engineering framework
Implement a three-layer quality strategy (automated gates, AI-assisted review, human judgment) tailored for PostgresAI's small team building infrastructure that touches production PostgreSQL instances. New files: - quality/QUALITY_ENGINEERING_GUIDE.md — living doc defining standards, processes, coverage requirements, and weekly rhythm - quality/pr-review-prompt.md — PostgreSQL-specific AI PR review system prompt covering SQL safety, connection handling, transaction safety, lock awareness, and version compatibility - quality/failure-modes.md — top 5 critical failure modes (data loss, incorrect diagnostics, silent monitoring failure, security exposure, performance regression) with required test coverage checklists - quality/scripts/release-readiness.sh — automated release readiness checker (tests, build, schemas, secrets, coverage thresholds) - quality/checklists/pr-review-checklist.md — PR review checklist - quality/checklists/release-checklist.md — release checklist + weekly quality rhythm process - quality/gitlab-ci-quality.yml — CI jobs for schema validation, SQL safety checks, connection safety, PG version matrix, perf benchmarks Updated files: - .gitlab-ci.yml — include quality CI jobs - .pre-commit-config.yaml — add JSON/YAML validation, SQL injection detection, large file prevention, branch protection - .claude/CLAUDE.md — reference quality framework docs https://claude.ai/code/session_01TKKnEc2Yn2zM64bwCJ2UaX
1 parent 9a9180f commit c00f80e

10 files changed

Lines changed: 1448 additions & 0 deletions

.claude/CLAUDE.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,10 @@ git submodule update --remote .cursor
1717

1818
- **README.md** — Project overview, features, and quick start
1919
- **CONTRIBUTING.md** — Local development workflow, Docker setup, debugging
20+
- **quality/QUALITY_ENGINEERING_GUIDE.md** — Quality standards, processes, automated gates
21+
- **quality/pr-review-prompt.md** — AI PR review system prompt (PostgreSQL-specific)
22+
- **quality/failure-modes.md** — Critical failure modes with required test coverage
23+
- **quality/checklists/** — PR review and release checklists
2024

2125
### Commands
2226

@@ -44,3 +48,15 @@ postgresai mon local-install --demo
4448
| H004 | Redundant indexes |
4549
| F004 | Table bloat |
4650
| K003 | Top queries |
51+
52+
## Quality
53+
54+
```bash
55+
# Run release readiness check
56+
./quality/scripts/release-readiness.sh
57+
58+
# Full check (includes integration tests)
59+
./quality/scripts/release-readiness.sh --full
60+
```
61+
62+
See `quality/QUALITY_ENGINEERING_GUIDE.md` for the full quality framework.

.gitlab-ci.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
include:
22
- local: 'components/index_pilot/.gitlab-ci.yml'
3+
- local: 'quality/gitlab-ci-quality.yml'
34
- template: Security/SAST.gitlab-ci.yml
45
- project: 'postgres-ai/infra'
56
file: '/ci/templates/approval-check.yml'

.pre-commit-config.yaml

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,42 @@ repos:
33
rev: v8.30.0
44
hooks:
55
- id: gitleaks
6+
7+
- repo: https://github.com/pre-commit/pre-commit-hooks
8+
rev: v5.0.0
9+
hooks:
10+
# Prevent large files from being committed
11+
- id: check-added-large-files
12+
args: ['--maxkb=500']
13+
# Ensure JSON files are valid
14+
- id: check-json
15+
# Ensure YAML files are valid
16+
- id: check-yaml
17+
args: ['--allow-multiple-documents']
18+
# Prevent committing to main directly
19+
- id: no-commit-to-branch
20+
args: ['--branch', 'main']
21+
# Ensure files end with newline
22+
- id: end-of-file-fixer
23+
# Remove trailing whitespace
24+
- id: trailing-whitespace
25+
args: ['--markdown-linebreak-ext=md']
26+
27+
- repo: local
28+
hooks:
29+
# Catch potential SQL injection patterns in TypeScript
30+
- id: sql-injection-check
31+
name: SQL injection check (TypeScript)
32+
entry: bash -c 'grep -rn --include="*.ts" -E "(\`SELECT|\`INSERT|\`UPDATE|\`DELETE|\`DROP|\`ALTER|\`CREATE).*\$\{" "$@" && echo "ERROR: Possible SQL injection — use parameterized queries" && exit 1 || exit 0' --
33+
language: system
34+
files: '\.(ts|js)$'
35+
exclude: '(test|spec|__tests__)'
36+
pass_filenames: false
37+
38+
# Verify JSON schemas are valid
39+
- id: validate-schemas
40+
name: Validate JSON schemas
41+
entry: python3 -c "import json, sys; [json.load(open(f)) for f in sys.argv[1:]]"
42+
language: system
43+
files: '\.schema\.json$'
44+
types: [json]
Lines changed: 305 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,305 @@
1+
# Quality Engineering Guide
2+
3+
> Living document — last updated: 2026-03-15
4+
>
5+
> This guide defines PostgresAI's quality standards, processes, and automated
6+
> gates. It serves as the "constitution" that both AI agents and human engineers
7+
> reference when building, reviewing, and releasing software.
8+
9+
## Core Philosophy: Quality as Code
10+
11+
PostgresAI products touch production PostgreSQL instances — mistakes can mean
12+
data loss, incorrect diagnostics, or silent monitoring failures. Traditional QA
13+
departments don't fit a small, distributed team. Instead, quality is embedded
14+
into the development workflow itself:
15+
16+
| Layer | Purpose | Catches |
17+
|-------|---------|---------|
18+
| **1. Automated Gates** | CI/CD pipelines, pre-commit hooks, schema validation | ~80% of issues before any human sees them |
19+
| **2. AI-Assisted Review** | PostgreSQL-specific PR review, test generation, spec gap analysis | Edge cases, combinatorial scenarios, domain-specific bugs |
20+
| **3. Human Judgment** | Architecture decisions, customer scenarios, risk assessment | Design flaws, UX issues, safety-critical decisions |
21+
22+
---
23+
24+
## Layer 1: Automated Foundation
25+
26+
### 1.1 Pre-Commit Hooks
27+
28+
Every developer must have pre-commit hooks installed (`pre-commit install`).
29+
Current hooks:
30+
31+
- **gitleaks** — Prevents secrets from being committed
32+
- **TypeScript typecheck** — Catches type errors before push
33+
- **pytest (unit)** — Runs fast unit tests on changed Python files
34+
35+
### 1.2 CI Pipeline Quality Gates
36+
37+
Every PR must pass these gates before merge:
38+
39+
| Gate | Tool | Blocks Merge |
40+
|------|------|:------------:|
41+
| Python unit + integration tests | pytest + pytest-postgresql | Yes |
42+
| CLI unit tests + coverage | Bun test runner | Yes |
43+
| CLI smoke tests | Node.js + built CLI | Yes |
44+
| E2E monitoring stack tests | Docker-in-Docker | Yes |
45+
| Helm/config validation | pytest + helm template | Yes |
46+
| SAST security scanning | GitLab SAST | Yes |
47+
| Secret detection | gitleaks | Yes |
48+
| JSON schema validation | ajv / jsonschema | Yes |
49+
| Performance regression check | Benchmark comparison | Warning |
50+
51+
### 1.3 PostgreSQL Version Matrix
52+
53+
Products must be tested against supported PostgreSQL versions:
54+
55+
| Version | Status | CI Coverage |
56+
|---------|--------|:-----------:|
57+
| 14 | Supported | Nightly |
58+
| 15 | Supported | Every PR |
59+
| 16 | Supported | Every PR |
60+
| 17 | Supported | Nightly |
61+
| 18 | Preview | Weekly |
62+
63+
### 1.4 Test Categories
64+
65+
Tests are organized by execution speed and infrastructure requirements:
66+
67+
```
68+
pytest markers:
69+
unit — Fast, mocked, no external services (~seconds)
70+
integration — Requires PostgreSQL (~30s)
71+
requires_postgres — Alias for integration
72+
e2e — Full monitoring stack (~minutes)
73+
enable_socket — Allow network access
74+
75+
Bun test tags:
76+
*.test.ts — Unit tests (default)
77+
*.integration.test.ts — Integration tests
78+
```
79+
80+
### 1.5 Coverage Requirements
81+
82+
| Component | Minimum | Target |
83+
|-----------|:-------:|:------:|
84+
| Reporter (Python) | 70% | 85% |
85+
| CLI (TypeScript) | 60% | 80% |
86+
| New code (any) | 80% | 95% |
87+
88+
Coverage is reported automatically in CI and visible in MR/PR comments.
89+
90+
### 1.6 Schema Validation
91+
92+
All health check outputs must conform to JSON schemas in `reporter/schemas/`.
93+
Schema compliance is enforced at two levels:
94+
95+
1. **Build time**`test_report_schemas.py` validates all check outputs
96+
2. **Runtime**`checkup.ts` validates against embedded schemas before upload
97+
98+
When adding a new check:
99+
1. Create `reporter/schemas/<CHECK_ID>.schema.json`
100+
2. Add test cases in `tests/reporter/`
101+
3. Add CLI implementation in `cli/lib/checkup.ts`
102+
4. Validate output matches schema in both Python and TypeScript paths
103+
104+
---
105+
106+
## Layer 2: AI-Assisted Quality
107+
108+
### 2.1 AI PR Review
109+
110+
Every PR is reviewed by an AI agent with the PostgreSQL-specific system prompt
111+
defined in `quality/pr-review-prompt.md`. The review focuses on:
112+
113+
- **SQL safety** — Injection paths, raw string concatenation, missing parameterization
114+
- **Connection handling** — Unclosed connections, missing timeouts, pool exhaustion
115+
- **Transaction safety** — Incorrect isolation assumptions, long-running transactions
116+
- **Resource leaks** — Unreleased advisory locks, unclosed cursors, temp table accumulation
117+
- **PostgreSQL version compatibility** — Features not available in all supported versions
118+
- **Error handling** — Missing error paths on database operations
119+
- **Lock awareness** — DDL that acquires AccessExclusive locks, missing `CONCURRENTLY`
120+
121+
### 2.2 AI Test Generation
122+
123+
When implementing a new health check or analyzer, use AI to generate test
124+
scaffolding:
125+
126+
1. Write the spec/implementation
127+
2. Feed to AI with prompt: *"Generate test cases for this PostgreSQL analyzer.
128+
Cover: normal case, empty table, table with no indexes, partial indexes,
129+
expression indexes, concurrent DDL during analysis, permission errors,
130+
PostgreSQL version differences."*
131+
3. Developer reviews, adjusts, and commits the tests
132+
133+
### 2.3 Spec Gap Analysis
134+
135+
Before implementation begins, feed the spec to AI for gap analysis:
136+
137+
- *"What failure modes aren't addressed in this spec?"*
138+
- *"What PostgreSQL version-specific behaviors could affect this?"*
139+
- *"What happens if this runs concurrently with vacuum/reindex/DDL?"*
140+
141+
### 2.4 Automated Issue Triage
142+
143+
When a bug report arrives:
144+
1. AI agent classifies severity (P0-P3)
145+
2. Identifies likely affected components (reporter, CLI, monitoring stack)
146+
3. Searches for related past issues
147+
4. Drafts initial investigation path
148+
5. Human picks up with context already assembled
149+
150+
---
151+
152+
## Layer 3: Human Quality Decisions
153+
154+
### 3.1 Architecture Reviews
155+
156+
Required for:
157+
- New health checks that modify database state
158+
- Changes to the Analyst/Auditor/Actor pipeline
159+
- New autonomous actions (anything that writes to production databases)
160+
- Changes to connection pooling or authentication flows
161+
- New PostgreSQL extension dependencies
162+
163+
### 3.2 Customer Scenario Testing
164+
165+
Before each release, one engineer walks through key customer workflows:
166+
167+
| Scenario | What to verify |
168+
|----------|---------------|
169+
| Express checkup on fresh PostgreSQL | All checks run, report is valid JSON, upload succeeds |
170+
| Monitoring stack install (demo mode) | `local-install --demo` completes, Grafana accessible, metrics flowing |
171+
| Add external target database | Target added, metrics collected, checkup runs against it |
172+
| Large database checkup | No timeouts, memory stays bounded, results are accurate |
173+
| Extension-heavy database | Common extensions (PostGIS, pg_partman, pg_stat_statements) don't cause failures |
174+
175+
### 3.3 Risk Classification for Autonomous Actions
176+
177+
Every autonomous action (current or future) must have a risk classification:
178+
179+
| Risk Level | Description | Gate |
180+
|------------|-------------|------|
181+
| **Read-only** | Queries, EXPLAIN, pg_stat views | Automated |
182+
| **Advisory** | Recommendations shown to user | AI review + human spot-check |
183+
| **Reversible write** | CREATE INDEX CONCURRENTLY, config changes with reload | Human approval required |
184+
| **Irreversible write** | DROP, TRUNCATE, ALTER TABLE rewrite | Human approval + confirmation prompt |
185+
186+
---
187+
188+
## PostgreSQL-Specific Quality Standards
189+
190+
### SQL Query Standards
191+
192+
- All queries generated by the product must be tested with `EXPLAIN ANALYZE`
193+
- No sequential scans on tables expected to have >10k rows
194+
- No queries that acquire `AccessExclusiveLock` without explicit documentation
195+
- All SQL uses parameterized queries (`$1`, `$2`) — never string concatenation
196+
- Queries must specify `statement_timeout` for safety
197+
198+
### Extension Compatibility
199+
200+
First-class CI coverage for these extensions (used by most customers):
201+
202+
| Extension | Why |
203+
|-----------|-----|
204+
| pg_stat_statements | Core dependency for K-series checks |
205+
| pg_stat_kcache | CPU/IO metrics in D004 |
206+
| auto_explain | Query plan analysis |
207+
| pg_buffercache | Buffer analysis |
208+
| PostGIS | Common in customer deployments |
209+
| pg_partman | Partition management |
210+
| pgvector | Growing adoption |
211+
212+
### Connection Handling Standards
213+
214+
- All connections must have a `statement_timeout` (default: 30s for checks)
215+
- All connections must have a `connect_timeout` (default: 10s)
216+
- Connections must be returned to pool or closed in `finally` blocks
217+
- Connection errors must produce actionable error messages
218+
- Maximum connection count must be configurable and bounded
219+
220+
### WAL and Replication Safety
221+
222+
- Features touching WAL or replication need tests for:
223+
- Replica lag scenarios
224+
- Failover during operation
225+
- WAL segment cleanup interaction
226+
- Never hold connections across WAL switch boundaries unnecessarily
227+
228+
---
229+
230+
## Process: Feature Development Workflow
231+
232+
### For Every Feature
233+
234+
```
235+
1. Spec written
236+
└─→ Spec reviewed by engineer + AI gap analysis
237+
238+
2. Implementation + tests
239+
└─→ Developer writes code
240+
└─→ AI generates test scaffolding from spec
241+
└─→ Developer refines tests
242+
243+
3. PR opened
244+
└─→ CI runs fast suite (unit + lint + typecheck)
245+
└─→ AI runs PostgreSQL-specific review
246+
└─→ Human reviewer focuses on design + correctness
247+
248+
4. Merge to main
249+
└─→ Nightly: full PostgreSQL version matrix
250+
└─→ Nightly: performance benchmarks vs baseline
251+
252+
5. Release candidate
253+
└─→ AI produces release readiness report
254+
└─→ Human does scenario walkthrough
255+
└─→ Go/no-go decision
256+
```
257+
258+
### PR Review Checklist
259+
260+
Before approving any PR, verify:
261+
262+
- [ ] Tests cover the happy path AND at least 2 error paths
263+
- [ ] New SQL queries are parameterized (no string concatenation)
264+
- [ ] Database connections are properly closed/returned
265+
- [ ] New checks have corresponding JSON schema
266+
- [ ] Schema changes are backward-compatible
267+
- [ ] No new dependencies without justification
268+
- [ ] Error messages are actionable (not just "something went wrong")
269+
- [ ] PostgreSQL version-specific behavior is handled
270+
- [ ] No hardcoded credentials, tokens, or connection strings
271+
272+
---
273+
274+
## Quality Metrics
275+
276+
Track these metrics to measure quality system effectiveness:
277+
278+
| Metric | How to Measure | Target |
279+
|--------|---------------|--------|
280+
| Test coverage (Python) | `pytest --cov` in CI | >70% overall, >80% new code |
281+
| Test coverage (CLI) | Bun coverage in CI | >60% overall, >80% new code |
282+
| CI pipeline pass rate | GitLab CI analytics | >90% on main |
283+
| Mean time bug-intro → detection | Git blame + issue timestamps | <1 sprint |
284+
| Performance benchmark trend | Nightly benchmark results | No regression >5% |
285+
| Schema validation failures | CI artifact count | 0 on main |
286+
| Security findings (SAST) | GitLab security dashboard | 0 critical/high |
287+
288+
---
289+
290+
## Weekly Quality Rhythm
291+
292+
| Day | Activity |
293+
|-----|----------|
294+
| **Monday** | Review nightly test failures, triage new issues |
295+
| **Wednesday** | Mid-week check: any flaky tests? CI pipeline health? |
296+
| **Friday** | Quality retro: what slipped through? New test needed? CI tightening? |
297+
298+
---
299+
300+
## What We Don't Do
301+
302+
- **Dedicated QA team** — Quality ownership stays with engineers, amplified by AI
303+
- **Manual test plans in spreadsheets** — Everything is code
304+
- **Separate staging that drifts** — Use monitoring stack's own Docker setup to mirror real environments
305+
- **100% coverage targets** — Diminishing returns; focus on critical paths and failure modes

0 commit comments

Comments
 (0)