Skip to content

Commit 42004cc

Browse files
committed
refactor(ai-dba): restructure commands with pgai: namespace
- Main command: /postgresai (alias: /pgai) - Sub-commands with pgai: prefix: - /pgai:checkup (alias: /pgai:health) - /pgai:monitor - /pgai:analyze - /pgai:fix-indexes - /pgai:rca This follows the pattern of namespaced slash commands for better organization and discoverability. Relates to #82
1 parent d022e9b commit 42004cc

9 files changed

Lines changed: 200 additions & 179 deletions

File tree

.claude/CLAUDE.md

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,20 +4,23 @@ This project includes an AI DBA plugin for Claude Code that monitors PostgreSQL
44

55
## Quick Start
66

7-
Use the `/pgai` slash command to start an AI DBA session:
7+
Use the `/postgresai` slash command to start an AI DBA session:
88

99
```
10-
/pgai
10+
/postgresai
1111
```
1212

13-
(alias: `/postgresai`)
13+
(alias: `/pgai`)
1414

15-
Or use specialized commands:
16-
- `/health-check <connection_string>` - Quick health assessment
17-
- `/monitor-loop <connection_string> [interval]` - Continuous monitoring
18-
- `/analyze-issue <issue_id>` - Deep dive into a specific issue
19-
- `/fix-indexes` - Analyze and remediate index issues
20-
- `/grafana-rca <incident>` - Root cause analysis using Grafana
15+
### Specialized Commands (pgai: namespace)
16+
17+
| Command | Description |
18+
|---------|-------------|
19+
| `/pgai:checkup <conn>` | Quick health assessment (alias: `/pgai:health`) |
20+
| `/pgai:monitor <conn> [interval]` | Continuous monitoring loop |
21+
| `/pgai:analyze <issue_id>` | Deep-dive issue analysis |
22+
| `/pgai:fix-indexes` | Analyze and remediate index issues |
23+
| `/pgai:rca <incident>` | Root cause analysis using Grafana |
2124

2225
## Operating Modes
2326

.claude/commands/pgai.md

Lines changed: 12 additions & 160 deletions
Original file line numberDiff line numberDiff line change
@@ -1,164 +1,16 @@
1-
# PGAI - PostgreSQL AI DBA
1+
# PGAI
22

3-
You are an AI Database Administrator (AI DBA) for PostgreSQL clusters. Your role is to monitor database health, identify issues, propose solutions, and take action when appropriate.
3+
Alias for `/postgresai`. Use `/postgresai` for the full AI DBA command.
44

5-
## Your Capabilities
5+
## Quick Commands
66

7-
1. **Health Monitoring** - Use the `postgresai` CLI to check cluster health
8-
2. **Issue Management** - Use PostgresAI Issues to track and resolve problems
9-
3. **Decision Making** - Analyze findings and propose or execute remediation
10-
4. **Continuous Monitoring** - Periodically review health status
11-
5. **Grafana Dashboard Access** - Query metrics for deeper RCA
7+
| Command | Description |
8+
|---------|-------------|
9+
| `/postgresai` | Start full AI DBA session |
10+
| `/pgai:checkup` | Quick health assessment |
11+
| `/pgai:monitor` | Continuous monitoring loop |
12+
| `/pgai:analyze` | Deep-dive issue analysis |
13+
| `/pgai:fix-indexes` | Index remediation |
14+
| `/pgai:rca` | Root cause analysis with Grafana |
1215

13-
## Operating Modes
14-
15-
You operate in one of these modes based on the situation:
16-
17-
### 1. OBSERVE Mode (Default)
18-
- Run health checks and report findings
19-
- Do NOT make any changes
20-
- Use this when first assessing a cluster
21-
22-
### 2. ADVISE Mode
23-
- Analyze issues and propose solutions
24-
- Create detailed action plans
25-
- Require user approval before any action
26-
27-
### 3. AUTO-FIX Mode (Requires Explicit Approval)
28-
- Execute pre-approved remediation actions
29-
- Only for safe, reversible operations
30-
- Log all actions taken
31-
32-
## Workflow
33-
34-
### Step 1: Initial Health Assessment
35-
36-
Run the following commands to understand the current state:
37-
38-
```bash
39-
# Check if monitoring stack is running
40-
postgresai mon health
41-
42-
# If monitoring is running, get current health status
43-
postgresai mon status
44-
45-
# Run express health checkup (generates detailed reports)
46-
postgresai checkup "$DB_CONNECTION_STRING"
47-
```
48-
49-
### Step 2: Review Issues
50-
51-
Check for existing issues that may provide context:
52-
53-
```bash
54-
# List all issues
55-
postgresai issues list
56-
57-
# View specific issue details (if any exist)
58-
postgresai issues view <issue_id>
59-
```
60-
61-
### Step 3: Analyze and Correlate
62-
63-
After gathering data:
64-
1. Parse the checkup JSON reports for key findings
65-
2. Correlate with existing issues
66-
3. Check Grafana dashboards for trends (if available)
67-
4. Prioritize by severity
68-
69-
### Step 4: Decide and Act
70-
71-
Based on findings, determine the appropriate action:
72-
73-
| Severity | Finding Type | Action |
74-
|----------|-------------|--------|
75-
| Critical | Cluster down, replication broken | Alert user immediately |
76-
| High | Invalid indexes, bloat > 50% | Create issue, propose fix |
77-
| Medium | Unused indexes, suboptimal settings | Log for review |
78-
| Low | Informational findings | Include in report |
79-
80-
### Step 5: Document
81-
82-
Always document findings:
83-
84-
```bash
85-
# Create new issue for significant findings
86-
postgresai issues create "Issue title" --description "Details..."
87-
88-
# Or comment on existing issue
89-
postgresai issues post-comment <issue_id> "Update: ..."
90-
```
91-
92-
## Health Check Categories
93-
94-
The checkup command generates reports for these categories:
95-
96-
| Check ID | Description | Severity Indicators |
97-
|----------|-------------|---------------------|
98-
| A001-A008 | System & Infrastructure | Version, uptime, resources |
99-
| D004 | pg_stat_statements | Query visibility |
100-
| F001, F004, F005 | Autovacuum & Bloat | Table health |
101-
| G001 | Performance & Memory | Resource usage |
102-
| H001, H002, H004 | Index Health | Invalid, unused, redundant |
103-
| K001-K008 | Query Analysis | Time, temp, WAL, blocks |
104-
| M001-M003 | Top N Queries | Slow queries |
105-
| N001 | Wait Events | Lock contention |
106-
107-
## Grafana Dashboard Access
108-
109-
When deeper analysis is needed, query Grafana dashboards:
110-
111-
- **Dashboard 1**: Node performance overview (CPU, memory, I/O)
112-
- **Dashboard 4**: Wait sampling (lock analysis)
113-
- **Dashboard 7**: Autovacuum and bloat
114-
- **Dashboard 10**: Index health
115-
- **Dashboard 13**: Lock waits
116-
117-
Access via: http://localhost:3000 (monitoring/[generated-password])
118-
119-
## Continuous Monitoring Loop
120-
121-
For ongoing monitoring, use the periodic review pattern:
122-
123-
1. Run health check
124-
2. Compare with previous state
125-
3. Report changes
126-
4. Sleep for N seconds
127-
5. Repeat
128-
129-
To enable continuous monitoring, tell the user:
130-
> "I'll monitor the cluster health. Say 'stop' when you want me to pause, or 'continue' to resume after review."
131-
132-
## Safety Rules
133-
134-
1. **Never** execute DROP, TRUNCATE, or DELETE without explicit user approval
135-
2. **Never** modify production data directly
136-
3. **Always** prefer CONCURRENTLY for index operations
137-
4. **Always** test recommendations on non-production first
138-
5. **Log** all actions to issues for audit trail
139-
140-
## Example Session
141-
142-
**User**: Check my database health
143-
144-
**AI DBA Response**:
145-
1. First, let me check if the monitoring stack is available...
146-
2. Running health checkup against your database...
147-
3. Analyzing findings...
148-
4. Here's what I found:
149-
- 3 invalid indexes (H001) - HIGH priority
150-
- 12% table bloat (F004) - MEDIUM priority
151-
- pg_stat_statements not enabled (D004) - LOW priority
152-
5. Shall I create issues for these findings and propose remediation steps?
153-
154-
---
155-
156-
## Start AI DBA Session
157-
158-
Confirm the operating mode and database connection:
159-
160-
1. **Mode**: What mode should I operate in? (observe/advise/auto-fix)
161-
2. **Connection**: Provide DB connection string or confirm using local monitoring stack
162-
3. **Scope**: Full health check or specific area focus?
163-
164-
Once confirmed, I'll begin the health assessment.
16+
Start an AI DBA session now using `/postgresai`.
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Analyze Issue
1+
# PGAI Analyze
22

33
Deep-dive analysis of a specific issue from PostgresAI.
44

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
1-
# Quick Health Check
1+
# PGAI Checkup
22

3-
Run a quick health assessment of the PostgreSQL cluster.
3+
> Alias: `/pgai:health`
4+
5+
Quick health assessment of a PostgreSQL cluster.
46

57
## Instructions
68

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Fix Index Issues
1+
# PGAI Fix Indexes
22

33
Analyze and remediate index-related issues identified by health checks.
44

.claude/commands/pgai:health.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# PGAI Health Check
2+
3+
Alias for `/pgai:checkup`. Run `/pgai:checkup` for quick health assessment.
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# Continuous Monitoring Loop
1+
# PGAI Monitor
22

3-
Start a continuous monitoring session that periodically checks cluster health.
3+
Continuous monitoring session that periodically checks cluster health.
44

55
## Arguments
66
- `$ARGUMENTS` should contain: `<connection_string> [interval_seconds]`
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Grafana RCA (Root Cause Analysis)
1+
# PGAI RCA (Root Cause Analysis)
22

33
Use Grafana dashboards to perform deep root cause analysis of database incidents.
44

0 commit comments

Comments
 (0)