AI Copilot for pgwatch - starting implementation outside GSoC #1386
Replies: 3 comments 5 replies
-
Beta Was this translation helpful? Give feedback.
-
|
The five-stage architecture is well thought out. A few observations from working in the NL-to-SQL space: MetricSignal abstraction (Stage 1)This is the highest-leverage design decision. Raw counter values are meaningless to an LLM — rates, baselines, and anomaly scores give it something it can actually reason about. One thing worth considering: include the units in the struct (e.g., SQL safety (Stage 2)The Schema context (Stage 3)For the schema context builder, consider caching NL-to-SQL connectionThe copilot will inevitably need to translate natural language questions ("why is this query slow?") into diagnostic SQL against Disclosure: I work on ai2sql.io — a natural language to SQL tool. Standalone vs. subcommandThe maintainer's suggestion of a standalone app is the right call. A shared Go module for sink access would give you clean separation without duplicating connection logic. |
Beta Was this translation helpful? Give feedback.
-
|
I've just set up the project, added config, and made the sink connection. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
I applied for GSoC this year to build an AI Copilot for pgwatch and although the project wasn't selected, I was encouraged to continue contributing outside the program. I'm planning to start the implementation now and wanted to open this discussion to align on scope and approach before writing code.
My plan was to build this in 5 stages:
1- Metric transformation engine
A Go package that reads the two most recent rows per metric series from the sink, scoped by
sys_id, and computes rate-of-change, rolling baseline, and anomaly score. Output is a[]MetricSignalstruct, so the LLM never sees raw counter values.2- Safety and validation layer
AST-based SQL validation using
pg_query_go, read-only session enforcement via pgx RuntimeParams, and a pgmicro in-memory sandbox so LLM-generated queries never touch the production sink.3- Schema and context builder
Read-only connection to the monitored instance for
pg_stat_user_tables,pg_stat_statements, and optionalpg_stat_plans/ HypoPG with graceful degradation when absent.4- Prompt assembler and LLM interface
Pluggable LLMProvider interface with OpenAI and Anthropic adapters (maybe other LLMs too), streaming SSE, and a configurable token window policy.
5- CLI integration
pgwatch copilot ask "" subcommand with --sys-id, --provider, --window flags.
Before I start, a few questions I want to confirm:
Should the Copilot live as a new subcommand inside the existing pgwatch binary, or as a separate binary that imports pgwatch as a library?
For the two-connection model (one to the sink and one to the monitored instance for schema context) .. is this the expected pattern for an internal pgwatch component?
Also if there are any suggestions and ideas I will be Happy to adjust scope or approach based on feedback and guidance here before I start.
Beta Was this translation helpful? Give feedback.
All reactions