Skip to content

Commit ea8e0c6

Browse files
committed
add "Monitoring a Prometheus Exporter" howto
1 parent 39ab05b commit ea8e0c6

4 files changed

Lines changed: 171 additions & 3 deletions

File tree

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
---
2+
title: Monitoring a Prometheus Exporter
3+
---
4+
5+
pgwatch's native strength is collecting metrics from PostgreSQL via SQL queries. For metrics that
6+
cannot be expressed as SQL — such as HA cluster state, node roles, replication lag as seen by the
7+
cluster manager, or OS-level metrics like CPU, memory, and I/O that are not accessible to Postgres
8+
without special extensions — pgwatch can scrape any HTTP endpoint that exposes data in the
9+
[Prometheus text exposition format](https://prometheus.io/docs/instrumenting/exposition_formats/).
10+
11+
A typical use case is [Patroni](https://patroni.readthedocs.io/), which exposes cluster health
12+
metrics (node role, WAL position, DCS connectivity, …) through its `GET /metrics` endpoint. These
13+
metrics complement the SQL-based metrics already collected from the PostgreSQL instances themselves.
14+
15+
## Quick Start
16+
17+
Add a source with `kind: prometheus`, point `conn_str` at Patroni's `/metrics` endpoint, and choose
18+
the built-in `patroni` preset:
19+
20+
```yaml
21+
- name: patroni-prod-node1
22+
kind: prometheus
23+
conn_str: "http://patroni-node1:8008/metrics"
24+
preset_metrics: patroni
25+
is_enabled: true
26+
```
27+
28+
pgwatch periodically fetches the URL, parses the Prometheus text format, and forwards each metric
29+
family to the configured sink.
30+
31+
## Connection String Format
32+
33+
The `conn_str` field is a plain HTTP or HTTPS URL. Optional query parameters control TLS
34+
validation and are stripped before the request is sent to the exporter:
35+
36+
| Query parameter | Description |
37+
|---|---|
38+
| `tlsskipverify=true` | Disable TLS certificate verification (use with caution). |
39+
| `tlsrootcert=<path>` | Absolute path to a PEM-encoded CA certificate file used to verify the server certificate. |
40+
41+
Basic Auth credentials are embedded in the URL in the standard `user:password@host` form.
42+
43+
### Examples
44+
45+
```yaml
46+
# Plain HTTP – no auth
47+
conn_str: "http://patroni-node1:8008/metrics"
48+
49+
# HTTPS with a custom CA certificate
50+
conn_str: "https://patroni-node1:8008/metrics?tlsrootcert=/etc/ssl/certs/my-ca.pem"
51+
52+
# HTTPS with Basic Auth and TLS certificate verification disabled
53+
conn_str: "https://user:secret@patroni-node1:8008/metrics?tlsskipverify=true"
54+
```
55+
56+
## Presets
57+
58+
### `patroni`
59+
60+
Covers the key metric families emitted by Patroni's `GET /metrics` endpoint (default port **8008**):
61+
62+
| Metric family | What it measures | Interval |
63+
|---|---|---|
64+
| `patroni_postgres_running` | 1 if Postgres is running | 30 s |
65+
| `patroni_primary` | 1 if this node is the primary | 30 s |
66+
| `patroni_replica` | 1 if this node is a replica | 30 s |
67+
| `patroni_standby_leader` | 1 if this node is standby leader | 30 s |
68+
| `patroni_cluster_unlocked` | 1 if the cluster has no leader lock | 30 s |
69+
| `patroni_xlog_location` | WAL location on primary | 30 s |
70+
| `patroni_xlog_received_location` | received WAL on replica | 30 s |
71+
| `patroni_xlog_replayed_location` | replayed WAL on replica | 30 s |
72+
| `patroni_dcs_last_seen` | seconds since DCS last contacted | 30 s |
73+
| `patroni_pending_restart` | 1 if node needs a restart | 60 s |
74+
| `patroni_is_paused` | 1 if auto-failover is disabled | 60 s |
75+
| `patroni_postgres_timeline` | Postgres timeline | 60 s |
76+
77+
### `postgres-exporter-basic`
78+
79+
Covers the most important metric families emitted by
80+
[postgres_exporter](https://github.com/prometheus-community/postgres_exporter):
81+
82+
| Metric family | Interval |
83+
|---|---|
84+
| `pg_stat_activity_count` | 30 s |
85+
| `pg_stat_bgwriter_checkpoints_timed` | 60 s |
86+
| `pg_stat_replication_pg_wal_lsn_diff` | 30 s |
87+
88+
## Custom Metrics
89+
90+
You can specify individual metric families and their intervals with `custom_metrics`:
91+
92+
```yaml
93+
- name: patroni-prod-node1
94+
kind: prometheus
95+
conn_str: "http://patroni-node1:8008/metrics"
96+
custom_metrics:
97+
patroni_primary: 30
98+
patroni_cluster_unlocked: 30
99+
patroni_dcs_last_seen: 30
100+
is_enabled: true
101+
```
102+
103+
## Custom Tags
104+
105+
Use `custom_tags` to attach arbitrary key-value pairs to every stored data point. This is
106+
useful when multiple nodes feed the same sink and you need to distinguish them:
107+
108+
```yaml
109+
- name: patroni-prod-node1
110+
kind: prometheus
111+
conn_str: "http://patroni-node1:8008/metrics"
112+
preset_metrics: patroni
113+
custom_tags:
114+
cluster: prod
115+
node: node1
116+
is_enabled: true
117+
```
118+
119+
## Full Example
120+
121+
The following snippet monitors all three nodes of a Patroni HA cluster. Each node's metrics are
122+
tagged with the cluster name and node identifier so they can be queried independently:
123+
124+
```yaml
125+
- name: patroni-prod-node1
126+
kind: prometheus
127+
conn_str: "http://patroni-node1:8008/metrics"
128+
preset_metrics: patroni
129+
custom_tags:
130+
cluster: prod
131+
node: node1
132+
is_enabled: true
133+
134+
- name: patroni-prod-node2
135+
kind: prometheus
136+
conn_str: "http://patroni-node2:8008/metrics"
137+
preset_metrics: patroni
138+
custom_tags:
139+
cluster: prod
140+
node: node2
141+
is_enabled: true
142+
143+
- name: patroni-prod-node3
144+
kind: prometheus
145+
conn_str: "http://patroni-node3:8008/metrics"
146+
preset_metrics: patroni
147+
custom_tags:
148+
cluster: prod
149+
node: node3
150+
is_enabled: true
151+
```
152+

internal/metrics/metrics.yaml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4486,6 +4486,21 @@ presets:
44864486
# These presets map Prometheus metric family names (as emitted by exporters such as
44874487
# postgres_exporter or node_exporter) to per-family emit intervals in seconds.
44884488
# They are only meaningful for sources with kind: prometheus.
4489+
patroni:
4490+
description: "Core metrics from Patroni /metrics endpoint (port 8008)"
4491+
metrics:
4492+
patroni_postgres_running: 30
4493+
patroni_primary: 30
4494+
patroni_replica: 30
4495+
patroni_standby_leader: 30
4496+
patroni_cluster_unlocked: 30
4497+
patroni_xlog_location: 30
4498+
patroni_xlog_received_location: 30
4499+
patroni_xlog_replayed_location: 30
4500+
patroni_dcs_last_seen: 30
4501+
patroni_pending_restart: 60
4502+
patroni_is_paused: 60
4503+
patroni_postgres_timeline: 60
44894504
postgres-exporter-basic:
44904505
description: "Core metrics from postgres_exporter"
44914506
metrics:

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ nav:
6969
- Migration from pgwatch v2: howto/migrate_v2_to_v3.md
7070
- Implement Custom gRPC Sink: howto/implement_grpc_server.md
7171
- Reverse Proxy Setup: howto/reverse_proxy.md
72+
- Monitoring a Prometheus Exporter: howto/monitor_prometheus_exporter.md
7273
- Reference:
7374
- CLI & Environment Variables: reference/cli_env.md
7475
- REST API: reference/rest.md

spec/tasks/prometheus-exporter-source.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -243,9 +243,9 @@ Covers REQ-032, REQ-033, GUD-004.
243243

244244
**Purpose**: Race-detector sweep, build verification, and documentation.
245245

246-
- [ ] T051 Run `go test -race -failfast -p 1 -timeout=300s -parallel=1 ./...` and fix any detected races
247-
- [ ] T052 [P] Run `go build ./cmd/pgwatch/` and confirm binary builds without errors
248-
- [ ] T053 [P] Update `docs/reference/` or `docs/howto/` with a short section on configuring prometheus sources and the `postgres-exporter-basic` preset
246+
- [x] T051 Run `go test -race -failfast -p 1 -timeout=300s -parallel=1 ./...` and fix any detected races
247+
- [x] T052 [P] Run `go build ./cmd/pgwatch/` and confirm binary builds without errors
248+
- [x] T053 [P] Update `docs/reference/` or `docs/howto/` with a short section on configuring prometheus sources and the `postgres-exporter-basic` preset
249249

250250
---
251251

0 commit comments

Comments
 (0)