Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
ba24a0c
add spec for prometheus sources
pashagolub Apr 26, 2026
0eecb70
update spec with a simpler metric definitions
pashagolub May 8, 2026
16ceca8
add spec to refactor SourceConn as interface
pashagolub May 8, 2026
1b8bae2
rebase specs on batch metric fetching
pashagolub May 8, 2026
f67df2e
add `NewFakeExporter()` to `testutil`
pashagolub May 8, 2026
bdd38b3
extract `SourceConn` interface and `DbConn` struct
pashagolub May 8, 2026
e7a2c40
add `prometheus` source kind
pashagolub May 11, 2026
59357be
implement `PromConn.Connect()`, `Ping()`, `FetchRuntimeInfo()`
pashagolub May 11, 2026
57373c7
use `Ping()` call in ping command
pashagolub May 11, 2026
01c515b
implement `ScrapeAll` and extend `MeasurementEnvelope` with `SourceKind`
pashagolub May 11, 2026
ff27861
make `Reaper` an interface and `reaper` a struct implementing it
pashagolub May 11, 2026
a69f44e
remove `sourceReapers`, use `cancelFuncs` only
pashagolub May 11, 2026
ec19b72
rename `SourceReaper` to `DbConnReaper`
pashagolub May 11, 2026
b26eaef
implement `PromReaper`
pashagolub May 11, 2026
f312d55
move `DbConnReaper` into `database.go`
pashagolub May 11, 2026
6d88585
rename `prom_reaper.go` to `prometheus.go`
pashagolub May 11, 2026
7d0a5cd
make `NewReaper` return an interface, while `newReaper` return struct
pashagolub May 11, 2026
ad99c89
support prom-sourced envelopes in `PrometheusWriter`
pashagolub May 13, 2026
39ab05b
add `postgres-exporter-basic` preset and prometheus source example
pashagolub May 13, 2026
ea8e0c6
add "Monitoring a Prometheus Exporter" howto
pashagolub May 13, 2026
12d0d76
add `prometheus` source kind to webui
pashagolub May 14, 2026
249cf74
add `prometheus` source kind migration for postgres sink schema
pashagolub May 14, 2026
00930c1
use `GET` request in `PromConn.Ping()`
pashagolub May 14, 2026
31753a1
fix `ResolveDatabases()` for Prometheus sources
pashagolub May 14, 2026
5c68a8a
add Prometheus sources to docker compose test instance
pashagolub May 14, 2026
4f468ec
add "Patroni Cluster Overview" dashboards
pashagolub May 14, 2026
392a91c
fix schema migration tests
pashagolub May 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions contrib/sample.sources.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
- name: postgres-exporter-prod
kind: prometheus
conn_str: "http://localhost:9187/metrics"
preset_metrics: postgres-exporter-basic # scrape cadence = min(30, 60) = 30 s
custom_tags:
env: production
is_enabled: true

- name: test1 # An arbitrary unique name for the monitored source
conn_str: postgresql://postgres@localhost/postgres
kind: postgres # One of the:
Expand Down
17 changes: 10 additions & 7 deletions docker/scripts/add-test-db.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,14 @@ docker compose exec -T -i postgres psql -d pgwatch_metrics -v ON_ERROR_STOP=1
docker compose exec -T postgres psql -d pgwatch -v ON_ERROR_STOP=1 -c \
"TRUNCATE pgwatch.source CASCADE;
INSERT INTO pgwatch.source
(name, dbtype, preset_config, connstr)
(name, dbtype, preset_config, connstr, custom_tags)
VALUES
('demo', 'postgres', 'debug', 'postgresql://pgwatch:pgwatchadmin@postgres/pgwatch'),
('demo_metrics', 'postgres', 'full', 'postgresql://pgwatch:pgwatchadmin@postgres/pgwatch_metrics'),
('demo_standby', 'postgres', 'full', 'postgresql://pgwatch:pgwatchadmin@postgres-standby/pgwatch'),
('demo_patroni', 'patroni', 'basic', 'etcd://etcd1:2379,etcd2:2379,etcd3:2379/service/demo'),
('demo_pgbouncer', 'pgbouncer', 'pgbouncer', 'postgresql://pgwatch:pgwatchadmin@pgbouncer/pgbouncer'),
('demo_pgpool', 'pgpool', 'pgpool', 'postgresql://pgwatch:pgwatchadmin@pgpool/pgwatch');"
('demo', 'postgres', 'debug', 'postgresql://pgwatch:pgwatchadmin@postgres/pgwatch', NULL),
('demo_metrics', 'postgres', 'full', 'postgresql://pgwatch:pgwatchadmin@postgres/pgwatch_metrics', NULL),
('demo_standby', 'postgres', 'full', 'postgresql://pgwatch:pgwatchadmin@postgres-standby/pgwatch', NULL),
('demo_patroni', 'patroni', 'basic', 'etcd://etcd1:2379,etcd2:2379,etcd3:2379/service/demo', NULL),
('demo_pgbouncer', 'pgbouncer', 'pgbouncer', 'postgresql://pgwatch:pgwatchadmin@pgbouncer/pgbouncer', NULL),
('demo_pgpool', 'pgpool', 'pgpool', 'postgresql://pgwatch:pgwatchadmin@pgpool/pgwatch', NULL),
('patroni1-prom', 'prometheus', 'patroni', 'http://patroni1:8008/metrics', '{\"cluster\": \"demo\", \"node\": \"patroni1\"}'),
('patroni2-prom', 'prometheus', 'patroni', 'http://patroni2:8008/metrics', '{\"cluster\": \"demo\", \"node\": \"patroni2\"}'),
('patroni3-prom', 'prometheus', 'patroni', 'http://patroni3:8008/metrics', '{\"cluster\": \"demo\", \"node\": \"patroni3\"}');"
152 changes: 152 additions & 0 deletions docs/howto/monitor_prometheus_exporter.md

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@0xgouda check this out! :)

Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
---
title: Monitoring a Prometheus Exporter
---

pgwatch's native strength is collecting metrics from PostgreSQL via SQL queries. For metrics that
cannot be expressed as SQL — such as HA cluster state, node roles, replication lag as seen by the
cluster manager, or OS-level metrics like CPU, memory, and I/O that are not accessible to Postgres
without special extensions — pgwatch can scrape any HTTP endpoint that exposes data in the
[Prometheus text exposition format](https://prometheus.io/docs/instrumenting/exposition_formats/).

A typical use case is [Patroni](https://patroni.readthedocs.io/), which exposes cluster health
metrics (node role, WAL position, DCS connectivity, …) through its `GET /metrics` endpoint. These
metrics complement the SQL-based metrics already collected from the PostgreSQL instances themselves.

## Quick Start

Add a source with `kind: prometheus`, point `conn_str` at Patroni's `/metrics` endpoint, and choose
the built-in `patroni` preset:

```yaml
- name: patroni-prod-node1
kind: prometheus
conn_str: "http://patroni-node1:8008/metrics"
preset_metrics: patroni
is_enabled: true
```

pgwatch periodically fetches the URL, parses the Prometheus text format, and forwards each metric
family to the configured sink.

## Connection String Format

The `conn_str` field is a plain HTTP or HTTPS URL. Optional query parameters control TLS
validation and are stripped before the request is sent to the exporter:

| Query parameter | Description |
|---|---|
| `tlsskipverify=true` | Disable TLS certificate verification (use with caution). |
| `tlsrootcert=<path>` | Absolute path to a PEM-encoded CA certificate file used to verify the server certificate. |

Basic Auth credentials are embedded in the URL in the standard `user:password@host` form.

### Examples

```yaml
# Plain HTTP – no auth
conn_str: "http://patroni-node1:8008/metrics"

# HTTPS with a custom CA certificate
conn_str: "https://patroni-node1:8008/metrics?tlsrootcert=/etc/ssl/certs/my-ca.pem"

# HTTPS with Basic Auth and TLS certificate verification disabled
conn_str: "https://user:secret@patroni-node1:8008/metrics?tlsskipverify=true"
```

## Presets

### `patroni`

Covers the key metric families emitted by Patroni's `GET /metrics` endpoint (default port **8008**):

| Metric family | What it measures | Interval |
|---|---|---|
| `patroni_postgres_running` | 1 if Postgres is running | 30 s |
| `patroni_primary` | 1 if this node is the primary | 30 s |
| `patroni_replica` | 1 if this node is a replica | 30 s |
| `patroni_standby_leader` | 1 if this node is standby leader | 30 s |
| `patroni_cluster_unlocked` | 1 if the cluster has no leader lock | 30 s |
| `patroni_xlog_location` | WAL location on primary | 30 s |
| `patroni_xlog_received_location` | received WAL on replica | 30 s |
| `patroni_xlog_replayed_location` | replayed WAL on replica | 30 s |
| `patroni_dcs_last_seen` | seconds since DCS last contacted | 30 s |
| `patroni_pending_restart` | 1 if node needs a restart | 60 s |
| `patroni_is_paused` | 1 if auto-failover is disabled | 60 s |
| `patroni_postgres_timeline` | Postgres timeline | 60 s |

### `postgres-exporter-basic`

Covers the most important metric families emitted by
[postgres_exporter](https://github.com/prometheus-community/postgres_exporter):

| Metric family | Interval |
|---|---|
| `pg_stat_activity_count` | 30 s |
| `pg_stat_bgwriter_checkpoints_timed` | 60 s |
| `pg_stat_replication_pg_wal_lsn_diff` | 30 s |

## Custom Metrics

You can specify individual metric families and their intervals with `custom_metrics`:

```yaml
- name: patroni-prod-node1
kind: prometheus
conn_str: "http://patroni-node1:8008/metrics"
custom_metrics:
patroni_primary: 30
patroni_cluster_unlocked: 30
patroni_dcs_last_seen: 30
is_enabled: true
```

## Custom Tags

Use `custom_tags` to attach arbitrary key-value pairs to every stored data point. This is
useful when multiple nodes feed the same sink and you need to distinguish them:

```yaml
- name: patroni-prod-node1
kind: prometheus
conn_str: "http://patroni-node1:8008/metrics"
preset_metrics: patroni
custom_tags:
cluster: prod
node: node1
is_enabled: true
```

## Full Example

The following snippet monitors all three nodes of a Patroni HA cluster. Each node's metrics are
tagged with the cluster name and node identifier so they can be queried independently:

```yaml
- name: patroni-prod-node1
kind: prometheus
conn_str: "http://patroni-node1:8008/metrics"
preset_metrics: patroni
custom_tags:
cluster: prod
node: node1
is_enabled: true

- name: patroni-prod-node2
kind: prometheus
conn_str: "http://patroni-node2:8008/metrics"
preset_metrics: patroni
custom_tags:
cluster: prod
node: node2
is_enabled: true

- name: patroni-prod-node3
kind: prometheus
conn_str: "http://patroni-node3:8008/metrics"
preset_metrics: patroni
custom_tags:
cluster: prod
node: node3
is_enabled: true
```

4 changes: 2 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ require (
github.com/json-iterator/go v1.1.12
github.com/pashagolub/pgxmock/v4 v4.9.0
github.com/prometheus/client_golang v1.23.2
github.com/prometheus/client_model v0.6.2
github.com/prometheus/common v0.67.5
github.com/rifflock/lfshook v0.0.0-20180920164130-b9218ef580f5
github.com/sethvargo/go-retry v0.3.0
github.com/shirou/gopsutil/v4 v4.26.4
Expand Down Expand Up @@ -76,8 +78,6 @@ require (
github.com/opencontainers/image-spec v1.1.1 // indirect
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
github.com/power-devops/perfstat v0.0.0-20240221224432-82ca36839d55 // indirect
github.com/prometheus/client_model v0.6.2 // indirect
github.com/prometheus/common v0.67.5 // indirect
github.com/prometheus/procfs v0.20.1 // indirect
github.com/tklauser/go-sysconf v0.3.16 // indirect
github.com/tklauser/numcpus v0.11.0 // indirect
Expand Down
Loading
Loading