Skip to content

Commit 63328e7

Browse files
committed
feat: enhance ClickHouse Cloud integration with multi-service support and detailed configuration options
1 parent d430be5 commit 63328e7

5 files changed

Lines changed: 133 additions & 19 deletions

File tree

CHANGELOG.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Changelog
2+
3+
All notable changes to this project will be documented in this file.
4+
5+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7+
8+
## [1.0.0] - 2026-04-03
9+
10+
### Added
11+
- Custom Datadog Agent check for ClickHouse Cloud log collection
12+
- Query log collection from `system.query_log` (completed queries and exceptions)
13+
- Text log collection from `system.text_log` (Error, Warning, Fatal levels)
14+
- OpenMetrics configuration for ClickHouse Cloud Prometheus endpoint
15+
- Cursor-based pagination with duplicate-delivery-over-loss semantics
16+
- Configurable batch size, slow-query threshold, backfill window, and query timeout
17+
- Input validation with bounds checking on all numeric config parameters
18+
- HTTP retry logic (2 retries with backoff on 502/503/504)
19+
- Configurable `cluster_name` for the Datadog `service` field
20+
- `ddsource: clickhouse` on all log entries for Datadog pipeline compatibility
21+
- Comprehensive test suite (53 tests)
22+
- CI workflow with ruff lint and pytest across Python 3.10/3.11/3.12
23+
- Dependabot configuration for pip and GitHub Actions

README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,7 @@ instances:
6666
- service_id: "<your-service-uuid>"
6767
key_id: "<your-api-key-id>"
6868
key_secret: "<your-api-key-secret>"
69+
cluster_name: "<your-cluster-name>" # appears as "service" in Datadog Logs
6970

7071
collect_query_logs: true
7172
collect_text_logs: true
@@ -86,6 +87,35 @@ All three credential fields are required. Everything else has sensible defaults.
8687
| `initial_backfill_minutes` | 60 | 1 -- 1440 | How far back to look on first run (avoids flooding Datadog with history) |
8788
| `query_timeout_seconds` | 30 | 5 -- 300 | HTTP timeout for each ClickHouse Cloud API call |
8889

90+
### Monitoring multiple ClickHouse Cloud services
91+
92+
Add one entry per service under `instances` in the log check config. Each instance runs independently with its own cursor, credentials, and tags:
93+
94+
```yaml
95+
instances:
96+
# Production cluster
97+
- service_id: "abc-123-prod"
98+
key_id: "<prod-key-id>"
99+
key_secret: "<prod-key-secret>"
100+
cluster_name: "prod-analytics"
101+
tags:
102+
- "env:production"
103+
- "clickhouse_cluster:prod-analytics"
104+
105+
# Staging cluster
106+
- service_id: "def-456-staging"
107+
key_id: "<staging-key-id>"
108+
key_secret: "<staging-key-secret>"
109+
cluster_name: "staging-analytics"
110+
collect_text_logs: false # only query logs for staging
111+
log_batch_size: 500
112+
tags:
113+
- "env:staging"
114+
- "clickhouse_cluster:staging-analytics"
115+
```
116+
117+
For metrics, add a separate entry per service in the OpenMetrics config as well (each with its own org/endpoint). Use `cluster_name` and tags to distinguish services in Datadog dashboards and log facets.
118+
89119
### 4. Configure metrics
90120

91121
Edit `/etc/datadog-agent/conf.d/openmetrics.d/conf.yaml`:

checks/clickhouse_cloud.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@
77

88
from __future__ import annotations
99

10+
__version__ = "1.0.0"
11+
1012
import json
1113
import time
1214
from collections.abc import Callable

conf.d/clickhouse_cloud.d/conf.yaml.example

Lines changed: 77 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,88 @@
1+
## All options defined here are available to all instances.
2+
#
13
init_config:
24

35
instances:
4-
- service_id: "<your-service-uuid>" # required – ClickHouse Cloud service UUID
5-
key_id: "<your-api-key-id>" # required – Cloud API key ID
6-
key_secret: "<your-api-key-secret>" # required – Cloud API key secret
76

8-
# Cluster / service identity
9-
cluster_name: "<your-cluster-name>" # used as the "service" field in Datadog Logs
10-
# (default: "clickhouse")
7+
## @param service_id - string - required
8+
## ClickHouse Cloud service UUID.
9+
## Found in the ClickHouse Cloud console under your service's connection details.
10+
#
11+
- service_id: "<your-service-uuid>"
1112

12-
# Log collection toggles
13-
collect_query_logs: true # default: true
14-
collect_text_logs: true # default: true
13+
## @param key_id - string - required
14+
## ClickHouse Cloud API key ID.
15+
## Create one at https://clickhouse.cloud/settings/api-keys with query access.
16+
#
17+
key_id: "<your-api-key-id>"
1518

16-
# Tuning
17-
log_batch_size: 1000 # rows per check run (1–10000)
18-
slow_query_threshold_ms: 5000 # marks query logs as warning (0–3600000)
19-
initial_backfill_minutes: 60 # lookback window on first run (1–1440)
20-
query_timeout_seconds: 30 # HTTP timeout per query (5–300)
19+
## @param key_secret - string - required
20+
## ClickHouse Cloud API key secret (paired with key_id above).
21+
#
22+
key_secret: "<your-api-key-secret>"
2123

22-
# Applied to all logs shipped
23-
tags:
24-
- "env:prod"
25-
- "clickhouse_cluster:<your-cluster-name>"
24+
## @param cluster_name - string - optional - default: clickhouse
25+
## Friendly name for this ClickHouse Cloud service.
26+
## Used as the "service" field on every Datadog log entry, making it easy
27+
## to filter logs by cluster in the Datadog Logs Explorer.
28+
#
29+
# cluster_name: clickhouse
2630

31+
## @param collect_query_logs - boolean - optional - default: true
32+
## Enable collection of query logs from system.query_log.
33+
## Captures completed queries and exceptions with duration, memory, and row stats.
34+
#
35+
# collect_query_logs: true
36+
37+
## @param collect_text_logs - boolean - optional - default: true
38+
## Enable collection of server logs from system.text_log.
39+
## Captures Error, Warning, and Fatal level server-side log entries.
40+
#
41+
# collect_text_logs: true
42+
43+
## @param log_batch_size - integer - optional - default: 1000
44+
## Maximum number of log rows fetched per check run, per table.
45+
## Range: 1–10000. Higher values reduce API calls but increase memory usage.
46+
#
47+
# log_batch_size: 1000
48+
49+
## @param slow_query_threshold_ms - integer - optional - default: 5000
50+
## Queries slower than this threshold (in milliseconds) are tagged with
51+
## status "warning" instead of "info" in Datadog Logs.
52+
## Range: 0–3600000. Set to 0 to disable slow-query highlighting.
53+
#
54+
# slow_query_threshold_ms: 5000
55+
56+
## @param initial_backfill_minutes - integer - optional - default: 60
57+
## How far back (in minutes) to look on the very first check run when
58+
## no cursor exists yet. Subsequent runs resume from the last cursor.
59+
## Range: 1–1440 (up to 24 hours).
60+
#
61+
# initial_backfill_minutes: 60
62+
63+
## @param query_timeout_seconds - integer - optional - default: 30
64+
## HTTP timeout in seconds for each ClickHouse Cloud Query API request.
65+
## Range: 5–300. Increase if you see timeout errors with large batch sizes.
66+
#
67+
# query_timeout_seconds: 30
68+
69+
## @param tags - list of strings - optional
70+
## List of tags to attach to every log entry emitted by this check.
71+
##
72+
## Learn more about tagging: https://docs.datadoghq.com/tagging/
73+
#
74+
# tags:
75+
# - "env:prod"
76+
# - "clickhouse_cluster:<your-cluster-name>"
77+
78+
## Log section (required for send_log() to work)
79+
##
80+
## The Datadog Agent's log launcher needs this block to open a file handle
81+
## for the integration. Without it, send_log() silently fails with:
82+
## "Failed to write log to file, file is nil for integration ID: clickhouse_cloud"
83+
##
84+
## Do not remove or comment out this section.
85+
#
2786
logs:
2887
- type: integration
2988
source: clickhouse
30-

integrations-extras

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Subproject commit 683881dc728a7e8a595209c907dec18c0d2aa08a

0 commit comments

Comments
 (0)