Skip to content

Commit 1552a9c

Browse files
netdatabotgithub-actions[bot]
authored andcommitted
Ingest new documentation
1 parent d284aa3 commit 1552a9c

240 files changed

Lines changed: 24101 additions & 1766 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/Alerts & Notifications/Alert Configuration Reference.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1038,7 +1038,7 @@ Several stock health configurations use host variables to reference dimensions f
10381038

10391039
##### Prometheus Collector Variables
10401040

1041-
For metrics collected by the go.d `prometheus` collector, each unique Prometheus label set usually produces a separate chart. The chart ID is built from the metric name followed by `-label=value` pairs for every label (e.g. `kubelet_volume_stats_used_bytes-persistentvolumeclaim=my-pvc`). In the Netdata chart registry, the prefix comes from the go.d job `FullName`: it is `prometheus.<metric_name>-<label_set>` only when the job name is literally `prometheus`; otherwise it is `prometheus_<job_name>.<metric_name>-<label_set>` (for example, `prometheus_local.<metric_name>-<label_set>` or `prometheus_kubelet.<metric_name>-<label_set>`). For summary and histogram metric families, the collector may also emit related chart IDs such as `<id>`, `<id>_sum`, and `<id>_count`, so verify the exact chart ID you want to reference.
1041+
For metrics collected by the go.d `prometheus` collector, each unique Prometheus label set usually produces a separate chart. The chart ID is built from the metric name followed by `-label=value` pairs for every label (e.g. `kubelet_volume_stats_used_bytes-persistentvolumeclaim=my-pvc`); characters in a label value that are not chart-ID-safe, such as `.`, are replaced with `_` in the chart ID, while the chart's label keeps the original value (so `addr="10.0.0.1"` yields `…-addr=10_0_0_1`). In the Netdata chart registry, the prefix comes from the go.d job `FullName`: it is `prometheus.<metric_name>-<label_set>` only when the job name is literally `prometheus`; otherwise it is `prometheus_<job_name>.<metric_name>-<label_set>` (for example, `prometheus_local.<metric_name>-<label_set>` or `prometheus_kubelet.<metric_name>-<label_set>`). Summary and histogram families also emit separate `_sum` and `_count` charts; the suffix is part of the metric name, so the IDs are `<metric_name>_sum-<label_set>` and `<metric_name>_count-<label_set>` (just `<metric_name>_sum` / `<metric_name>_count` when the series has no labels), while histogram buckets are dimensions of the base `<metric_name>` chart. Verify the exact chart ID you want to reference.
10421042

10431043
Because Prometheus chart IDs typically contain hyphens and `=` characters, use the `${...}` brace form to reference them in `calc`/`warn`/`crit` expressions — the unbraced `$var` form stops parsing at `-`. Apply the same rule for both the common `prometheus_<job_name>` prefix and the special-case plain `prometheus` prefix, including any `_sum` or `_count` chart variants.
10441044

docs/Alerts & Notifications/Alerts & Notifications.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ custom_edit_url: "https://github.com/netdata/netdata/edit/master/src/health/READ
33
sidebar_label: "Alerts & Notifications"
44
learn_status: "Published"
55
learn_rel_path: "Alerts & Notifications"
6-
sidebar_position: "140"
6+
sidebar_position: "150"
77
learn_link: "https://learn.netdata.cloud/docs/alerts-&-notifications"
88
slug: "/alerts-&-notifications"
99
---

docs/Alerts & Notifications/Notifications/Agent Dispatched Notifications/Agent Notifications Reference.mdx

Lines changed: 48 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,7 @@ Use the `edit-config` script to safely edit configuration files. It automaticall
100100
:::
101101

102102
1. Open the Agent's health notification config:
103+
103104
```bash
104105
sudo ./edit-config health_alarm_notify.conf
105106
```
@@ -109,6 +110,7 @@ Use the `edit-config` script to safely edit configuration files. It automaticall
109110
3. Define recipients per **role** (see below).
110111

111112
4. Restart the Agent for changes to take effect:
113+
112114
```bash
113115
sudo systemctl restart netdata
114116
```
@@ -286,7 +288,7 @@ role_recipients_email[sysadmin]="disabled"
286288
If left empty, the default recipient for that method is used.
287289
</details>
288290

289-
<details>
291+
<details id="alert-severity-filtering">
290292
<summary><strong>Alert Severity Filtering</strong></summary><br/>
291293

292294
You can limit certain recipients to only receive **critical** alerts:
@@ -298,11 +300,47 @@ role_recipients_email[sysadmin]="user1@example.com user2@example.com|critical"
298300
This setup:
299301

300302
- Sends all alerts to `user1@example.com`
301-
- Sends only critical-related alerts to `user2@example.com`
303+
- Sends notifications to `user2@example.com` only once the alarm reaches CRITICAL, then continues sending status changes (including WARNING and CLEAR) until the alarm is cleared.
302304

303305
Works for all supported methods: email, Slack, Telegram, Twilio, Discord, etc.
304306
</details>
305307

308+
<details>
309+
<summary><strong>Controlling Recovered (CLEAR) Notifications</strong></summary><br/>
310+
311+
When an alert returns to normal, Netdata sends a **CLEAR** (recovered) notification. You can control when and whether these are sent.
312+
313+
**Default behavior:** Netdata suppresses CLEAR notifications when the alert was never in a WARNING or CRITICAL state. If `old_status` was not WARNING or CRITICAL and the alert transitions to CLEAR, no notification is sent. This prevents noise from alerts that flap without ever reaching a problem state.
314+
315+
**Enable CLEAR for all transitions:** If your downstream system handles deduplication, set `clear_alarm_always` in `health_alarm_notify.conf` to override the default suppression and send a CLEAR notification regardless of the previous status:
316+
317+
```ini
318+
clear_alarm_always='YES'
319+
```
320+
321+
**Filter by CRITICAL history with the `|critical` modifier:** As described in [Alert Severity Filtering](#alert-severity-filtering) above, `|critical` forwards notifications only for alerts that have reached CRITICAL status. This affects both WARNING and CLEAR:
322+
323+
- **WARNING** notifications are suppressed unless the alarm has previously reached CRITICAL.
324+
- **CLEAR** notifications are only sent when the alert previously passed through CRITICAL. If the alert only went through WARNING → CLEAR, the CLEAR is not forwarded.
325+
326+
```ini
327+
role_recipients_email[sysadmin]="admin@example.com|critical"
328+
```
329+
330+
**Suppress all CLEAR notifications:** Use the `|noclear` modifier to completely block CLEAR notifications for a recipient while still receiving WARNING and CRITICAL alerts:
331+
332+
```ini
333+
role_recipients_email[sysadmin]="admin@example.com|noclear"
334+
```
335+
336+
You can combine modifiers. This example notifies only for alarms that have reached CRITICAL (WARNING is suppressed until then), and excludes CLEAR notifications entirely:
337+
338+
```ini
339+
role_recipients_email[sysadmin]="admin@example.com|critical|noclear"
340+
```
341+
342+
</details>
343+
306344
<details>
307345
<summary><strong>Proxy Settings</strong></summary><br/>
308346

@@ -411,21 +449,25 @@ Here are solutions for common alert notification issues:
411449
### Email Notifications Not Working
412450

413451
1. Verify your email configuration:
452+
414453
```bash
415454
grep -E "SEND_EMAIL|DEFAULT_RECIPIENT_EMAIL" /etc/netdata/health_alarm_notify.conf
416455
```
417456

418457
2. Check if the system can send mail:
458+
419459
```bash
420460
echo "Test" | mail -s "Test Email" your@email.com
421461
```
422462

423463
3. Look for errors in the Netdata log:
464+
424465
```bash
425466
tail -f /var/log/netdata/error.log | grep "alarm notify"
426467
```
427468

428469
4. Test with debugging enabled:
470+
429471
```bash
430472
sudo su -s /bin/bash netdata
431473
export NETDATA_ALARM_NOTIFY_DEBUG=1
@@ -435,11 +477,13 @@ Here are solutions for common alert notification issues:
435477
### Slack Notifications Failing
436478

437479
1. Verify your webhook URL is correct:
480+
438481
```bash
439482
grep -E "SLACK_WEBHOOK_URL" /etc/netdata/health_alarm_notify.conf
440483
```
441484

442485
2. Check for network connectivity to Slack:
486+
443487
```bash
444488
curl -X POST -H "Content-type: application/json" --data '{"text":"Test"}' YOUR_WEBHOOK_URL
445489
```
@@ -449,11 +493,13 @@ Here are solutions for common alert notification issues:
449493
### PagerDuty Integration Issues
450494

451495
1. Verify your service key:
496+
452497
```bash
453498
grep -E "PAGERDUTY_SERVICE_KEY" /etc/netdata/health_alarm_notify.conf
454499
```
455500

456501
2. Test the PagerDuty API directly:
502+
457503
```bash
458504
curl -H "Content-Type: application/json" -X POST -d '{"service_key":"YOUR_SERVICE_KEY","event_type":"trigger","description":"Test"}' https://events.pagerduty.com/generic/2010-04-15/create_event.json
459505
```

docs/Alerts & Notifications/Notifications/Centralized Cloud Notifications/Webhook.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -105,13 +105,13 @@ By default, the following headers will be sent in the HTTP request
105105

106106
Netdata webhook integration supports 3 different authentication mechanisms:
107107

108-
##### Mutual TLS authentication (recommended)
108+
##### Mutual TLS authentication
109109

110-
In mutual Transport Layer Security (mTLS) authentication, the client and the server authenticate each other using X.509 certificates. This ensures that the client is connecting to the intended server, and that the server is only accepting connections from authorized clients.
110+
Netdata always sends a client certificate with every webhook request, regardless of which authentication method is selected in the UI. This means mTLS is available on all webhook integrations by default — no additional configuration is needed on the Netdata side to enable it.
111111

112-
This is the default authentication mechanism used if no other method is selected.
112+
The authentication method you select (no auth, basic, or bearer) controls only whether an Authorization header is included in the request. It does not affect the client certificate behavior.
113113

114-
To take advantage of mutual TLS, you can configure your server to verify Netdata's client certificate. In order to achieve this, the Netdata client sending the notification supports mutual TLS (mTLS) to identify itself with a client certificate that your server can validate.
114+
If you want to verify Netdata's client certificate on your end, configure your server to validate it using the Netdata CA certificate below.
115115

116116
The steps to perform this validation are as follows:
117117

docs/Collecting Metrics/Collectors configuration.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ You can modify how often collectors gather metrics to optimize CPU usage. This c
5858
1. Open `netdata.conf` using [`edit-config`](/docs/netdata-agent/configuration#edit-configuration-files).
5959
2. Set the `update every` value (default is `1`, meaning one-second intervals):
6060
```text
61-
[global]
61+
[db]
6262
update every = 2
6363
```
6464

docs/Collecting Metrics/Collectors/Applications/Alamos FE2 server.mdx

Lines changed: 104 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,11 +94,14 @@ The following options can be defined globally: update_every, autodetection_retry
9494
| | autodetection_retry | Autodetection retry interval (seconds). Set 0 to disable. | 0 | no |
9595
| **Target** | url | Target endpoint URL. | | yes |
9696
| | timeout | HTTP request timeout (seconds). | 10 | no |
97+
| | expected_prefix | If set, the job's check passes only when at least one scraped metric name starts with this prefix. Guards against scraping an unexpected endpoint. | | no |
98+
| **Customization** | app | Application name used as the app segment of chart contexts (`prometheus.<app>.<metric>`). When unset, it is taken from a matched profile, otherwise it falls back to the job name. | | no |
9799
| **Filters** | [selector](#option-filters-selector) | Time series selector (filter). | | no |
98100
| **Limits** | max_time_series | Global time series limit. If an endpoint returns more time series than this, the data is not processed. | 2000 | no |
99101
| | max_time_series_per_metric | Per-metric time series limit. Metrics with more time series than this are skipped. | 200 | no |
100102
| **Customization** | [fallback_type](#option-customization-fallback-type) | Fallback type rules for untyped metrics. | | no |
101-
| | label_prefix | Optional prefix added to all labels of all charts. Labels will be formatted as `prefix_name`. | | no |
103+
| | [relabeling](#option-customization-relabeling) | Prometheus-compatible metric relabeling, applied before charts are built. | | no |
104+
| | [profiles](#option-customization-profiles) | Curated, exporter-specific chart profiles. Disable with mode `none`. | auto | no |
102105
| **HTTP Auth** | username | Username for Basic HTTP authentication. | | no |
103106
| | password | Password for Basic HTTP authentication. | | no |
104107
| | bearer_token_file | Path to a file containing a bearer token (used for `Authorization: Bearer`). | | no |
@@ -155,6 +158,57 @@ fallback_type:
155158
```
156159
157160
161+
<a id="option-customization-relabeling"></a>
162+
##### relabeling
163+
164+
A list of relabeling blocks. Each block applies a list of Prometheus
165+
`metric_relabel_configs` rules to the metrics whose name matches `match`. See the
166+
[relabeling reference](https://github.com/netdata/netdata/blob/master/src/go/plugin/go.d/collector/prometheus/relabel/README.md)
167+
for the full action set and more examples.
168+
169+
- `match`: Netdata simple patterns matched against the full metric name — including
170+
any `_bucket`/`_sum`/`_count` suffix, so prefer globs like `app_lat*` over an exact
171+
`app_lat` (space-separated; `*` matches any sequence, `?` any character, a leading
172+
`!` negates). Use `*` to target every metric. Required.
173+
- `metric_relabel_configs`: Prometheus relabel rules (`source_labels`, `separator`,
174+
`regex`, `modulus`, `target_label`, `replacement`, `action`), applied in order to
175+
the scraped samples before charts are built.
176+
177+
Relabeling that would corrupt a histogram or summary — splitting it, dropping a
178+
component, mutating the `le`/`quantile` label, or merging two families — is rejected.
179+
180+
```yaml
181+
relabeling:
182+
- match: 'http_*'
183+
metric_relabel_configs:
184+
- source_labels: [code]
185+
regex: '(\d)\d\d'
186+
target_label: code_class
187+
replacement: '${1}xx'
188+
```
189+
190+
191+
<a id="option-customization-profiles"></a>
192+
##### profiles
193+
194+
Profiles ship curated charts for recognized exporters. `profiles.mode` selects them:
195+
196+
- `auto` (default): every profile whose `match` hits at least one scraped metric.
197+
- `exact`: only the profiles named in `mode_exact.entries` (each must match, or the job fails its check).
198+
- `combined`: `auto` plus the profiles named in `mode_combined.entries`.
199+
- `none`: no profiles — generic autogen charts only (the pre-profile behavior).
200+
201+
Only the block matching the selected mode (`mode_exact` or `mode_combined`) is read; entries under the other block are ignored. Metrics not covered by a selected profile keep their generic autogen charts.
202+
203+
```yaml
204+
profiles:
205+
mode: exact
206+
mode_exact:
207+
entries:
208+
- name: haproxy
209+
```
210+
211+
158212

159213
</details>
160214

@@ -286,6 +340,55 @@ jobs:
286340
```
287341
</details>
288342

343+
###### Metric relabeling
344+
345+
Derive a `code_class` label (2xx, 4xx, ...) on metrics named `http_*`.
346+
347+
<details open>
348+
<summary>Config</summary>
349+
350+
```yaml
351+
jobs:
352+
- name: local
353+
url: http://127.0.0.1:9090/metrics
354+
relabeling:
355+
- match: 'http_*'
356+
metric_relabel_configs:
357+
- source_labels: [code]
358+
regex: '(\d)\d\d'
359+
target_label: code_class
360+
replacement: '${1}xx'
361+
362+
```
363+
</details>
364+
365+
###### Rename labels that collide with Netdata's reserved labels
366+
367+
When these metrics are re-exported in Prometheus format, Netdata adds its own `instance`,
368+
`family`, `chart`, and `dimension` labels. If the scraped endpoint already uses one of those
369+
names, the re-export emits a duplicate label and a downstream Prometheus rejects the scrape.
370+
Rename the colliding labels to avoid it (the use case the former `label_prefix` option served).
371+
372+
373+
<details open>
374+
<summary>Config</summary>
375+
376+
```yaml
377+
jobs:
378+
- name: coredns
379+
url: http://127.0.0.1:9153/metrics
380+
relabeling:
381+
- match: '*'
382+
metric_relabel_configs:
383+
- regex: '(instance|family)'
384+
action: labelmap
385+
replacement: 'coredns_$1'
386+
- regex: '(instance|family)'
387+
action: labeldrop
388+
389+
```
390+
</details>
391+
289392

290393

291394
## Alerts

0 commit comments

Comments
 (0)