Skip to content

Commit 203b2ef

Browse files
committed
feat(disk-usage, dhcp-scope-usage): add --brief switch to hide rows within thresholds
--brief is the row-level counterpart of the column-level --lengthy switch: --lengthy adds columns, --brief filters rows. They are orthogonal and both can be set at the same time. Perfdata and alerting are strictly unaffected: every checked item still emits perfdata so Grafana trending stays complete, and every item still drives the overall check state. When --brief is set and no item is in WARN/CRIT state, the plugin collapses to a single-line "Everything is ok. (thresholds)" summary without an empty table. Admins on hosts with hundreds of disk mounts or thousands of DHCP scopes can now use --brief to get event-console-sized output while keeping the full perfdata for Grafana. The convention is documented in CONTRIBUTING under "Verbosity parameter convention"; #1081 tracks the sweep to extend --brief to other plugins with unbounded table output. Also picks up regenerated Icinga Director basket files for dhcp-scope-usage, disk-usage, docker-info, haproxy-status and podman-info (the latter three from recent --ignore work). Closes #782 Closes #788
1 parent 345c26b commit 203b2ef

File tree

14 files changed

+398
-112
lines changed

14 files changed

+398
-112
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@ Monitoring Plugins:
3333

3434
* by-ssh: add alerting on single numeric values
3535
* by-winrm: executes commands on remote Windows hosts by WinRM, supporting JEA (including the JEA endpoint via `--winrm-configuration-name`)
36+
* dhcp-scope-usage: add `--brief` parameter to hide scopes within the thresholds, useful on DHCP servers with thousands of scopes where the default output becomes unreadable. Inverse of `--lengthy`: `--brief` filters rows, `--lengthy` adds columns, both are orthogonal and can be combined. Perfdata and alerting are unaffected, so Grafana trending stays complete ([#788](https://github.com/Linuxfabrik/monitoring-plugins/issues/788))
37+
* disk-usage: add `--brief` parameter to hide filesystems within the thresholds, useful on hosts with hundreds of mounts where the default output becomes unreadable. Inverse of `--lengthy`: `--brief` filters rows, `--lengthy` adds columns, both are orthogonal and can be combined. Perfdata and alerting are unaffected, so Grafana trending stays complete ([#782](https://github.com/Linuxfabrik/monitoring-plugins/issues/782))
3638
* infomaniak-swiss-backup-devices: add `--ignore-customer`, `--ignore-name`, `--ignore-tag`, `--ignore-user` parameters to skip devices by regex
3739
* infomaniak-swiss-backup-products: add `--ignore-customer`, `--ignore-tag` parameters to skip products by regex
3840
* docker-info, podman-info: add `--ignore` parameter to filter stderr warnings and errors by regex, e.g. to suppress the `WARNING: No swap limit support` message from `docker info` on kernels without swap accounting, or the `WARNING: bridge-nf-call-iptables is disabled` message on Debian hosts that do not load the `br_netfilter` module. Accepts a Python regex, is case-sensitive by default (use `(?i)` for case-insensitive matching) and can be specified multiple times. The same treatment was applied to both plugins because they share the same stderr-parsing code path ([#834](https://github.com/Linuxfabrik/monitoring-plugins/issues/834))

CONTRIBUTING.md

Lines changed: 23 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ Checklist:
138138

139139
### Rules of Thumb
140140

141-
* Be brief by default. Report what needs to be reported to fix a problem. If there is more information that might help the admin, support a `--lengthy` parameter.
141+
* Be brief by default. Report what needs to be reported to fix a problem. If there is more information that might help the admin, support a `--lengthy` parameter. If the default output still grows unbounded on large systems (thousands of disk mounts, DHCP scopes, backends, services), also support a `--brief` parameter that hides rows within the thresholds. See "Verbosity parameter convention" below.
142142
* The plugin should be "self configuring" and/or using best practise defaults, so that it runs without parameters wherever possible.
143143
* Develop with a minimal Linux in mind.
144144
* Develop with Icinga2 in mind.
@@ -232,6 +232,7 @@ For all other options, use long parameters only. Separate words using a `-`. We
232232
--always-ok
233233
--argument
234234
--authtype
235+
--brief
235236
--cache-expire
236237
--command
237238
--community
@@ -262,8 +263,6 @@ For all other options, use long parameters only. Separate words using a `-`. We
262263
--icinga-username
263264
--idsite
264265
--ignore
265-
--ignore-pattern
266-
--ignore-regex
267266
--input
268267
--insecure
269268
--instance
@@ -600,6 +599,27 @@ If a plugin supports `-v`/`--verbose`, it should implement up to three verbosity
600599
Note: Most of our plugins use `--lengthy` instead of `-v` for extended output. The verbosity levels above apply if the plugin explicitly supports `--verbose`.
601600

602601

602+
### Verbosity parameter convention: `--lengthy` and `--brief`
603+
604+
`--lengthy` and `--brief` are the two verbosity knobs admins use to tune what a plugin prints. They are **orthogonal** (not mutually exclusive) and control different axes of the output:
605+
606+
| Parameter | Axis | Effect |
607+
|-----------|------|--------|
608+
| default | rows × columns | Show all checked items with the core columns. |
609+
| `--lengthy` | columns | **Add** extra columns to every row (e.g. full details, debug info). |
610+
| `--brief` | rows | **Hide** rows that are within the thresholds. Show only items in WARN/CRIT state. |
611+
| `--lengthy --brief` | rows × columns | Hide OK rows, show extra columns on the rows that remain. |
612+
613+
Rules:
614+
615+
* **Perfdata is always complete.** `--brief` and `--lengthy` only reshape the human-readable message. Every checked item still emits perfdata so Grafana can trend everything.
616+
* **Alerting is unaffected.** All items (including the ones `--brief` hides) still drive the overall check state. `--brief` is a display filter, not a threshold.
617+
* **When `--brief` hides everything**, the plugin prints only the summary header ("Everything is ok. (thresholds)"), not an empty table. Admins on a quiet system see one line.
618+
* **`--lengthy` and `--brief` are always combinable.** Do not mark them mutually exclusive in `argparse`.
619+
* **When to support `--brief`**: add it whenever the default output can grow unbounded on large systems (hundreds of disk mounts, thousands of DHCP scopes, hundreds of HAProxy backends, etc.). Reference implementations: `check-plugins/disk-usage` and `check-plugins/dhcp-scope-usage`.
620+
* **Help text** for `--brief` should describe the filter semantic and explicitly state that perfdata and alerting are unaffected, so the admin understands that `--brief` is safe to use on production without losing trending data.
621+
622+
603623
### Plugin Performance Data, Perfdata
604624

605625
"UOM" means "Unit of Measurement".

check-plugins/dhcp-scope-usage/README.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -34,22 +34,30 @@ Monitors IPv4 DHCP scope usage on a Windows DHCP server. Connects via WinRM and
3434
## Help
3535

3636
```text
37-
usage: dhcp-scope-usage [-h] [-V] [--always-ok] [-c CRIT] [-H HOSTNAME]
38-
[--test TEST] [-w WARN] [--winrm-domain WINRM_DOMAIN]
37+
usage: dhcp-scope-usage [-h] [-V] [--always-ok] [--brief] [-c CRIT]
38+
[-H HOSTNAME] [--test TEST] [-w WARN]
39+
[--winrm-domain WINRM_DOMAIN]
3940
--winrm-hostname WINRM_HOSTNAME
4041
--winrm-password WINRM_PASSWORD
4142
[--winrm-transport {basic,ntlm,kerberos,credssp,plaintext}]
4243
[--winrm-username WINRM_USERNAME]
4344
4445
Monitors IPv4 DHCP scope usage on a Windows DHCP server. Connects via WinRM
45-
and queries scope statistics using PowerShell. Alerts when the address pool
46-
usage of any scope exceeds the configured thresholds (default: WARN at 80%,
47-
CRIT at 90%).
46+
and queries scope statistics using PowerShell. On servers with thousands of
47+
scopes, --brief hides rows within the thresholds so the output only lists
48+
scopes in WARN/CRIT state. Alerts when the address pool usage of any scope
49+
exceeds the configured thresholds (default: WARN at 80%, CRIT at 90%).
4850
4951
options:
5052
-h, --help show this help message and exit
5153
-V, --version show program's version number and exit
5254
--always-ok Always returns OK.
55+
--brief Hide scopes within the thresholds and show only those
56+
in WARN/CRIT state. Inverse of `--lengthy` (which adds
57+
columns); `--brief` filters rows. The two are
58+
orthogonal and can be combined. Perfdata and alerting
59+
are unaffected: all scopes still emit perfdata and
60+
still drive the overall check state. Default: False
5361
-c, --critical CRIT CRIT threshold in percent. Default: >= 90
5462
-H, --hostname HOSTNAME
5563
DNS name, IPv4, or IPv6 address of the DHCP server.

check-plugins/dhcp-scope-usage/dhcp-scope-usage

Lines changed: 45 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -22,11 +22,13 @@ import lib.winrm
2222
from lib.globals import STATE_CRIT, STATE_OK, STATE_UNKNOWN, STATE_WARN
2323

2424
__author__ = 'Linuxfabrik GmbH, Zurich/Switzerland'
25-
__version__ = '2026040801'
25+
__version__ = '2026041405'
2626

2727
DESCRIPTION = """Monitors IPv4 DHCP scope usage on a Windows DHCP server. Connects via WinRM and
28-
queries scope statistics using PowerShell. Alerts when the address pool usage of any
29-
scope exceeds the configured thresholds (default: WARN at 80%, CRIT at 90%)."""
28+
queries scope statistics using PowerShell. On servers with thousands of scopes, --brief
29+
hides rows within the thresholds so the output only lists scopes in WARN/CRIT state.
30+
Alerts when the address pool usage of any scope exceeds the configured thresholds
31+
(default: WARN at 80%, CRIT at 90%)."""
3032

3133
DEFAULT_CRIT = 90
3234
DEFAULT_HOSTNAME = 'localhost'
@@ -55,6 +57,18 @@ def parse_args():
5557
default=False,
5658
)
5759

60+
parser.add_argument(
61+
'--brief',
62+
help='Hide scopes within the thresholds and show only those in WARN/CRIT state. '
63+
'Inverse of `--lengthy` (which adds columns); `--brief` filters rows. The two '
64+
'are orthogonal and can be combined. Perfdata and alerting are unaffected: all '
65+
'scopes still emit perfdata and still drive the overall check state. '
66+
'Default: %(default)s',
67+
dest='BRIEF',
68+
action='store_true',
69+
default=False,
70+
)
71+
5872
parser.add_argument(
5973
'-c',
6074
'--critical',
@@ -158,15 +172,15 @@ def main():
158172
lib.base.oao(stderr, STATE_WARN)
159173

160174
# init some vars
161-
msg = ''
162175
state = STATE_OK
163176
perfdata = ''
177+
rows = []
164178

165179
# ScopeId Free InUse PercentageInUse Reserved Pending SuperscopeName
166180
# ------- ---- ----- --------------- -------- ------- --------------
167181
# 192.0.2.120 3 0 0 0 0
168182

169-
# get state and build the message
183+
# get state and collect per-scope rows
170184
data_section = False
171185
for line in stdout.splitlines():
172186
# ignore all header lines including "----"
@@ -187,8 +201,8 @@ def main():
187201
scope_state = lib.base.get_state(scope_used, args.WARN, args.CRIT)
188202
state = lib.base.get_worst(scope_state, state)
189203

190-
# build the message
191-
msg += f'* {scope_id}: {scope_used}% used{lib.base.state2str(scope_state, prefix=" ")}\n'
204+
# Perfdata is emitted for every scope regardless of --brief so
205+
# Grafana sees the full picture and trend lines stay continuous.
192206
perfdata += lib.base.get_perfdata(
193207
f'scope_{scope_id}',
194208
scope_used,
@@ -198,14 +212,32 @@ def main():
198212
_min=0,
199213
_max=100,
200214
)
215+
rows.append((scope_id, scope_used, scope_state))
216+
217+
# Filter rows for --brief display. --brief is the inverse of
218+
# --lengthy: --lengthy adds columns, --brief filters rows. They
219+
# are orthogonal and both can be set at the same time. Perfdata
220+
# and alerting stay untouched above this point; --brief only
221+
# reshapes the human-readable output.
222+
display_rows = (
223+
[row for row in rows if row[2] != STATE_OK] if args.BRIEF else rows
224+
)
225+
226+
# build the message
227+
headers = {
228+
STATE_CRIT: 'There are one or more criticals.',
229+
STATE_WARN: 'There are one or more warnings.',
230+
STATE_OK: 'Everything is ok.',
231+
}
232+
header = headers.get(state, headers[STATE_OK])
233+
body = ''.join(
234+
f'* {scope_id}: {scope_used}% used'
235+
f'{lib.base.state2str(scope_state, prefix=" ")}\n'
236+
for scope_id, scope_used, scope_state in display_rows
237+
)
238+
msg = f'{header}\n\n{body}' if body else header
201239

202240
# over and out
203-
if state == STATE_CRIT:
204-
msg = 'There are one or more criticals.\n\n' + msg
205-
elif state == STATE_WARN:
206-
msg = 'There are one or more warnings.\n\n' + msg
207-
else:
208-
msg = 'Everything is ok.\n\n' + msg
209241
lib.base.oao(msg, state, perfdata, always_ok=args.ALWAYS_OK)
210242

211243

0 commit comments

Comments
 (0)