Skip to content

Commit 991dd5f

Browse files
committed
feat(snmp): add column "skip output" to CSV definition for devices, add unit tests
1 parent f9f92b5 commit 991dd5f

File tree

7 files changed

+129
-99
lines changed

7 files changed

+129
-99
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ Monitoring Plugins:
1818
* kubectl-get-pods: checks the health and status of kubernetes pods by running `kubectl get pods` and parsing the results
1919
* redfish-sel: add support for Supermicro ([#866](https://github.com/Linuxfabrik/monitoring-plugins/issues/866))
2020
* systemd-unit: implement support for `systemctl --machine` and `--user`
21+
* snmp: add column "skip output" to CSV definition for devices, add unit tests
2122

2223

2324
### Fixed ("fix")

check-plugins/snmp/README.rst

Lines changed: 43 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -200,29 +200,29 @@ If needed, get any MIB files ready. Copy them to ``$HOME/.snmp/mibs`` or ``/usr/
200200
Create an OID list in ``/usr/lib64/nagios/plugins/device-oids/...`` using CSV format. For details, have a look at "Defining a Device" within this document.
201201

202202

203-
Defining a Device
204-
-----------------
203+
Defining a Device via CSV file
204+
------------------------------
205205

206206
If you want to define a device-specific list of OIDs, including any calculations, warning and critical thresholds, create a CSV file located at ``device-oids``, using ``,`` as delimiter and ``"`` as quoting character. A minimal example for nearly any device:
207207

208-
========================= ============= ========================== ============ ======================= ======================= ================== ================== ================== ==========================
209-
OID Name Re-Calc Unit Label WARN CRIT Show in 1st Line Report Change as Ignore in Perfdata Perfdata Alert Thresholds
210-
========================= ============= ========================== ============ ======================= ======================= ================== ================== ================== ==========================
208+
========================= ============= ========================== ============ ======================= ======================= ================== ================== ================== ========================== ===========
209+
OID Name Re-Calc Unit Label WARN CRIT Show in 1st Line Report Change as Ignore in Perfdata Perfdata Alert Thresholds Skip Output
210+
========================= ============= ========================== ============ ======================= ======================= ================== ================== ================== ========================== ===========
211211
SNMPv2-MIB::sysName.0 Name
212-
SNMPv2-MIB::sysLocation.0 Location WARN
212+
SNMPv2-MIB::sysLocation.0 Location WARN True
213213
SNMPv2-MIB::sysUpTime.0 Uptime int(value) / 100 * 24*3600 s value > 6*30 value > 2*365 True "3*30,None"
214-
========================= ============= ========================== ============ ======================= ======================= ================== ================== ================== ==========================
214+
========================= ============= ========================== ============ ======================= ======================= ================== ================== ================== ========================== ===========
215215

216216
The columns in detail:
217217

218-
* | Column 1: OID (String)
219-
| The Object-Identifier from any of your MIB files.
220-
* | Column 2: Name (String)
221-
| If provided, the check prints this instead of the OID.
222-
* | Column 3: Re-Calc (Python code, or empty)
223-
| Feel free to use any Python Code based on the variables ``value`` and ``values``, which contain the result (always a string) of the SNMPGET operation on the given OID.
224-
* | Column 4: Unit (String, or empty)
225-
| This is the "Unit of Measurement", case-insensitiv. One of:
218+
* | **OID**:
219+
| String. The Object-Identifier from any of your MIB files.
220+
* | **Name**:
221+
| String. If provided, the check prints this instead of the OID.
222+
* | **Re-Calc**:
223+
| Python code, or empty. Feel free to use any Python Code based on the variables ``value`` and ``values``, which contain the result (always a string) of the SNMPGET operation on the given OID.
224+
* | **Unit**:
225+
| String, or empty. This is the "Unit of Measurement", case-insensitiv. One of:
226226
227227
* s - seconds (also us, ms)
228228
* % - percentage
@@ -237,32 +237,20 @@ The columns in detail:
237237
* b - bytes
238238
* bps - bits per second
239239

240-
* | Column 5: WARN (Python condition, or empty)
241-
| The warning condition for the re-calculated or raw ``value``.
242-
* | Column 6: CRIT (Python condition, or empty)
243-
| The critical condition for the re-calculated or raw ``value``.
244-
* | Column 7: Show in first line (Bool, either "False", "True", or empty)
245-
| Should ``value`` be printed in the first line of the check output?
246-
* | Column 8: Report Change as (String, either "WARN", "CRIT", or empty)
247-
| Should a change of ``value`` be reported as ``WARN`` or ``CRIT``? The check stores the initial values on the first run in ``$TEMP/linuxfabrik-monitoring-plugins-snmp.db``.
248-
* | Column 9: Ignore in Perfdata (Bool, either "False", "True", or empty)
249-
| By default, all numeric values are automatically returned as perfdata objects. Set to ``True`` to exclude this item from the perfdata list.
250-
* | Column 10: Perfdata Alert Thresholds (Python tuple)
251-
| Add warning and critical thresholds to performance data by defining a valid Python tuple - first element for warning, second one for critical. Use double quotes around the tuple because the comma is the separator between the fields. Normally, the values of WARN and CRIT should be repeated here so that the actual thresholds used are written to the performance data.
252-
253-
The output would be something like this
254-
255-
.. code-block:: text
256-
257-
Uptime: 5M 1W
258-
259-
Key Value State
260-
--- ----- -----
261-
Name BRW38B1DB3B30F4 [OK]
262-
Location Office [OK]
263-
Contact The Printer Guy [OK]
264-
Description Brother NC-350w [OK]
265-
Uptime 5M 1W [WARNING]
240+
* | **WARN**:
241+
| Python condition, or empty. The warning condition for the re-calculated or raw ``value``.
242+
* | **CRIT**:
243+
| Python condition, or empty. The critical condition for the re-calculated or raw ``value``.
244+
* | **Show in first line**:
245+
| Bool, either "False", "True", or empty. Should ``value`` be printed in the first line of the check output?
246+
* | **Report Change as**:
247+
| String, either "WARN", "CRIT", or empty. Should a change of ``value`` be reported as ``WARN`` or ``CRIT``? The check stores the initial values on the first run in ``$TEMP/linuxfabrik-monitoring-plugins-snmp.db``.
248+
* | **Ignore in Perfdata**:
249+
| Bool, either "False", "True", or empty. By default, all numeric values are automatically returned as perfdata objects. Set to ``True`` to exclude this item from the perfdata list.
250+
* | **Perfdata Alert Thresholds**:
251+
| Python tuple. Add warning and critical thresholds to performance data by defining a valid Python tuple - first element for warning, second one for critical. Use double quotes around the tuple because the comma is the separator between the fields. Normally, the values of WARN and CRIT should be repeated here so that the actual thresholds used are written to the performance data.
252+
* | **Skip Output**:
253+
| Bool, either "False", "True", or empty. Should this row be included in the resulting table output? Set this to "True" if you only need the row for calculations.
266254
267255
The check divides the OID list automatically into blocks of 25 OIDs per SNMPGET request.
268256

@@ -291,8 +279,8 @@ IF-MIB::ifInOctets.1 NIC.1 rx int(value)
291279

292280

293281

294-
Parameter Mapping
295-
-----------------
282+
Parameter Mapping ``snmpget`` vs. this Plugin
283+
---------------------------------------------
296284

297285
================= ========================================================
298286
``snmpget`` This check
@@ -330,26 +318,6 @@ Example:
330318
10.80.32.141 NETGEAR-SWITCHING-MIB::agentInfoGroup
331319
332320
333-
Q & A
334-
-----
335-
336-
Q: **I get ``Too many object identifiers specified. Only 128 allowed in one request.``**
337-
338-
A: Probably your SNMP v3 parameters are incomplete or incorrect.
339-
340-
Q: **I get ``add_mibdir: strings scanned in from .snmp/mibs/.index are too large. count = ...``**
341-
342-
A: There seems to be a malformed, a duplicated MIB file or one with spaces in its filename within one of your MIB directories.
343-
344-
Q: **I get ``Error in packet. Reason: (tooBig) Response message would have been too large.``**
345-
346-
A: A "tooBig" response simply means that the SNMP agent tried to generate a response with all requested OID's, but the response grew too big for its buffer, resulting in this error message. To avoid this, we divide your OID list and send a maximum of 25 oids per request each.
347-
348-
Q: **Within Icinga, if I acknowledge a value change in WARN or CRIT state, does the plugin returns OK?**
349-
350-
A: If you acknowledge a value change in Icinga, the desired WARN or CRIT state remains - due to the fact that SNMP is mostly run against hardware, and you have to check what triggered the change. If everything is fine, delete ``$TEMP/linuxfabrik-monitoring-plugins-snmp.db``. On the next run of the plugin, it will recreate the inventory.
351-
352-
353321
States
354322
------
355323

@@ -373,6 +341,18 @@ Troubleshooting
373341
`IndexError: list index out of range`
374342
Something is wrong with your CSV file format. Try editing it in LibreOffice Calc, for example, to get the right amount of commas, quotes, etc.
375343

344+
Too many object identifiers specified. Only 128 allowed in one request.
345+
Probably your SNMP v3 parameters are incomplete or incorrect.
346+
347+
add_mibdir: strings scanned in from .snmp/mibs/.index are too large. count = ...
348+
There seems to be a malformed, a duplicated MIB file or one with spaces in its filename within one of your MIB directories.
349+
350+
Error in packet. Reason: (tooBig) Response message would have been too large.
351+
A "tooBig" response simply means that the SNMP agent tried to generate a response with all requested OID's, but the response grew too big for its buffer, resulting in this error message. To avoid this, we divide your OID list and send a maximum of 25 oids per request each.
352+
353+
Within Icinga, if I acknowledge a value change in WARN or CRIT state, does the plugin returns OK?
354+
If you acknowledge a value change in Icinga, the desired WARN or CRIT state remains - due to the fact that SNMP is mostly run against hardware, and you have to check what triggered the change. If everything is fine, delete ``$TEMP/linuxfabrik-monitoring-plugins-snmp.db``. On the next run of the plugin, it will recreate the inventory.
355+
376356

377357
Credits, License
378358
----------------

check-plugins/snmp/snmp

Lines changed: 49 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ from lib.globals import (STATE_CRIT, STATE_OK, # pylint: disable=C0413
2626
STATE_UNKNOWN, STATE_WARN)
2727

2828
__author__ = 'Linuxfabrik GmbH, Zurich/Switzerland'
29-
__version__ = '2025051902'
29+
__version__ = '2025052101'
3030

3131
DESCRIPTION = """This check is a SNMP application that uses the SNMP GET request to query for
3232
information on a network entity. The object identifiers (OIDs) of interest have
@@ -43,6 +43,8 @@ CSV_COL_SIFL = 6
4343
CSV_COL_RCA = 7
4444
CSV_COL_IGNPERF = 8 # added 2024052901
4545
CSV_COL_PERFTHRSHLD = 9 # added 2024052901
46+
CSV_COL_SKIPOUTPUT = 10 # added 2025052101
47+
# the last (non-existent) column contains the snmp result
4648

4749
DEFAULT_HIDE_TABLE = False
4850

@@ -348,6 +350,7 @@ def main():
348350

349351
# the last (non-existent) column should contain the result
350352
CSV_COL_VALUE = len(snmp_objects[0])
353+
351354
# evaluate results
352355
for snmp_object in snmp_objects[1:]:
353356
if len(snmp_object) <= 1:
@@ -386,19 +389,28 @@ def main():
386389
except:
387390
pass
388391

392+
# manage different CSV formats
393+
# v1: last column is CSV_COL_RCA
394+
# v2: last column is CSV_COL_PERFTHRSHLD, added 2024052901
395+
# v3: last column is CSV_COL_SKIPOUTPUT, added 2025052101
396+
skip_output = False
397+
ignore_perfdata = False
398+
perf_thresholds = False
399+
if CSV_COL_VALUE > CSV_COL_SKIPOUTPUT:
400+
# v3
401+
try:
402+
skip_output = lib.base.str2bool(snmp_object[CSV_COL_SKIPOUTPUT])
403+
except IndexError:
404+
# invalid csv definition
405+
pass
389406
if CSV_COL_VALUE > CSV_COL_RCA:
390-
# these two columns were added 2024052901
407+
# v2
391408
try:
392409
ignore_perfdata = lib.base.str2bool(snmp_object[CSV_COL_IGNPERF])
393410
perf_thresholds = snmp_object[CSV_COL_PERFTHRSHLD]
394411
except IndexError:
395-
# invalid csv definition, so ignore perfdata
396-
ignore_perfdata = False
397-
perf_thresholds = False
398-
else:
399-
# previous behaviour
400-
ignore_perfdata = False
401-
perf_thresholds = False
412+
# invalid csv definition
413+
pass
402414

403415
if recalc:
404416
# we got a formula
@@ -485,38 +497,41 @@ def main():
485497
lib.human.seconds2human(value),
486498
lib.base.state2str(value_state, prefix=' '),
487499
)
488-
if not args.HIDEOK or value_state:
489-
table_values.append({
490-
'name': name,
491-
'value': '{}{}'.format(lib.human.seconds2human(value), ''),
492-
'state': lib.base.state2str(value_state, empty_ok=False),
493-
})
500+
if not skip_output:
501+
if not args.HIDEOK or value_state:
502+
table_values.append({
503+
'name': name,
504+
'value': '{}{}'.format(lib.human.seconds2human(value), ''),
505+
'state': lib.base.state2str(value_state, empty_ok=False),
506+
})
494507
elif unit.lower() == 'b':
495508
if show_in_first_line:
496509
msg_header += '{}: {}{}, '.format(
497510
name,
498511
lib.human.bytes2human(value),
499512
lib.base.state2str(value_state, prefix=' '),
500513
)
501-
if not args.HIDEOK or value_state:
502-
table_values.append({
503-
'name': name,
504-
'value': '{}{}'.format(lib.human.bytes2human(value), ''),
505-
'state': lib.base.state2str(value_state, empty_ok=False),
506-
})
514+
if not skip_output:
515+
if not args.HIDEOK or value_state:
516+
table_values.append({
517+
'name': name,
518+
'value': '{}{}'.format(lib.human.bytes2human(value), ''),
519+
'state': lib.base.state2str(value_state, empty_ok=False),
520+
})
507521
elif unit.lower() == 'bps':
508522
if show_in_first_line:
509523
msg_header += '{}: {}{}, '.format(
510524
name,
511525
lib.human.bps2human(value),
512526
lib.base.state2str(value_state, prefix=' '),
513527
)
514-
if not args.HIDEOK or value_state:
515-
table_values.append({
516-
'name': name,
517-
'value': '{}{}'.format(lib.human.bps2human(value), ''),
518-
'state': lib.base.state2str(value_state, empty_ok=False),
519-
})
528+
if not skip_output:
529+
if not args.HIDEOK or value_state:
530+
table_values.append({
531+
'name': name,
532+
'value': '{}{}'.format(lib.human.bps2human(value), ''),
533+
'state': lib.base.state2str(value_state, empty_ok=False),
534+
})
520535
else:
521536
if show_in_first_line:
522537
msg_header += '{}: {}{}{}, '.format(
@@ -525,12 +540,13 @@ def main():
525540
unit,
526541
lib.base.state2str(value_state, prefix=' '),
527542
)
528-
if not args.HIDEOK or value_state:
529-
table_values.append({
530-
'name': name,
531-
'value': '{}{}'.format(value, unit),
532-
'state': lib.base.state2str(value_state, empty_ok=False),
533-
})
543+
if not skip_output:
544+
if not args.HIDEOK or value_state:
545+
table_values.append({
546+
'name': name,
547+
'value': '{}{}'.format(value, unit),
548+
'state': lib.base.state2str(value_state, empty_ok=False),
549+
})
534550

535551
# create perfdata for numeric values
536552
value_type = lib.base.guess_type(value)

check-plugins/snmp/unit-test/run

Lines changed: 26 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -63,9 +63,9 @@ class TestCheck(unittest.TestCase):
6363
self.assertEqual(stderr, '')
6464
self.assertEqual(retc, STATE_WARN)
6565

66-
# against EXAMPLE03.csv
67-
def test_if_check_runs_EXAMPLE03(self):
68-
stdout, stderr, retc = lib.base.coe(lib.shell.shell_exec(self.check + ' --hostname 127.0.0.3 --test=stdout/EXAMPLE03,,0'))
66+
# against EXAMPLE03.csv v1
67+
def test_if_check_runs_EXAMPLE03_v1(self):
68+
stdout, stderr, retc = lib.base.coe(lib.shell.shell_exec(self.check + ' --hostname 127.0.0.3 --test=stdout/EXAMPLE03,,0 --device=../unit-test/stdout/EXAMPLE03-v1.csv'))
6969
self.assertIn('Key ! Value ! State', stdout)
7070
self.assertIn('-------------------------+--------------+----------', stdout)
7171
self.assertIn('System Name ! My Appliance ! [OK]', stdout)
@@ -75,6 +75,29 @@ class TestCheck(unittest.TestCase):
7575
self.assertEqual(stderr, '')
7676
self.assertEqual(retc, STATE_CRIT)
7777

78+
# against EXAMPLE03.csv v2
79+
def test_if_check_runs_EXAMPLE03_v2(self):
80+
stdout, stderr, retc = lib.base.coe(lib.shell.shell_exec(self.check + ' --hostname 127.0.0.3 --test=stdout/EXAMPLE03,,0 --device=../unit-test/stdout/EXAMPLE03-v2.csv'))
81+
self.assertIn('Key ! Value ! State', stdout)
82+
self.assertIn('-------------------------+--------------+----------', stdout)
83+
self.assertIn('System Name ! My Appliance ! [OK]', stdout)
84+
self.assertIn('System Location ! My Location ! [OK]', stdout)
85+
self.assertIn('Supply Air Temperature ! 27.0 ! [WARNING]', stdout)
86+
self.assertIn('Rack Inlet Temperature 1 ! 36.0 ! [CRITICAL]', stdout)
87+
self.assertEqual(stderr, '')
88+
self.assertEqual(retc, STATE_CRIT)
89+
90+
# against EXAMPLE03.csv v3
91+
def test_if_check_runs_EXAMPLE03_v3(self):
92+
stdout, stderr, retc = lib.base.coe(lib.shell.shell_exec(self.check + ' --hostname 127.0.0.3 --test=stdout/EXAMPLE03,,0 --device=../unit-test/stdout/EXAMPLE03-v3.csv'))
93+
self.assertIn('Key ! Value ! State', stdout)
94+
self.assertIn('-------------------------+--------------+----------', stdout)
95+
self.assertIn('System Name ! My Appliance ! [OK]', stdout)
96+
self.assertIn('System Location ! My Location ! [OK]', stdout)
97+
self.assertIn('Rack Inlet Temperature 1 ! 36.0 ! [CRITICAL]', stdout)
98+
self.assertEqual(stderr, '')
99+
self.assertEqual(retc, STATE_CRIT)
100+
78101

79102
if __name__ == '__main__':
80103
unittest.main()
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
OID,Name,Re-Calc,Unit Label,WARN,CRIT,Show in 1st Line,Report Change as
2+
SNMPv2-MIB::sysName.0,System Name,,,,,,
3+
SNMPv2-MIB::sysLocation.0,System Location,,,,,,
4+
SNMPv2-SMI::enterprises.318.1.1.27.1.4.1.2.1.3.1.2,Supply Air Temperature,int(value) / 10,,value > 26,value > 28,,
5+
SNMPv2-SMI::enterprises.318.1.1.27.1.4.1.2.1.3.1.37,Rack Inlet Temperature 1,int(value) / 10,,value > 28,value > 30,,
File renamed without changes.

0 commit comments

Comments
 (0)