You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: check-plugins/disk-io/README.md
+75-34Lines changed: 75 additions & 34 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,10 +8,22 @@ On Linux, the check plugin by default tries to find "important" disks automatica
8
8
9
9
Disk I/O always starts at 10 MiB/sec, but stores the highest measured bandwidth, so it adjusts the `RWmax/s` value accordingly. For this reason, this check takes some time to warm up its (cached) readings: The check will throw some warnings and criticals during the first major disk activities above 10Mib/sec until the maximum bandwidth of the disk has been determined.
10
10
11
-
Example: The (shortened) result of `./disk-io --count 5 --warning 80 --critical 90` could look like this:
11
+
12
+
### iowait (Linux only)
13
+
14
+
On Linux, the check also monitors the system-wide iowait percentage. iowait represents CPU time spent idle while waiting for I/O operations to complete. While technically a CPU metric, its diagnostic value is entirely in the disk I/O context, which is why it is part of this check rather than a separate one.
15
+
16
+
The raw iowait value is normalized by multiplying it with the number of logical CPUs, so that 100% always means one CPU core is fully I/O-saturated, regardless of the total number of CPUs. Values above 100% indicate that more than one core is waiting for I/O. This normalization approach is inspired by [Glances](https://github.com/nicolargo/glances), which uses `100 / N` (where N = number of CPUs) as its critical threshold for raw iowait. The reason such thresholds appear low in Glances is that raw iowait is reported as a percentage of total CPU time across all cores: on a 4-core system, 25% raw iowait already means one entire core is doing nothing but waiting for I/O. By normalizing the value, the default thresholds (80/90%) work consistently across any hardware.
17
+
18
+
Like bandwidth alerts, iowait alerts only trigger after `--count` consecutive threshold violations, suppressing short spikes.
19
+
20
+
21
+
### Example
22
+
23
+
The (shortened) result of `./disk-io --count 5 --warning 80 --critical 90` could look like this:
12
24
13
25
```text
14
-
/dev/dm-4: 0.0B/s read1, 48.7KiB/s write1, 48.7KiB/s total, 227.9MiB/s max
@@ -138,10 +172,17 @@ Top 5 processes that generate the most I/O traffic (r/w):
138
172
## States
139
173
140
174
* WARN or CRIT if the bandwidth over the last n measured values is above a certain percentage, compared to the all time maximum bandwidth of this drive.
175
+
* WARN or CRIT if iowait exceeds the threshold for `--count` consecutive runs (Linux only).
0 commit comments