Skip to content

Commit 9a1a9d6

Browse files
authored
Add lparstats check (AIX only) (DataDog#23451)
* Add lparstats check (AIX only) Port the lparstats check from datadog-unix-agent to integrations-core, updated for Python 3 and datadog-checks-base. This check collects IBM POWER LPAR performance metrics on AIX via the `lparstat` command: - Memory statistics (system.lpar.memory.*) - Hypervisor call statistics (system.lpar.hypervisor.*) - I/O memory entitlements (system.lpar.memory.entitlement.*) - SPURR processor utilization (system.lpar.spurr.*) The manifest explicitly declares "Supported OS::AIX" as this check relies on lparstat which is exclusive to IBM AIX on POWER hardware. * Fix validation issues: license headers, CHANGELOG, metadata sort, spec.yaml, labeler, ci * Generate config models, sync conf.yaml.example, sync CI, fix lint * Add missing [project.optional-dependencies] section to pyproject.toml * Add basic unit tests with mocked lparstat output * Fix test: remove dd_environment fixture (no Docker needed) * Remove unused pytest import * Fix test: remove assert_all_metrics_covered, check key metrics only * Add dd_environment fixture and rebase onto master * Fix manifest: remove invalid Supported OS::AIX tag, add auto_install and source_type_id * Apply review nits: f-strings, x.split(), hypervisor guard, dev status 5, remove setup.cfg, skip e2e on non-AIX * lparstats: declare SPURR .pct metrics as fraction not percent Values emitted by collect_spurr are in the [0,1] range (e.g. 0.015), not [0,100], so unit_name=percent was incorrect. * lparstats: set curated_metric=core for system.lpar.memory.physb This is the manifest's primary check metric; marking it as core aligns with the convention used by other integrations. * lparstats: always apply DEFAULT_TIMEOUT, not only under sudo A hung lparstat call can block the check regardless of whether sudo is in use; unconditionally setting the timeout is the safer default. * lparstats: strip % from field names in collect_memory_entitlements collect_memory already strips % from its fields; apply the same treatment in collect_memory_entitlements so a %-suffixed column in lparstat -m -eR output produces a valid metric name. * lparstats: add type hints to all callables * lparstats: replace HYPERVISOR_IDX_METRIC_MAP dict with a tuple Contiguous integer keys 0..4 are just positional indices; a tuple is simpler and removes the need for a dict-lookup pattern. * lparstats: patch subprocess.run in tests instead of private _run_cmd Coupling tests to the internal _run_cmd helper is fragile; patching the public subprocess.run interface is more stable and matches the project testing guidelines. * lparstats: use dd_run_check fixture instead of check.check(instance) * lparstats: add tests for hypervisor and memory-entitlements collectors Both collectors had zero coverage because the default instance fixture disables them. Add a dedicated test that enables both and asserts that expected metrics are emitted. * lparstats: add tag assertions for all collectors Assert that memory/SPURR metrics carry no tags and that hypervisor (call:<name>) and entitlement (iompn:<name>) tags are present. * lparstats: guard os.getuid() with hasattr check os.getuid() does not exist on Windows; wrap it so the module can be imported on non-Unix platforms without raising AttributeError. * lparstats: check returncode in all collectors, add lparstats.can_collect service check Each collector now inspects the lparstat return code and skips metric emission (with a warning) on failure instead of silently parsing empty output. A new lparstats.can_collect service check is emitted OK when all enabled collectors succeed and CRITICAL if any lparstat invocation exits non-zero. * lparstats: fix fragile SPURR actual-vs-normalized column split The previous split used idx > len(fields)/2, which silently breaks if lparstat -E ever changes its column count. Split at the freq column instead (it reliably separates actual from normalized), with a fallback warning if freq is absent. * lparstats: extract _lparstat_rows helper to deduplicate parsing prefix All four collectors shared the same _run_cmd → splitlines → filter → slice pattern. _lparstat_rows(cmd, start_idx, ...) centralises it and returns (rows, stderr, returncode) so callers can still inspect the exit code. * lparstats: fix curated_metric value for system.lpar.memory.physb Valid values are cpu and memory; core is not accepted by the validator. * lparstats: disable e2e env, remove dd_environment fixture Set e2e-env = false in hatch.toml so CI does not try to spin up an e2e environment for an AIX-only check that can never run on Linux CI. Remove the now-redundant dd_environment fixture from conftest.py. * lparstats: set owner to agent-integrations
1 parent 6597bfd commit 9a1a9d6

23 files changed

Lines changed: 1019 additions & 0 deletions

.codecov.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -430,6 +430,10 @@ coverage:
430430
target: 75
431431
flags:
432432
- kyototycoon
433+
LPARStats:
434+
target: 75
435+
flags:
436+
- lparstats
433437
Lighttpd:
434438
target: 75
435439
flags:
@@ -1408,6 +1412,11 @@ flags:
14081412
paths:
14091413
- kyverno/datadog_checks/kyverno
14101414
- kyverno/tests
1415+
lparstats:
1416+
carryforward: true
1417+
paths:
1418+
- lparstats/datadog_checks/lparstats
1419+
- lparstats/tests
14111420
lighttpd:
14121421
carryforward: true
14131422
paths:

.github/workflows/config/labeler.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -890,6 +890,10 @@ integration/langchain:
890890
- changed-files:
891891
- any-glob-to-any-file:
892892
- langchain/**/*
893+
integration/lparstats:
894+
- changed-files:
895+
- any-glob-to-any-file:
896+
- lparstats/**/*
893897
integration/lastpass:
894898
- changed-files:
895899
- any-glob-to-any-file:

.github/workflows/test-all.yml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2457,6 +2457,26 @@ jobs:
24572457
minimum-base-package: ${{ inputs.minimum-base-package }}
24582458
pytest-args: ${{ inputs.pytest-args }}
24592459
secrets: inherit
2460+
jefcf576:
2461+
uses: ./.github/workflows/test-target.yml
2462+
with:
2463+
job-name: LPARStats
2464+
target: lparstats
2465+
platform: linux
2466+
runner: '["ubuntu-22.04"]'
2467+
repo: "${{ inputs.repo }}"
2468+
context: ${{ inputs.context }}
2469+
python-version: "${{ inputs.python-version }}"
2470+
latest: ${{ inputs.latest }}
2471+
agent-image: "${{ inputs.agent-image }}"
2472+
agent-image-py2: "${{ inputs.agent-image-py2 }}"
2473+
agent-image-windows: "${{ inputs.agent-image-windows }}"
2474+
agent-image-windows-py2: "${{ inputs.agent-image-windows-py2 }}"
2475+
test-py2: ${{ inputs.test-py2 }}
2476+
test-py3: ${{ inputs.test-py3 }}
2477+
minimum-base-package: ${{ inputs.minimum-base-package }}
2478+
pytest-args: ${{ inputs.pytest-args }}
2479+
secrets: inherit
24602480
je63e92c:
24612481
uses: ./.github/workflows/test-target.yml
24622482
with:

lparstats/CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# CHANGELOG - lparstats
2+
3+
<!-- towncrier release notes start -->

lparstats/README.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# LPARStats
2+
3+
## Overview
4+
5+
The LPARStats check collects performance metrics from IBM POWER Logical Partitions (LPARs)
6+
running AIX by parsing the output of the `lparstat` command.
7+
8+
**This check is only supported on AIX.** It relies on the `lparstat` utility, which is
9+
exclusive to IBM AIX on POWER hardware.
10+
11+
Metrics collected:
12+
13+
- **Memory statistics** (`system.lpar.memory.*`): physical memory usage, page statistics,
14+
I/O memory pool utilization.
15+
- **Hypervisor call statistics** (`system.lpar.hypervisor.*`): per-call counts and latency
16+
for hypervisor calls. Requires root or sudo.
17+
- **I/O memory entitlements** (`system.lpar.memory.entitlement.*`): per-pool entitlement
18+
and allocation data. Requires root or sudo.
19+
- **SPURR processor utilization** (`system.lpar.spurr.*`): actual and normalized physical
20+
processor utilization rates.
21+
22+
## Setup
23+
24+
### Installation
25+
26+
The LPARStats check is included in the [Datadog Agent][1] package for AIX. No additional
27+
installation is needed.
28+
29+
### Configuration
30+
31+
1. Edit the `lparstats.d/conf.yaml` file in your Agent's `conf.d/` directory.
32+
See the [sample lparstats.d/conf.yaml][2] for all available configuration options.
33+
34+
2. To collect hypervisor and memory entitlement metrics, the Agent must run as root, or
35+
the `dd-agent` user must be granted sudo access to `lparstat`:
36+
37+
```
38+
dd-agent ALL=(root) NOPASSWD: /usr/bin/lparstat
39+
```
40+
41+
3. [Restart the Agent][3].
42+
43+
### Validation
44+
45+
Run the [Agent's status subcommand][4] and look for `lparstats` under the Checks section.
46+
47+
## Data Collected
48+
49+
### Metrics
50+
51+
See [metadata.csv][5] for a list of metrics provided by this check.
52+
53+
### Service Checks
54+
55+
`lparstats.can_collect`
56+
: Returns `CRITICAL` if any `lparstat` sub-command exits with a non-zero return code. Returns `OK` otherwise.
57+
58+
### Events
59+
60+
The LPARStats check does not include any events.
61+
62+
## Support
63+
64+
Need help? Contact [Datadog support][6].
65+
66+
[1]: https://app.datadoghq.com/account/settings/agent/latest
67+
[2]: https://github.com/DataDog/integrations-core/blob/master/lparstats/datadog_checks/lparstats/data/conf.yaml.example
68+
[3]: https://docs.datadoghq.com/agent/guide/agent-commands/#start-stop-and-restart-the-agent
69+
[4]: https://docs.datadoghq.com/agent/guide/agent-commands/#agent-status-and-information
70+
[5]: https://github.com/DataDog/integrations-core/blob/master/lparstats/metadata.csv
71+
[6]: https://docs.datadoghq.com/help/
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
name: LPARStats
2+
files:
3+
- name: lparstats.yaml
4+
options:
5+
- template: init_config
6+
options:
7+
- template: init_config/default
8+
- template: instances
9+
options:
10+
- name: name
11+
description: A name for this instance.
12+
required: false
13+
value:
14+
type: string
15+
example: lparstats
16+
- name: sudo
17+
description: |
18+
Run lparstat with sudo. Requires adding dd-agent to the sudoers file:
19+
20+
dd-agent ALL=(ALL) NOPASSWD: /usr/bin/lparstat
21+
22+
When running as root, sudo is not needed.
23+
required: false
24+
value:
25+
type: boolean
26+
example: false
27+
- name: memory_stats
28+
description: Collect physical memory and page statistics (lparstat -m).
29+
required: false
30+
value:
31+
type: boolean
32+
example: true
33+
- name: page_stats
34+
description: Include page-level statistics (-pw flag). Requires memory_stats to be true.
35+
required: false
36+
value:
37+
type: boolean
38+
example: true
39+
- name: memory_entitlements
40+
description: |
41+
Collect per-I/O-memory-pool entitlement stats (lparstat -m -eR).
42+
Requires root or sudo.
43+
required: false
44+
value:
45+
type: boolean
46+
example: true
47+
- name: hypervisor
48+
description: |
49+
Collect hypervisor call statistics (lparstat -H).
50+
Requires root or sudo.
51+
required: false
52+
value:
53+
type: boolean
54+
example: true
55+
- name: spurr_utilization
56+
description: Collect SPURR physical processor utilization (lparstat -E).
57+
required: false
58+
value:
59+
type: boolean
60+
example: true
61+
- template: instances/default
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
[
2+
{
3+
"agent_version": "7.0.0",
4+
"integration": "lparstats",
5+
"check": "lparstats.can_collect",
6+
"statuses": ["ok", "critical"],
7+
"groups": [],
8+
"name": "LPARStats Can Collect",
9+
"description": "Returns `CRITICAL` if any `lparstat` sub-command fails (non-zero exit code). Returns `OK` otherwise."
10+
}
11+
]
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# (C) Datadog, Inc. 2026-present
2+
# All rights reserved
3+
# Licensed under a 3-clause BSD style license (see LICENSE)
4+
5+
__version__ = '0.1.0'
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# (C) Datadog, Inc. 2026-present
2+
# All rights reserved
3+
# Licensed under a 3-clause BSD style license (see LICENSE)
4+
5+
from .__about__ import __version__
6+
from .lparstats import LPARStats
7+
8+
__all__ = ['__version__', 'LPARStats']
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# (C) Datadog, Inc. 2026-present
2+
# All rights reserved
3+
# Licensed under a 3-clause BSD style license (see LICENSE)
4+
5+
# This file is autogenerated.
6+
# To change this file you should edit assets/configuration/spec.yaml and then run the following commands:
7+
# ddev -x validate config -s <INTEGRATION_NAME>
8+
# ddev -x validate models -s <INTEGRATION_NAME>
9+
10+
from .instance import InstanceConfig
11+
from .shared import SharedConfig
12+
13+
14+
class ConfigMixin:
15+
_config_model_instance: InstanceConfig
16+
_config_model_shared: SharedConfig
17+
18+
@property
19+
def config(self) -> InstanceConfig:
20+
return self._config_model_instance
21+
22+
@property
23+
def shared_config(self) -> SharedConfig:
24+
return self._config_model_shared

0 commit comments

Comments
 (0)