Skip to content

Commit 57d05ec

Browse files
[AI-6534] Add Dell Powerflex Integration (DataDog#23183)
* Scaffolding * validate * update labeler * format * update changelog * Add can_connect metric * lint * collect system and system statistics * Add API component * Add support for volumes * Simplify code * Add support for storage pools * simplify code * Add support for protection domains * Add sds support * parameterize exception test * Add sdc metrics * Add device metrics * Add common metrics * remove resource prefix and tag by resource * Add auth token config * Add resource filters and events * handle token workflow * update event query and fix tests * Add support for alerts * change to last updated * Add more storage pool metrics * rename constants * address comments and clean up tests * Address more feedback * simplify code more * remove dead code and add test for unauthenticated mode * Add overrides and sync models * validate * Add more debug logging and add collect_events_and_alerts function * replace direct access with .get * fix types * fix tag fallback * update constant naming and spacing, device statistics powered off by default * Add test for device disabled by default * Add severity remapping * Allow multiple filters per resource * match nutanix and vsphere filters better * Add e2e * Fix types * validate config * fill coverage gaps and more types * Add comment for get_alerts * Add count metric for each resource and fix resource filters bug * use min collection interval to handle expiry and also handle api errors gracefully * remove unneeded tests * clean up tests * more test cleanup * remove unneeded test * fix style * apply user defined tags and fix metadata * remove unneeded defaults * validate ci * lint * ignore type * Add thread pool * lint * add exception case and remove unneeded ensure auth * ensure auth first, and update tests accordingly * update readme and event source * Address comments * Add loop to simplify tests * update metadata with suggestions * Apply suggestions from code review Co-authored-by: Alicia Thuerk <alicia.thuerk@datadoghq.com> * update coverage * Refactor constants and assert all metrics and tags in unit and e2e test * Add scope * Harden test * Add another test case * update config * Fix metadata and fix filter fallback * Add type * Handle exception better * Add catch all exception handling and fix types * Add stack trace debug line * fix e2e test * remove unneeded ALL_METRICS * Fix E2E test and clean up constants * lint * Fix types * address comments --------- Co-authored-by: Alicia Thuerk <alicia.thuerk@datadoghq.com>
1 parent 55a989b commit 57d05ec

58 files changed

Lines changed: 6836 additions & 1 deletion

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.ddev/config.toml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ n8n = "n8n"
4646
hpe_aruba_edgeconnect = "HPE Aruba EdgeConnect"
4747
control_m = "Control-M"
4848
nifi = "Apache NiFi"
49+
dell_powerflex = "Dell Powerflex"
4950

5051
[overrides.metrics-prefix]
5152
krakend = "krakend.api."
@@ -54,6 +55,7 @@ prefect = "prefect.server."
5455
n8n = "n8n."
5556
control_m = "control_m."
5657
nifi = "nifi."
58+
dell_powerflex = "dell_powerflex."
5759
hpe_aruba_edgeconnect = "hpe_aruba_edgeconnect."
5860

5961
[overrides.ci.ddev]
@@ -204,7 +206,6 @@ kube_scheduler = "kube_scheduler"
204206
nifi = "nifi"
205207
nginx_ingress_controller = "nginx-ingress-controller"
206208

207-
208209
[overrides.dep.updates]
209210
exclude = [
210211
'pyasn1', # https://github.com/pyasn1/pyasn1/issues/52
@@ -273,5 +274,6 @@ prefect = ["linux", "windows", "mac_os"]
273274
n8n = ["linux", "windows", "mac_os"]
274275
control_m = ["linux", "windows", "mac_os"]
275276
nifi = ["linux", "windows", "mac_os"]
277+
dell_powerflex = ["linux", "windows", "mac_os"]
276278
hpe_aruba_edgeconnect = ["linux", "windows", "mac_os"]
277279

.github/workflows/config/labeler.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -445,6 +445,10 @@ integration/delinea_secret_server:
445445
- changed-files:
446446
- any-glob-to-any-file:
447447
- delinea_secret_server/**/*
448+
integration/dell_powerflex:
449+
- changed-files:
450+
- any-glob-to-any-file:
451+
- dell_powerflex/**/*
448452
integration/directory:
449453
- changed-files:
450454
- any-glob-to-any-file:

.github/workflows/test-all.yml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -978,6 +978,26 @@ jobs:
978978
minimum-base-package: ${{ inputs.minimum-base-package }}
979979
pytest-args: ${{ inputs.pytest-args }}
980980
secrets: inherit
981+
jc346754:
982+
uses: ./.github/workflows/test-target.yml
983+
with:
984+
job-name: Dell Powerflex
985+
target: dell_powerflex
986+
platform: linux
987+
runner: '["ubuntu-22.04"]'
988+
repo: "${{ inputs.repo }}"
989+
context: ${{ inputs.context }}
990+
python-version: "${{ inputs.python-version }}"
991+
latest: ${{ inputs.latest }}
992+
agent-image: "${{ inputs.agent-image }}"
993+
agent-image-py2: "${{ inputs.agent-image-py2 }}"
994+
agent-image-windows: "${{ inputs.agent-image-windows }}"
995+
agent-image-windows-py2: "${{ inputs.agent-image-windows-py2 }}"
996+
test-py2: ${{ inputs.test-py2 }}
997+
test-py3: ${{ inputs.test-py3 }}
998+
minimum-base-package: ${{ inputs.minimum-base-package }}
999+
pytest-args: ${{ inputs.pytest-args }}
1000+
secrets: inherit
9811001
jc8f84c3:
9821002
uses: ./.github/workflows/test-target.yml
9831003
with:

code-coverage.datadog.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -226,6 +226,9 @@ services:
226226
- id: ddev
227227
paths:
228228
- ddev/src/ddev/
229+
- id: dell_powerflex
230+
paths:
231+
- dell_powerflex/datadog_checks/dell_powerflex/
229232
- id: directory
230233
paths:
231234
- directory/datadog_checks/directory/

dell_powerflex/CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# CHANGELOG - Dell PowerFlex
2+
3+
<!-- towncrier release notes start -->

dell_powerflex/README.md

Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
# Agent Check: Dell PowerFlex
2+
3+
## Overview
4+
5+
This check monitors [Dell PowerFlex][1] software-defined storage environments through the Datadog Agent. It collects metrics, events, and alerts from the PowerFlex Gateway REST API across the following resource types:
6+
7+
- **Systems**: MDM cluster state, capacity, and I/O statistics
8+
- **Protection Domains**: capacity, rebuild, rebalance, and I/O metrics
9+
- **Storage Pools**: capacity utilization, usage ratios, and throughput
10+
- **Volumes**: per-volume I/O and SDC mappings
11+
- **SDS (Storage Data Servers)**: device counts, capacity, cache, and I/O
12+
- **SDC (Storage Data Clients)**: mapped volumes and user data I/O
13+
- **Devices**: read/write latency, capacity, and I/O bandwidth
14+
15+
## Setup
16+
17+
### Installation
18+
19+
The Dell PowerFlex check is included in the [Datadog Agent][2] package. No additional installation is needed on your server.
20+
21+
### Configuration
22+
23+
1. Edit the `dell_powerflex.d/conf.yaml` file in the `conf.d/` folder at the root of your [Agent's configuration directory][3] to start collecting your Dell PowerFlex metrics. See the [sample dell_powerflex.d/conf.yaml][4] for all available configuration options.
24+
25+
```yaml
26+
instances:
27+
- powerflex_gateway_url: https://<GATEWAY_HOST>:443
28+
powerflex_username: <USERNAME>
29+
powerflex_password: <PASSWORD>
30+
```
31+
32+
2. [Restart the Agent][5].
33+
34+
#### Optional: Event and alert collection
35+
36+
To collect events and alerts from the PowerFlex Gateway, enable them in your configuration:
37+
38+
```yaml
39+
instances:
40+
- powerflex_gateway_url: https://<GATEWAY_HOST>:443
41+
powerflex_username: <USERNAME>
42+
powerflex_password: <PASSWORD>
43+
collect_events: true
44+
collect_alerts: true
45+
```
46+
47+
#### Optional: Resource filtering
48+
49+
Use `resource_filters` to control which resources are collected and whether per-resource statistics API calls are made. This is useful for large environments where you want to limit the number of API calls. Exclude filters take precedence over include filters. By default, all resources are collected with statistics enabled, except for devices which have statistics disabled by default.
50+
51+
```yaml
52+
instances:
53+
- powerflex_gateway_url: https://<GATEWAY_HOST>:443
54+
powerflex_username: <USERNAME>
55+
powerflex_password: <PASSWORD>
56+
resource_filters:
57+
- resource: storage_pool
58+
property: name
59+
patterns:
60+
- "^prod-"
61+
- resource: sds
62+
property: name
63+
type: exclude
64+
patterns:
65+
- "^standby-"
66+
- resource: device
67+
property: name
68+
collect_statistics: false
69+
```
70+
71+
#### Log collection
72+
73+
The Dell PowerFlex integration can collect logs from multiple PowerFlex components.
74+
75+
1. Collecting logs is disabled by default in the Datadog Agent. Enable it in your `datadog.yaml` file:
76+
77+
```yaml
78+
logs_enabled: true
79+
```
80+
81+
2. Add this configuration block to your `dell_powerflex.d/conf.yaml` file to start collecting your Dell PowerFlex logs. Adjust the `path` and `service` values to match your environment:
82+
83+
```yaml
84+
logs:
85+
- type: file
86+
path: /opt/emc/scaleio/mdm/logs/eventLogger.log
87+
source: dell_powerflex
88+
service: <SERVICE>
89+
90+
- type: file
91+
path: /opt/emc/scaleio/mdm/logs/trc.0
92+
source: dell_powerflex
93+
service: <SERVICE>
94+
95+
- type: file
96+
path: /opt/emc/scaleio/sds/logs/trc.0
97+
source: dell_powerflex
98+
service: <SERVICE>
99+
100+
- type: file
101+
path: /opt/emc/scaleio/lia/logs/trc.0
102+
source: dell_powerflex
103+
service: <SERVICE>
104+
105+
- type: file
106+
path: /opt/emc/scaleio/activemq/data/activemq.log
107+
source: dell_powerflex
108+
service: <SERVICE>
109+
```
110+
111+
See the [sample dell_powerflex.d/conf.yaml][4] for all available configuration options.
112+
113+
3. [Restart the Agent][5].
114+
115+
### Validation
116+
117+
[Run the Agent's status subcommand][6] and look for `dell_powerflex` under the Checks section.
118+
119+
## Data Collected
120+
121+
### Metrics
122+
123+
See [metadata.csv][7] for a list of metrics provided by this check.
124+
125+
### Events
126+
127+
When `collect_events` is enabled, the Dell PowerFlex integration collects CRITICAL and MAJOR severity events from the PowerFlex Gateway. When `collect_alerts` is enabled, it collects alerts. Both are forwarded as Datadog events.
128+
129+
### Service Checks
130+
131+
The Dell PowerFlex integration does not include any service checks.
132+
133+
## Troubleshooting
134+
135+
Need help? Contact [Datadog support][8].
136+
137+
[1]: https://www.dell.com/en-us/dt/storage/powerflex.htm
138+
[2]: https://app.datadoghq.com/account/settings/agent/latest
139+
[3]: https://docs.datadoghq.com/agent/guide/agent-configuration-files/#agent-configuration-directory
140+
[4]: https://github.com/DataDog/integrations-core/blob/master/dell_powerflex/datadog_checks/dell_powerflex/data/conf.yaml.example
141+
[5]: https://docs.datadoghq.com/agent/guide/agent-commands/#start-stop-and-restart-the-agent
142+
[6]: https://docs.datadoghq.com/agent/guide/agent-commands/#agent-status-and-information
143+
[7]: https://github.com/DataDog/integrations-core/blob/master/dell_powerflex/metadata.csv
144+
[8]: https://docs.datadoghq.com/help/
Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
name: Dell PowerFlex
2+
files:
3+
- name: dell_powerflex.yaml
4+
options:
5+
- template: init_config
6+
options:
7+
- template: init_config/default
8+
- template: instances
9+
options:
10+
- name: powerflex_gateway_url
11+
required: true
12+
display_priority: 3
13+
description: URL of the PowerFlex Gateway REST API.
14+
value:
15+
example: https://localhost:443
16+
type: string
17+
- name: powerflex_username
18+
display_priority: 2
19+
description: Username for PowerFlex Gateway authentication via Keycloak.
20+
value:
21+
type: string
22+
- name: powerflex_password
23+
secret: true
24+
display_priority: 1
25+
description: Password for PowerFlex Gateway authentication via Keycloak.
26+
value:
27+
type: string
28+
example: <PASSWORD>
29+
- name: powerflex_client_id
30+
description: OAuth2 client ID for Keycloak authentication.
31+
value:
32+
type: string
33+
example: powerflexUI
34+
default: powerflexUI
35+
- name: collect_events
36+
description: Enable collection of CRITICAL and MAJOR severity events from the PowerFlex Gateway.
37+
value:
38+
type: boolean
39+
example: false
40+
default: false
41+
- name: collect_alerts
42+
description: Enable collection of alerts from the PowerFlex Gateway.
43+
value:
44+
type: boolean
45+
example: false
46+
default: false
47+
- name: max_workers
48+
description: |
49+
Maximum number of threads used to fetch per-resource statistics in parallel.
50+
Increase this value for large environments with many resources.
51+
value:
52+
type: integer
53+
example: 4
54+
default: 4
55+
- name: resource_filters
56+
description: |
57+
Filter resources by property regex patterns and control statistics
58+
collection per resource type. Exclude takes precedence over include.
59+
If no filter is configured for a resource type, all resources of
60+
that type are collected with statistics enabled, except for devices
61+
which have statistics disabled by default.
62+
63+
Each filter entry has the following fields:
64+
resource: The resource type to filter.
65+
property: The API property name to match against.
66+
patterns: List of regex patterns to match against the property value.
67+
type: Either 'include' (default) or 'exclude'. If a resource matches
68+
both an include and exclude filter, it is excluded.
69+
collect_statistics: When false, skip per-resource statistics API
70+
calls to reduce load. Defaults to true.
71+
72+
Supported resource types and common filterable properties:
73+
volume: name, id, volumeType, storagePoolId, ancestorVolumeId
74+
storage_pool: name, id, mediaType, protectionDomainId
75+
protection_domain: name, id, protectionDomainState
76+
sds: name, id, protectionDomainId, sdsState, faultSetId
77+
sdc: id, sdcGuid, sdcType, sdcIp
78+
device: name, id, storagePoolId, sdsId, deviceCurrentPathName
79+
80+
Note: the system resource type is not filterable and will be ignored if specified.
81+
value:
82+
type: array
83+
items:
84+
type: object
85+
example:
86+
- resource: volume
87+
property: name
88+
patterns:
89+
- ".*"
90+
- resource: storage_pool
91+
property: name
92+
collect_statistics: true
93+
patterns:
94+
- "^prod-"
95+
- resource: protection_domain
96+
property: name
97+
patterns:
98+
- ".*"
99+
- resource: sds
100+
property: name
101+
type: exclude
102+
patterns:
103+
- "^standby-"
104+
- resource: sdc
105+
property: sdcType
106+
patterns:
107+
- "^AppSdc$"
108+
- resource: device
109+
property: name
110+
collect_statistics: false
111+
- template: instances/http
112+
- template: instances/default
113+
- template: logs
114+
example:
115+
- type: file
116+
path: /opt/emc/scaleio/mdm/logs/eventLogger.log
117+
source: dell_powerflex
118+
service: <SERVICE>
119+
- type: file
120+
path: /opt/emc/scaleio/mdm/logs/trc.0
121+
source: dell_powerflex
122+
service: <SERVICE>
123+
- type: file
124+
path: /opt/emc/scaleio/sds/logs/trc.0
125+
source: dell_powerflex
126+
service: <SERVICE>
127+
- type: file
128+
path: /opt/emc/scaleio/lia/logs/trc.0
129+
source: dell_powerflex
130+
service: <SERVICE>
131+
- type: file
132+
path: /opt/emc/scaleio/activemq/data/activemq.log
133+
source: dell_powerflex
134+
service: <SERVICE>
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Initial Release.
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# (C) Datadog, Inc. 2026-present
2+
# All rights reserved
3+
# Licensed under a 3-clause BSD style license (see LICENSE)
4+
__path__ = __import__('pkgutil').extend_path(__path__, __name__) # type: ignore
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# (C) Datadog, Inc. 2026-present
2+
# All rights reserved
3+
# Licensed under a 3-clause BSD style license (see LICENSE)
4+
__version__ = '0.0.1'

0 commit comments

Comments
 (0)