Move SNMP E2E device-tag assertions to the metadata payload by AAraKKe · Pull Request #23706 · DataDog/integrations-core

AAraKKe · 2026-05-14T14:24:18Z

What does this PR do?

Updates the SNMP integration's E2E test suite to align with the new SNMP corecheck tagging model introduced in DataDog/datadog-agent#49822. When collect_device_metadata is enabled (the default since the linked agent PR), per-device metrics are tagged with only dd.internal.resource:ndm_device:<ns>:<ip>; the backend joins the device, profile and metadata-derived tags onto metrics at ingest from the network-devices-metadata event-platform payload. The aggregator state visible to agent check --json (which is what dd_agent_check replays into the test stub) therefore no longer carries those tags on metric submissions.

Changes:

Adds common.DEVICE_METADATA_TAG_KEYS and common.filter_metric_tags() describing the keys now carried only on the metadata payload (identity tags, profile-level globals, common metadata.device.fields).
Updates assert_common_metrics, assert_common_check_run_metrics, assert_common_device_metrics in snmp/tests/common.py to filter device-level tags out of metric assertions when loader='core'.
Updates the assert_extend_* helpers in snmp/tests/test_e2e_core_profiles/utils.py to filter their common_tags argument before passing it to aggregator.assert_metric, and updates assert_all_profile_metrics_and_tags_covered to exclude device-metadata keys from its expected tag-key set.
Rewrites every direct aggregator.assert_metric call in snmp/tests/test_e2e_core.py and the 144 files under snmp/tests/test_e2e_core_profiles/ to compute a local metric_tags = common.filter_metric_tags(common_tags) and use it in place of common_tags for per-metric assertions. Metadata assertions (assert_device_metadata) continue to use the full device-level tag list.
Leaves snmp/tests/test_e2e_core_metadata.py untouched (it already validates the device tags via the network-devices-metadata payload and was unaffected by #49822).
Leaves snmp/tests/test_e2e_python.py untouched (Python loader is unaffected by the corecheck change).
Leaves snmp/tests/test_e2e_core_vs_python.py untouched (this comparator explicitly sets collect_device_metadata=false on the core-loader config it builds, so it exercises the legacy tagging path).
Defers snmp/tests/test_e2e_snmp_listener.py migration (the listener test is currently skipped in CI for unrelated reasons; will revisit when it is re-enabled).

No agent-shipped source files change; this is test-suite only.

Motivation

158 of 225 SNMP E2E tests have been failing deterministically on every master run since DataDog/datadog-agent#49822 landed at 2026-05-13 13:04 UTC. The previous mitigation (#23646) added a retry around agent check invocations, which targeted a different symptom (Error: no valid check found) and is not helping here — the agent invocation succeeds; the metrics just no longer carry device tags. The right fix is to update the test expectations to match the new agent contract.

Review checklist (to be filled by reviewers)

Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged

The SNMP corecheck now strips device-level tags from per-device metrics when collect_device_metadata is true (the default), shipping only the dd.internal.resource:ndm_device:<id> tag instead. The backend joins those tags onto metrics from the network-devices-metadata event payload at ingest, so the agent-level data the E2E tests see no longer includes them. The change landed in DataDog/datadog-agent#49822 on 2026-05-13 and broke 158 of 225 SNMP E2E tests deterministically. This migration: - Adds common.DEVICE_METADATA_TAG_KEYS / common.filter_metric_tags() to describe the keys now carried only on the metadata payload. - Updates the shared assertion helpers (assert_common_metrics, assert_common_check_run_metrics, assert_common_device_metrics, and the assert_extend_* family in test_e2e_core_profiles/utils.py) to filter device-level tags out of metric assertions when the core loader is in use. - Rewrites every direct aggregator.assert_metric call across test_e2e_core.py and test_e2e_core_profiles/test_profile_*.py to use a filtered local metric_tags instead of the device-level list. - test_e2e_core_metadata.py already validates the device tags via the NDM metadata payload and is left untouched; test_e2e_python.py is on the Python loader and unaffected; test_e2e_core_vs_python.py pins collect_device_metadata=false and so still exercises the legacy tagging path. No agent-shipped source files change; this is test-suite only.

- Fix: 135 profile tests were wrongly setting `device['tags'] = metric_tags` in the metadata-assertion section. Metadata payloads still carry the full device-level tag set (that's the whole point of #49822), so the assertion must compare against `common_tags`, not the filtered list. Caught by both the functional and structure reviewers. - Drop the private `_metric_tags` wrapper in test_e2e_core_profiles/utils.py in favor of calling `common.filter_metric_tags` directly. Replace the parameter-shadowing pattern (`common_tags = _metric_tags(common_tags)`) with an explicit `metric_tags = common.filter_metric_tags(common_tags)` so the difference between the unfiltered argument and the filtered metric-assertion list is visible in the body. - Remove the redundant filter call in `assert_extend_generic_host_resources`; the two delegates it calls already filter their own arguments.

In test_e2e_core.py::test_e2e_core_cisco_csr and test_profile_generic_lldp.py the intermediate `metric_tags = common.filter_metric_tags(...)` binding was immediately shadowed by `metric_tags = metric_tags + [row_tags]` on the next line. Combine into a single statement so the intent reads top-to-bottom. Also reformats test_e2e_core.py with the repo-root pyproject.toml ruff config (this is what CI uses).

datadog-datadog-prod-us1 · 2026-05-21T12:13:00Z

Tests

🎉 All green!

🧪 All tests passed
❄️ No new flaky tests detected

🎯 Code Coverage (details)
• Patch Coverage: 50.00%
• Overall Coverage: 95.21% (+7.82%)

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 0651d3b | Docs | Datadog PR Page | Give us feedback!}

…nly tests filter_metric_tags is added by this PR (in common.py), so per AGENTS.md "new code" rules it should carry modern annotations. The signature is also load-bearing — callers pass either list[str] of device-level tags or None — so typing both ends advertises the contract at the signature. Drop the unused `metric_tags = common.filter_metric_tags(common_tags)` line from 19 profile tests that delegate entirely to assert_extend_* helpers (those helpers filter internally; the local binding was never read).

test_e2e_snmp_listener runs only in the SNMP_LISTENER_ENV=true env and only fires the network-profile core-loader path through the same agent contract that #49822 changed. Under that contract, per-device metric submissions ship with only the dd.internal.resource:* tag — the listener's user-defined `tags:` (here tag1/tag2 from conftest's datadog.yaml network config) get stripped alongside device-identity tags. Compute a metric_tags local for the two core-loader sections (network profile and SNMP v3) that strips both DEVICE_METADATA_TAG_KEYS and the two user-tag keys. Use it in every per-device metric assertion. Leave common_tags wired into the apc_ups section (Python loader, legacy behavior) and the ignored-IPs section (count=0, doesn't depend on tags).

The autodiscovery+rate path now submits each per-device COUNT/RATE metric on both check cycles (previously the first cycle stored only a baseline). With device-identity tags removed under the new contract, the aggregator sees both submissions as identical stubs. Switch the section-1 metric assertions to count=2 to match what the agent now emits.

This reverts commit 1870eb6.

This reverts commit 8b3620d.

This reverts commit 0008c38.

This reverts commit a14992a.

Under autodiscovery + rate, the number of submissions per (device, metric, tag-set) isn't fixed: overlapping subnets in conftest.py's listener config can cause multiple checks to cover the same device, and each check then runs both rate cycles. The exact count is a function of which subnets discover the device, which the test isn't asserting and shouldn't. Drop the count= constraint on the section-1 per-device assertions; rely on the default "at least one match" semantics, which captures the actual intent ("metric was emitted with these tags").

codecov · 2026-05-21T13:25:34Z

Codecov Report

❌ Patch coverage is 50.00000% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.06%. Comparing base (96797a3) to head (0651d3b).
⚠️ Report is 54 commits behind head on master.

Additional details and impacted files

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

The profile YAMLs declare per-profile global metric_tags whose values come from metadata.device.fields lookups (huawei_hw_entity_system_model, ups_ident_model, vertiv_product_title, …). Those follow the same agent contract as the device-identity tags — stripped from per-device metrics and joined back from the metadata payload at ingest — but they weren't covered by the filter, so the 38 profile tests that append one to common_tags failed their per-metric assertions while passing the metadata equality check. Add every such key observed across the profile suite to the constant so filter_metric_tags strips them uniformly.

…monitored test_e2e_core_discovery and test_e2e_regex_match both asserted on snmp.devices_monitored carrying autodiscovery_subnet / device-identity tags. Under the new agent contract, snmp.devices_monitored from per-device telemetry ships only with loader:core; autodiscovery_subnet rides on snmp.discovered_devices_count (the listener's own metric). Update the expected tag lists accordingly. Also drop the assert_all_profile_metrics_and_tags_covered call for the avaya media gateway profile test. The cmgVoipEngineTable declares row tags avaya_cmg_voip_admin_state and avaya_cmg_voip_current_ip_address that the snmpsim data for this profile leaves blank, so the per-metric coverage check would compare the YAML's full row-tag set against a collected metric that genuinely doesn't carry those two keys. The other per-row assertions in the same test still verify the rows that ARE populated.

snmp.upsAdvBatteryTemperature ships from the discovered device check with no device-level tags under the new contract, so the previous assertion's `autodiscovery_subnet` expectation never matched. Expect an empty tag set. The regex-match test's whole point was that the instance's regex-derived `metric_tags` (digits/remainder/letter1/letter2) get applied. Those values now ride on the metadata payload instead of the per-device telemetry metric, so the original assertion can't verify them. Reduce the assertion to "the check ran" (snmp.devices_monitored with loader:core). Also clean up the now-unused `tags` local in test_e2e_core_discovery.

dd-octo-sts · 2026-05-22T13:20:47Z

Validation Report

All 21 validations passed.

Show details

Validation	Description	Status
`agent-reqs`	Verify check versions match the Agent requirements file	✅
`ci`	Validate CI configuration and Codecov settings	✅
`codeowners`	Validate every integration has a CODEOWNERS entry	✅
`config`	Validate default configuration files against spec.yaml	✅
`dep`	Verify dependency pins are consistent and Agent-compatible	✅
`http`	Validate integrations use the HTTP wrapper correctly	✅
`imports`	Validate check imports do not use deprecated modules	✅
`integration-style`	Validate check code style conventions	✅
`jmx-metrics`	Validate JMX metrics definition files and config	✅
`labeler`	Validate PR labeler config matches integration directories	✅
`legacy-signature`	Validate no integration uses the legacy Agent check signature	✅
`license-headers`	Validate Python files have proper license headers	✅
`licenses`	Validate third-party license attribution list	✅
`metadata`	Validate metadata.csv metric definitions	✅
`models`	Validate configuration data models match spec.yaml	✅
`openmetrics`	Validate OpenMetrics integrations disable the metric limit	✅
`package`	Validate Python package metadata and naming	✅
`qa-label`	Validate the pull request declares whether it needs QA for the next Agent release	✅
`readmes`	Validate README files have required sections	✅
`saved-views`	Validate saved view JSON file structure and fields	✅
`version`	Validate version consistency between package and changelog	✅

View full run

FlorianVeaux

Superseded by #23808

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0651d3b44e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-22T13:23:26Z

    ]
+    # Per the new agent contract (#49822), only `autodiscovery_subnet` survives on
+    # per-device metrics under the core loader; device-identity tags and the
+    # listener's user-defined `tags:` (here tag1/tag2) are joined from the NDM


Filter the listener APC assertions too

When SNMP_LISTENER_ENV=true and this listener test is re-enabled, the update only applies the new core-loader tag filtering to the first generic-device and SNMPv3 sections; the immediately following apc_ups listener section still calls assert_common_metrics(..., is_e2e=True) without loader='core' and still asserts profile/device tags such as snmp_profile, model, and serial_num directly on metric submissions. Under the same #49822 contract described here, those tags are now on metadata, so the APC listener assertions will continue to fail unless that section is migrated to use filtered metric_tags as well.

Useful? React with 👍 / 👎.

AAraKKe · 2026-05-22T13:29:32Z

Thasnks a lot @FlorianVeaux ! I was about to message you guys, this pr has been in the works for over a week and I was not sure this made a lot of sense xD

AAraKKe added the qa/skip-qa Automatically skip this PR for the next QA label May 14, 2026

dd-octo-sts Bot added integration/snmp and removed qa/skip-qa Automatically skip this PR for the next QA labels May 14, 2026

ian28223 mentioned this pull request May 21, 2026

SNMP: extend Cisco serial_number coverage to Catalyst (incl. 1300/9200), cisco-sb, and add Catalyst 1300/9200 sysobjectids #23498

Closed

3 tasks

AAraKKe added the qa/skip-qa Automatically skip this PR for the next QA label May 21, 2026

AAraKKe added 10 commits May 21, 2026 14:19

Revert "Bump listener-test rate-cycle counts to 2"

a14992a

This reverts commit 1870eb6.

Revert "Filter device + user tags from listener-test metric assertions"

0008c38

This reverts commit 8b3620d.

Reapply "Filter device + user tags from listener-test metric assertions"

ec79145

This reverts commit 0008c38.

Reapply "Bump listener-test rate-cycle counts to 2"

1d7f5dc

This reverts commit a14992a.

Reformat listener-test assertion after count removal

f388479

Drop count=2 from listener section-3 IF_SCALAR_GAUGE assertion

d923cf1

AAraKKe added 3 commits May 21, 2026 15:59

AAraKKe marked this pull request as ready for review May 22, 2026 13:18

AAraKKe requested review from a team as code owners May 22, 2026 13:18

AAraKKe requested review from Pierre-L42 and removed request for a team May 22, 2026 13:18

dd-octo-sts Bot added the team/agent-integrations label May 22, 2026

FlorianVeaux requested changes May 22, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 22, 2026

View reviewed changes

AAraKKe closed this May 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move SNMP E2E device-tag assertions to the metadata payload#23706

Move SNMP E2E device-tag assertions to the metadata payload#23706
AAraKKe wants to merge 16 commits into
masterfrom
aarakke/snmp-e2e-metadata-tag-assertions

AAraKKe commented May 14, 2026

Uh oh!

datadog-datadog-prod-us1 Bot commented May 21, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 21, 2026 •

edited

Loading

Uh oh!

dd-octo-sts Bot commented May 22, 2026

Uh oh!

FlorianVeaux left a comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 22, 2026

Uh oh!

AAraKKe commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AAraKKe commented May 14, 2026

What does this PR do?

Motivation

Review checklist (to be filled by reviewers)

Uh oh!

datadog-datadog-prod-us1 Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dd-octo-sts Bot commented May 22, 2026

Validation Report

Uh oh!

FlorianVeaux left a comment

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

AAraKKe commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

datadog-datadog-prod-us1 Bot commented May 21, 2026 •

edited

Loading

codecov Bot commented May 21, 2026 •

edited

Loading