[pull] master from DataDog:master#610
Merged
Merged
Conversation
* Add initial Kueue OpenMetrics integration scaffold.
Start with a basic OpenMetrics V2 check that forwards endpoint metrics under the kueue namespace to enable early endpoint validation before adding curated mappings.
* Add basic Kueue OpenMetrics scraping
* Add curated Kueue OpenMetrics mapping
* Add Kueue resource metric suffixing
* Document Kueue as a cluster check
* Implement Kueue integration
* Fix code coverage missing
* Update README
* Fix manifest
* Update metadata
* Add owners
* Fix Kueue CI validation
* Address Kueue review feedback.
* Add memory
* Pin kind networking and node image for Kueue E2E
Use non-default service/pod subnets so the kind cluster's API service IP
does not collide with the host environment's Kubernetes networking, which
hijacked in-cluster traffic and broke Kueue's webhook cert bootstrap. Also
scope the LocalQueue readiness wait to the default namespace.
* Fix Kueue go info tag validation.
Rename the generic Go version label before submission so E2E metrics pass tag validation.
* Make Kueue E2E test pass against live cluster
Relax metric tag assertions to match the actual tag set emitted by the
controller (endpoint, replica_role, cohort tags) instead of pinning an
exact subset, and add the missing assets/service_checks.json (with its
manifest reference) that assert_service_checks requires.
* Wait for Kueue webhook before applying queue manifests
The controller deployment can report `Available` before its webhook server
is actually serving, causing intermittent `connection refused` failures when
applying ResourceFlavor/ClusterQueue. Wait for the webhook service endpoints
and retry the apply to absorb the brief cert-propagation window.
* Use remapped tag names in Kueue metric descriptions
Metric descriptions referenced the raw Prometheus labels ('cluster_queue',
'local_queue'/'localQueue') instead of the tags Datadog actually emits after
remapping ('kueue_cluster_queue', 'kueue_local_queue').
* Rename cluster_queue.pending_workloads to pending_workloads
The raw kueue_pending_workloads metric has no cluster_queue in its name, so
the cluster_queue. prefix was inconsistent with every other cluster-queue-
indexed metric (which keep bare names and just carry the kueue_cluster_queue
tag). Drop the prefix to match the source name and the rest of the convention.
* Sync Kueue configuration example
* Update codeowners
* Apply suggestions from code review
* Apply Kueue review cleanup
* Use metadata assertion for Kueue metric coverage
* Remove Kueue service check metadata
* Configure Kueue manifestless metadata
* Refactor Kueue tag assertions
* Assert Kueue e2e metrics from metadata
* Assert idle Kueue e2e metrics
* Rename Kueue flavor label
* Document Kueue GC summary metrics
* Remove checks
* Add more e2e metrics
* Fix e2e setup
* Fix Kueue e2e controller rollout wait.
* Change codeowners
* Unify Kueue unit and e2e metric expectations.
Share EXPECTED_METRIC_TAGS between tests, align the OpenMetrics fixture with
e2e queue labels, expand unit coverage, and pin go_version to go1.26.3 for
kueue.go.info to match the controller toolchain.
* Drop black as a direct dependency, use ruff format for generated config models
- Remove explicit `black==23.12.1` from `datadog_checks_dev`
- Replace `apply_black` calls in the model consumer with `ruff format -`,
using the repo's centralized `[tool.ruff]` configuration
- Drop the now-unused `code_formatter` plumbing through `ModelConsumer` /
`build_model_file` / `validate models`
- Drop the [tool.black] block from `ddev/pyproject.toml` and the matching
python-version-bump logic in `update_py_config.py` (with test fixture)
- Update README badge from black to ruff
The root `[tool.black]` section is kept (with an explanatory comment)
because `datamodel-code-generator` reads it transitively through its own
internal formatter, and removing it changes line-length to 88 which
breaks our list[...] -> tuple[..., ...] line-by-line transform.
* Add changelog entries
* Make ruff a hard dep of datadog_checks_dev[cli], invoke via python -m ruff
CI installs ddev with `pip install -e ./datadog_checks_dev[cli]` and never
adds ruff to PATH. The previous helper called `shutil.which('ruff')` and
silently returned the input unchanged when ruff was missing — leaving long
lines and missing wraps in 401 generated config-model files, surfacing as
"not in sync" in the validate workflow.
- Declare `ruff>=0.11` in `datadog_checks_dev[cli]` so the package is
always installed alongside the model generator.
- Switch the helper to `sys.executable -m ruff` so the in-venv package is
used regardless of PATH.
- Raise loudly on missing ruff or non-zero ruff exit instead of silently
degrading, so any future regression fails the workflow with a clear error.
* Pin ruff to 0.11.10 to match ddev's hatch lint env
* Remove [tool.black] from root pyproject.toml and update _fix_types
Make `_fix_types` operate on the joined document (as UTF-8 bytes) instead
of line by line, so the bracket-tracking pass works regardless of how
datamodel-code-generator's internal formatter wrapped `list[...]`. Place
the `, ...` sentinel right after the last non-whitespace byte before the
closing `]`, so output stays on the previous content line even when the
parser pre-wrapped the closing bracket onto its own line.
With those changes the generator no longer relies on `[tool.black]`
existing in the repo, so the section and its accompanying comment are
removed from `pyproject.toml`. The black-related comment near the
config_models lint exclusion and the black badge in `README.md` go too.
Four config_models files (kafka_actions, win32_event_log, yarn x2)
regenerate with different — but semantically identical — wrapping. They
were the only ones whose pre-wrapped form was sensitive to the change in
upstream line-length default; future regens are stable.
* Add changelogs for regenerated config_models in kafka_actions, win32_event_log, yarn
* Address PR review: docs, error-path hint, focused _fix_types tests
- Replace remaining "code style - black" references in developer docs
(`docs/developer/index.md` badge, `docs/developer/guidelines/style.md`
style section, link reference) with ruff equivalents.
- Update the stale `ddev test postgres -l` example output in
`docs/developer/testing.md` to drop `black==22.12.0` and reflect the
current lint env contents (`ruff==0.11.10`, `pydantic==2.11.5`).
- Move the "ruff is not installed" install hint in `format_with_ruff`
from the `FileNotFoundError` branch to the `CalledProcessError` branch
and gate it on `"No module named 'ruff'"` in stderr — the previous
layout was effectively dead code because `sys.executable` always
resolves, so missing-ruff surfaces as a non-zero exit, not a missing
binary.
- Add `tests/tooling/configuration/consumers/model/test_fix_types.py`
with focused coverage for the `_fix_types` post-processing pass: the
multiline-wrapped `list[Literal[...]]` regression case the PR was
written to fix, dict and nested-list translations, unicode in
descriptions, and verbatim pass-through when no `list[`/`dict[`
is present.
* Address PR review (round 2): code_formatter robustness + direct tests
code_formatter.py:
- Guard `_resolve_ruff_config` with `if root_str:` so an unset
`get_root()` (returns '') doesn't fall into the `Path('').is_dir()`
branch (which is True — it resolves to CWD), and unit tests actually
walk back to the repo pyproject.toml as the docstring claims.
- Replace the loose `'[tool.ruff' in text` substring with a line-anchored
scan that only matches actual TOML table headers (`[tool.ruff]` or
`[tool.ruff.…]`), so a comment or string value can't false-positive.
- Surface argv (via `shlex.join`), stderr, and stdout in the error
message for non-missing-package failures, so a future ruff config
change emitting actionable output is debuggable from the message alone.
Tests:
- New `test_code_formatter.py` (17 tests): direct coverage for
`format_with_ruff` (line wrapping, quote-style preservation, short
passthrough, missing-ruff hint, full-context error on other failures)
and `_resolve_ruff_config` (root path success, fallback walk on empty
root, fallback walk when root has no `[tool.ruff]`, returns None when
nothing is found, parametrized header recognition for `_has_ruff_section`).
- `test_update_py_config.py`: add explicit content assertions on the
rewritten `ddev/pyproject.toml` — no `[tool.black]` block survives,
`[tool.ruff].target-version` is updated to the new pinned version, the
old token is gone. Captures the actual contract instead of relying on
the success-counter being 9.
* Relax ddev's datadog-checks-dev pin to span the dcd 39 release gap
This PR bumps `datadog-checks-dev` to 39 (`black` is dropped from `[cli]`
extras, so per semver it's a major). The current `~=38.0` constraint in
ddev's `pyproject.toml` would block any GitHub Action that installs both
packages from the local repo — `pip install -e ./ddev` would fail to
resolve the local dcd 39 against ddev's pin.
Relax the pin to `>=38.0,<40` for the duration of the gap between this
PR landing and the next ddev release. The release PR for ddev MUST
tighten this back to `~=39.0`.
* Apply ruff format to new helper and tests
* Restore relaxed datadog-checks-dev pin for transition period
* Add ai block to ddev config model for Anthropic API key and flow dirs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add changelog for ddev AI config * Cover ddev AI config display * Fix changelog filename to match PR #23894. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Move get_anthropic_api_key to top of model.py and add precedence test Move get_anthropic_api_key() next to get_github_token() so all module-level env-var helpers are grouped before any class definitions. Add a test verifying config-file value takes precedence over DD_ANTHROPIC_API_KEY env var. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(ddev/config): replace hardcoded scrub_config with glob-based _scrub_path helper - Extract _scrub_path to scrub arbitrary nested config paths using dot-notation globs - Replace per-field scrub_config logic with a SCRUBBED_GLOBS-driven loop Rationale: makes adding new sensitive fields trivial without touching scrubbing logic This commit made by [/dd:git:commit:quick](https://github.com/DataDog/claude-marketplace/tree/main/dd/commands/git/commit/quick.md) * Add unit tests for scrub_config traversal semantics Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Use app.config.ai.anthropic_api_key in dynamicd command * Mention DD_ANTHROPIC_API_KEY in missing-key error message Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )