Skip to content

[pull] master from DataDog:master#530

Merged
pull[bot] merged 6 commits into
ConnectionMaster:masterfrom
DataDog:master
May 11, 2026
Merged

[pull] master from DataDog:master#530
pull[bot] merged 6 commits into
ConnectionMaster:masterfrom
DataDog:master

Conversation

@pull

@pull pull Bot commented May 11, 2026

Copy link
Copy Markdown

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

clamoriniere and others added 6 commits May 11, 2026 15:09
* add(datadog_cluster_agent): add autoscaling metrics

Added new metrics related to autoscaling conditions and constraints for the DatadogPodAutoscaler. 
those metrics were introduced in the cluster-agent by DataDog/datadog-agent#47138

* Apply suggestion from @clamoriniere
* ci(release): gate release-trigger on the release environment

The prepare job in release-dispatch.yml creates tags before reaching the
environment: release gate on the dispatch job. Adding environment: release
to the calling dispatch job in release-trigger.yml ensures GitHub's
deployment protection runs before the reusable workflow's jobs start,
so tagging requires manual approval.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* ci(release): gate release-trigger on the release environment

Add environment: release to the dispatch job that calls the reusable
release-dispatch.yml workflow. GitHub's deployment protection now runs
before any of the reusable workflow's jobs start, so the prepare step
(which creates tags) requires manual approval.

The inner environment: release on release-dispatch.yml's dispatch job
is removed in integrations-core — a single gate at the trigger level
is sufficient.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* ci(release): gate release-trigger via intermediate approve job

environment: release cannot be used on a job that calls a reusable
workflow (uses:). Instead, add an explicit approve job that holds the
environment gate; the dispatch job depends on it, so the reusable
workflow's prepare step (which creates tags) cannot run until a
reviewer approves the deployment.

Remove the previously-added environment: release from the dispatch
job (invalid) and the inner environment: release from release-dispatch.yml
(redundant — a single gate is sufficient).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…isolate its failures (#23580)

* [kafka_consumer] Use AdminClient.list_offsets for earliest fetch and isolate its failures

Replace the LOW_WATERMARK fetch in `_collect_topic_metadata` (which went
through `Consumer.offsets_for_times(timestamp=0)` and forced the broker
to walk `.timeindex` segment files) with `AdminClient.list_offsets` and
`OffsetSpec.earliest()`. The broker services this from in-memory
`logStartOffset` instead, eliminating the time-index scan that was
timing out for clusters with many segments x multi-broker fan-out.

Wrap the call so a failure no longer aborts the entire topic-metadata
collection. When the earliest fetch fails (or returns errors per
partition), only the metrics that genuinely depend on it are skipped:
`partition.beginning_offset`, `partition.size`, and `topic.size`.
Everything else - `topic.message_rate`, `topic.partitions`,
`partition.isr/replicas/under_replicated/offline`, `topic.config.*`,
and all consumer-side metrics - keeps emitting.

Verified locally on a 20k-topic / 40k-partition / 20k-consumer-group
cluster: with the failure simulated, sample volume drops from 480k to
380k (only the 100k earliest-dependent samples lost) instead of the
prior 300k+ loss when the whole `_collect_topic_metadata` aborted.

* Use actual PR number for changelog file

* Drop dead lowwater code path and rename get_watermark_offsets -> get_highwater_offsets
Co-authored-by: dkirov-dd <166512750+dkirov-dd@users.noreply.github.com>
Co-authored-by: Steven Yuen <steven.yuen@datadoghq.com>
Co-authored-by: dkirov-dd <166512750+dkirov-dd@users.noreply.github.com>
Co-authored-by: Steven Yuen <steven.yuen@datadoghq.com>
@pull pull Bot locked and limited conversation to collaborators May 11, 2026
@pull pull Bot added the ⤵️ pull label May 11, 2026
@pull pull Bot merged commit df7c7d2 into ConnectionMaster:master May 11, 2026
1 check passed
@pull pull Bot temporarily deployed to release May 11, 2026 20:44 Inactive
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants