Skip to content

fix(collector): remove prometheus annotations on disablePrometheusAnnotations update#5069

Open
praveen9354 wants to merge 1 commit into
open-telemetry:mainfrom
praveen9354:fix/operator-disable-prom-annotations-5043
Open

fix(collector): remove prometheus annotations on disablePrometheusAnnotations update#5069
praveen9354 wants to merge 1 commit into
open-telemetry:mainfrom
praveen9354:fix/operator-disable-prom-annotations-5043

Conversation

@praveen9354
Copy link
Copy Markdown

Fixes #5043

Description

When spec.observability.metrics.disablePrometheusAnnotations flipped from
false to true on an existing OpenTelemetryCollector, the operator's pod-
template mutate path preserved the prometheus.io/scrape, port, and path
annotations on the rolled pods.

The renderer in manifestutils.PodAnnotations correctly omitted those keys
from the desired pod template, but mutatePodTemplate runs a
mergeWithOverride against the existing pod template's annotations — that
merge intentionally preserves any key present in existing but absent from
desired (the long-standing "external annotations on the pod template
should survive a reconcile" contract enforced by the Ingress mutate tests).
So the prometheus.io/* keys planted at create-time stuck around forever.

Fix: track a small fixed set of operator-managed pod-template annotation
keys (currently the three prometheus.io annotations) and strip those keys
from existing.Annotations before the merge runs, when they are absent from
desired. Keys outside that set continue to be preserved.

Adds three table-driven test cases against mutatePodTemplate:

  • disable removes prometheus annotations but keeps user keys (the bug)
  • enable keeps prometheus annotations when desired sets them (regression)
  • no operator-managed keys on either side leaves user annotations intact

Fixes #5043

Signed-off-by: praveen9354 praveen9354@gmail.com

Link to tracking Issue(s)

See description and commits above.

Testing

Local verification of the changed file's adjacent test suite.
Upstream CI runs the full matrix on push.

Documentation

  • Yes, PR includes docs updates
  • Yes, issue opened
  • No

…otations update

When `spec.observability.metrics.disablePrometheusAnnotations` flipped from
false to true on an existing OpenTelemetryCollector, the operator's pod-
template mutate path preserved the prometheus.io/scrape, port, and path
annotations on the rolled pods.

The renderer in `manifestutils.PodAnnotations` correctly omitted those keys
from the desired pod template, but `mutatePodTemplate` runs a
mergeWithOverride against the existing pod template's annotations — that
merge intentionally preserves any key present in existing but absent from
desired (the long-standing "external annotations on the pod template
should survive a reconcile" contract enforced by the Ingress mutate tests).
So the prometheus.io/* keys planted at create-time stuck around forever.

Fix: track a small fixed set of operator-managed pod-template annotation
keys (currently the three prometheus.io annotations) and strip those keys
from existing.Annotations before the merge runs, when they are absent from
desired. Keys outside that set continue to be preserved.

Adds three table-driven test cases against `mutatePodTemplate`:
- disable removes prometheus annotations but keeps user keys (the bug)
- enable keeps prometheus annotations when desired sets them (regression)
- no operator-managed keys on either side leaves user annotations intact

Fixes open-telemetry#5043

Signed-off-by: praveen9354 <praveen9354@gmail.com>
@praveen9354 praveen9354 requested a review from a team as a code owner May 11, 2026 17:33
@swiatekm
Copy link
Copy Markdown
Contributor

This is a little dicey, because these annotations don't belong to the operator, but rather to Prometheus. If someone flips the attribute on and then adds their own, we'd remove them with this change. @open-telemetry/operator-approvers what do you think? We'd need to track the state in some way - for example add another annotation indicating that it was the operator who added these.

@github-actions
Copy link
Copy Markdown
Contributor

E2E Test Results

 36 files  259 suites   3h 52m 59s ⏱️
109 tests 106 ✅ 0 💤 3 ❌
281 runs  277 ✅ 0 💤 4 ❌

Results for commit 1ca314b.

@iblancasa
Copy link
Copy Markdown
Contributor

We'd need to track the state in some way - for example add another annotation indicating that it was the operator who added these.

Yep. It makes sense.

@praveen9354
Copy link
Copy Markdown
Author

@swiatekm @iblancasa — agreed, the unconditional strip is too aggressive. The operator should only remove keys it planted.

Plan for v2:

  1. When the operator stamps prometheus.io/{scrape,port,path} onto the pod template, also write a meta-annotation opentelemetry.io/operator-managed-annotations: prometheus.io/scrape,prometheus.io/port,prometheus.io/path recording which keys it owns on that pod template.
  2. On reconcile, before the merge, parse the meta-annotation from existing.Annotations into a set, and strip only those keys when they are absent from desired. Update the meta-annotation to reflect the current desired set after the strip.
  3. If a user later adds their own prometheus.io/scrape while disablePrometheusAnnotations: true, that key is not in the meta-set, so it survives the merge.

Migration shape:

  • Pod templates created before this change have no meta-annotation. On the first reconcile after upgrade, the operator stamps the meta-annotation reflecting the current desired set (so existing operator-planted keys are correctly tracked going forward), but does not strip anything that turn — there is no recorded ownership yet.
  • From the second reconcile onward, behavior matches the new contract.

Tests (mutatePodTemplate, table-driven):

  • operator-added key + disable flip → key stripped, user keys preserved (the bug).
  • operator-added key + enable flip → key kept, meta-annotation updated.
  • user-added prometheus.io/scrape while disable=true and no meta → user key preserved (legacy migration).
  • user-added prometheus.io/scrape while disable=true with explicit meta → user key preserved, operator-managed keys still stripped.
  • no operator-managed keys on either side → user annotations untouched.

CI:

  • .chloggen/<filename>.yaml entry added (the current changelog check is failing because this PR has no chloggen file).
  • make lint clean.
  • e2e instrumentation/targetallocator jobs re-run after the redesign.

Will rework on the same branch — likely a force-push since the diff shape changes, unless you would prefer me to stack a second commit so the v1 → v2 evolution is visible in the PR history. Let me know which you prefer.

If the meta-annotation key name (opentelemetry.io/operator-managed-annotations) is reserved for something else or you have a preferred namespace key, happy to adjust.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

disablePrometheusAnnotations does not remove annotations

3 participants