Skip to content

🌱 Migrate e2e prometheus from custom chart to kube-prometheus-stack#2757

Open
pedjak wants to merge 1 commit into
operator-framework:mainfrom
pedjak:migrate-prometheus-to-kube-prometheus-stack
Open

🌱 Migrate e2e prometheus from custom chart to kube-prometheus-stack#2757
pedjak wants to merge 1 commit into
operator-framework:mainfrom
pedjak:migrate-prometheus-to-kube-prometheus-stack

Conversation

@pedjak

@pedjak pedjak commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Description

🌱 Replace the hand-rolled prometheus-operator install script (hack/test/install-prometheus.sh) and custom Helm chart (helm/prometheus/) with the official kube-prometheus-stack community chart (v86.2.2), installed from OCI registry.

Motivation: The custom chart required maintaining 12 templates, a separate install script, kustomize tooling, and conftest policies. The official chart provides the same functionality with less maintenance burden.

What changed

  • Prometheus deployment: Single helm install from oci://ghcr.io/prometheus-community/charts/kube-prometheus-stack replaces the kustomize-based operator install + custom chart template pipeline
  • Unused components disabled: grafana, alertmanager, node-exporter, kube-state-metrics, default rules, admission webhooks, operator TLS
  • Kubelet ServiceMonitor: Uses chart's built-in kubelet support with kubelet: false (only cAdvisor), honorTimestamps: false, trackTimestampsStaleness: false, and cAdvisorRelabelings: [] to match old behavior
  • Custom ServiceMonitors: operator-controller and catalogd added via additionalServiceMonitors using bearerTokenFile (projected SA token) instead of the legacy prometheus-metrics-token Secret
  • PrometheusRules: Split into controller-panic-alerts and controller-resource-alerts map keys so the experimental override only replaces the resource-usage group (reducing duplication)
  • NetworkPolicies: Managed by the chart via prometheus.networkPolicy and prometheusOperator.networkPolicy settings
  • Install logic: Inlined into the Makefile prometheus target — no separate script
  • Conftest: Removed prometheus-networkpolicies.rego policy (NetworkPolicy now managed by the chart)
  • Kustomize: Removed unused bingo tooling (was only used by the deleted install script)

Reviewer Checklist

  • API Go Documentation
  • Tests: Unit Tests (and E2E Tests, if appropriate)
  • Comprehensive Commit Messages
  • Links to related GitHub Issue(s)

Copilot AI review requested due to automatic review settings June 10, 2026 16:24
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 10, 2026
@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign perdasilva for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the e2e Prometheus setup from a bespoke Helm chart + install script to the upstream kube-prometheus-stack chart (installed from the OCI registry), aiming to reduce maintenance overhead while preserving the existing e2e monitoring/alerting behavior.

Changes:

  • Added kube-prometheus-stack values files under testdata/prometheus/ (including an experimental override).
  • Updated the prometheus Makefile target to install kube-prometheus-stack directly via Helm and removed Prometheus-specific helm+conftest linting.
  • Removed the custom Prometheus Helm chart, the install script, and the Prometheus NetworkPolicy conftest policy.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
testdata/prometheus/values.yaml New baseline kube-prometheus-stack values for e2e (disables unused components, adds ServiceMonitors and rules).
testdata/prometheus/values-experimental.yaml New experimental override values to adjust alert thresholds.
Makefile Replaces the Prometheus install script with an inline Helm install from OCI; updates linting and experimental values path.
hack/conftest/policy/README.md Updates documentation to reflect removal of Prometheus-specific conftest policies and lint flow.
AGENTS.md Updates repository layout documentation to reference the new testdata/prometheus/ location.
helm/prometheus/Chart.yaml (deleted) Removes the custom Prometheus Helm chart definition.
helm/prometheus/values.yaml (deleted) Removes custom chart values (thresholds/namespaces).
helm/prometheus/templates/servicemonitor-operator-controller-controller-manager-metrics-monitor.yml (deleted) Removes custom ServiceMonitor for operator-controller metrics.
helm/prometheus/templates/servicemonitor-catalogd-controller-manager-metrics-monitor.yml (deleted) Removes custom ServiceMonitor for catalogd metrics.
helm/prometheus/templates/servicemonitor-kubelet.yml (deleted) Removes custom kubelet ServiceMonitor.
helm/prometheus/templates/prometheusrule-controller-alerts.yml (deleted) Removes custom PrometheusRule definitions (now supplied via upstream chart values).
helm/prometheus/templates/prometheus-prometheus.yml (deleted) Removes custom Prometheus CR manifest.
helm/prometheus/templates/serviceaccount-prometheus.yml (deleted) Removes custom Prometheus ServiceAccount manifest.
helm/prometheus/templates/secret-prometheus-metrics-token.yml (deleted) Removes the legacy service-account-token Secret approach for scraping.
helm/prometheus/templates/service-prometheus-service.yml (deleted) Removes the custom NodePort Service manifest.
helm/prometheus/templates/networkpolicy-prometheus.yml (deleted) Removes custom Prometheus NetworkPolicy manifest.
helm/prometheus/templates/networkpolicy-prometheus-operator.yml (deleted) Removes custom Prometheus Operator NetworkPolicy manifest.
helm/prometheus/templates/clusterrole-prometheus.yml (deleted) Removes custom Prometheus ClusterRole manifest.
helm/prometheus/templates/clusterrolebinding-prometheus.yml (deleted) Removes custom Prometheus ClusterRoleBinding manifest.
helm/prom_experimental.yaml (deleted) Removes the legacy experimental Prometheus values file (replaced by testdata/prometheus/values-experimental.yaml).
hack/test/install-prometheus.sh (deleted) Removes the bespoke Prometheus install script (replaced by Makefile logic).
hack/conftest/policy/prometheus-networkpolicies.rego (deleted) Removes Prometheus NetworkPolicy conftest policy (now chart-managed).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Makefile
Comment thread Makefile Outdated
@netlify

netlify Bot commented Jun 10, 2026

Copy link
Copy Markdown

Deploy Preview for olmv1 ready!

Name Link
🔨 Latest commit 1ae258f
🔍 Latest deploy log https://app.netlify.com/projects/olmv1/deploys/6a29bdfd5d736c00087043c2
😎 Deploy Preview https://deploy-preview-2757--olmv1.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

Copilot AI review requested due to automatic review settings June 10, 2026 18:59

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 5 comments.

Comment thread testdata/prometheus/values.yaml
Comment thread testdata/prometheus/values.yaml
Comment thread testdata/prometheus/values.yaml
Comment thread testdata/prometheus/values.yaml
Comment thread Makefile Outdated
Copilot AI review requested due to automatic review settings June 10, 2026 19:11
@pedjak pedjak force-pushed the migrate-prometheus-to-kube-prometheus-stack branch from ba05248 to 3251893 Compare June 10, 2026 19:11

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 4 comments.

Comment thread Makefile Outdated
Comment thread testdata/prometheus/values.yaml Outdated
Comment thread testdata/prometheus/values.yaml Outdated
Comment thread testdata/prometheus/values.yaml
@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 66.82%. Comparing base (23b7e52) to head (1ae258f).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2757      +/-   ##
==========================================
- Coverage   66.84%   66.82%   -0.02%     
==========================================
  Files         149      149              
  Lines       11382    11382              
==========================================
- Hits         7608     7606       -2     
- Misses       3218     3219       +1     
- Partials      556      557       +1     
Flag Coverage Δ
e2e 35.24% <ø> (+0.10%) ⬆️
experimental-e2e 52.28% <ø> (-0.07%) ⬇️
unit 52.17% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@pedjak pedjak force-pushed the migrate-prometheus-to-kube-prometheus-stack branch from 3251893 to dc22383 Compare June 10, 2026 19:37
…-stack

Replace the hand-rolled prometheus-operator install script and custom
Helm chart (helm/prometheus/) with the official kube-prometheus-stack
community chart (v86.2.2), installed from OCI registry.

- Disable all unused components (grafana, alertmanager, exporters,
  default rules, admission webhooks, operator TLS)
- Configure Prometheus instance, NetworkPolicies, and kubelet
  ServiceMonitor via chart values
- Add operator-controller and catalogd ServiceMonitors as
  additionalServiceMonitors using bearerTokenFile (projected SA token)
  instead of the legacy prometheus-metrics-token Secret
- Split PrometheusRules into controller-panic-alerts and
  controller-resource-alerts so the experimental override only
  replaces the resource-usage group
- Inline the install logic into the Makefile prometheus target
- Remove conftest prometheus-networkpolicies.rego policy (NetworkPolicy
  now managed by the chart)
- Remove unused kustomize bingo tooling

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 10, 2026 19:41
@pedjak pedjak force-pushed the migrate-prometheus-to-kube-prometheus-stack branch from dc22383 to 1ae258f Compare June 10, 2026 19:41

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.

Comment thread Makefile
Comment on lines 125 to +127
lint-helm: $(HELM) $(CONFTEST) #HELP Run helm linter
helm lint helm/olmv1
helm lint helm/prometheus
(set -euo pipefail; helm template olmv1 helm/olmv1; helm template prometheus helm/prometheus) | $(CONFTEST) test --policy hack/conftest/policy/ --combine -n main -n prometheus -
(set -euo pipefail; helm template olmv1 helm/olmv1) | $(CONFTEST) test --policy hack/conftest/policy/ --combine -n main -
@pedjak pedjak marked this pull request as ready for review June 10, 2026 21:19
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 10, 2026
@openshift-ci openshift-ci Bot requested review from ankitathomas and tmshort June 10, 2026 21:19
@pedjak pedjak requested a review from dtfranz June 10, 2026 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants