🌱 Migrate e2e prometheus from custom chart to kube-prometheus-stack#2757
🌱 Migrate e2e prometheus from custom chart to kube-prometheus-stack#2757pedjak wants to merge 1 commit into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
This PR migrates the e2e Prometheus setup from a bespoke Helm chart + install script to the upstream kube-prometheus-stack chart (installed from the OCI registry), aiming to reduce maintenance overhead while preserving the existing e2e monitoring/alerting behavior.
Changes:
- Added
kube-prometheus-stackvalues files undertestdata/prometheus/(including an experimental override). - Updated the
prometheusMakefile target to installkube-prometheus-stackdirectly via Helm and removed Prometheus-specific helm+conftest linting. - Removed the custom Prometheus Helm chart, the install script, and the Prometheus NetworkPolicy conftest policy.
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| testdata/prometheus/values.yaml | New baseline kube-prometheus-stack values for e2e (disables unused components, adds ServiceMonitors and rules). |
| testdata/prometheus/values-experimental.yaml | New experimental override values to adjust alert thresholds. |
| Makefile | Replaces the Prometheus install script with an inline Helm install from OCI; updates linting and experimental values path. |
| hack/conftest/policy/README.md | Updates documentation to reflect removal of Prometheus-specific conftest policies and lint flow. |
| AGENTS.md | Updates repository layout documentation to reference the new testdata/prometheus/ location. |
| helm/prometheus/Chart.yaml (deleted) | Removes the custom Prometheus Helm chart definition. |
| helm/prometheus/values.yaml (deleted) | Removes custom chart values (thresholds/namespaces). |
| helm/prometheus/templates/servicemonitor-operator-controller-controller-manager-metrics-monitor.yml (deleted) | Removes custom ServiceMonitor for operator-controller metrics. |
| helm/prometheus/templates/servicemonitor-catalogd-controller-manager-metrics-monitor.yml (deleted) | Removes custom ServiceMonitor for catalogd metrics. |
| helm/prometheus/templates/servicemonitor-kubelet.yml (deleted) | Removes custom kubelet ServiceMonitor. |
| helm/prometheus/templates/prometheusrule-controller-alerts.yml (deleted) | Removes custom PrometheusRule definitions (now supplied via upstream chart values). |
| helm/prometheus/templates/prometheus-prometheus.yml (deleted) | Removes custom Prometheus CR manifest. |
| helm/prometheus/templates/serviceaccount-prometheus.yml (deleted) | Removes custom Prometheus ServiceAccount manifest. |
| helm/prometheus/templates/secret-prometheus-metrics-token.yml (deleted) | Removes the legacy service-account-token Secret approach for scraping. |
| helm/prometheus/templates/service-prometheus-service.yml (deleted) | Removes the custom NodePort Service manifest. |
| helm/prometheus/templates/networkpolicy-prometheus.yml (deleted) | Removes custom Prometheus NetworkPolicy manifest. |
| helm/prometheus/templates/networkpolicy-prometheus-operator.yml (deleted) | Removes custom Prometheus Operator NetworkPolicy manifest. |
| helm/prometheus/templates/clusterrole-prometheus.yml (deleted) | Removes custom Prometheus ClusterRole manifest. |
| helm/prometheus/templates/clusterrolebinding-prometheus.yml (deleted) | Removes custom Prometheus ClusterRoleBinding manifest. |
| helm/prom_experimental.yaml (deleted) | Removes the legacy experimental Prometheus values file (replaced by testdata/prometheus/values-experimental.yaml). |
| hack/test/install-prometheus.sh (deleted) | Removes the bespoke Prometheus install script (replaced by Makefile logic). |
| hack/conftest/policy/prometheus-networkpolicies.rego (deleted) | Removes Prometheus NetworkPolicy conftest policy (now chart-managed). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
ba05248 to
3251893
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2757 +/- ##
==========================================
- Coverage 66.84% 66.82% -0.02%
==========================================
Files 149 149
Lines 11382 11382
==========================================
- Hits 7608 7606 -2
- Misses 3218 3219 +1
- Partials 556 557 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
3251893 to
dc22383
Compare
…-stack Replace the hand-rolled prometheus-operator install script and custom Helm chart (helm/prometheus/) with the official kube-prometheus-stack community chart (v86.2.2), installed from OCI registry. - Disable all unused components (grafana, alertmanager, exporters, default rules, admission webhooks, operator TLS) - Configure Prometheus instance, NetworkPolicies, and kubelet ServiceMonitor via chart values - Add operator-controller and catalogd ServiceMonitors as additionalServiceMonitors using bearerTokenFile (projected SA token) instead of the legacy prometheus-metrics-token Secret - Split PrometheusRules into controller-panic-alerts and controller-resource-alerts so the experimental override only replaces the resource-usage group - Inline the install logic into the Makefile prometheus target - Remove conftest prometheus-networkpolicies.rego policy (NetworkPolicy now managed by the chart) - Remove unused kustomize bingo tooling Co-Authored-By: Claude <noreply@anthropic.com>
dc22383 to
1ae258f
Compare
| lint-helm: $(HELM) $(CONFTEST) #HELP Run helm linter | ||
| helm lint helm/olmv1 | ||
| helm lint helm/prometheus | ||
| (set -euo pipefail; helm template olmv1 helm/olmv1; helm template prometheus helm/prometheus) | $(CONFTEST) test --policy hack/conftest/policy/ --combine -n main -n prometheus - | ||
| (set -euo pipefail; helm template olmv1 helm/olmv1) | $(CONFTEST) test --policy hack/conftest/policy/ --combine -n main - |
Description
🌱 Replace the hand-rolled prometheus-operator install script (
hack/test/install-prometheus.sh) and custom Helm chart (helm/prometheus/) with the official kube-prometheus-stack community chart (v86.2.2), installed from OCI registry.Motivation: The custom chart required maintaining 12 templates, a separate install script, kustomize tooling, and conftest policies. The official chart provides the same functionality with less maintenance burden.
What changed
helm installfromoci://ghcr.io/prometheus-community/charts/kube-prometheus-stackreplaces the kustomize-based operator install + custom chart template pipelinekubelet: false(only cAdvisor),honorTimestamps: false,trackTimestampsStaleness: false, andcAdvisorRelabelings: []to match old behavioradditionalServiceMonitorsusingbearerTokenFile(projected SA token) instead of the legacyprometheus-metrics-tokenSecretcontroller-panic-alertsandcontroller-resource-alertsmap keys so the experimental override only replaces the resource-usage group (reducing duplication)prometheus.networkPolicyandprometheusOperator.networkPolicysettingsprometheustarget — no separate scriptprometheus-networkpolicies.regopolicy (NetworkPolicy now managed by the chart)Reviewer Checklist