Skip to content

NE-1113: Implement podMonitor provisioning with Gateway API#1425

Open
rikatz wants to merge 1 commit into
openshift:masterfrom
rikatz:ne-1113
Open

NE-1113: Implement podMonitor provisioning with Gateway API#1425
rikatz wants to merge 1 commit into
openshift:masterfrom
rikatz:ne-1113

Conversation

@rikatz
Copy link
Copy Markdown
Member

@rikatz rikatz commented Apr 28, 2026

Summary

Adds a new gateway-podmonitor controller that creates a PodMonitor resource per Gateway in openshift-ingress, enabling Prometheus to scrape Istio gateway pod metrics for consumption by Kiali.

  • New unmanaged dependent controller following the gateway-networkpolicy pattern
  • Creates a PodMonitor per Gateway selecting pods via gateway.networking.k8s.io/gateway-name
  • Scrapes Istio envoy metrics at /stats/prometheus (port http-envoy-prom) every 60s
  • Filters metrics via metricRelabelings to keep only istio_* and select envoy_* series
  • PodMonitor lifecycle tied to Gateway via OwnerReference (cascading deletion)
  • RBAC scoped to openshift-ingress namespace via namespaced Role/RoleBinding (not ClusterRole)

Files Changed

File Change
pkg/operator/controller/gateway-podmonitor/controller.go New controller: watches Gateways and PodMonitors in openshift-ingress
pkg/operator/controller/gateway-podmonitor/podmonitor.go PodMonitor ensure/create/update logic using unstructured.Unstructured
pkg/operator/controller/names.go Added GatewayPodMonitorName() returning <gateway-name>-monitor
pkg/operator/operator.go Registered controller as dependent of gatewayapi controller
manifests/01-role.yaml Added namespaced Role for podmonitors in openshift-ingress
manifests/01-role-binding.yaml Added corresponding RoleBinding

Prometheus Cardinality Impact

The metricRelabelings keep filter restricts scraped metrics to:

  • istio_* (request count, duration, size, TCP metrics)
  • Select envoy_* (upstream connections/requests, listener activity, server memory, uptime)

Estimated time series per deployment

Scale Series estimate Impact
1 Gateway, 2 replicas ~4,000 Negligible
5 Gateways, 3 replicas ~30,000 Modest
20 Gateways, 3 replicas ~120,000 Moderate — monitor prometheus_tsdb_head_series

Mitigations in place

  • keep action with regex filter drops all metrics not matching the allowlist
  • 60s scrape interval reduces ingestion load 2-4x vs default 30s
  • Per-gateway PodMonitor allows targeted deletion if a specific Gateway causes issues

Recommendations for high-scale environments

  • Monitor prometheus_tsdb_head_series after enabling Gateway PodMonitors
  • If histogram buckets drive excessive cardinality, add a drop relabeling for istio_request_duration_milliseconds_bucket (keep only _count and _sum)
  • For >20 Gateways, consider tightening the regex further based on actual Kiali requirements

Followup

  • Add a NetworkPolicy for gateway provisioning that blocks metric scrapping and allows just openshift-monitoring namespace to do so

Test Plan

  • Deploy operator with Gateway API feature gate enabled
  • Create a Gateway in openshift-ingress
  • Verify PodMonitor <gateway-name>-monitor is created in openshift-ingress
  • Verify PodMonitor selector matches Gateway pod labels
  • Verify Prometheus targets show the Gateway pods as scrape targets
  • Verify istio_* and allowed envoy_* metrics appear in Prometheus
  • Delete the Gateway and verify the PodMonitor is garbage-collected
  • Verify no PodMonitor leak after Gateway deletion
  • Check Prometheus prometheus_tsdb_head_series for cardinality impact

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 28, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Apr 28, 2026

@rikatz: This pull request references NE-1113 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

Adds a new gateway-podmonitor controller that creates a PodMonitor resource per Gateway in openshift-ingress, enabling Prometheus to scrape Istio gateway pod metrics for consumption by Kiali.

  • New unmanaged dependent controller following the gateway-networkpolicy pattern
  • Creates a PodMonitor per Gateway selecting pods via gateway.networking.k8s.io/gateway-name
  • Scrapes Istio envoy metrics at /stats/prometheus (port http-envoy-prom) every 60s
  • Filters metrics via metricRelabelings to keep only istio_* and select envoy_* series
  • PodMonitor lifecycle tied to Gateway via OwnerReference (cascading deletion)
  • RBAC scoped to openshift-ingress namespace via namespaced Role/RoleBinding (not ClusterRole)

Files Changed

File Change
pkg/operator/controller/gateway-podmonitor/controller.go New controller: watches Gateways and PodMonitors in openshift-ingress
pkg/operator/controller/gateway-podmonitor/podmonitor.go PodMonitor ensure/create/update logic using unstructured.Unstructured
pkg/operator/controller/names.go Added GatewayPodMonitorName() returning <gateway-name>-monitor
pkg/operator/operator.go Registered controller as dependent of gatewayapi controller
manifests/01-role.yaml Added namespaced Role for podmonitors in openshift-ingress
manifests/01-role-binding.yaml Added corresponding RoleBinding

Prometheus Cardinality Impact

The metricRelabelings keep filter restricts scraped metrics to:

  • istio_* (request count, duration, size, TCP metrics)
  • Select envoy_* (upstream connections/requests, listener activity, server memory, uptime)

Estimated time series per deployment

Scale Series estimate Impact
1 Gateway, 2 replicas ~4,000 Negligible
5 Gateways, 3 replicas ~30,000 Modest
20 Gateways, 3 replicas ~120,000 Moderate — monitor prometheus_tsdb_head_series

Mitigations in place

  • keep action with regex filter drops all metrics not matching the allowlist
  • 60s scrape interval reduces ingestion load 2-4x vs default 30s
  • Per-gateway PodMonitor allows targeted deletion if a specific Gateway causes issues

Recommendations for high-scale environments

  • Monitor prometheus_tsdb_head_series after enabling Gateway PodMonitors
  • If histogram buckets drive excessive cardinality, add a drop relabeling for istio_request_duration_milliseconds_bucket (keep only _count and _sum)
  • For >20 Gateways, consider tightening the regex further based on actual Kiali requirements

Followup

  • Add a NetworkPolicy for gateway provisioning that blocks metric scrapping and allows just openshift-monitoring namespace to do so

Test Plan

  • Deploy operator with Gateway API feature gate enabled
  • Create a Gateway in openshift-ingress
  • Verify PodMonitor <gateway-name>-monitor is created in openshift-ingress
  • Verify PodMonitor selector matches Gateway pod labels
  • Verify Prometheus targets show the Gateway pods as scrape targets
  • Verify istio_* and allowed envoy_* metrics appear in Prometheus
  • Delete the Gateway and verify the PodMonitor is garbage-collected
  • Verify no PodMonitor leak after Gateway deletion
  • Check Prometheus prometheus_tsdb_head_series for cardinality impact

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: f9c091c6-aa72-4e1e-b8ab-9c0962aaaafb

📥 Commits

Reviewing files that changed from the base of the PR and between 5ceccfd and 7ae9aed.

📒 Files selected for processing (6)
  • manifests/00-cluster-role.yaml
  • pkg/operator/controller/gateway-podmonitor/controller.go
  • pkg/operator/controller/gateway-podmonitor/podmonitor.go
  • pkg/operator/controller/names.go
  • pkg/operator/operator.go
  • test/e2e/gateway_api_test.go
✅ Files skipped from review due to trivial changes (1)
  • manifests/00-cluster-role.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/operator/controller/names.go
  • pkg/operator/operator.go
  • pkg/operator/controller/gateway-podmonitor/controller.go

📝 Walkthrough

Walkthrough

The pull request introduces a new unmanaged gateway-podmonitor controller that reconciles PodMonitor resources for each Gateway. The controller watches Gateway objects and PodMonitor resources in the operand namespace, mapping PodMonitor events to their owning Gateway. On reconciliation, it creates or updates a PodMonitor with configured metrics scraping on port http-envoy-prom at path /stats/prometheus. The controller is integrated into the operator's lifecycle management and includes corresponding RBAC permissions and e2e test coverage to validate PodMonitor creation and configuration.

🚥 Pre-merge checks | ✅ 12
✅ Passed checks (12 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: implementing PodMonitor provisioning for Gateway API, which is the core objective of this PR.
Description check ✅ Passed The description comprehensively covers the changeset with detailed rationale, cardinality analysis, test plan, and implementation notes directly related to the PodMonitor controller addition.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All Ginkgo test declarations in gateway_api_test.go contain stable, deterministic names without dynamic content, fmt.Sprintf, string concatenation, or variable interpolation.
Test Structure And Quality ✅ Passed The assertGatewayPodMonitor helper function tests a single logical behavior with appropriate setup/cleanup patterns, proper timeouts, meaningful assertion messages, and follows existing codebase patterns.
Microshift Test Compatibility ✅ Passed New e2e test additions do not use MicroShift-incompatible APIs. PodMonitor resources from monitoring.coreos.com/v1 API and standard Kubernetes APIs are used without OpenShift-specific dependencies.
Single Node Openshift (Sno) Test Compatibility ✅ Passed The assertGatewayPodMonitor test helper function does not make multi-node or HA cluster assumptions and works equally well in SNO environments.
Topology-Aware Scheduling Compatibility ✅ Passed PR does not introduce topology-aware scheduling constraints; creates only configuration PodMonitors using standard label selectors.
Ote Binary Stdout Contract ✅ Passed Pull request introduces gateway-podmonitor controller code that properly adheres to OTE Binary Stdout Contract with all logging using configured logf.Logger writing to stderr.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed The assertGatewayPodMonitor test helper contains no IPv4-specific assumptions, IP parsing logic, or external connectivity requirements, operating entirely within the Kubernetes cluster using standard client operations.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 28, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign miciah for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/operator/operator.go (1)

358-375: ⚠️ Potential issue | 🟠 Major

Avoid one-shot startup for the PodMonitor dependent controller.

gatewayapi.ensureDependentControllers() starts dependents once and never retries after a Start() error. That worked for the existing Gateway API dependents because their prerequisite CRDs are created by the gatewayapi bootstrap path, but gateway-podmonitor adds a separate dependency on monitoring.coreos.com/v1/PodMonitor. If this controller fails to start during bootstrap, PodMonitor reconciliation stays off until the operator restarts.

Please gate this controller on PodMonitor CRD readiness or make dependent-controller startup retryable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/operator.go` around lines 358 - 375, The gateway-podmonitor
controller (created via gatewaypodmonitorcontroller.NewUnmanaged and passed into
the DependentControllers slice of gatewayapicontroller.New) may fail to start
one-shot if the monitoring.coreos.com/v1/PodMonitor CRD isn’t present; update
the startup logic so PodMonitor is gated on CRD readiness or make
ensureDependentControllers retryable: either add a CRD-ready check for
monitoring.coreos.com/v1/PodMonitor before registering/starting
gatewayPodMonitorController, or change gatewayapi.ensureDependentControllers to
attempt Start() with retry/backoff and continue retrying on transient Start()
errors so PodMonitor reconciliation comes up without requiring an operator
restart.
🧹 Nitpick comments (1)
test/e2e/gateway_api_test.go (1)

1647-1658: Assert the keep relabeling too.

The critical safeguard in this feature is the metricRelabelings keep rule. Right now the test still passes if a future change drops that filter and starts scraping the full Envoy metric set, which is exactly the failure mode called out in the PR description.

Suggested assertion
 	// Verify podMetricsEndpoints configuration.
 	endpoints, found, err := unstructured.NestedSlice(pm.Object, "spec", "podMetricsEndpoints")
 	if assert.NoError(t, err) && assert.True(t, found, "PodMonitor should have spec.podMetricsEndpoints") {
 		if assert.Len(t, endpoints, 1, "PodMonitor should have exactly one endpoint") {
 			ep, ok := endpoints[0].(map[string]interface{})
 			if assert.True(t, ok, "endpoint should be a map") {
 				assert.Equal(t, "/stats/prometheus", ep["path"], "endpoint path should be /stats/prometheus")
 				assert.Equal(t, "60s", ep["interval"], "endpoint interval should be 60s")
 				assert.Equal(t, "http-envoy-prom", ep["port"], "endpoint port should be http-envoy-prom")
+				relabelings, ok := ep["metricRelabelings"].([]interface{})
+				if assert.True(t, ok, "endpoint should define metricRelabelings") &&
+					assert.Len(t, relabelings, 1, "endpoint should have exactly one metricRelabeling") {
+					relabeling, ok := relabelings[0].(map[string]interface{})
+					if assert.True(t, ok, "metricRelabeling should be a map") {
+						assert.Equal(t, "keep", relabeling["action"], "metricRelabeling action should be keep")
+					}
+				}
 			}
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/gateway_api_test.go` around lines 1647 - 1658, Add an assertion that
the endpoint's metricRelabelings includes a keep rule: after you obtain ep from
endpoints[0] (the existing variables pm.Object, endpoints and ep), fetch
ep["metricRelabelings"] as a slice, assert no error and that it exists, then
assert at least one entry is a map containing "action" == "keep" (and optionally
the expected "regex" key if you want to be stricter) so the test fails if the
keep relabeling is removed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/operator/controller/gateway-podmonitor/controller.go`:
- Around line 63-75: The mapper enqueueRequestForOwningGateway currently only
uses manifests.OwningGatewayLabel; update it to fall back to the object's owner
references when that label is absent or changed: if
labels[manifests.OwningGatewayLabel] is missing, iterate a.GetOwnerReferences(),
find an owner with Kind matching the Gateway kind (use the Gateway kind constant
if available or "Gateway"), take owner.Name and enqueue a reconcile.Request for
that name (use operatorcontroller.DefaultOperandNamespace for the gateway
namespace to match desiredGatewayPodMonitor behavior); keep the original
label-based path and only use the ownerRef fallback when label lookup fails.

---

Outside diff comments:
In `@pkg/operator/operator.go`:
- Around line 358-375: The gateway-podmonitor controller (created via
gatewaypodmonitorcontroller.NewUnmanaged and passed into the
DependentControllers slice of gatewayapicontroller.New) may fail to start
one-shot if the monitoring.coreos.com/v1/PodMonitor CRD isn’t present; update
the startup logic so PodMonitor is gated on CRD readiness or make
ensureDependentControllers retryable: either add a CRD-ready check for
monitoring.coreos.com/v1/PodMonitor before registering/starting
gatewayPodMonitorController, or change gatewayapi.ensureDependentControllers to
attempt Start() with retry/backoff and continue retrying on transient Start()
errors so PodMonitor reconciliation comes up without requiring an operator
restart.

---

Nitpick comments:
In `@test/e2e/gateway_api_test.go`:
- Around line 1647-1658: Add an assertion that the endpoint's metricRelabelings
includes a keep rule: after you obtain ep from endpoints[0] (the existing
variables pm.Object, endpoints and ep), fetch ep["metricRelabelings"] as a
slice, assert no error and that it exists, then assert at least one entry is a
map containing "action" == "keep" (and optionally the expected "regex" key if
you want to be stricter) so the test fails if the keep relabeling is removed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 7025f267-551c-4c6a-be65-faec8aa0a877

📥 Commits

Reviewing files that changed from the base of the PR and between 961ac21 and 5ceccfd.

📒 Files selected for processing (7)
  • manifests/01-role-binding.yaml
  • manifests/01-role.yaml
  • pkg/operator/controller/gateway-podmonitor/controller.go
  • pkg/operator/controller/gateway-podmonitor/podmonitor.go
  • pkg/operator/controller/names.go
  • pkg/operator/operator.go
  • test/e2e/gateway_api_test.go

Comment thread pkg/operator/controller/gateway-podmonitor/controller.go
@rikatz
Copy link
Copy Markdown
Member Author

rikatz commented Apr 28, 2026

The pre-merge findings does not make sense as I am not changing openshift/origin tests

@rikatz
Copy link
Copy Markdown
Member Author

rikatz commented Apr 28, 2026

claude answer on guarding the controller for CRD and ensureDependentControllers:

Regarding the ensureDependentControllers CRD readiness finding — no fix is needed. Here's why:

  1. Pre-existing pattern: ensureDependentControllers uses a fire-and-forget Start() in goroutines for all
    dependent controllers (networkpolicy, status, labeler, DNS). The controllersStarted flag prevents retry after
    the first attempt. This is the established design, not something specific to our PodMonitor controller.
  2. monitoring.coreos.com CRDs are always present on OpenShift: The cluster-monitoring-operator is a required
    component and installs these CRDs. The existing ServiceMonitor code in
    pkg/operator/controller/ingress/monitoring.go already makes this same assumption — it creates
    monitoring.coreos.com/v1/ServiceMonitor resources without any CRD readiness gating.
  3. Scope: Changing ensureDependentControllers to be retryable or CRD-gated would affect all dependent
    controllers and is a broader architectural change beyond the scope of this PodMonitor addition.

All the other findings from the review have already been verified and fixed where needed (RBAC list/watch
verbs, %w error wrapping, AlreadyExists guard, unused struct fields, consistent cmp.Equal usage). The
implementation is complete.

@rikatz
Copy link
Copy Markdown
Member Author

rikatz commented Apr 30, 2026

image

@rikatz
Copy link
Copy Markdown
Member Author

rikatz commented Apr 30, 2026

image

@rikatz
Copy link
Copy Markdown
Member Author

rikatz commented Apr 30, 2026

/hold
holding for now, we need to discuss a better strategy to ship metrics on Gateways and also possible impacts

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 30, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 30, 2026

@rikatz: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn 7ae9aed link true /test e2e-aws-ovn
ci/prow/e2e-aws-operator-techpreview 7ae9aed link false /test e2e-aws-operator-techpreview

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants