Kyma Telemetry auto-wiring

When Kyma Telemetry module is present on the cluster, automatically create a MetricPipeline CR that scrapes DCGM metrics from the gpu-operator namespace and forwards them to the user's existing metrics backend.

NVIDIA DCGM Exporter already ships GPU metrics (utilization, memory, temperature) inside the gpu-operator namespace. Today they go nowhere. With this feature, enabling the GPU module is enough - metrics appear in the user's dashboards automatically with zero additional configuration.



How

1. On every reconcile, check if MetricPipeline CRD exists on the cluster
2. If Telemetry is present, create a MetricPipeline pointing at gpu-operator namespace, forwarding to Kyma's internal OTel Collector
3. Delete the pipeline when GPU module is deleted

Open question

Confirm the internal Telemetry OTel Collector service endpoint - this is what we point the pipeline output at so metrics flow to whatever backend the user already configured in Telemetry.

Acceptance criteria

- Telemetry not installed -> no error, no pipeline, silent skip
- Telemetry installed -> MetricPipeline created automatically on GPU module install
- GPU module deleted -> MetricPipeline cleaned up
- Telemetry installed after GPU module -> next reconcile picks it up
- GPU metrics visible in user's backend with no manual steps

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kyma Telemetry auto-wiring #70

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Kyma Telemetry auto-wiring #70

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions