Motivation
The operator reconciles RestateCluster CRs and has access to their full parsed spec during each reconcile. Today, none of this configuration is exposed as Prometheus metrics. Operators who want to understand how their clusters are configured — retention durations, resource limits, concurrency settings, partition counts — have to query each CR individually via kubectl.
Exposing the effective configuration as Prometheus gauges makes this information queryable, alertable, and dashboardable alongside the runtime metrics that Restate already emits.
Proposal
Add per-RestateCluster Prometheus gauges, updated during each successful reconcile. All gauges labelled by namespace. Cleaned up when the CR is deleted (via the finalizer).
Configuration gauges (parsed from spec.compute.env[] and resource fields):
| Metric |
Source |
Notes |
restate_operator_cluster_journal_retention_seconds |
RESTATE_DEFAULT_JOURNAL_RETENTION |
Duration string → seconds |
restate_operator_cluster_journal_retention_max_seconds |
RESTATE_MAX_JOURNAL_RETENTION |
Duration string → seconds |
restate_operator_cluster_concurrency_limit |
RESTATE_WORKER__INVOKER__CONCURRENT_INVOCATIONS_LIMIT |
|
restate_operator_cluster_throttle_rate |
RESTATE_WORKER__INVOKER__ACTION_THROTTLING__RATE |
|
restate_operator_cluster_throttle_capacity |
RESTATE_WORKER__INVOKER__ACTION_THROTTLING__CAPACITY |
|
restate_operator_cluster_compaction_threads |
RESTATE_STORAGE_LOW_PRIORITY_BG_THREADS |
|
restate_operator_cluster_rocksdb_cache_bytes |
RESTATE_ROCKSDB_TOTAL_MEMORY_SIZE |
Quantity string → bytes |
restate_operator_cluster_query_engine_memory_bytes |
RESTATE_ADMIN__QUERY_ENGINE__MEMORY_SIZE |
Quantity string → bytes |
restate_operator_cluster_query_engine_parallelism |
RESTATE_ADMIN__QUERY_ENGINE__QUERY_PARALLELISM |
|
restate_operator_cluster_num_partitions |
RESTATE_DEFAULT_NUM_PARTITIONS |
Immutable after creation |
restate_operator_cluster_cpu_limit_millicores |
spec.compute.resources.limits.cpu |
Quantity string → millicores |
restate_operator_cluster_memory_limit_bytes |
spec.compute.resources.limits.memory |
Quantity string → bytes |
restate_operator_cluster_storage_request_bytes |
spec.storage.storageRequestBytes |
Already a number |
Info metric (string labels):
| Metric |
Labels |
restate_operator_cluster_info |
namespace, image, replicas |
Status gauges:
| Metric |
Source |
restate_operator_cluster_ready |
status.conditions[type=Ready].status as 1/0 |
restate_operator_cluster_generation_drift |
metadata.generation - status.observedGeneration |
generation_drift non-zero for more than a few seconds means a spec change hasn't been reconciled — either pending or failing silently. Complements the Ready condition (which may stay True during non-disruptive spec changes that the operator hasn't picked up yet).
These metrics give the operator the same level of configuration visibility that the Restate server provides for runtime state — but for the cluster's declared configuration.
Why this is valuable
Alert correlation. Write alerts that reference configuration alongside runtime behaviour:
- "Storage > 80% AND journal_retention > 86400s" → retention is likely the cause, not unbounded growth
- "Memory > 90% AND concurrency_limit > 500" → fan-out is driving memory pressure
- "CPU > 75% AND compaction_threads < 4 AND compaction_pending > 0" → compaction is bottlenecked
Today these correlations require a human to cross-reference kubectl with Grafana.
Fleet dashboards. A single Grafana table query shows every cluster's effective configuration. Useful for:
- Which clusters are on non-default retention?
- What's the distribution of concurrency limits across the fleet?
- Are any clusters still on an old image version?
- Which clusters have single-digit partition counts (potential hot-partition risk)?
Version management. Alert on clusters running outdated images: restate_operator_cluster_info{image!~".*:1\\.6\\..*"} finds clusters not on the current release. Useful for tracking rollout progress or finding stragglers.
Capacity planning. Aggregate resource allocation across the fleet: total CPU requested, total storage provisioned, distribution of memory limits by tier. Queryable without kubectl.
Incident triage. When investigating a customer issue, the on-call engineer can query the cluster's full configuration from Grafana without needing kubectl access or a port-forward. Reduces time-to-context during incidents.
Prior art
Flux CD's controllers emit reconciliation and configuration metrics for their CRs (GitRepository, Kustomization, HelmRelease). This gives operators fleet-wide visibility via Prometheus without needing to query individual resources. The pattern is: controllers emit what they know about the resources they manage.
Implementation notes
- The
prometheus crate and /metrics endpoint are already wired up (src/metrics.rs)
- Use
GaugeVec with namespace label for per-CR metrics
- Parse values during the reconcile loop (spec is already deserialized)
- On CR deletion (finalizer path), call
remove_label_values to avoid stale series
- Duration parsing: Restate uses Go-style durations ("24h", "1 week") —
humantime crate or a small parser
- Quantity parsing: K8s resource quantities ("4Gi", "100m") —
k8s-openapi or a small parser
- Env var extraction: iterate
spec.compute.env once per reconcile, match by name, parse values. Missing env vars → metric not emitted (not 0)
Motivation
The operator reconciles RestateCluster CRs and has access to their full parsed spec during each reconcile. Today, none of this configuration is exposed as Prometheus metrics. Operators who want to understand how their clusters are configured — retention durations, resource limits, concurrency settings, partition counts — have to query each CR individually via kubectl.
Exposing the effective configuration as Prometheus gauges makes this information queryable, alertable, and dashboardable alongside the runtime metrics that Restate already emits.
Proposal
Add per-RestateCluster Prometheus gauges, updated during each successful reconcile. All gauges labelled by
namespace. Cleaned up when the CR is deleted (via the finalizer).Configuration gauges (parsed from
spec.compute.env[]and resource fields):restate_operator_cluster_journal_retention_secondsRESTATE_DEFAULT_JOURNAL_RETENTIONrestate_operator_cluster_journal_retention_max_secondsRESTATE_MAX_JOURNAL_RETENTIONrestate_operator_cluster_concurrency_limitRESTATE_WORKER__INVOKER__CONCURRENT_INVOCATIONS_LIMITrestate_operator_cluster_throttle_rateRESTATE_WORKER__INVOKER__ACTION_THROTTLING__RATErestate_operator_cluster_throttle_capacityRESTATE_WORKER__INVOKER__ACTION_THROTTLING__CAPACITYrestate_operator_cluster_compaction_threadsRESTATE_STORAGE_LOW_PRIORITY_BG_THREADSrestate_operator_cluster_rocksdb_cache_bytesRESTATE_ROCKSDB_TOTAL_MEMORY_SIZErestate_operator_cluster_query_engine_memory_bytesRESTATE_ADMIN__QUERY_ENGINE__MEMORY_SIZErestate_operator_cluster_query_engine_parallelismRESTATE_ADMIN__QUERY_ENGINE__QUERY_PARALLELISMrestate_operator_cluster_num_partitionsRESTATE_DEFAULT_NUM_PARTITIONSrestate_operator_cluster_cpu_limit_millicoresspec.compute.resources.limits.cpurestate_operator_cluster_memory_limit_bytesspec.compute.resources.limits.memoryrestate_operator_cluster_storage_request_bytesspec.storage.storageRequestBytesInfo metric (string labels):
restate_operator_cluster_infonamespace,image,replicasStatus gauges:
restate_operator_cluster_readystatus.conditions[type=Ready].statusas 1/0restate_operator_cluster_generation_driftmetadata.generation - status.observedGenerationgeneration_driftnon-zero for more than a few seconds means a spec change hasn't been reconciled — either pending or failing silently. Complements the Ready condition (which may stay True during non-disruptive spec changes that the operator hasn't picked up yet).These metrics give the operator the same level of configuration visibility that the Restate server provides for runtime state — but for the cluster's declared configuration.
Why this is valuable
Alert correlation. Write alerts that reference configuration alongside runtime behaviour:
Today these correlations require a human to cross-reference kubectl with Grafana.
Fleet dashboards. A single Grafana table query shows every cluster's effective configuration. Useful for:
Version management. Alert on clusters running outdated images:
restate_operator_cluster_info{image!~".*:1\\.6\\..*"}finds clusters not on the current release. Useful for tracking rollout progress or finding stragglers.Capacity planning. Aggregate resource allocation across the fleet: total CPU requested, total storage provisioned, distribution of memory limits by tier. Queryable without kubectl.
Incident triage. When investigating a customer issue, the on-call engineer can query the cluster's full configuration from Grafana without needing kubectl access or a port-forward. Reduces time-to-context during incidents.
Prior art
Flux CD's controllers emit reconciliation and configuration metrics for their CRs (GitRepository, Kustomization, HelmRelease). This gives operators fleet-wide visibility via Prometheus without needing to query individual resources. The pattern is: controllers emit what they know about the resources they manage.
Implementation notes
prometheuscrate and/metricsendpoint are already wired up (src/metrics.rs)GaugeVecwithnamespacelabel for per-CR metricsremove_label_valuesto avoid stale serieshumantimecrate or a small parserk8s-openapior a small parserspec.compute.envonce per reconcile, match by name, parse values. Missing env vars → metric not emitted (not 0)