You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/patterns/hybrid-mesh-platform/observability.md
+36-16Lines changed: 36 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,48 +14,68 @@ Grafana panels, Kiali graphs, and Kafka Console views help you confirm that fact
14
14
15
15
_Observability stack overview: Grafana for multi-cluster dashboards, Kiali for mesh topology, Kafka Console for streaming health, and OpenTelemetry for distributed traces._
_Hub Grafana — fleet dashboards with datasources from hub Prometheus plus east/west via Skupper auth proxy._
17
+
## How cross-cluster metrics flow
20
18
21
-
[](/images/hybrid-mesh-platform/product-kiali-service-mesh.png)
19
+
Spoke clusters scrape their own Prometheus metrics (Kafka JMX, ztunnel L4, workload RED signals). An **nginx auth proxy** in front of Thanos Querier injects a bearer token, and a **Skupper Connector** exposes it on port 9091. On the hub, Skupper **Listeners**`prometheus-east` and `prometheus-west` materialize as ClusterIP services that Grafana queries as plain HTTP datasources — no spoke tokens are stored on the hub.
22
20
23
-
_Kiali traffic graph — Service Mesh topology showing L4 ztunnel connections and L7 waypoint traffic between hub gateway and spoke services._
Hub Grafana aggregates three datasources: local hub Thanos plus `prometheus-east` and `prometheus-west` via Skupper. Dashboards are deployed by the `components/grafana-dashboards` chart.
30
28
31
-
_Architecture diagram: spoke Prometheus → nginx auth proxy → Skupper connector → hub listener → Grafana datasource. No bearer tokens stored on the hub._
_Hub Grafana landing — fleet dashboards organized by cluster and workload type._
34
32
35
-
Multi-cluster fleet dashboards on the hub (east/west traffic, Service Mesh L4/L7, Kafka health):
33
+
The **east-west-traffic** dashboard shows Kafka broker health (gauges), leader/partition distribution (pie), and API request rates (bargauge) across both spokes:
36
34
37
35
[](/images/hybrid-mesh-platform/product-grafana-observability-2.png)
38
36
39
-
_East-west traffic dashboard: Kafka broker state gauges, leader/partition distribution pie charts, and API request bargauges per cluster._
37
+
_East-west traffic: broker state gauges, leader distribution, and API request bargauges per cluster._
38
+
39
+
The **multi-cluster-istio** dashboard plots L4 ztunnel TCP connections, bytes timeseries, and cross-cluster error rates:
_Extended fleet KPI panels: combined Kafka and mesh signals for operational health across all clusters._
49
+
_Fleet KPI: combined Kafka + mesh health across all clusters._
50
+
51
+
## Kiali — mesh topology visualization
48
52
49
-
## Kiali and mesh topology views
53
+
Each cluster runs Kiali with an OSSMConsole CR (OpenShift Console plugin). On the hub, Kiali shows multi-cluster topology using remote secrets — without requiring Istio multi-cluster trust federation. With ztunnel active, the graph shows L4 connections; L7 detail appears for paths routed through waypoints.
54
+
55
+
[](/images/hybrid-mesh-platform/product-kiali-service-mesh.png)
56
+
57
+
_Kiali traffic graph: L4 ztunnel connections between hub gateway and spoke services._
50
58
51
59
[](/images/hybrid-mesh-platform/product-kiali-service-mesh-2.png)
52
60
53
-
## Kafka Console views
61
+
_Kiali detail view: per-service traffic rates, error percentages, and response time distributions._
62
+
63
+
## Kafka Console — streaming health across clusters
64
+
65
+
The Streams for Apache Kafka Console on the hub registers hub `prod-cluster` (full metrics) plus spoke `dev-cluster` and `factory-cluster` via Skupper bootstrap listeners. Operators can view topics, consumer groups, partitions, and broker status from a single UI.
[](/images/hybrid-mesh-platform/product-kafka-console-amq-streams-2.png)
56
72
73
+
_Cluster detail: topics, partitions, and replicas per spoke Kafka cluster over Skupper._
74
+
57
75
[](/images/hybrid-mesh-platform/product-kafka-console-amq-streams-3.png)
58
76
77
+
_Broker and topic metrics: producer/consumer rates, lag, and partition leadership distribution._
0 commit comments