Skip to content

Commit b99a4de

Browse files
Restructure Observability page with text between all images.
Add explanatory paragraphs between every screenshot: cross-cluster metrics flow, per-dashboard descriptions, Kiali topology detail, and Kafka Console multi-cluster explanation. Signed-off-by: Maximiliano Pizarro <maximiliano.pizarro.5@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>
1 parent bcd093a commit b99a4de

1 file changed

Lines changed: 36 additions & 16 deletions

File tree

content/patterns/hybrid-mesh-platform/observability.md

Lines changed: 36 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -14,48 +14,68 @@ Grafana panels, Kiali graphs, and Kafka Console views help you confirm that fact
1414

1515
_Observability stack overview: Grafana for multi-cluster dashboards, Kiali for mesh topology, Kafka Console for streaming health, and OpenTelemetry for distributed traces._
1616

17-
[![Grafana multi-cluster dashboards](/images/hybrid-mesh-platform/product-grafana-observability.png)](/images/hybrid-mesh-platform/product-grafana-observability.png)
18-
19-
_Hub Grafana — fleet dashboards with datasources from hub Prometheus plus east/west via Skupper auth proxy._
17+
## How cross-cluster metrics flow
2018

21-
[![Kiali service mesh](/images/hybrid-mesh-platform/product-kiali-service-mesh.png)](/images/hybrid-mesh-platform/product-kiali-service-mesh.png)
19+
Spoke clusters scrape their own Prometheus metrics (Kafka JMX, ztunnel L4, workload RED signals). An **nginx auth proxy** in front of Thanos Querier injects a bearer token, and a **Skupper Connector** exposes it on port 9091. On the hub, Skupper **Listeners** `prometheus-east` and `prometheus-west` materialize as ClusterIP services that Grafana queries as plain HTTP datasources — no spoke tokens are stored on the hub.
2220

23-
_Kiali traffic graph — Service Mesh topology showing L4 ztunnel connections and L7 waypoint traffic between hub gateway and spoke services._
21+
[![Observability pipeline](/images/hybrid-mesh-platform/arch-observability-pipeline.png)](/images/hybrid-mesh-platform/arch-observability-pipeline.png)
2422

25-
[![Kafka Console](/images/hybrid-mesh-platform/product-kafka-console-amq-streams.png)](/images/hybrid-mesh-platform/product-kafka-console-amq-streams.png)
23+
_Architecture: spoke Prometheus → nginx auth proxy → Skupper connector → hub listener → Grafana datasource._
2624

27-
_Kafka Console — multi-cluster view of hub `prod-cluster` and spoke `dev-cluster` / `factory-cluster` topics over Skupper bootstrap._
25+
## Grafana — multi-cluster dashboards
2826

29-
[![Observability pipeline](/images/hybrid-mesh-platform/arch-observability-pipeline.png)](/images/hybrid-mesh-platform/arch-observability-pipeline.png)
27+
Hub Grafana aggregates three datasources: local hub Thanos plus `prometheus-east` and `prometheus-west` via Skupper. Dashboards are deployed by the `components/grafana-dashboards` chart.
3028

31-
_Architecture diagram: spoke Prometheus → nginx auth proxy → Skupper connector → hub listener → Grafana datasource. No bearer tokens stored on the hub._
29+
[![Grafana multi-cluster dashboards](/images/hybrid-mesh-platform/product-grafana-observability.png)](/images/hybrid-mesh-platform/product-grafana-observability.png)
3230

33-
## Grafana dashboard views
31+
_Hub Grafana landing — fleet dashboards organized by cluster and workload type._
3432

35-
Multi-cluster fleet dashboards on the hub (east/west traffic, Service Mesh L4/L7, Kafka health):
33+
The **east-west-traffic** dashboard shows Kafka broker health (gauges), leader/partition distribution (pie), and API request rates (bargauge) across both spokes:
3634

3735
[![Grafana — east-west traffic and Service Mesh](/images/hybrid-mesh-platform/product-grafana-observability-2.png)](/images/hybrid-mesh-platform/product-grafana-observability-2.png)
3836

39-
_East-west traffic dashboard: Kafka broker state gauges, leader/partition distribution pie charts, and API request bargauges per cluster._
37+
_East-west traffic: broker state gauges, leader distribution, and API request bargauges per cluster._
38+
39+
The **multi-cluster-istio** dashboard plots L4 ztunnel TCP connections, bytes timeseries, and cross-cluster error rates:
4040

4141
[![Grafana — multi-cluster Istio metrics (ztunnel L4)](/images/hybrid-mesh-platform/product-grafana-observability-3.png)](/images/hybrid-mesh-platform/product-grafana-observability-3.png)
4242

43-
_Multi-cluster Istio dashboard: L4 ztunnel TCP connections, bytes sent/received timeseries, and cross-cluster error rates._
43+
_Multi-cluster Istio: ztunnel TCP connections, bytes sent/received, error rate per cluster._
44+
45+
Extended KPI panels combine Kafka and mesh signals for a single operational health view:
4446

4547
[![Grafana — extended fleet KPI panels](/images/hybrid-mesh-platform/product-grafana-observability-4.png)](/images/hybrid-mesh-platform/product-grafana-observability-4.png)
4648

47-
_Extended fleet KPI panels: combined Kafka and mesh signals for operational health across all clusters._
49+
_Fleet KPI: combined Kafka + mesh health across all clusters._
50+
51+
## Kiali — mesh topology visualization
4852

49-
## Kiali and mesh topology views
53+
Each cluster runs Kiali with an OSSMConsole CR (OpenShift Console plugin). On the hub, Kiali shows multi-cluster topology using remote secrets — without requiring Istio multi-cluster trust federation. With ztunnel active, the graph shows L4 connections; L7 detail appears for paths routed through waypoints.
54+
55+
[![Kiali service mesh](/images/hybrid-mesh-platform/product-kiali-service-mesh.png)](/images/hybrid-mesh-platform/product-kiali-service-mesh.png)
56+
57+
_Kiali traffic graph: L4 ztunnel connections between hub gateway and spoke services._
5058

5159
[![Kiali — service mesh traffic graph](/images/hybrid-mesh-platform/product-kiali-service-mesh-2.png)](/images/hybrid-mesh-platform/product-kiali-service-mesh-2.png)
5260

53-
## Kafka Console views
61+
_Kiali detail view: per-service traffic rates, error percentages, and response time distributions._
62+
63+
## Kafka Console — streaming health across clusters
64+
65+
The Streams for Apache Kafka Console on the hub registers hub `prod-cluster` (full metrics) plus spoke `dev-cluster` and `factory-cluster` via Skupper bootstrap listeners. Operators can view topics, consumer groups, partitions, and broker status from a single UI.
66+
67+
[![Kafka Console](/images/hybrid-mesh-platform/product-kafka-console-amq-streams.png)](/images/hybrid-mesh-platform/product-kafka-console-amq-streams.png)
68+
69+
_Kafka Console landing: five registered clusters (hub + east/west × dev/factory)._
5470

5571
[![Kafka Console — multi-cluster clusters and topics](/images/hybrid-mesh-platform/product-kafka-console-amq-streams-2.png)](/images/hybrid-mesh-platform/product-kafka-console-amq-streams-2.png)
5672

73+
_Cluster detail: topics, partitions, and replicas per spoke Kafka cluster over Skupper._
74+
5775
[![Kafka Console — broker and topic detail over Skupper](/images/hybrid-mesh-platform/product-kafka-console-amq-streams-3.png)](/images/hybrid-mesh-platform/product-kafka-console-amq-streams-3.png)
5876

77+
_Broker and topic metrics: producer/consumer rates, lag, and partition leadership distribution._
78+
5979
## Observability architecture
6080

6181
| Layer | Technology | Role |

0 commit comments

Comments
 (0)