Skip to content

GTI-686: normalize Redis Cluster zone values in Region#25

Merged
ymendez-redis merged 1 commit into
mainfrom
GTI-686/fix-cluster-zone-region-grouping
Apr 27, 2026
Merged

GTI-686: normalize Redis Cluster zone values in Region#25
ymendez-redis merged 1 commit into
mainfrom
GTI-686/fix-cluster-zone-region-grouping

Conversation

@ymendez-redis
Copy link
Copy Markdown
Collaborator

Problem

GCP exposes Redis Cluster and Valkey metrics under node-level monitored resources, redis.googleapis.com/ClusterNode and memorystore.googleapis.com/InstanceNode, whose location label is a zone (for example us-central1-b), not a region.

memorystore.py was writing that value verbatim into the Region column, so a 4-node cluster spread across 3 zones produced 3 distinct Region values for what is actually a single regional cluster.

Downstream, redis2re groups by Region when sizing, so the same cluster surfaced as multiple duplicate rows in the Salesforce sizing CSV, one per zone, with throughput split across them.

Standalone Redis (redis_instance) is unaffected because it exposes region explicitly. The bug only appears on cluster / Valkey node-level metrics.

Fix

  • Added _normalize_location(value) -> (region, zone) to detect zone-shaped GCP location strings using ^([a-z]+(?:-[a-z]+)+\d+)-[a-z]$ and split them into region and zone components.
  • Region-shaped inputs such as us-east4, plus values like us, global, and empty strings, pass through unchanged as the region with an empty zone.
  • Applied the helper in _accumulate_commands when resolving REGION_LABELS.
  • Kept explicit zone labels as higher precedence when present, preserving existing behavior for resources that already populate both fields.

The regex was validated against the current GCP namespace:

  • 130/130 current zones matched and mapped to the correct parent region
  • 43/43 current regions were correctly rejected

The multi-segment form (?:-[a-z]+)+ also supports hypothetical 3+ word-segment regions without changing behavior for names currently in use.

Verification

Reproduced end-to-end against a live multi-zone cluster, redislabs-sales-pivotal/memorystore-redis-cluster, with 2 shards × 2 replicas across us-central1-{b,c,f} after generating mixed-command traffic.

Stage Before fix After fix
memorystore.py Region column us-central1-b, us-central1-c, us-central1-f us-central1 for all 4 nodes
memorystore.py Zone column empty per-node zone (-b / -c / -f)
redis2re sizing rows for the cluster 3 duplicates (0.01 + 0.01 + 0.10 ops) 1 consolidated row (0.04 ops, throughput aggregated correctly)

Verification artifacts, including memorystore.py CSV before/after and redis2re sizing CSV before/after, are attached to the Jira ticket.

Tests

Added in test_msstats.py:

  • test_normalize_location_zone_form — splits us-central1-a into ("us-central1", "us-central1-a")
  • test_normalize_location_region_form — passes us-east4, us, global, and "" through unchanged
  • test_accumulate_commands_cluster_node_splits_zone_from_loc — drives _accumulate_commands with a synthetic ClusterNode time series carrying a zone-shaped location and asserts the resulting Region / Zone split

Validation:

  • pytest test_msstats.py -> 27/27 passing
  • black --check clean

Redis Cluster node-level metrics expose a `location` label whose value
is a zone (e.g. us-central1-a). That zone was being written verbatim
into the Region column, which caused downstream redis2re grouping to
emit one row per zone per cluster.

Add _normalize_location() to split a GCP location string into its
(region, zone) components and apply it in _accumulate_commands.
Explicit `zone` labels (Valkey, Redis standalone with a zone metric
label) keep their precedence over the derived value. Region-shaped
strings (e.g. us-east4) are passed through unchanged.
@ymendez-redis ymendez-redis merged commit 207df46 into main Apr 27, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants