feat(gateway): URL-stable identity-based cluster naming for API-level upstreams#2137
Draft
mehara-rothila wants to merge 2 commits into
Draft
feat(gateway): URL-stable identity-based cluster naming for API-level upstreams#2137mehara-rothila wants to merge 2 commits into
mehara-rothila wants to merge 2 commits into
Conversation
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
812acc1 to
534247e
Compare
… upstreams Derive API-level main/sandbox cluster names from sha256(apiID) instead of the backend URL, in both the Envoy xDS translator and the RuntimeDeployConfig transform path. The final name shape is main_<16hex> / sandbox_<16hex>; an API's main and sandbox clusters share the hash fragment (the env prefix provides the distinction), so an operator can pair an API's clusters at a glance in stats and config dumps. Because the name no longer encodes the URL, a host, port, or scheme edit is delivered to Envoy as an update to the same named cluster, which Envoy warms and swaps, instead of removing one cluster name and adding another, and a path-only edit touches just the route rewrite: routes never repoint, name-keyed stats stay continuous, and no request failures were observed while updating under continuous traffic. ClusterKey and EnvoyClusterName are unified to the same string so the policy engine's default upstream cluster always resolves to a real Envoy cluster, and the same API's main and sandbox upstreams can no longer collide into one cluster when they share a URL. Cross-API name uniqueness rests on the 64-bit hash truncation (collision-resistant, not impossible). The integration test asserts cluster identity through the Envoy admin endpoint: identity-named clusters present, URL-derived names absent, and the exact cluster-name set unchanged across host-changing URL updates (a new capture/compare step pair), so the scenarios fail under the previous naming scheme. Rollout note: on upgrade, each existing API's main/sandbox clusters are renamed once on their next translation, so per-cluster connection pools and upstream_cluster metric labels churn once; dashboards keyed on the old cluster_<scheme>_<host> names need a one-time update. The data-model spec doc is updated to describe the new naming.
534247e to
a99b916
Compare
Contributor
Author
|
@coderabbitai full review |
Contributor
✅ Action performedFull review finished. |
Both xDS builders now build "<env>_<hash>" through clusterkey.APILevelName so the two paths cannot drift. Add known-answer vectors pinning SHA-256[:8].
5c3aed7 to
840cb13
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Changes how the gateway names the Envoy clusters for an API's main and sandbox upstreams. Cluster names are now derived from the API's identity,
main_<hash>/sandbox_<hash>where the hash is the first 8 bytes of sha256(apiID|env) in hex, instead of from the backend URL (cluster_<scheme>_<sanitized-host>). Both xDS builders (the Envoy xDS translator and the RuntimeDeployConfig transform path) derive the name through one shared helper, so they can never disagree.Why
With URL-derived names, editing a backend URL renames the cluster: the old cluster is destroyed, a new one is created, and routes are repointed, draining in-flight connections on every URL change. With identity-derived names the cluster name never changes for the lifetime of the API, so a URL edit propagates as an endpoint update on the same cluster and live traffic is not dropped. The name carries no URL information, so host, port, scheme, and path edits are all equally safe.
Design decisions
Identity in, URL out. The hash input is apiID plus env (main or sandbox) only. Anything URL-derived in the name would defeat the purpose, since some class of URL edit would still rename the cluster.
One naming helper for both builders. The gateway has two builders that must agree on cluster names (the main Envoy xDS translator and the policy-side RuntimeDeployConfig transform). Both now call
pkg/utils/clusterkey, replacing the two hand-rolled sanitize functions that had to be kept in sync manually.ClusterKey and EnvoyClusterName unified. In the RDC path the internal cluster key and the Envoy-facing cluster name are now the same string. If they differ, the policy engine's default upstream cluster metadata points at a name Envoy does not know, which surfaces as 503 NoRoute.
Latent collision fixed. Under URL-derived naming, a main and sandbox upstream pointing at the same URL collided into one cluster, and two different APIs sharing a backend URL shared one cluster. With env in the hash input main and sandbox are always distinct, and each API owns its clusters.
Migration notes (one-time effects on existing deployments)
cluster_<scheme>_<host>names in theupstream_clusterlabel need a one-time update.What changed
pkg/utils/clusterkey(new):APILevel(apiID, env)naming helper plus unit tests.pkg/xds/translator.go:resolveUpstreamClustertakes the API id and derives the hash name;sanitizeClusterNameremoved.pkg/transform/restapi.go:addUpstreamClusterderives the same hash name;EnvoyClusterNameset to the cluster key;sanitizeEnvoyClusterNameremoved.cmd/controller/main.go: comment documenting that both xDS paths must keep cluster-name derivation in sync (plus an import-order fix in the touched file).gateway/spec/impls/1-basic-gateway-with-controller/data-model.md: cluster-naming section updated from the old URL-sanitized scheme to the identity-derived scheme.Test coverage
clusterkeytests; translator tests assert the hash name shape and URL-independence (TestResolveUpstreamCluster_DedupSameAPIDifferentURLs,TestResolveUpstreamCluster_MainSandboxNeverCollide); transformer tests assert the name shape on routes and clusters, that a route's default cluster always references a registered cluster (TestRestAPITransformer_APILevelDefaultClusterMatchesRealCluster), and that a URL edit does not rename the cluster (TestRestAPITransformer_APILevelURLStableAcrossURLEdit).api-level-url-stable.featurewith three scenarios: a main upstream URL update that changes both the host and the path (the host change is what URL-derived naming could not survive), a sandbox upstream URL update, and default cluster resolution under cluster_header routing. The main and sandbox scenarios also assert cluster identity through the Envoy admin endpoint (identity-named clusters present, URL-derivedcluster_<scheme>_<host>names absent, checked before and after the update), so these scenarios fail under the previous naming scheme.Related PRs