Skip to content

Commit a463838

Browse files
committed
Fix CEL and add examples
Signed-off-by: jukie <10012479+Jukie@users.noreply.github.com>
1 parent a7dea7c commit a463838

4 files changed

Lines changed: 414 additions & 1 deletion

File tree

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
---
2+
apiVersion: v1
3+
kind: Service
4+
metadata:
5+
name: backend-utilization
6+
labels:
7+
app: backend-utilization
8+
spec:
9+
selector:
10+
app: backend-utilization
11+
ports:
12+
- protocol: TCP
13+
port: 3000
14+
targetPort: 3000
15+
---
16+
apiVersion: apps/v1
17+
kind: Deployment
18+
metadata:
19+
name: backend-utilization-low
20+
labels:
21+
app: backend-utilization
22+
util: low
23+
spec:
24+
replicas: 2
25+
selector:
26+
matchLabels:
27+
app: backend-utilization
28+
util: low
29+
template:
30+
metadata:
31+
labels:
32+
app: backend-utilization
33+
util: low
34+
spec:
35+
containers:
36+
- name: backend
37+
image: envoyproxy/gateway-backend-utilization:latest
38+
imagePullPolicy: IfNotPresent
39+
ports:
40+
- containerPort: 3000
41+
env:
42+
- name: POD_NAME
43+
valueFrom:
44+
fieldRef:
45+
fieldPath: metadata.name
46+
- name: NAMESPACE
47+
valueFrom:
48+
fieldRef:
49+
fieldPath: metadata.namespace
50+
- name: SERVICE_NAME
51+
value: backend-utilization
52+
- name: ORCA_CPU_UTILIZATION
53+
value: "0.1"
54+
resources:
55+
requests:
56+
cpu: 10m
57+
---
58+
apiVersion: apps/v1
59+
kind: Deployment
60+
metadata:
61+
name: backend-utilization-high
62+
labels:
63+
app: backend-utilization
64+
util: high
65+
spec:
66+
replicas: 2
67+
selector:
68+
matchLabels:
69+
app: backend-utilization
70+
util: high
71+
template:
72+
metadata:
73+
labels:
74+
app: backend-utilization
75+
util: high
76+
spec:
77+
containers:
78+
- name: backend
79+
image: envoyproxy/gateway-backend-utilization:latest
80+
imagePullPolicy: IfNotPresent
81+
ports:
82+
- containerPort: 3000
83+
env:
84+
- name: POD_NAME
85+
valueFrom:
86+
fieldRef:
87+
fieldPath: metadata.name
88+
- name: NAMESPACE
89+
valueFrom:
90+
fieldRef:
91+
fieldPath: metadata.namespace
92+
- name: SERVICE_NAME
93+
value: backend-utilization
94+
- name: ORCA_CPU_UTILIZATION
95+
value: "0.9"
96+
resources:
97+
requests:
98+
cpu: 10m
99+
---
100+
apiVersion: gateway.networking.k8s.io/v1
101+
kind: HTTPRoute
102+
metadata:
103+
name: backend-utilization
104+
spec:
105+
parentRefs:
106+
- name: eg
107+
hostnames:
108+
- "www.example.com"
109+
rules:
110+
- matches:
111+
- path:
112+
type: PathPrefix
113+
value: /backend-utilization
114+
backendRefs:
115+
- name: backend-utilization
116+
port: 3000
Lines changed: 217 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,217 @@
1+
---
2+
title: "Backend Utilization Load Balancing"
3+
---
4+
5+
BackendUtilization load balancing uses [Open Resource Cost Application (ORCA)][ORCA] load metrics reported by the backend to dynamically weight endpoints. Under the hood it is implemented as [Envoy's client-side weighted round-robin][client-side-wrr] policy: each endpoint's weight is derived from the utilization metrics it emits, so instances running hot receive proportionally less traffic than those with headroom.
6+
7+
If no ORCA metrics are received from an endpoint, that endpoint is treated as evenly weighted.
8+
9+
See the [Load Balancing concepts page][concepts-lb] for a deeper explanation of ORCA metric formats.
10+
11+
## Prerequisites
12+
13+
* Your backend (or a sidecar in front of it) must emit ORCA load metrics as response headers or trailers. See [Backend instrumentation](#backend-instrumentation) below.
14+
* {{< boilerplate prerequisites >}}
15+
16+
## Build and Deploy the Example Backend
17+
18+
The Envoy Gateway repository includes a small HTTP server under `examples/backend-utilization/` that emits a fixed ORCA `cpu_utilization` value (set via the `ORCA_CPU_UTILIZATION` environment variable) on every response. The example manifest deploys two sets of pods — one reporting `0.1` (idle) and one reporting `0.9` (hot) — behind a single Service. This lets you observe the weighting effect without wiring real load into a backend.
19+
20+
**Note:** The `envoyproxy/gateway-backend-utilization` image is not published to a public registry — you need to build it locally from a checkout of the Envoy Gateway repository.
21+
22+
* Build the example backend image
23+
24+
```shell
25+
make -C examples/backend-utilization docker-buildx
26+
```
27+
28+
* Make the image available to your cluster
29+
30+
{{< tabpane text=true >}}
31+
{{% tab header="local kind server" %}}
32+
33+
```shell
34+
kind load docker-image --name envoy-gateway envoyproxy/gateway-backend-utilization:latest
35+
```
36+
37+
{{% /tab %}}
38+
{{% tab header="other Kubernetes server" %}}
39+
40+
```shell
41+
docker tag envoyproxy/gateway-backend-utilization:latest $YOUR_DOCKER_REPO/gateway-backend-utilization:latest
42+
docker push $YOUR_DOCKER_REPO/gateway-backend-utilization:latest
43+
```
44+
45+
If you push to your own registry, update the `image:` field in `examples/kubernetes/backend-utilization.yaml` to match before applying.
46+
47+
{{% /tab %}}
48+
{{< /tabpane >}}
49+
50+
* Apply the example manifest (Service, two Deployments, HTTPRoute)
51+
52+
```shell
53+
kubectl apply -f https://raw.githubusercontent.com/envoyproxy/gateway/latest/examples/kubernetes/backend-utilization.yaml -n default
54+
```
55+
56+
Verify the two Deployments are ready:
57+
58+
```shell
59+
kubectl get deployment/backend-utilization-low deployment/backend-utilization-high -n default
60+
```
61+
62+
## Configure BackendUtilization
63+
64+
Apply a [BackendTrafficPolicy][BackendTrafficPolicy] with `loadBalancer.type: BackendUtilization`:
65+
66+
{{< tabpane text=true >}}
67+
{{% tab header="Apply from stdin" %}}
68+
```shell
69+
cat <<EOF | kubectl apply -f -
70+
apiVersion: gateway.envoyproxy.io/v1alpha1
71+
kind: BackendTrafficPolicy
72+
metadata:
73+
name: backend-utilization
74+
namespace: default
75+
spec:
76+
targetRefs:
77+
- group: gateway.networking.k8s.io
78+
kind: HTTPRoute
79+
name: backend-utilization
80+
loadBalancer:
81+
type: BackendUtilization
82+
backendUtilization:
83+
blackoutPeriod: 1s # shorten so the demo shifts traffic quickly
84+
weightUpdatePeriod: 500ms
85+
EOF
86+
```
87+
{{% /tab %}}
88+
{{% tab header="Apply from file" %}}
89+
```yaml
90+
---
91+
apiVersion: gateway.envoyproxy.io/v1alpha1
92+
kind: BackendTrafficPolicy
93+
metadata:
94+
name: backend-utilization
95+
namespace: default
96+
spec:
97+
targetRefs:
98+
- group: gateway.networking.k8s.io
99+
kind: HTTPRoute
100+
name: backend-utilization
101+
loadBalancer:
102+
type: BackendUtilization
103+
backendUtilization:
104+
blackoutPeriod: 1s # shorten so the demo shifts traffic quickly
105+
weightUpdatePeriod: 500ms
106+
```
107+
{{% /tab %}}
108+
{{< /tabpane >}}
109+
110+
Leaving `backendUtilization: {}` empty accepts the defaults, but the 10 s default `blackoutPeriod` means traffic will appear evenly split for the first 10 seconds of the test. The shorter values above make the weighting visible immediately. The `backendUtilization` field itself is required when `type: BackendUtilization` — omitting it will fail CEL validation.
111+
112+
## Configuration Fields
113+
114+
All fields on `backendUtilization` are optional.
115+
116+
| Field | Default | Purpose |
117+
|---|---|---|
118+
| `blackoutPeriod` | `10s` | How long an endpoint must report metrics before its reported weight is trusted. Prevents traffic from shifting based on a single noisy sample. |
119+
| `weightExpirationPeriod` | `3m` | If an endpoint stops reporting for this long, its reported weight is discarded and it reverts to the default weight. |
120+
| `weightUpdatePeriod` | `1s` | How often Envoy recomputes the weight table. Values below `100ms` are capped at `100ms`. |
121+
| `errorUtilizationPenaltyPercent` | `0` | Multiplier (as `percent × 100`) applied to an endpoint's effective utilization based on its error rate (eps/qps). `100` = 1.0×, `150` = 1.5×, `200` = 2.0×. Higher values push errant endpoints out of rotation faster. |
122+
| `metricNamesForComputingUtilization` | _unset_ | Custom ORCA metric keys to feed into the weight formula when `application_utilization` isn't reported. Use `named_metrics.<key>` for keys inside the ORCA proto's `named_metrics` map. |
123+
| `keepResponseHeaders` | `false` | By default Envoy strips the ORCA headers/trailers before forwarding the response. Set to `true` to let downstream clients see them (useful for chained load balancers or debugging). |
124+
125+
### Example: Tuned for a Bursty Backend
126+
127+
```yaml
128+
loadBalancer:
129+
type: BackendUtilization
130+
backendUtilization:
131+
blackoutPeriod: 30s # ignore reports during slow-start
132+
weightExpirationPeriod: 1m # shorter memory — react faster to silent endpoints
133+
weightUpdatePeriod: 500ms # faster reweighting
134+
errorUtilizationPenaltyPercent: 150 # 1.5× penalty for errant endpoints
135+
```
136+
137+
### Example: Application-Defined Utilization
138+
139+
If your backend reports a custom metric (for example, queue depth) instead of CPU utilization, wire it in through `metricNamesForComputingUtilization`:
140+
141+
```yaml
142+
loadBalancer:
143+
type: BackendUtilization
144+
backendUtilization:
145+
metricNamesForComputingUtilization:
146+
- named_metrics.queue_depth
147+
```
148+
149+
The backend would then emit:
150+
151+
```http
152+
endpoint-load-metrics: TEXT named_metrics.queue_depth=0.42
153+
```
154+
155+
## Backend Instrumentation
156+
157+
Your backend must emit ORCA load metrics. Envoy accepts metrics in three formats on response **headers or trailers**:
158+
159+
| Format | Header | Payload |
160+
|---|---|---|
161+
| Binary | `endpoint-load-metrics-bin` | Base64-encoded serialized [`OrcaLoadReport`][orca-proto] proto |
162+
| JSON | `endpoint-load-metrics` | `JSON {"cpu_utilization": 0.3, "mem_utilization": 0.8}` |
163+
| TEXT | `endpoint-load-metrics` | `TEXT cpu=0.3,mem=0.8,named_metrics.queue_depth=0.42` |
164+
165+
For gRPC backends, the [xDS ORCA][grpc-orca] libraries emit these automatically via the `orca_load_report` service. For HTTP backends, add a response middleware that measures and serializes your CPU/memory/custom metrics on each response.
166+
167+
## Combining With Zone-Aware Routing
168+
169+
`BackendUtilization` composes with `weightedZones` to produce locality-aware weighted round-robin (Envoy's `wrr_locality` policy). See the [WeightedZones example][zone-aware-weighted] on the zone-aware routing page.
170+
171+
`preferLocal` is **not** supported with `BackendUtilization`.
172+
173+
## Testing
174+
175+
Ensure the `GATEWAY_HOST` environment variable from the [Quickstart](../../quickstart) is set. If not, follow the Quickstart instructions to set the variable.
176+
177+
Give Envoy a few seconds after applying the policy to collect ORCA samples and compute endpoint weights — until then, traffic will appear roughly even. Then send 200 requests and tally which deployment handled each. Because `backend-utilization-low` reports `cpu_utilization=0.1` and `backend-utilization-high` reports `0.9`, Envoy should weight the `low` pods roughly 9× more heavily.
178+
179+
```shell
180+
for i in $(seq 1 200); do
181+
curl -s -H "Host: www.example.com" "http://${GATEWAY_HOST}/backend-utilization" | jq -r '.pod'
182+
done | sort | uniq -c
183+
```
184+
185+
Expected output (exact counts will vary, but `low` should dominate ~9:1):
186+
187+
```console
188+
90 backend-utilization-low-6b9cf46b59-l7df7
189+
87 backend-utilization-low-6b9cf46b59-xxrw2
190+
12 backend-utilization-high-5fdb65cb87-mctlp
191+
11 backend-utilization-high-5fdb65cb87-rrdvq
192+
```
193+
194+
If you instead see a roughly even split, the weights may not have stabilized yet — wait a few seconds and retry. You can verify the per-endpoint weights directly through the Envoy admin interface:
195+
196+
```shell
197+
ENVOY_POD=$(kubectl get pods -n envoy-gateway-system -l gateway.envoyproxy.io/owning-gateway-name=eg -o jsonpath='{.items[0].metadata.name}')
198+
kubectl -n envoy-gateway-system port-forward pod/${ENVOY_POD} 19000:19000 &
199+
curl -s localhost:19000/clusters | grep "backend-utilization" | grep weight
200+
```
201+
202+
You should see weights roughly `10000` for the `low` pods and `1111` for the `high` pods (the inverse of the reported utilization).
203+
204+
## Clean-Up
205+
206+
```shell
207+
kubectl delete backendtrafficpolicy/backend-utilization
208+
kubectl delete -f https://raw.githubusercontent.com/envoyproxy/gateway/latest/examples/kubernetes/backend-utilization.yaml -n default
209+
```
210+
211+
[ORCA]: https://docs.google.com/document/d/1NSnK3346BkBo1JUU3I9I5NYYnaJZQPt8_Z_XCBCI3uA
212+
[orca-proto]: https://www.envoyproxy.io/docs/envoy/latest/xds/data/orca/v3/orca_load_report.proto
213+
[client-side-wrr]: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/load_balancing_policies/client_side_weighted_round_robin/v3/client_side_weighted_round_robin.proto
214+
[grpc-orca]: https://github.com/grpc/proposal/blob/master/A51-custom-backend-metrics.md
215+
[concepts-lb]: ../../../concepts/load-balancing#backend-utilization-orca
216+
[zone-aware-weighted]: ../zone-aware-routing#weightedzones
217+
[BackendTrafficPolicy]: ../../../api/extension_types#backendtrafficpolicy

0 commit comments

Comments
 (0)