@@ -133,6 +133,8 @@ Each metric will record the following dimensions:
133133
134134### Surfacing Signals from the Edge
135135
136+ ![ HTTP Metering Signal Pipeline] ( ./signal-pipeline.png )
137+
136138All metering signals originate from the ** edge cluster** , where the Envoy
137139Gateway proxies (` datum-downstream-gateway ` ) actually serve customer traffic.
138140There is no central collection point that observes individual requests — the
@@ -144,50 +146,47 @@ The raw access log already carries everything the meters need *except* one
144146thing: the ` route_name ` field identifies the owning project only by its
145147control-plane namespace UID (e.g. ` ns-<project-uid> ` ), not by the
146148human-readable project name. To populate the ` project_name ` dimension, three
147- components must be updated, all operating at the edge:
148-
149- 1 . ** Network Services Operator (controller).** When the operator reconciles a
150- customer ` HTTPRoute ` into its downstream representation, it injects the
151- project name as a request header (` x-datum-project-name ` ) via a
152- ` RequestHeaderModifier ` filter on each route rule. The project name is read
153- from the upstream cluster identity (the Milo project name) that the
154- operator already holds while mapping upstream → downstream resources. Routes
155- that already define a ` RequestHeaderModifier ` are merged into rather than
156- duplicated, since Gateway API permits at most one such filter per rule.
157-
158- 2 . ** Envoy access log format.** The ` EnvoyProxy ` access log JSON format is
159- extended with a ` project_name ` field sourced from the injected header:
160- ` project_name: "%REQ(X-DATUM-PROJECT-NAME)%" ` . Because the header is set on
161- the route before the access log is written, every logged request for a
162- customer route carries the resolved project name. (We use ` %REQ()% ` rather
163- than ` %METADATA(ROUTE:...)% ` because Envoy Gateway's JSON access log
164- formatter does not register the metadata formatter, so route metadata is not
165- accessible from JSON access logs.)
149+ components collaborate at the edge:
150+
151+ 1 . ** Extension Server (xDS mutation).** The NSO extension server implements
152+ ` ApplyTPPRouteConfig ` in ` internal/extensionserver/mutate/tpp.go ` . During
153+ each xDS route-config build, it iterates every VirtualHost owned by NSO and
154+ calls ` injectProjectNameMetadata ` on every route, which writes the resolved
155+ ` project_name ` string directly into
156+ ` filter_metadata["datum-gateway"]["project_name"] ` on the Envoy
157+ ` RouteConfiguration ` proto. This happens for every NSO-owned route
158+ regardless of whether a ` TrafficProtectionPolicy ` governs it — WAF config is
159+ an optional overlay on top of the metadata that is always stamped. The
160+ project name is sourced from ` idx.ProjectNames[dsNS] ` , the
161+ downstream-namespace → project-name mapping the operator maintains in its
162+ policy index.
163+
164+ 2 . ** Envoy access log format.** The ` EnvoyProxy ` access log JSON format
165+ includes a ` project_name ` field read from the xDS route metadata:
166+ ` project_name: "%METADATA(ROUTE:datum-gateway:project_name)%" ` . Because the
167+ extension server stamps the metadata into the xDS route before any request is
168+ served, every logged request for a customer route carries the resolved
169+ project name. (` %METADATA(ROUTE:...)% ` is used because the name lives in xDS
170+ route metadata — it is not a per-request value and does not need to travel as
171+ a header.)
166172
1671733 . ** Vector billing collector.** The ` billing-usage-collector-vector ` VRL
168- transform reads the ` project_name ` field from each access log line and adds
169- it as a dimension on all four emitted CloudEvents (requests, ingress-bytes ,
170- egress-bytes, connection-seconds), and subject . An absent or empty value (rendered by
171- Envoy as ` "-" ` ) is normalized to an empty string so unmatched routes do not
174+ transform reads the ` project_name ` field from each parsed access log line
175+ and adds it as a dimension on all four emitted CloudEvents (requests,
176+ ingress-bytes, egress-bytes, connection-seconds). An absent or Envoy-default
177+ ` "-" ` value is normalized to an empty string so unmatched routes do not
172178 pollute the dimension.
173179
174- This keeps the entire signal path — request handling, name resolution , log
175- emission, parsing, and CloudEvent forwarding — co-located on the edge cluster.
180+ This keeps the entire signal path — xDS route enrichment , log emission,
181+ parsing, and CloudEvent forwarding — co-located on the edge cluster.
176182
177183#### Transport: how access logs reach Vector
178184
179- The access log line must travel from the Envoy proxy to the
180- ` billing-usage-collector-vector ` agent. Two transports are viable; see
181- [ Access Log Transport] ( #access-log-transport-file-sink-vs-otlp-sink ) under
182- Alternatives for the trade-offs. In short:
183-
184- - ** File sink (stdout) + ` kubernetes_logs ` ** — the current/baseline approach,
185- where Envoy writes JSON to stdout and Vector tails the node's container logs.
186- This requires Vector to run as a per-node DaemonSet co-located with the Envoy
187- pod, which holds on edge clusters but not where Vector runs as an aggregator.
188- - ** OpenTelemetry (OTLP) sink** — Envoy pushes access logs directly to Vector's
189- OTLP receiver over the network, independent of pod/node topology. This is
190- implemented in a draft PR (see below).
185+ The access log line travels from the Envoy proxy to the
186+ ` billing-usage-collector-vector ` agent via the ** File sink (stdout) +
187+ ` kubernetes_logs ` ** approach: Envoy writes JSON to ` /dev/stdout ` (the ` File `
188+ sink configured on ` datum-downstream-gateway ` ) and the node-local Vector
189+ DaemonSet tails the container log file via its ` kubernetes_logs ` source.
191190
192191---
193192
0 commit comments