Skip to content

Commit 12b056e

Browse files
juliusmarmingecodex
andcommitted
Fix relay Axiom HTTP span view fields
Co-authored-by: codex <codex@users.noreply.github.com>
1 parent 701e768 commit 12b056e

3 files changed

Lines changed: 40 additions & 53 deletions

File tree

docs/relay-observability.md

Lines changed: 21 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -1,68 +1,39 @@
11
# Relay observability
22

3-
The relay Alchemy stack owns Axiom resources for post-hoc diagnostics:
3+
The relay Alchemy stack owns a focused Axiom trace setup:
44

5-
- `t3-code-relay-events` for Effect logs and spans
6-
- `t3-code-relay-metrics` for Effect metrics
7-
- `t3-code-relay-otel-ingest` for Worker OTLP ingest
8-
- `t3-code-relay-readonly-query` for human/agent log lookup
9-
- `T3 Code Relay Operations` dashboard
10-
- starter views for recent logs and recent failures
11-
- monitors for warning/error logs, APNS failures, managed tunnel provisioning failures, and quiet log ingestion
5+
- `t3-code-relay-traces`, an OpenTelemetry trace dataset for Worker requests
6+
- `t3-code-relay-otel-ingest`, a dataset-scoped ingest token bound to the Worker
7+
- `t3-code-relay-readonly-query`, a dataset-scoped token for scripted diagnostics
8+
- `t3-code-relay-recent-spans`, a view of recent request and endpoint spans
129

1310
Deploy from `infra/relay` with the normal Alchemy workflow:
1411

1512
```sh
1613
bun run deploy
1714
```
1815

19-
Alchemy resolves Axiom credentials through the Axiom provider. Use either environment credentials or `alchemy login --configure` before deploy.
16+
Alchemy resolves Axiom deployment credentials through its provider. At runtime, the Worker
17+
receives only the scoped ingest token; it does not receive the diagnostics query token.
2018

21-
Useful APL queries:
19+
The Worker emits Effect's built-in HTTP server spans plus endpoint and database child spans.
20+
Effect's OpenTelemetry exporter stores semantic HTTP attributes below the `attributes.` prefix.
21+
For example:
2222

2323
```apl
24-
['t3-code-relay-events']
24+
['t3-code-relay-traces']
25+
| where name startswith 'http.server'
26+
| project _time, name, trace_id, duration,
27+
['attributes.http.request.method'],
28+
['attributes.url.path'],
29+
['attributes.http.response.status_code']
2530
| order by _time desc
2631
| limit 200
2732
```
2833

29-
```apl
30-
['t3-code-relay-events']
31-
| extend logSeverity = column_ifexists('severityText', '')
32-
| extend logBody = column_ifexists('body', '')
33-
| where logSeverity in ("WARN", "WARNING", "ERROR", "FATAL")
34-
or logBody contains "failed"
35-
or logBody contains "error"
36-
| order by _time desc
37-
| limit 200
38-
```
39-
40-
Metrics intentionally capture product and state signals that are not just trace counts:
41-
42-
- `relay_managed_tunnel_provisions_total`: managed tunnel provisioning outcomes, split by `created` versus `reused`
43-
- `relay_environment_links_total`: link and unlink lifecycle operations
44-
- `relay_managed_tunnels_active`: current active managed-tunnel links
45-
- `relay_environment_links_active`: current active environment links
46-
- `relay_mobile_devices_registered`: current registered mobile devices
47-
- `relay_live_activity_targets_active`: current active Live Activity targets
48-
- `relay_agent_activities_active`: current active agent activity rows
49-
- `relay_agent_activity_publishes_total`: agent activity publish/replay lifecycle events
50-
- `relay_apns_deliveries_total`: APNS enqueue/send outcomes for Live Activities and push notifications
51-
52-
The `*_active` and `*_registered` values are gauges refreshed from the relay database, which is the source of truth for current state. Lifecycle counters are updated from the mutation path after successful writes or delivery outcomes.
53-
54-
Useful metrics queries:
55-
56-
```mpl
57-
`t3-code-relay-metrics`:`relay_managed_tunnels_active`
58-
| group using sum
59-
```
60-
61-
```mpl
62-
`t3-code-relay-metrics`:`relay_managed_tunnel_provisions_total`
63-
| map increase
64-
| align to 5m using sum
65-
| group by outcome, tunnelProvisionKind using sum
66-
```
34+
Endpoint failure annotations and other relay-specific attributes are also emitted under
35+
`attributes.relay.*` when present on a span.
6736

68-
Agents should prefer Axiom views or APL queries for completed incidents instead of tailing the Cloudflare Worker. Use the read-only query token when scripted access is needed; keep the ingest token reserved for the Worker.
37+
Agents should prefer the provisioned view or APL queries for completed incidents instead of
38+
tailing the Cloudflare Worker. Use the read-only query token when scripted access is needed;
39+
keep the ingest token reserved for the Worker.

infra/relay/src/infra/RelayObservability.test.ts

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ import {
44
RELAY_AXIOM_TRACE_DATASET,
55
relayAxiomIngestDatasetCapabilities,
66
relayAxiomQueryDatasetCapabilities,
7+
relayRecentSpansQuery,
78
relayTraceQuery,
89
} from "./RelayObservability.ts";
910

@@ -25,4 +26,15 @@ describe("RelayObservability", () => {
2526
"['relay-traces-test']\n| where name == 'GET /health'",
2627
);
2728
});
29+
30+
it("projects Effect HTTP span attributes through their OTLP field names", () => {
31+
const query = relayRecentSpansQuery("relay-traces-test");
32+
33+
expect(query).toContain("['relay-traces-test']");
34+
expect(query).toContain("attributes.http.request.method");
35+
expect(query).toContain("attributes.http.response.status_code");
36+
expect(query).toContain("attributes.url.path");
37+
expect(query).toContain("attributes.relay.endpoint");
38+
expect(query).not.toContain("['http.request.method']");
39+
});
2840
});

infra/relay/src/infra/RelayObservability.ts

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,12 @@ export const RELAY_AXIOM_TRACE_DATASET = "t3-code-relay-traces";
88
export const relayTraceQuery = (query: string, dataset: string = RELAY_AXIOM_TRACE_DATASET) =>
99
`['${dataset}']\n${query}`;
1010

11+
export const relayRecentSpansQuery = (dataset: string = RELAY_AXIOM_TRACE_DATASET) =>
12+
relayTraceQuery(
13+
"| where isnotnull(span_id) or isnotnull(trace_id)\n| extend requestMethod = column_ifexists('attributes.http.request.method', ''), path = column_ifexists('attributes.url.path', ''), statusCode = column_ifexists('attributes.http.response.status_code', 0), endpoint = column_ifexists('attributes.relay.endpoint', '')\n| project _time, name, trace_id, span_id, duration, requestMethod, path, statusCode, endpoint\n| order by _time desc\n| limit 200",
14+
dataset,
15+
);
16+
1117
export const relayAxiomIngestDatasetCapabilities = (
1218
dataset: string = RELAY_AXIOM_TRACE_DATASET,
1319
) => ({
@@ -44,9 +50,7 @@ export const provisionRelayObservability = Effect.gen(function* () {
4450
name: "t3-code-relay-recent-spans",
4551
description: "Recent relay HTTP request spans.",
4652
datasets: [RELAY_AXIOM_TRACE_DATASET],
47-
aplQuery: relayTraceQuery(
48-
"| where isnotnull(span_id) or isnotnull(trace_id)\n| project _time, name, trace_id, span_id, duration, ['http.request.method'], ['url.path'], ['http.response.status_code'], ['relay.endpoint']\n| order by _time desc\n| limit 200",
49-
),
53+
aplQuery: relayRecentSpansQuery(),
5054
});
5155

5256
return { traces, ingestToken, queryToken } as const;

0 commit comments

Comments
 (0)