Skip to content

Commit 4d1b556

Browse files
feat(proxy): server-side RTT chart — TCP_INFO + ICMP path ping (#401, #404) (#402)
## Summary Two complementary RTT signals on a single chart immediately below the bitrate chart, sharing the time axis with bitrate / buffer / FPS via `getChartsForSession` so spikes line up visually. **TCP_INFO RTT (#401)** — 100 ms ticker reads `getsockopt(TCP_INFO)` on each session's most-recent connection, folds into a 1 s window drained on every snapshot tick. Six metrics: `client_rtt_ms` (smoothed avg), `_max` / `_min` (per-window peak/trough), `_min_lifetime` (kernel's per-connection floor), `_var` (jitter), `_rto` (retransmit timeout). RTT family on the left axis; RTO on the right axis (kernel default ≥200 ms, can spike to seconds during a wedge — sharing one axis would flatten RTT). Charted with a min/max envelope, dashed lifetime-min reference, hidden-by-default RTO line. Wedge detection is the gap between `rtt` and `rto` — RTO climbs while smoothed RTT flatlines because no fresh ACKs. **Out-of-band ICMP path ping (#404)** — 1 Hz ICMP echo from go-proxy → `player_ip` via a shared `*icmp.PacketConn` with a receiver goroutine demuxing by (id, seq). Independent of streaming throughput because ICMP packets bypass the application's send queue. New `client_path_ping_rtt_ms` field, cyan line on the same chart's left axis. The line that **stays put when shaping kicks in while TCP_INFO RTT climbs** — the gap between them is the queueing delay the app is inducing on itself. ICMP filtering renders a gap rather than misleading data. Linux-only kernel reads (`getsockopt(TCP_INFO)`, raw ICMP). `!linux` build tags compile macOS dev with stubs that emit zeros / disabled samplers so `go build ./...` stays green. Closes #401, closes #404. ## Test plan - [x] `go vet` + `go build` clean for `go-proxy/...` and `analytics/go-forwarder/...` on darwin and linux/amd64 - [x] `node --check` clean on `session-shell.js` and `testing-session-ui.js` - [x] Pre-commit Sonnet review pass — surfaced rttCharts cleanup leak (fixed) + fold() zero-overwrite of latest-fields (fixed) - [x] `make test-deploy-dev` clean: `rtt sampler started (100ms cadence)` + `path ping sampler started (1Hz)` boot logs present, all 9 go-proxy ports listening, no fatals - [x] `make analytics-migrate` applied seven `ADD COLUMN IF NOT EXISTS` lines (six TCP_INFO + one path-ping); `DESCRIBE session_snapshots` confirms all seven columns - [x] RTT chart parity with bandwidth/buffer/FPS: in `getChartsForSession`, `ensureChartLiveWheelAnchor`, `refreshLegendHoverAll`, `chartsToDestroy`, event-marker pool. Pause/zoom/pan inherit via DOM and shared zoom options. - [x] Right-edge alignment: dual-Y chart uses y1 `afterFit` width — no double-counting `layout.padding.right` - [ ] LAN baseline: drive a session via `testing-session.html`, RTT chart shows `client_rtt_ms < 1 ms`, `client_path_ping_rtt_ms` ≈ same, `client_rtt_min_lifetime_ms` ≈ same, RTO at kernel default - [ ] Throttle to 1 Mbps mid-stream: TCP_INFO lines climb (queueing); path-ping line **stays at LAN baseline** — that gap is the bufferbloat the chart was built to surface - [ ] Add `tc netem delay 100ms` on top of throttle: BOTH lines step up by ~100 ms (path itself got longer) - [ ] Wedge: drop packets via fault-injection UI, toggle RTO visible — RTO rises while RTT flatlines - [ ] Replay an archived session in `session-viewer.html`: same chart populated from ClickHouse - [ ] Block ICMP outbound at host firewall: path-ping line shows gap; TCP_INFO lines unaffected 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 4cba20f commit 4d1b556

17 files changed

Lines changed: 1519 additions & 27 deletions

File tree

README.md

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -257,11 +257,11 @@ ATS-style transfer timeouts the proxy enforces against the *client* — useful f
257257

258258
When a timeout fires the network-log waterfall renders the row with `!⏱` so you can tell it apart from a fault-injection cut (`!✂`) or a player abort (`!↩`).
259259

260-
### Bitrate chart (with buffer depth and FPS)
260+
### Bitrate chart (with RTT, buffer depth, and FPS)
261261

262262
![Playback state chart](docs/screenshots/playback-state-chart.png)
263263

264-
The session card has a collapsible **Bitrate Chart** that stacks an events timeline + up to three time-series charts, all sharing a 10-minute rolling window and unified zoom/pan. Legend entries toggle series, scroll zooms, drag pans, `` pauses live updates. The four panels share an x-axis so a bandwidth dip lines up visually with its buffer / FPS / variant-shift impact.
264+
The session card has a collapsible **Bitrate Chart** that stacks an events timeline + up to four time-series charts, all sharing a 10-minute rolling window and unified zoom/pan. Legend entries toggle series, scroll zooms, drag pans, `` pauses live updates. The five panels share an x-axis so a bandwidth dip lines up visually with its RTT / buffer / FPS / variant-shift impact.
265265

266266
- **Events timeline** *(top)* — swim-lane visualization of what the player and server are doing right now:
267267
- **PLAYER variants** — one lane per ladder rung (e.g. `1920×1080:7.1Mbps`, `1280×720:3.5Mbps`). Coloured blocks show which variant the player was on at each moment. The variant shift on a throughput collapse is visible as a downstep across lanes.
@@ -275,10 +275,14 @@ The session card has a collapsible **Bitrate Chart** that stacks an events timel
275275
- **Reference lines**: `Limit` (shaping ceiling, stepped when a pattern is active), `Server Rendition` (what the server believes it delivered), one line per ladder `Variant` (hidden by default).
276276
- **Events**: `STALL` and `RESTART` markers annotate player stalls and restarts.
277277
- **Y-axis**: `Auto` or fixed `5 / 10 / 20 / 30 / 40 / 50 / 100` Mbps — pin the scale when comparing two sessions side by side.
278+
- **RTT chart** — two independent round-trip-time signals on the same time axis (issues #401 / #404):
279+
- **TCP_INFO RTT family** (purple lines, left Y-axis) — what the streaming TCP connection actually experiences. Sampled inside go-proxy via `getsockopt(TCP_INFO)` at 100 Hz, drained into 1 s windows: `RTT avg` (smoothed RFC 6298 SRTT), `RTT max` / `min` (per-window peak/trough envelope), `RTT lifetime min` (sticky path floor, dashed reference), `RTO` (right Y-axis, hidden by default — climbs above RTT during a wedge while smoothed RTT flatlines).
280+
- **Path ping** (cyan line, left Y-axis) — out-of-band ICMP echo from go-proxy → `player_ip` at 1 Hz, routed through a high-priority band inside the per-port shaping class. Sees the configured netem delay but jumps the bulk segment queue. The closest thing to "what the path could deliver if you weren't loading it." Zero / gap when ICMP is filtered.
281+
- **Why both?** The ping line is the network's contribution; the gap up to the TCP_INFO line is the application stack's contribution (queueing under throttle, delayed ACKs, receiver load). Together they decompose latency under shaping: rising netem moves both lines together; rising throttle inflates only TCP_INFO via bufferbloat. See [`analytics/README.md`](analytics/README.md#reading-the-rtt-chart-issues-401-404) for the deep interpretation guide and per-shaping-knob test recipes.
278282
- **Buffer depth chart** — player `buffered` TimeRanges (`player_metrics_buffer_depth_s`) on the left axis; **Wall-Clock Offset** (player playhead vs encoder PDT) on the right axis.
279283
- **FPS chart** — rendered and dropped frames/s from `player_metrics_frames_displayed` / `_dropped_frames`, 2 s sliding window, exponential smoothing (α = 0.15). Series: `FPS (smoothed)`, `Low FPS` (red below threshold), `FPS Baseline` (75th percentile), `Low Threshold` (`0.75 × baseline`), `Dropped Frames/s` (right Y-axis).
280284

281-
The four panels' plot areas all align on the same right edge so vertical x-axis ticks line up across every chart and the events timeline above.
285+
The five panels' plot areas all align on the same right edge so vertical x-axis ticks line up across every chart and the events timeline above.
282286

283287
### Network log waterfall (HAR view)
284288

@@ -545,6 +549,20 @@ Full API (`/api/content`, `/api/jobs`, `/api/sessions/*`, `/api/nftables/*`, etc
545549
| `mbps_transfer_rate` | 250 ms | Byte-change-gated rate during segment transfer, aligned to HTB burst edges. Reports at drain/refill boundaries |
546550
| `mbps_transfer_complete` | per segment | Total bytes / total time for one completed segment transfer (backlog drained to 0) |
547551

552+
### Server-side RTT metrics
553+
554+
Sampled inside go-proxy via `getsockopt(TCP_INFO)` on each session's most-recent connection. The 100 ms sampler folds reads into a 1 s window that drains on every snapshot tick — same cadence as the player-metrics PATCH heartbeat, so the RTT chart shares a time axis with the bitrate chart above it. Linux-only (the kernel option doesn't exist on macOS); the dev build compiles via a stub that emits zeros. All values in milliseconds.
555+
556+
| Metric | Source field | What it measures |
557+
|---|---|---|
558+
| `client_rtt_ms` | `tcpi_rtt` (avg of 1 s window) | Smoothed RTT (RFC 6298 SRTT, kernel EWMA) |
559+
| `client_rtt_max_ms` | window max of `tcpi_rtt` | Peak smoothed RTT in window — catches sub-second spikes the kernel's EWMA would mask |
560+
| `client_rtt_min_ms` | window min of `tcpi_rtt` | Trough during the same 1 s window |
561+
| `client_rtt_min_lifetime_ms` | `tcpi_min_rtt` | Min RTT ever observed on this connection — the path floor |
562+
| `client_rtt_var_ms` | `tcpi_rttvar` | Smoothed mean deviation (jitter) |
563+
| `client_rto_ms` | `tcpi_rto` | Current retransmit timeout — rises during a wedge while smoothed RTT flatlines; the gap between `rto` and `rtt` is the canonical "kernel suspects this connection is stalling" signal |
564+
| `client_path_ping_rtt_ms` | ICMP echo, 1 Hz | **Out-of-band path latency** (issue #404). Independent of the streaming connection's queue contribution — TCP_INFO RTT inflates with throttle, but ICMP packets bypass the application's send queue, so this line stays at the LAN baseline when shaping kicks in. The vertical gap between this line and `client_rtt_ms` is the queueing delay the application is inducing on itself. Zero / gap when ICMP is filtered. |
565+
548566
### Metric semantics
549567

550568
- **Limit value** (`nftables` shaping rate): configured ceiling for the session port; a control target, not a measured throughput.

analytics/README.md

Lines changed: 212 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,218 @@ docker compose up -d --force-recreate grafana
7777
- `TTL toDateTime(ts) + INTERVAL 30 DAY` enforces retention; tune via
7878
`ALTER TABLE ... MODIFY TTL`.
7979

80+
### Reading the RTT chart (issues #401, #404)
81+
82+
Two independent measurement schemes plotted on the same chart:
83+
84+
- **TCP_INFO family** (`client_rtt_*_ms`, purple lines) — what the
85+
*streaming TCP connection* actually experiences. Self-loaded:
86+
every sample includes the application's own queue contribution.
87+
- **Path ping** (`client_path_ping_rtt_ms`, cyan line) — out-of-band
88+
ICMP echo from go-proxy → player_ip at 1 Hz, routed through a
89+
high-priority band (TC_PRIO_INTERACTIVE → tc band 0) inside the
90+
per-port shaping class. Sees the configured netem delay but jumps
91+
the bulk queue. The closest thing to "what the path could deliver
92+
if you weren't loading it."
93+
94+
#### The two probe paths through the kernel
95+
96+
Both probes share the same physical egress (eth0) and the same per-
97+
port HTB class, but they take different *bands* through the prio
98+
scheduler that sits inside the class as the leaf qdisc:
99+
100+
```
101+
egress eth0
102+
└── HTB root (1:1)
103+
└── HTB class 1:<port> (rate-limited at configured Mbps)
104+
└── prio leaf qdisc (3 bands, default priomap)
105+
├── band 0 (1810:1) ← PROBE LANE
106+
│ └── netem (configured delay/loss + 5% jitter)
107+
├── band 1 (1810:2) ← BULK DATA LANE
108+
│ └── netem (same delay/loss/jitter)
109+
└── band 2 (1810:3) unused
110+
111+
filters at parent 1:0:
112+
ip sport <port> → flowid 1:<port> (TCP segments → bulk lane via priomap default)
113+
ip protocol icmp dst <ip> → flowid 1:<port> (ICMP probes → probe lane via IP_TOS=0x10 → priomap band 0)
114+
```
115+
116+
How a packet picks its band:
117+
118+
- **Bulk segment data**: leaves the proxy with `skb->priority = 0`
119+
(`TC_PRIO_BESTEFFORT`). Default prio priomap routes that to band 1
120+
(the middle band) → classid `1810:2` → bulk netem.
121+
- **Path-ping ICMP**: socket has `IP_TOS = 0x10` set. Kernel's
122+
`rt_tos2priority` table maps that to `TC_PRIO_INTERACTIVE` (priority
123+
6). Default priomap routes that to band 0 → classid `1810:1`
124+
probe netem.
125+
126+
Both bands carry **identical netem configuration**`UpdateNetem`
127+
writes the same `delay/loss/jitter` to all three bands in lockstep,
128+
so the configured network conditions apply uniformly regardless of
129+
priority. The only thing the prio scheduler changes is *queueing
130+
order* when both bands have eligible packets: strict priority means
131+
band 0 always drains first.
132+
133+
Net effect:
134+
135+
- **Bulk** sits in band 1's netem queue, gets the configured delay,
136+
then waits for HTB's rate-limit token. If the rate is full it
137+
queues behind earlier bulk packets too — bufferbloat.
138+
- **Probe** sits in band 0's netem queue, gets the same configured
139+
delay, then jumps every queued bulk packet on the next HTB dequeue.
140+
Worst case it waits one MTU's serialization (≈ 12 ms at 1 Mbps,
141+
2.4 ms at 5 Mbps) for a bulk packet already on the wire.
142+
143+
So the ping line shows you *path + netem*, almost free of bufferbloat.
144+
The TCP_INFO lines show you *path + netem + bufferbloat + ACK overhead*
145+
— what bulk segment data really pays.
146+
147+
#### Why we ship both signals
148+
149+
They answer different questions. Either alone is misleading:
150+
151+
- **Just TCP_INFO** would conflate "the path is bad" with "I'm
152+
loading the path" — a player ABR algorithm seeing high RTT can't
153+
tell whether to back off or just wait for its own queue to drain.
154+
Lifetime min approximates the path floor but is sticky and slow
155+
to update during sustained load.
156+
- **Just ping** would tell you nothing about what the streaming
157+
connection actually experiences — the ABR doesn't react to ICMP,
158+
it reacts to TCP behavior. A perfectly healthy ping line above a
159+
stalling player would be a useless diagnostic.
160+
161+
Together they decompose the latency budget: ping is the *physical
162+
network's contribution* (path + configured delay), and the gap up
163+
to TCP_INFO is the *application stack's contribution* (queueing
164+
under throttle, delayed ACKs, receiver load, reverse-path queuing).
165+
When bitrate drops or buffer drains, the chart tells you *which
166+
component* moved — was it the path getting longer, or my own
167+
queue piling up?
168+
169+
#### Field reference
170+
171+
- `client_rtt_ms` — kernel's smoothed RTT (RFC 6298 SRTT). Current
172+
path latency with EWMA, lags real changes by a fraction of a second.
173+
- `client_rtt_max_ms` / `client_rtt_min_ms` — peak/trough of smoothed
174+
RTT *within the 1 s emit window*. Catches sub-second spikes the
175+
kernel's EWMA would otherwise mask.
176+
- `client_rtt_min_lifetime_ms` — connection's *path floor*: the best
177+
RTT ever seen on that TCP connection. Sticky-low, never climbs back.
178+
- `client_rtt_var_ms` — kernel's smoothed mean deviation (jitter).
179+
- `client_rto_ms` — current retransmit timeout. Rises during a wedge
180+
while smoothed RTT flatlines because no fresh ACKs are coming back.
181+
- `client_path_ping_rtt_ms` — ICMP echo round-trip, 1 Hz cadence.
182+
Zero / absent when ICMP is filtered on the path.
183+
184+
#### Why the ping line is *always* lower than TCP_INFO RTT
185+
186+
Even on a perfectly idle, unshaped LAN, expect TCP_INFO RTT to sit
187+
above the ping line. Sources of inflation, all real, all working as
188+
designed:
189+
190+
- **Delayed ACKs** — receiver TCP holds ACKs up to 40–200 ms (Linux/iOS
191+
default ~40 ms) to coalesce them. ICMP echo replies aren't subject
192+
to this; they go out immediately.
193+
- **TCP_INFO is the smoothed SRTT** — exponentially averaged across
194+
recent ACKs, including ACKs generated under load. ICMP samples a
195+
single round-trip on the kernel fast path.
196+
- **Receiver processing latency under load** — at multi-Mbps the
197+
player's TCP stack is busy demuxing segment data; ACK generation
198+
slips behind. ICMP echo handling skips userspace entirely.
199+
- **Reverse-path queuing** — the player's egress (ACKs going *back*)
200+
has its own tiny outbound queue. ICMP replies skip it.
201+
202+
So `TCP_INFO − ping` on healthy unshaped LAN ≈ delayed-ACK + receiver
203+
load + reverse queueing. This is the network stack's overhead, not
204+
a fault.
205+
206+
#### Expected behavior under shaping
207+
208+
The two signals respond differently to the two shaping knobs (netem
209+
delay and HTB rate limit). Useful test recipes:
210+
211+
| Action | TCP_INFO RTT | Path ping | Why |
212+
|---|---|---|---|
213+
| **No shaping** | `path + ACK overhead + receiver load` (typically 5–50 ms LAN) | `~path RTT` (sub-ms LAN) | Baseline. The ping line is the floor; TCP_INFO is everything else the stack adds. |
214+
| **Set netem delay = 25 ms** | rises by ~25 ms (mean) | rises by ~25 ms (mean) | Both packets traverse the same per-band netem inside the HTB class. Matched movement. Per-packet variance is ±5 % of mean (~1 ms stddev at 25 ms — see jitter note below). |
215+
| **Set throttle = 1 Mbps** (no netem) | climbs into bufferbloat range (often 100s of ms during downloads) | unchanged from baseline | Bulk segment data fills the HTB queue; each MTU waits for a rate token. The probe escapes via prio band 0 — at most one MTU's serialization (~12 ms at 1 Mbps for 1500 B). |
216+
| **Throttle + netem combined** | `path + netem + bufferbloat + ACK overhead` (compounded) | `~path + netem + at-most-one-MTU` | Effects stack additively on bulk data. The ping line shows you what's *just* the configured delay so you can subtract bufferbloat by eye. |
217+
| **Toggle shaping mid-stream** | step changes correlate visibly with bitrate / buffer drops on the chart above | flat through bandwidth changes; steps on netem changes only | Whole point of having both signals. Bitrate dropped because shaping was applied → both lines confirm in different ways. |
218+
| **Drop packets via fault-injection** | smoothed RTT eventually flatlines (no fresh ACKs); `client_rto_ms` climbs as kernel doubles its timeout | `0` / gap (echo replies dropped too) | RTO − RTT divergence is the canonical wedge indicator. |
219+
220+
Reading the chart in one sentence:
221+
**ping = network's contribution; (TCP_INFO − ping) = stack + bufferbloat
222+
contribution.**
223+
224+
#### A note on jitter
225+
226+
`UpdateNetem` adds a fixed 5 % normal-distributed jitter (`stddev =
227+
delay/20`). For configured 25 ms delay: stddev ≈ 1 ms, so ~99.7 % of
228+
per-packet delays land in [22, 28] ms. The ping per-window min sits
229+
at the configured floor; per-window max shows the tight scatter above.
230+
Configured delays ≤19 ms get zero jitter (integer divide rounds to 0)
231+
— fine for low-RTT testing where any noise would dominate the signal.
232+
233+
If a future test needs higher jitter (ABR resilience to RTT scatter,
234+
out-of-order arrivals via wider Gaussian draws), add a separate
235+
`jitter_ms` parameter to the shaping API rather than re-deriving from
236+
the delay value. The 5 % default is tuned for "I configured 25 ms,
237+
the chart should read ~25 ms," not for stress-testing variance.
238+
239+
#### Reading the RTO line (wedge detection)
240+
241+
`client_rto_ms` is hidden by default — toggle it on from the chart
242+
legend when you suspect a wedge. RTO answers a different question
243+
than the rest of the chart: not *how long does a round-trip take*
244+
but *how long is the kernel willing to wait for one before giving
245+
up and retransmitting*.
246+
247+
What it is. RTO is the kernel's **retransmission timeout** for the
248+
TCP connection. After sending a segment, the sender starts a timer;
249+
if no ACK arrives before the timer expires, the segment is
250+
retransmitted and the timer doubles. RFC 6298 sets the steady-state
251+
value to roughly `SRTT + 4 × RTTVAR` (smoothed RTT plus four times
252+
its smoothed deviation), with a kernel floor of 200 ms and ceiling
253+
of 120 s. So on a quiet, healthy connection RTO sits a small
254+
multiple of RTT above the smoothed-RTT line — *not* zero.
255+
256+
Why it's the wedge canary. When ACKs stop flowing entirely (transport
257+
fault, dropped packets, broken middlebox), `tcpi_rtt` flatlines —
258+
no fresh ACK round-trips means no new samples to update the EWMA.
259+
Looking at the smoothed-RTT line alone, you can't tell whether the
260+
connection is healthy-and-idle or wedged-and-silent. RTO has its
261+
own state machine driven by the kernel's retransmission timer, not
262+
by ACK arrivals: every retry doubles it (`200ms → 400ms → 800ms →
263+
1.6s → 3.2s → 6.4s → 12.8s → 25.6s → 51.2s → 102.4s`, capped at
264+
the kernel max). So when the connection wedges:
265+
266+
```
267+
RTT (purple) ──flat─────────────────────────────── ← no ACKs, no fresh samples
268+
RTO (red) ───────⌐──┘─⌐──┘──⌐──┘──⌐────┘──── ← kernel doubling on each retry
269+
```
270+
271+
The growing gap between the two is the unambiguous "kernel suspects
272+
this connection is stalling" signal. It appears within seconds of
273+
the wedge starting — much faster than a stall on the bitrate chart
274+
above (which only triggers after the player's buffer drains).
275+
276+
What recovery looks like. Once ACKs start flowing again (fault
277+
cleared, retry succeeds), the kernel resets RTO to `SRTT + 4×RTTVAR`
278+
on the very next ACK. The red line snaps back down; the purple
279+
smoothed-RTT line resumes updating. So a recovered wedge is visible
280+
as a sawtooth-like RTO climb followed by an instant drop, with the
281+
RTT line resuming its normal track.
282+
283+
Useful pairing. RTO + the path-ping line together disambiguate the
284+
wedge cause:
285+
286+
| RTT line | RTO line | Path ping | What it means |
287+
|---|---|---|---|
288+
| flat | climbing | also gone (ICMP filtered/dropped) | Wedge or transport fault — entire path is dead from proxy's view. |
289+
| flat | climbing | still arriving normally | Wedge is TCP-specific — ICMP gets through but TCP is stuck. Likely middlebox dropping the connection or a broken player TCP stack, not a network outage. |
290+
| climbing slowly | tracking RTT (small multiple above) | climbing the same | Genuine path latency increase, not a wedge — RTO is just following healthy RTT growth. |
291+
80292
## Securing a WAN-exposed deployment
81293

82294
Default docker-compose binds ClickHouse to `127.0.0.1` (host-only) and

0 commit comments

Comments
 (0)