@@ -18,60 +18,66 @@ rebuilding trails from replay has to infer leg boundaries from the recorded data
1818
1919` identd ` takes a snapshot of the aircraft present at most once per sample
2020interval and appends it to an in-memory block that covers a fixed span of time,
21- five minutes by default. A snapshot holds a timestamp and the set of aircraft
22- visible at that moment. When a sample arrives that belongs to a later span, the
23- open block is finalized and written, and a new one starts.
21+ five minutes. A snapshot holds a timestamp and the set of aircraft visible at
22+ that moment. Empty snapshots still matter because they prove the receiver was
23+ being sampled even when no aircraft were visible. When a sample arrives that
24+ belongs to a later span, the open block is finalized and written, and a new one
25+ starts.
2426
2527The block currently being filled is not listed and not served until it rolls
26- over. The smallest thing a viewer can load is therefore one finalized block. Both
27- the block length and the sample interval are configurable, with the block length
28- bounded below at one minute so a block always spans more than a single sample .
28+ over. The smallest thing a viewer can load is therefore one finalized block. The
29+ sample interval is configurable; the block duration is fixed so storage paths,
30+ cache metadata, and frontend loading all agree about the same time grid .
2931
3032## On-disk layout
3133
32- Blocks live in a single flat directory. Each finalized block is one
33- zstd-compressed JSON file whose name encodes the time range it covers. An index
34- file sits beside that directory and caches the list of blocks between restarts.
35-
36- At startup ` identd ` reads the index, scans the directory, and merges the two,
37- preferring what the scan actually finds on disk. It does not decompress every
38- block to validate it; the file name and size are enough to build the in-memory
39- list, and decompressing the whole corpus on a cold boot would dominate startup
40- time on modest hardware. A block is only read from disk when a viewer asks for
41- it. If the index is missing, unreadable, or written in a version this build does
42- not recognize, ` identd ` falls back to the directory scan and records a diagnostic
43- rather than refusing to start.
44-
45- The block format carries its own version. A block whose version this build does
46- not support is skipped, not deleted. Earlier behavior deleted mismatched blocks
47- and could silently destroy recorded history across an upgrade or downgrade, so
48- the current code never deletes a block on the basis of its version.
34+ Blocks are grouped by UTC day instead of all living in one directory. Each
35+ finalized block is one zstd-compressed JSON file whose name encodes the time
36+ range it covers. The grouping is for filesystem fanout and static serving; it is
37+ not a time-retention policy.
38+
39+ Replay keeps cache manifests next to the blocks. The root cache is intentionally
40+ small: it records the covered days and the overall range, not every block. Each
41+ day cache records the blocks for that day. A valid cache lets startup avoid
42+ walking the full tree and statting every historical file, which matters on small
43+ receiver hosts. If the cache is missing or unreadable, an operator-controlled
44+ reindex setting decides whether ` identd ` scans filenames to rebuild the cache or
45+ starts with replay unavailable and records a diagnostic.
46+
47+ The normal startup path trusts cache metadata. It does not decompress every block
48+ to validate it; the file name and size are enough to publish availability, and
49+ decompressing the whole corpus on a cold boot would dominate startup time on
50+ modest hardware. A block is only read from disk when a viewer asks for it. If a
51+ cached block is missing or a viewer reports that it could not be decoded, the
52+ cache is corrected for that day and a diagnostic is recorded instead of leaving
53+ the stale coverage in place.
4954
5055## Retention
5156
52- Two limits bound disk use, and both are required when replay is enabled:
57+ Replay is bounded by a byte budget. The operator sets the high watermark for
58+ finalized blocks. When the estimated size rises above that watermark, ` identd `
59+ removes the oldest cached blocks until usage falls below a lower target.
5360
54- - A byte budget caps the total size of finalized blocks. When the total would
55- exceed it, the oldest blocks are removed first until the total fits. This is
56- checked both before writing a new block and after.
57- - An age cap sets the oldest a block may be. Blocks past that age are removed
58- regardless of how much room the byte budget has left.
59-
60- The two cover different failure modes. A byte budget alone does not bound how old
61- data gets: on a quiet receiver the budget might never fill, leaving stale history
62- around indefinitely. An age cap alone does not protect against disk exhaustion
63- when traffic is unexpectedly heavy. Together the byte budget is the hard ceiling
64- on space and the age cap sets the history window.
61+ Using two watermarks avoids deleting a single old block every time a new block
62+ rolls over near the limit. The tradeoff is that a cleanup pass can remove a
63+ batch of history at once. That is deliberate: it reduces metadata churn on
64+ storage that may be SD-card-backed. There is no separate age cap in this storage
65+ version, so a quiet receiver can keep old history as long as it fits inside the
66+ byte budget.
6567
6668## Serving blocks
6769
68- Two endpoints make replay available to the frontend. One returns a manifest: the
69- enabled flag, the time range covered, the block length, and the list of finalized
70- blocks with their URLs and sizes. The other serves a single block file by name,
71- after checking the requested name against the expected pattern so a request
72- cannot reach outside the blocks directory. Finalized blocks are served as
73- cacheable and immutable, since a block's contents are fixed once its time range
74- has passed.
70+ Replay exposes a dynamic manifest endpoint plus a static artifact subtree. The
71+ manifest tells the frontend which finalized blocks are available for playback.
72+ The artifact subtree contains immutable block files and cache manifests that can
73+ be served directly by a reverse proxy. Dynamic repair endpoints live outside
74+ that subtree so a deployment can hand static replay artifacts to the proxy
75+ without hiding the ` identd ` APIs that still need application logic.
76+
77+ When ` identd ` serves a block itself, it checks that the requested name has the
78+ date-partitioned shape that replay writes and that the block is present in its
79+ current cache. A name that does not match replay's own storage shape is rejected
80+ before it can become a filesystem lookup.
7581
7682### Why blocks are negotiated, not decompressed
7783
@@ -89,15 +95,13 @@ that common case.
8995The server resolves this by content negotiation. When the request says it accepts
9096zstd, the server sets the encoding header and ships the raw bytes; the browser
9197decompresses them natively and JavaScript receives JSON. When the request does
92- not say so — the plain-HTTP browser being the case that matters — the server
93- ships the same raw bytes with no encoding header, and the frontend decompresses
94- them itself.
95- It decides which path applies by inspecting the first few bytes of
96- the body for the zstd frame signature rather than trusting a response header,
97- because a browser strips the encoding header once it has decoded a response and a
98- cache may surface either form for the same URL. The frontend caps the size it
99- will expand a block to, and a decode failure shows up as a diagnostic in the
100- notification area rather than a silent blank.
98+ not say so, the server ships the same raw bytes with no encoding header, and the
99+ frontend decompresses them itself. The frontend decides which path applies by
100+ inspecting the first few bytes of the body for the zstd frame signature rather
101+ than trusting a response header, because a browser strips the encoding header
102+ once it has decoded a response and a cache may surface either form for the same
103+ URL. The frontend caps the size it will expand a block to, and a decode failure
104+ shows up as a diagnostic in the notification area rather than a silent blank.
101105
102106Treating a wildcard or an explicit request for no encoding as "does not accept
103107zstd" is deliberate: a wildcard only says unlisted encodings are acceptable, not
0 commit comments