Skip to content

Commit 68a7cdc

Browse files
committed
test: add codec/shard/buffer invariant tests + design doc
The new SyncCodecPipeline accumulated several correctness bugs that all had the same shape: case-by-case reasoning about how codec / shard / IO / buffer pieces interact, missing a combination. Each bug went undetected until a particular CI configuration tripped over it. Address the underlying problem by writing the invariants down and enforcing them with focused tests: docs/superpowers/specs/2026-04-17-codec-pipeline-invariants.md tests/test_codec_invariants.py The doc declares contracts in three groups: C1-C3 Codec chain invariants — codecs only mutate shape; per-call chunk_spec; pipeline never branches on codec type. S1-S3 Shard layout invariants — compact (not dense); empty chunks are skipped under the default config; byte-range fast path requires write_empty_chunks=True. B1-B3 Buffer invariants — store IO buffers may be read-only; decode_chunk may return a view; prototype is per-call. The corresponding tests are quick and run on every test invocation. A pipeline change that violates any contract fails immediately, instead of waiting for an end-to-end test in a particular config to surface the problem. Two of the S3 tests are skipped when SyncCodecPipeline is not the default — they're meaningful on the bench branch and a no-op otherwise.
1 parent 12e304c commit 68a7cdc

2 files changed

Lines changed: 505 additions & 0 deletions

File tree

Lines changed: 185 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,185 @@
1+
# Codec Pipeline Invariants
2+
3+
This document describes the contracts that the codec pipeline, codec
4+
chain, sharding codec, and buffer abstractions rely on. Each invariant
5+
is enforced by a runtime test in `tests/test_codec_invariants.py`.
6+
Any change to the pipeline that violates one of these invariants
7+
should fail those tests immediately, instead of waiting for an
8+
end-to-end test in a particular configuration to surface the bug.
9+
10+
## Why this exists
11+
12+
The new `SyncCodecPipeline` accumulated several correctness bugs that
13+
all had the same shape: **case-by-case reasoning about how codec /
14+
shard / IO / buffer pieces interact, missing a combination.** Examples:
15+
16+
- The `ChunkTransform` cached prototype-bearing specs at construction
17+
time, so per-call GPU prototypes were silently discarded.
18+
- The byte-range write path produced a dense layout, breaking the
19+
shard-compactness contract that downstream code (and existing
20+
tests) depended on.
21+
- The shard index decoded from a read-only buffer crashed when the
22+
byte-range path tried to mutate it.
23+
- Empty inner chunks weren't compacted out of partial shard writes.
24+
25+
In each case, the code "worked" for the configuration the author was
26+
thinking about, and broke for a different one. The invariants below
27+
are the contracts the author should have written down first.
28+
29+
## Codec chain invariants
30+
31+
### C1. Codecs only mutate the array spec's `shape`.
32+
33+
`Codec.resolve_metadata(spec)` returns an `ArraySpec` that may differ
34+
from `spec` in `shape` (e.g. `TransposeCodec` permutes it). Every
35+
other field — `prototype`, `dtype`, `fill_value`, `config` — is
36+
unchanged.
37+
38+
**Consequence:** A `ChunkTransform` cannot bake any field other than
39+
`shape` into a cache. Anything else (notably `prototype`) must be
40+
read from the runtime `chunk_spec` on every call.
41+
42+
**Test:** for every registered codec, call `resolve_metadata(spec)`
43+
with various specs and assert `result.prototype is spec.prototype`,
44+
`result.dtype == spec.dtype`, etc.
45+
46+
### C2. Each codec call receives the spec at its position in the chain.
47+
48+
A `chunk_spec` flows through the codec chain. AA codecs see the
49+
running spec (each step potentially changes shape). The AB codec sees
50+
the spec after all AA codecs have resolved. BB codecs all see the
51+
post-AB spec.
52+
53+
**Consequence:** the pipeline must walk the AA codecs to get the AB
54+
spec for any given input. The walk is cheap; what matters is that
55+
the pipeline walks from the *runtime* `chunk_spec`, not from a
56+
constructor-time spec.
57+
58+
**Test:** a fake codec that asserts `chunk_spec.prototype is X` where
59+
`X` was passed to the pipeline at call time.
60+
61+
### C3. The pipeline owns IO unless a codec opts in to partial
62+
encode/decode.
63+
64+
Codecs are pure compute. The pipeline is responsible for fetching
65+
encoded blobs from the store, calling `decode_chunk`, and writing
66+
encoded results back. The exception is when the AB codec implements
67+
`ArrayBytesCodecPartialDecodeMixin` / `ArrayBytesCodecPartialEncodeMixin`
68+
— then the codec receives the store handle and handles its own IO
69+
(this is how `ShardingCodec` does byte-range reads/writes).
70+
71+
**Consequence:** A pipeline must check `supports_partial_encode` /
72+
`supports_partial_decode` and dispatch to the codec when true. It must
73+
not branch on codec type (`isinstance(..., ShardingCodec)`).
74+
75+
**Test:** the pipeline should produce identical results when forced
76+
through both code paths (force partial-encode-capable codec to use
77+
the generic path, and vice versa where possible).
78+
79+
## Shard layout invariants
80+
81+
### S1. The shard layout is *compact*, not dense.
82+
83+
A shard blob contains only the inner chunks that are present, packed
84+
together (in morton order). Absent chunks are recorded in the index
85+
with `(MAX_UINT_64, MAX_UINT_64)` and consume no space in the data
86+
region.
87+
88+
**Consequence:** Even with fixed-size inner codecs, the shard size
89+
varies with how many chunks are present. Anything that writes a shard
90+
must produce the compact layout.
91+
92+
**Counter-example we hit:** the byte-range fast path wrote every
93+
affected chunk into a fixed slot, producing a dense layout. This
94+
worked for reads (the index correctly recorded offsets) but broke
95+
size-checking tests and wasted space.
96+
97+
### S2. `write_empty_chunks=False` (the default) means an inner chunk
98+
that equals `fill_value` must NOT be written.
99+
100+
When merged data equals the fill value, the chunk is omitted from
101+
the shard entirely (no entry in the data region, `MAX_UINT_64` in
102+
the index). If all chunks become absent, the entire shard is deleted.
103+
104+
**Consequence:** any partial-shard-write code path must compute the
105+
merged chunk content, check against `fill_value`, and omit it if
106+
they match.
107+
108+
**Test:** write fill-value data to a sharded array; assert the
109+
relevant store keys do not exist.
110+
111+
### S3. The byte-range fast path requires `write_empty_chunks=True`.
112+
113+
Byte-range writes update fixed slots in an existing shard blob. They
114+
cannot compact away chunks that newly equal `fill_value` (would
115+
require rewriting the whole blob anyway). The optimization is only
116+
valid when the user explicitly opts out of empty-chunk skipping.
117+
118+
**Consequence:** `_encode_partial_sync`'s byte-range path is gated
119+
on `not skip_empty`. Under the default config, partial shard writes
120+
take the full-rewrite path.
121+
122+
## Buffer invariants
123+
124+
### B1. Buffers returned from store IO may be read-only.
125+
126+
`LocalStore` returns mmap-backed buffers; `ZipStore` returns views
127+
into a zip member; remote stores may return immutable buffers.
128+
129+
**Consequence:** any code that decodes from a store-returned buffer
130+
and then mutates the result must `.copy()` first.
131+
132+
**Counter-example we hit:** `_ShardIndex.set_chunk_slice` mutates
133+
`self.offsets_and_lengths`. When the index was decoded from a
134+
read-only buffer, this raised `ValueError: assignment destination
135+
is read-only`.
136+
137+
### B2. `decode_chunk` may return a view of its input.
138+
139+
`BytesCodec._decode_sync` returns an `NDBuffer` that views the same
140+
memory as the input `Buffer`. Subsequent mutations affect both.
141+
142+
**Consequence:** if the caller intends to mutate the decoded array
143+
(e.g. for a partial-write merge), it must `.copy()` first.
144+
145+
### B3. The buffer prototype is a runtime parameter, not metadata.
146+
147+
The `prototype` field of `ArraySpec` indicates what kind of buffer
148+
the *caller* wants for this operation. The same array can be read
149+
into CPU buffers on one call and GPU buffers on the next. The
150+
codec pipeline must use the per-call prototype, not a cached one.
151+
152+
**Consequence:** see C1, C2 above.
153+
154+
## Test plan
155+
156+
`tests/test_codec_invariants.py` should contain:
157+
158+
1. **C1 enforcement:** parametric test over all registered codecs
159+
that calls `resolve_metadata` with a sentinel `ArraySpec` and
160+
asserts the non-shape fields are unchanged.
161+
162+
2. **C2 enforcement:** a test that passes a `ChunkTransform`
163+
different prototypes per call (using a fake codec that records
164+
the prototype it was called with) and asserts the recorded
165+
prototypes match.
166+
167+
3. **S1 + S2 enforcement:** a test that writes various combinations
168+
of present/absent chunks to a sharded array and asserts the
169+
resulting shard sizes match the compact-layout formula. Run for
170+
both `BatchedCodecPipeline` and `SyncCodecPipeline`.
171+
172+
4. **S3 enforcement:** a test that uses a mock store to record
173+
`set_range_sync` calls. Verify that under default config no
174+
`set_range_sync` happens for partial shard writes; under
175+
`write_empty_chunks=True` it does.
176+
177+
5. **B1 enforcement:** a test that creates a sharded array on
178+
`LocalStore`, then triggers a partial write, and asserts no
179+
`ValueError: assignment destination is read-only` is raised.
180+
181+
6. **B2 enforcement:** a test that calls `BytesCodec._decode_sync`
182+
then mutates the result, and verifies the input buffer is not
183+
modified (or, if we accept the view semantics, document that
184+
callers must copy and add a test that calling decode-then-mutate
185+
without copy gives a clear error / known behavior).

0 commit comments

Comments
 (0)