|
| 1 | +# Row encoding and its limits |
| 2 | + |
| 3 | +`Row` is Materialize's in-memory representation of a tuple of `Datum`s. |
| 4 | +It is a `Tag`-based byte sequence defined in `src/repr/src/row.rs`, where each datum is a one-byte tag followed by a tag-specific payload. |
| 5 | +This document collects the size limits the encoding and its datum types impose, so that callers know what a `Row` can hold and where overflow is rejected. |
| 6 | +The limits live in scattered constants across `src/repr/src`; this is the index to them. |
| 7 | + |
| 8 | +## What the encoding is (and isn't) |
| 9 | + |
| 10 | +The `Tag` encoding has three properties that bound how these limits may change. |
| 11 | + |
| 12 | +* It is **not durably persisted**: persist stores data as Arrow (`ProtoRow`), so the in-memory layout can change without a migration. |
| 13 | +* `Row` **sort order is implementation-defined**, so layout changes cannot break ordering correctness. |
| 14 | +* `Row` **equality is byte equality**, so any value must encode to identical bytes regardless of how it was built. |
| 15 | + |
| 16 | +These limits are therefore in-memory-`Row` limits, distinct from durable (persist) limits and from transport limits. |
| 17 | + |
| 18 | +## Per-datum limits |
| 19 | + |
| 20 | +| Datum type | Limit | Enforcement | Source | |
| 21 | +| --- | --- | --- | --- | |
| 22 | +| `String` / `Bytes` / `List` payload | up to `u64::MAX` bytes (tag tiers Tiny `u8`, Short `u16`, Long `u32`, Huge `u64`) | none; bounded by `usize` and memory | `row.rs` `Tag`, `read_lengthed_datum` | |
| 23 | +| `char(n)` / `varchar(n)` length | ≤ 10 MiB (`10_485_760`) | error at type construction | `adt/char.rs::MAX_LENGTH`, `adt/varchar.rs::MAX_MAX_LENGTH` | |
| 24 | +| `numeric` precision | ≤ 39 significant digits (`13 * 3`) | error / rounding at construction | `adt/numeric.rs::NUMERIC_DATUM_MAX_PRECISION` | |
| 25 | +| array dimensions | ≤ 6 | `InvalidArrayError` at construction | `adt/array.rs::MAX_ARRAY_DIMENSIONS` | |
| 26 | +| `regex` pattern | ≤ 1 MiB source, ≤ 10 MiB compiled | error at compilation | `adt/regex.rs::MAX_REGEX_SIZE_BEFORE_COMPILATION` / `_AFTER_COMPILATION` | |
| 27 | + |
| 28 | +## Whole-`Row` size |
| 29 | + |
| 30 | +A `Row` itself stores its bytes in a `CompactBytes` (heap-backed like `Vec<u8>` once it spills past 23 inline bytes), so the in-memory value is bounded by `usize` and available memory rather than by a fixed cap. |
| 31 | +The containers a `Row` flows through impose tighter bounds: |
| 32 | + |
| 33 | +* **Arrangements** store `Row`s in the columnar `Rows` container (`impl Columnar for Row`), whose offset bounds are `u64` (default `Vec<u64>`). This does *not* cap a row at 32 bits. |
| 34 | +* **Persist** encodes columns as Arrow `BinaryArray` / `StringArray`, which use 32-bit (`i32`) offsets. The cumulative offsets within a part's column must fit `i32`, so a single encoded `Row` (and each variable-length column) is bounded to **< 2 GiB** on the durable path. Parts target far less (~128 MiB) in practice. |
| 35 | + |
| 36 | +A `Row` is also bounded earlier by: |
| 37 | + |
| 38 | +* the per-statement result size (`max_result_size` session variable; see `doc/developer/design/20250415_large_select_result_size.md`), and |
| 39 | +* the transport message size (gRPC) between `environmentd` and clusters. |
| 40 | + |
| 41 | +## Maintaining this document |
| 42 | + |
| 43 | +When adding a datum type, a length-tiered tag, or a fixed-size header, record the limit here with its enforcing constant and whether overflow is a graceful error, a panic, or unbounded. |
| 44 | +Keep the table pointing at the constant, not a copied literal, so it does not drift. |
0 commit comments