Skip to content

Commit 7a6d748

Browse files
committed
Project flag fields as flat string arrays in canonical JSON
A flag-word spec entry now surfaces in canonical-doc as a flat sorted `string[]` instead of a `{flags, flagsRaw}` wrapper. Each entry is either a canonical slug for a named set bit or `bit<N>` for set bits the spec table doesn't name. Canonical sort: named slugs alphabetical first, then `bit<N>` numerically. Why: the wrapper had a redundant key collision when the parent field itself was named `flags` (e.g. `header.flags = {flags: [...], flagsRaw: ...}`), the wrapper-vs-array split made named and unnamed bits diff differently, and `flagsRaw` was hostile to hand-edits since toggling one bit required recomputing the hex. The flat array gives uniform diffs across named and unnamed bits, makes hand-edits one entry per bit, and preserves the strict-disjoint invariant via per-element refines (a `bit<N>` whose position falls inside the named-mask is rejected at both the schema layer and the wire boundary, and `bit<N>` with N >= codec width is also rejected). `slugifyCodedName` rejects display strings whose slug would collide with the `bit<N>` sentinel namespace, reserving it for the projection's synthesized entries. This is a wire-shape change for `*.pro.json` snapshots. Snapshots produced by previous versions need to be re-saved through this version's CLI before they can be loaded.
1 parent 3838996 commit 7a6d748

37 files changed

Lines changed: 442 additions & 406 deletions

binary/INTERNALS.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -321,15 +321,15 @@ These are non-negotiable across PRO and MAP:
321321
5. **Linked structures.** Same-struct: array length drives count via `enforceLinkedCounts(spec, doc)` + zod refinement. Cross-struct (`fromCtx`): orchestrator owns the binding; the count flows in via the read-time ctx.
322322
6. **No work-time artifacts in the repo.** Exception: `tmp/` (in `.gitignore`).
323323

324-
7. **Sorted-array projection for flag fields.** A flag-word spec entry (`{codec, flags: Table}`) surfaces in canonical-doc as a `{flags: string[], flagsRaw?: hexString}` strict-object wrapper, not as the raw int. `flags` lists the slugified-camelCase names of every set bit, alphabetically sorted; toggling one bit adds or removes one entry. `flagsRaw` is an optional hex-string reservoir for unnamed bits in the wire word, omitted when every set bit has a name. Strict-disjoint invariant: named bits never appear in `flagsRaw`, enforced at the wire boundary by `flagArrayToInt`. `compileFlagTable` slugifies the table's display strings to camelCase canonical keys (`"NoBlock"` -> `noBlock`); `intToFlagArray` / `flagArrayToInt` translate at the wire codec boundary via the `FlagArraySchema` wrapper.
324+
7. **Flat-array projection for flag fields.** A flag-word spec entry (`{codec, flags: Table}`) surfaces in canonical-doc as a flat sorted `string[]`, not as the raw int. Each entry is either a named slug (slugified-camelCase from the table's display string) or `bit<N>` (zero-based bit position) for set bits the table doesn't name. Canonical sort order: named slugs first alphabetically, then `bit<N>` in ascending bit position. Toggling one bit adds or removes one entry at its sorted position - same shape for named and unnamed bits, so diffs read uniformly. `compileFlagTable` slugifies display strings to camelCase canonical keys (`"NoBlock"` -> `noBlock`); `slugifyCodedName` rejects display strings whose slug would collide with the `bit<N>` sentinel namespace. `intToFlagArray` / `flagArrayToInt` translate at the wire codec boundary via the `FlagArraySchema` wrapper.
325325

326-
Slugified identifiers (rather than the raw display strings) are the canonical token shape because the construction API (`docs/todo.md`) surfaces flags as TS members typed against a literal-name union - identifier-shaped names get the canonical dot-trigger autocomplete with per-flag JSDoc visible inline, which a quoted-display-string union does not. Schema validation messages and JSON Schema `items.enum` autocomplete also benefit from identifier tokens (no spacing/casing ambiguities like "No LOS required" vs "No los required"). The display string remains the parsed-tree label; the slug is the toolchain token, with one translation point (label <-> slug) at the projection boundary.
326+
Strict-disjoint invariant: a `bit<N>` entry whose position falls inside the named-mask is rejected at both the schema layer and the wire boundary - hand-edits must use the canonical slug for any spec-named bit. `bit<N>` with N >= codec width is also rejected at both layers, so synthesized entries cannot reference a bit past the wire word.
327327

328-
The wrapper-at-canonical-doc shape (`header.flags = {flags: [...], flagsRaw: "..."}`) keeps the spec -> canonical-doc orchestration one-key-per-spec-field; a flat sidecar `header.flagsRaw` would require a special case in `toZodSchema` to emit two parent keys per flag spec entry.
328+
Slugified identifiers (rather than the raw display strings) are the canonical token shape because the construction API (`docs/todo.md`) surfaces flags as TS members typed against a literal-name union - identifier-shaped names get the canonical dot-trigger autocomplete with per-flag JSDoc visible inline, which a quoted-display-string union does not. Schema validation messages and JSON Schema `items.enum` autocomplete also benefit from identifier tokens (no spacing/casing ambiguities like "No LOS required" vs "No los required"). The display string remains the parsed-tree label; the slug is the toolchain token, with one translation point (label <-> slug) at the projection boundary.
329329

330-
This is a consistent application of rule #1 - `packedAs`+`bitRange` already exposes byte-packed sub-fields as peer scalar entries; the named-bit projection exposes bit-packed sub-fields the same way (the wire packs N independent semantic units into one int; canonical separates them).
330+
This is a consistent application of rule #1 - `packedAs`+`bitRange` already exposes byte-packed sub-fields as peer scalar entries; the flat-array projection exposes bit-packed sub-fields the same way (the wire packs N independent semantic units into one int; canonical separates them into one entry per set bit).
331331

332-
8. **Lossless reservoirs preserve unnamed bits.** Adding a name to a flag table is a non-breaking spec evolution: old snapshots load via the `flagsRaw` path, re-saving promotes the bit to its new name. `schemaVersion` does NOT bump for additive name changes; bumping is reserved for _re-interpretive_ spec changes (a previously-parsed field's meaning changes), which require explicit migration code in the snapshot codec. The byte round-trip invariant `serialize(parse(b)) === b` is the load-bearing property - every existing `*-roundtrip.test.ts` enforces it.
332+
8. **Lossless preservation of unnamed bits.** Adding a name to a flag table is a non-breaking spec evolution: old snapshots load via `bit<N>` entries, re-saving promotes the bit to its new slug. `schemaVersion` does NOT bump for additive name changes; bumping is reserved for _re-interpretive_ spec changes (a previously-parsed field's meaning changes), which require explicit migration code in the snapshot codec. The byte round-trip invariant `serialize(parse(b)) === b` is the load-bearing property - every existing `*-roundtrip.test.ts` enforces it.
333333

334334
9. **Enums and PIDs stay numeric in canonical-doc by design.** Where flag fields project to sorted-array name lists (rule #7), enum and PID fields stay as raw integers - the diff-friendliness gain doesn't justify the complication. Half the enum fields drive dispatch (`objectType`, `subType`, `scriptType`, MAP `version` / `rotation` / `elevation`) and would force a conversion at every dispatch site if projected to strings; the rest produce diffs of the same line count whether named or numeric (`5 -> 0` vs `"Items" -> "Background"`), unlike flags where the diff is fundamentally lossy. The display layer's `enum` table resolves names for editor dropdowns and hover; the snapshot stays close to the wire.
335335

binary/src/itm/schemas.ts

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,17 +15,17 @@ import { itmAbilitySpecAnnotated } from "./specs/ability.overrides";
1515
// Wire codecs use the *annotated* specs so flag fields project through
1616
// `intToFlagArray` / `flagArrayToInt` at the byte boundary - the
1717
// canonical-doc surface (which is built off the same annotated specs) sees
18-
// flags as sorted-array `{flags, flagsRaw?}` projections, matching what the
19-
// zod schema validates.
18+
// flags as flat sorted `string[]` projections (named slugs + `bit<N>`
19+
// sentinels), matching what the zod schema validates.
2020
export const itmHeaderSchema = toTypedBinarySchema(itmHeaderSpecAnnotated);
2121
export const itmAbilitySchema = toTypedBinarySchema(itmAbilitySpecAnnotated);
2222
export const effectSchema = toTypedBinarySchema(effectSpecAnnotated);
2323

2424
// Re-export the data types projected from the *annotated* specs so flag
25-
// fields surface as `{flags: string[], flagsRaw?: string}` (the sorted-array
26-
// projection) rather than `number`. The bare-spec types in `./specs/header`
27-
// etc. remain the underlying source for the spread, but consumers (parser,
28-
// canonical reader/writer) should consume the annotated projection.
25+
// fields surface as `string[]` (the flat sorted-array projection) rather
26+
// than `number`. The bare-spec types in `./specs/header` etc. remain the
27+
// underlying source for the spread, but consumers (parser, canonical
28+
// reader/writer) should consume the annotated projection.
2929
import type { SpecData } from "../spec/types";
3030

3131
export type ItmHeaderData = SpecData<typeof itmHeaderSpecAnnotated>;

binary/src/map/canonical-reader.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -308,7 +308,7 @@ export function rebuildMapCanonicalDocument(parseResult: ParseResult): MapCanoni
308308
...headerScalars,
309309
version: headerScalars.version >>> 0,
310310
filename: readString(headerGroup, "Filename"),
311-
// `flags` is a sorted-array projection (`{flags, flagsRaw?}`) produced
311+
// `flags` is a flat sorted-array projection (`string[]`) produced
312312
// by `walkGroup`; no signedness coercion applies.
313313
timestamp: headerScalars.timestamp >>> 0,
314314
defaultElevation: clampNumericValue(headerScalars.defaultElevation, "int32", {

binary/src/map/schemas.ts

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,12 +23,13 @@ export interface MapHeader {
2323
numLocalVars: number;
2424
scriptId: number;
2525
/**
26-
* `flags` is the sorted-array projection produced by the wire codec
27-
* (`FlagArraySchema`). Consumers that need a numeric mask use the
28-
* `FlagArray` overload of `hasElevation` (see `./types.ts`); raw-int
29-
* access goes through `flagArrayToInt(MapFlags, ...)` if needed.
26+
* `flags` is the flat sorted-array projection produced by the wire
27+
* codec (named slugs first, then `bit<N>` for unnamed set bits).
28+
* Consumers that need a numeric mask use the `FlagArray` overload of
29+
* `hasElevation` (see `./types.ts`); raw-int access goes through
30+
* `flagArrayToInt(MapFlags, ..., 32)` if needed.
3031
*/
31-
flags: { flags: string[]; flagsRaw?: string };
32+
flags: string[];
3233
darkness: number;
3334
numGlobalVars: number;
3435
mapId: number;

binary/src/map/types.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -131,11 +131,11 @@ export const ObjectFlags: Record<number, string> = {
131131
* bitmask form is preserved as a fallback for callers that still hold a
132132
* raw int (e.g., low-level parsers).
133133
*/
134-
export function hasElevation(flags: number | { flags: string[]; flagsRaw?: string }, elevation: number): boolean {
134+
export function hasElevation(flags: number | string[], elevation: number): boolean {
135135
if (typeof flags === "number") {
136136
return (flags & (0x2 << elevation)) === 0;
137137
}
138138
const key =
139139
elevation === 0 ? "skipElevation0Tiles" : elevation === 1 ? "skipElevation1Tiles" : "skipElevation2Tiles";
140-
return !flags.flags.includes(key);
140+
return !flags.includes(key);
141141
}

binary/src/pro/canonical-reader.ts

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -105,11 +105,10 @@ function readClampedFieldNumber(
105105
}
106106

107107
/**
108-
* Read a flag-word field from the display tree and project it to the
109-
* sorted-array `{flags, flagsRaw?}` shape canonical-doc expects. Width is
110-
* hard-coded per call site to match the underlying spec codec - every PRO
111-
* flag word is u8 / u24 / u32 in the spec, mapped to the matching
112-
* `codecBitWidth` here.
108+
* Read a flag-word field from the display tree and project it to the flat
109+
* sorted `string[]` shape canonical-doc expects. Width is hard-coded per
110+
* call site to match the underlying spec codec - every PRO flag word is
111+
* u8 / u24 / u32 in the spec, mapped to the matching `codecBitWidth` here.
113112
*/
114113
function readFlagArray(
115114
group: ParsedGroup,

binary/src/pro/index.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -319,7 +319,7 @@ function parseCritter(data: CritterData): ParsedGroup[] {
319319
// flagsField. The dynamic-access subset for fieldsFromDefs is the
320320
// numeric-only view.
321321
const critterData = data as unknown as Record<string, number>;
322-
const critterFlagsInt = flagArrayToInt(CritterFlags, data.critterFlags);
322+
const critterFlagsInt = flagArrayToInt(CritterFlags, data.critterFlags, 32);
323323

324324
return [
325325
group("Critter Properties", [

binary/src/spec/coded-projection.ts

Lines changed: 78 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -69,9 +69,21 @@ export function slugifyCodedName(displayName: string): string {
6969
`slugifyCodedName: name "${displayName}" produced "${camel}" which is not a valid JS identifier`,
7070
);
7171
}
72+
if (RESERVED_BIT_PATTERN.test(camel)) {
73+
// `bit<N>` is the reserved sentinel for unnamed set bits in
74+
// FlagArray projections (see `intToFlagArray`). Spec authors must
75+
// not pick a display name whose slug collides with that namespace,
76+
// since the projection cannot distinguish a spec-named `bit13`
77+
// from "bit at position 13" on encode.
78+
throw new Error(
79+
`slugifyCodedName: name "${displayName}" produced reserved sentinel "${camel}"; flag display names must not slugify to bit<N>`,
80+
);
81+
}
7282
return camel;
7383
}
7484

85+
const RESERVED_BIT_PATTERN = /^bit\d+$/;
86+
7587
export interface FlagBitEntry {
7688
readonly key: string;
7789
readonly mask: number;
@@ -82,9 +94,9 @@ export interface FlagBitEntry {
8294
* Compile a flag table (`{[mask]: displayName}`) into a sorted entry list with
8395
* canonical keys, plus the OR'd `namedMask` covering every named bit.
8496
*
85-
* Sorted alphabetically by canonical key so the projected `flags` array
86-
* serialises in stable order - toggling one bit adds or removes one entry
87-
* at its alphabetical position, regardless of bit position.
97+
* Sorted alphabetically by canonical key so the projected array serialises
98+
* in stable order - toggling a named bit adds or removes one entry at its
99+
* alphabetical position, regardless of bit position.
88100
*
89101
* Returns frozen entries to discourage in-place mutation.
90102
*/
@@ -108,90 +120,109 @@ export function compileFlagTable(table: Readonly<Record<number, string>>): {
108120
}
109121

110122
/**
111-
* Build a default array projection - empty `flags`, no `flagsRaw`. Used by
112-
* structural-edit transitions and as a default in test fixtures or
113-
* construction APIs.
123+
* Sorted-array projection of a flag word. Each entry is one of:
124+
*
125+
* - A canonical (slugified-camelCase) key from the spec table, identifying
126+
* a named set bit (e.g. `lightThru`).
127+
* - `bit<N>` where N is the zero-based bit position, identifying a set bit
128+
* the spec table doesn't name (e.g. `bit5`, `bit13`).
129+
*
130+
* Canonical sort order: named keys alphabetically first, then `bit<N>`
131+
* entries in ascending bit-position order. Toggling one bit adds or removes
132+
* exactly one entry at its sorted position - same shape for named and
133+
* unnamed bits, so diffs read uniformly.
134+
*
135+
* Strict-disjoint invariant: a `bit<N>` entry is rejected at the wire
136+
* boundary if `1 << N` falls inside the spec table's named-mask, since a
137+
* hand-edit must use the canonical name for any spec-named bit. Encode
138+
* also rejects N >= codecBitWidth (no synthetic bits past the wire word).
114139
*/
115-
export function emptyFlagArray(_table: Readonly<Record<number, string>>): FlagArray {
116-
return { flags: [] };
117-
}
140+
export type FlagArray = string[];
118141

119142
/**
120-
* Sorted-array projection of a flag word - `flags` lists every set bit by its
121-
* canonical (slugified-camelCase) name, `flagsRaw` carries any wire bits the
122-
* spec table doesn't name as a hex string. Both fields are wire-shape:
123-
* `flags` order is alphabetical for stable diffs, `flagsRaw` is omitted in
124-
* the common case where every set bit has a name.
143+
* Build a default flag-array projection (empty array). Used by
144+
* structural-edit transitions and as a default in test fixtures or
145+
* construction APIs.
125146
*/
126-
export interface FlagArray {
127-
flags: string[];
128-
flagsRaw?: string;
147+
export function emptyFlagArray(_table: Readonly<Record<number, string>>): FlagArray {
148+
return [];
129149
}
130150

131151
/**
132-
* Project an integer flag word to a sorted array of slugified names. Each
133-
* named bit that's set contributes its canonical key; unnamed bits land in
134-
* `flagsRaw` as a lowercase hex string. `flagsRaw` is omitted when all set
135-
* bits are named.
152+
* Project an integer flag word to a sorted FlagArray. Named set bits
153+
* contribute their canonical key (alphabetical); unnamed set bits within
154+
* the codec's bit width contribute `bit<N>` entries (numeric).
136155
*
137-
* `codecBitWidth` (8 / 16 / 24 / 32) masks `flagsRaw` to the wire width so
138-
* sign-extended bits a JS bit-OR might surface don't leak in.
156+
* `codecBitWidth` (8 / 16 / 24 / 32) bounds the per-bit scan so sign-
157+
* extended bits a JS bit-OR might surface don't leak in.
139158
*/
140159
export function intToFlagArray(
141160
table: Readonly<Record<number, string>>,
142161
value: number,
143162
codecBitWidth: number,
144163
): FlagArray {
145164
const { entries, namedMask } = compileFlagTable(table);
146-
const flags: string[] = [];
165+
const named: string[] = [];
147166
for (const entry of entries) {
148-
if ((value & entry.mask) !== 0) flags.push(entry.key);
167+
if ((value & entry.mask) !== 0) named.push(entry.key);
149168
}
150169
const codecMask = codecBitWidth >= 32 ? 0xffffffff : (1 << codecBitWidth) - 1;
151170
const reservoir = (value & ~namedMask & codecMask) >>> 0;
152-
if (reservoir !== 0) {
153-
return { flags, flagsRaw: `0x${reservoir.toString(16)}` };
171+
const bits: string[] = [];
172+
for (let i = 0; i < codecBitWidth; i++) {
173+
if ((reservoir & (1 << i)) !== 0) bits.push(`bit${i}`);
154174
}
155-
return { flags };
175+
return [...named, ...bits];
156176
}
157177

158178
/**
159-
* Pack a flag array back to an integer. Every name in `flags` contributes its
160-
* mask; `flagsRaw` (hex) ORs in. Throws on unknown names, duplicate names,
161-
* malformed `flagsRaw`, or a `flagsRaw` value overlapping a named bit
162-
* (strict-disjoint invariant - the hand-edit surface should not let the same
163-
* bit be specified twice).
179+
* Pack a FlagArray back to an integer. Each entry contributes its bit(s):
180+
* named keys via the spec table, `bit<N>` via `1 << N`. Throws on:
181+
*
182+
* - unknown names that match neither the table nor `bit<N>`,
183+
* - duplicate entries,
184+
* - `bit<N>` whose position overlaps a named-bit mask (strict-disjoint
185+
* invariant - the hand-edit surface should not let the same bit be
186+
* specified twice),
187+
* - `bit<N>` with N >= codecBitWidth (no synthetic bits past the wire
188+
* word).
164189
*/
165-
export function flagArrayToInt(table: Readonly<Record<number, string>>, projection: FlagArray): number {
190+
export function flagArrayToInt(
191+
table: Readonly<Record<number, string>>,
192+
projection: FlagArray,
193+
codecBitWidth: number,
194+
): number {
166195
const { entries, namedMask } = compileFlagTable(table);
167196
const byKey = new Map(entries.map((entry) => [entry.key, entry.mask]));
168197
const seen = new Set<string>();
169198
let value = 0;
170-
for (const name of projection.flags) {
199+
for (const name of projection) {
171200
if (seen.has(name)) {
172201
throw new Error(`flagArrayToInt: duplicate flag name "${name}"`);
173202
}
174203
seen.add(name);
175204
const mask = byKey.get(name);
176-
if (mask === undefined) {
205+
if (mask !== undefined) {
206+
value = (value | mask) >>> 0;
207+
continue;
208+
}
209+
const bitMatch = RESERVED_BIT_PATTERN.exec(name);
210+
if (!bitMatch) {
177211
throw new Error(`flagArrayToInt: unknown flag "${name}" (known: ${entries.map((e) => e.key).join(", ")})`);
178212
}
179-
value = (value | mask) >>> 0;
180-
}
181-
if (projection.flagsRaw !== undefined) {
182-
if (typeof projection.flagsRaw !== "string" || !/^0x[0-9a-f]+$/i.test(projection.flagsRaw)) {
183-
throw new TypeError(
184-
`flagArrayToInt: flagsRaw must be a hex string ("0x..."); got ${String(projection.flagsRaw)}`,
213+
const position = Number(bitMatch[0].slice(3));
214+
if (position >= codecBitWidth) {
215+
throw new Error(
216+
`flagArrayToInt: bit position ${position} exceeds codec width ${codecBitWidth} for "${name}"`,
185217
);
186218
}
187-
const reservoir = Number.parseInt(projection.flagsRaw, 16);
188-
if ((reservoir & namedMask) !== 0) {
189-
const overlapHex = (reservoir & namedMask).toString(16);
219+
const bitMask = (1 << position) >>> 0;
220+
if ((bitMask & namedMask) !== 0) {
190221
throw new Error(
191-
`flagArrayToInt: flagsRaw ${projection.flagsRaw} overlaps named-bit mask 0x${overlapHex}; named bits must be set via the flags array`,
222+
`flagArrayToInt: bit position ${position} overlaps named-bit mask; named bits must be set by their canonical name, not "${name}"`,
192223
);
193224
}
194-
value = (value | reservoir) >>> 0;
225+
value = (value | bitMask) >>> 0;
195226
}
196227
return value;
197228
}
20 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)