Skip to content

Commit 851f7cf

Browse files
krisnyeclaude
andauthored
feat(data): add Guid type — 128-bit RFC 4122 v4, linear-memory ECS-compatible (#106)
* feat(data): add Guid type — 128-bit RFC 4122 v4, linear-memory ECS-compatible Adds `Guid` to `@adobe/data` as a 4×u32 tuple schema (16 bytes / 128 bits), following the namespace pattern alongside `Time` and `Boolean` in the schema folder. The tuple representation slots directly into `createStructBuffer` / the ECS column path without any infrastructure changes. Helpers: `create` (RFC 4122 v4 via `crypto.getRandomValues`), `toString`, `fromString`, `equals`, `nil`. Includes 30 vitest tests covering layout validity, struct buffer round-trips, version/variant bit correctness, and `toString`/`fromString` interop with `crypto.randomUUID()`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(guid): add toUnserializableKey, performance tests, and README - Adds `Guid.toUnserializableKey`: 8-char WTF-16 Map key (minimum-length JS string for 128 bits, ~10× cheaper to produce than toString) - Adds `guid.performance.test.ts`: storage write/read comparison across StructTypedBuffer, BigUint64Array, and Array<bigint> at N=1M; Map key comparison across UUID string, BigInt, and toUnserializableKey at N=100K - Adds README with design rationale, full benchmark tables, and conclusions Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(guid): rename toString/fromString → toUUID/fromUUID Clarifies that the canonical output format is a UUID string ("xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx"), not an arbitrary string. Renames source files, tests, exports, README, and performance test references consistently. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(guid): surface toUUID vs toUnserializableKey performance inline Adds ~950 ns / ~87 ns cost annotations to the README API table and to to-uuid.ts, so the trade-off is visible at the call site without needing to cross-reference the performance test. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 2d0771a commit 851f7cf

18 files changed

Lines changed: 787 additions & 0 deletions
Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
# Guid
2+
3+
A 128-bit RFC 4122 v4 globally-unique identifier, designed for linear-memory
4+
ECS storage and efficient in-process Map lookups.
5+
6+
## Representation
7+
8+
```ts
9+
type Guid = readonly [number, number, number, number]; // 4 × u32
10+
```
11+
12+
128 bits stored as a tuple of four unsigned 32-bit integers. This is the only
13+
representation that slots into the ECS `StructTypedBuffer` column path without
14+
any infrastructure changes — the struct codegen layer (`DataView32`,
15+
`getStructLayout`, `createReadStruct`) is 32-bit-quad-indexed, so `F64`-based
16+
or `bigint`-based schemas are rejected at that layer.
17+
18+
The schema is a fixed-length `U32` array (16 bytes, `std140`-aligned):
19+
20+
```ts
21+
Guid.schema // → { type: 'array', items: U32.schema, minItems: 4, maxItems: 4 }
22+
Guid.layout // → StructLayout { size: 16, type: 'array', fields: { 0,1,2,3 } }
23+
```
24+
25+
## API
26+
27+
```ts
28+
Guid.create() // → Guid RFC 4122 v4 via crypto.getRandomValues
29+
Guid.nil // → Guid [0, 0, 0, 0]
30+
Guid.equals(a, b) // → boolean
31+
Guid.toUUID(g) // → string "xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx" ~950 ns
32+
Guid.fromUUID(s) // → Guid throws TypeError on bad input
33+
Guid.toUnserializableKey(g) // → string 8-char WTF-16 Map key, NOT serializable ~87 ns
34+
```
35+
36+
`toUUID` is for human-readable output and cross-system interop. Use
37+
`toUnserializableKey` on any hot path where the result stays in-process —
38+
it is **~11× faster** to produce and hashes faster as a Map key (~93 ns/set
39+
vs ~215 ns/set at N=100K).
40+
41+
### `Guid.toUnserializableKey`
42+
43+
Returns an 8-character JS string that encodes all 128 bits by splitting each
44+
`u32` into two UTF-16 code units via `String.fromCharCode`. This is the
45+
**minimum-length** JS string for 128 bits and the fastest Map key to produce.
46+
47+
**Use only as a transient in-process Map/Set key.** Some code units may be
48+
lone surrogates (0xD800–0xDFFF), which are valid in WTF-16 JS strings but
49+
corrupt on serialization (JSON, TextEncoder, postMessage). Do not store,
50+
transmit, or serialize the result.
51+
52+
```ts
53+
const key = Guid.toUnserializableKey(g); // fast — ~84–92 ns vs ~950 ns for toUUID
54+
const map = new Map<string, SomeValue>();
55+
map.set(key, value);
56+
map.get(key);
57+
```
58+
59+
---
60+
61+
## Performance
62+
63+
Tests run at N = 1,000,000 (storage) and N = 100,000 (Map keys) in both the
64+
Node and browser vitest projects. Numbers are from Node; browser results were
65+
within 10–20% in most cases. Source: `guid.performance.test.ts`.
66+
67+
### Storage: write (N = 1,000,000)
68+
69+
| Strategy | ns/op | Memory |
70+
|---|---|---|
71+
| **StructTypedBuffer** (4×u32, current) | **~8–10** | **15.3 MB** |
72+
| `BigUint64Array` packed (2×u64, identical footprint) | ~185–250 | 15.3 MB |
73+
| `Array<bigint>` heap (1×128-bit BigInt per slot) | ~270–345 | ~30.5 MB est. |
74+
75+
### Storage: read (N = 1,000,000)
76+
77+
| Strategy | ns/op |
78+
|---|---|
79+
| **StructTypedBuffer** | **~6–7** |
80+
| `BigUint64Array` packed | ~150–205 |
81+
| `Array<bigint>` heap | ~36–37 |
82+
83+
StructTypedBuffer is **20–30× faster** than any BigInt-based storage for both
84+
read and write, despite identical raw byte footprint for the two typed-array
85+
approaches. The cost is the `u32 ↔ BigInt` conversion required on every
86+
access — JavaScript's BigInt arithmetic is expensive relative to direct
87+
typed-array element reads.
88+
89+
The `Array<bigint>` read is faster than `BigUint64Array` read because the
90+
128-bit value is pre-boxed (no re-packing step), but it doubles the memory
91+
footprint and its write is the slowest due to heap allocation and seven BigInt
92+
operations per entry.
93+
94+
### Map key comparison (N = 100,000)
95+
96+
Key encoding is measured separately from the Map operation so the two costs
97+
can be evaluated independently.
98+
99+
#### Set
100+
101+
| Key type | Map set | Encode cost | Est. total memory |
102+
|---|---|---|---|
103+
| 36-char UUID string | ~215–221 ns/op | ~950–1005 ns/op | ~10.7 MB |
104+
| 128-bit BigInt | ~94–100 ns/op | ~270–370 ns/op | ~8.4 MB |
105+
| **8-char min UTF-16 (`toUnserializableKey`)** | **~93–110 ns/op** | **~84–92 ns/op** | **~8.4 MB** |
106+
107+
#### Get
108+
109+
| Key type | Map get |
110+
|---|---|
111+
| 36-char UUID string | ~72–92 ns/op |
112+
| 128-bit BigInt | ~94–97 ns/op |
113+
| **8-char min UTF-16** | **~58–75 ns/op** |
114+
115+
Memory estimates (V8, 64-bit, pointer compression off):
116+
- `SeqOneByteString` (UUID): ~64 bytes/key + ~48 bytes/entry = ~10.7 MB at N=100K
117+
- `BigInt` (128-bit, 2 digits): ~40 bytes/key + ~48 bytes/entry = ~8.4 MB at N=100K
118+
- `SeqTwoByteString` (min-string): ~40 bytes/key + ~48 bytes/entry = ~8.4 MB at N=100K
119+
120+
### Conclusions
121+
122+
**For dense ECS component storage**, use `StructTypedBuffer` via the schema
123+
(`createArchetype({ ..., guid: Guid.schema })`). It is 20–30× faster than
124+
any BigInt representation and has the same 16-byte linear memory footprint.
125+
126+
**For Map/Set lookups keyed on GUIDs**, use `Guid.toUnserializableKey`. It
127+
matches BigInt on set speed, beats it on get, uses the same memory, and is
128+
**~10× faster to encode** than `Guid.toUUID`. The UUID string is the slowest
129+
option across all three dimensions.
130+
131+
**Only use `Guid.toUUID` / `Guid.fromUUID` when human readability or
132+
cross-system interop is required** (logging, APIs, serialization). On a hot
133+
lookup path, the 36-char UUID string costs ~1 µs per key just to produce.
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
// © 2026 Adobe. MIT License. See /LICENSE for details.
2+
3+
import { describe, it, expect } from "vitest";
4+
import { Guid } from "./index.js";
5+
6+
describe("Guid.create", () => {
7+
it("returns a 4-element tuple of numbers", () => {
8+
const g = Guid.create();
9+
expect(g).toHaveLength(4);
10+
expect(typeof g[0]).toBe("number");
11+
expect(typeof g[1]).toBe("number");
12+
expect(typeof g[2]).toBe("number");
13+
expect(typeof g[3]).toBe("number");
14+
});
15+
16+
it("all elements are in u32 range [0, 4294967295]", () => {
17+
const g = Guid.create();
18+
for (const n of g) {
19+
expect(n).toBeGreaterThanOrEqual(0);
20+
expect(n).toBeLessThanOrEqual(0xFFFFFFFF);
21+
}
22+
});
23+
24+
it("sets RFC 4122 v4 version nibble (bits 48-51 = 0100)", () => {
25+
const g = Guid.create();
26+
// Version nibble is bits [15:12] of g[1]
27+
const versionNibble = (g[1] >>> 12) & 0xF;
28+
expect(versionNibble).toBe(4);
29+
});
30+
31+
it("sets RFC 4122 variant (top 2 bits of g[2] = 10)", () => {
32+
const g = Guid.create();
33+
const variantBits = (g[2] >>> 30) & 0x3;
34+
expect(variantBits).toBe(0b10);
35+
});
36+
37+
it("generates unique values", () => {
38+
const seen = new Set<string>();
39+
for (let i = 0; i < 100; i++) {
40+
seen.add(Guid.toUUID(Guid.create()));
41+
}
42+
expect(seen.size).toBe(100);
43+
});
44+
});
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
// © 2026 Adobe. MIT License. See /LICENSE for details.
2+
3+
import type { Guid } from "./index.js";
4+
5+
// RFC 4122 v4: version nibble (bits 48-51) = 4, variant (bits 64-65) = 10
6+
export const create = (): Guid => {
7+
const arr = new Uint32Array(4);
8+
crypto.getRandomValues(arr);
9+
arr[1] = (arr[1] & 0xFFFF0FFF) | 0x00004000;
10+
arr[2] = (arr[2] & 0x3FFFFFFF) | 0x80000000;
11+
return [arr[0], arr[1], arr[2], arr[3]];
12+
};
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
// © 2026 Adobe. MIT License. See /LICENSE for details.
2+
3+
import { describe, it, expect } from "vitest";
4+
import { Guid } from "./index.js";
5+
6+
describe("Guid.equals", () => {
7+
it("returns true for identical Guids", () => {
8+
const g: Guid = [1, 2, 3, 4];
9+
expect(Guid.equals(g, g)).toBe(true);
10+
});
11+
12+
it("returns true for two nil Guids", () => {
13+
expect(Guid.equals(Guid.nil, [0, 0, 0, 0])).toBe(true);
14+
});
15+
16+
it("returns false when element 0 differs", () => {
17+
expect(Guid.equals([1, 2, 3, 4], [9, 2, 3, 4])).toBe(false);
18+
});
19+
20+
it("returns false when element 1 differs", () => {
21+
expect(Guid.equals([1, 2, 3, 4], [1, 9, 3, 4])).toBe(false);
22+
});
23+
24+
it("returns false when element 2 differs", () => {
25+
expect(Guid.equals([1, 2, 3, 4], [1, 2, 9, 4])).toBe(false);
26+
});
27+
28+
it("returns false when element 3 differs", () => {
29+
expect(Guid.equals([1, 2, 3, 4], [1, 2, 3, 9])).toBe(false);
30+
});
31+
32+
it("returns true for two created Guids that are copies", () => {
33+
const g = Guid.create();
34+
const copy: Guid = [g[0], g[1], g[2], g[3]];
35+
expect(Guid.equals(g, copy)).toBe(true);
36+
});
37+
});
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
// © 2026 Adobe. MIT License. See /LICENSE for details.
2+
3+
import type { Guid } from "./index.js";
4+
5+
export const equals = (a: Guid, b: Guid): boolean =>
6+
a[0] === b[0] && a[1] === b[1] && a[2] === b[2] && a[3] === b[3];
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
// © 2026 Adobe. MIT License. See /LICENSE for details.
2+
3+
import { describe, it, expect } from "vitest";
4+
import { Guid } from "./index.js";
5+
6+
describe("Guid.fromUUID", () => {
7+
it("parses a canonical lowercase UUID string", () => {
8+
const g = Guid.fromUUID("12345678-9abc-def0-1122-334455667788");
9+
expect(g).toEqual([0x12345678, 0x9abcdef0, 0x11223344, 0x55667788]);
10+
});
11+
12+
it("parses uppercase hex", () => {
13+
const g = Guid.fromUUID("12345678-9ABC-DEF0-1122-334455667788");
14+
expect(g).toEqual([0x12345678, 0x9abcdef0, 0x11223344, 0x55667788]);
15+
});
16+
17+
it("round-trips with toUUID", () => {
18+
const g: Guid = [0xdeadbeef, 0xcafebabe, 0x80004000, 0x01234567];
19+
expect(Guid.equals(Guid.fromUUID(Guid.toUUID(g)), g)).toBe(true);
20+
});
21+
22+
it("round-trips a created Guid through toUUID/fromUUID", () => {
23+
const g = Guid.create();
24+
expect(Guid.equals(Guid.fromUUID(Guid.toUUID(g)), g)).toBe(true);
25+
});
26+
27+
it("round-trips with crypto.randomUUID() strings", () => {
28+
const uuidStr = crypto.randomUUID();
29+
const g = Guid.fromUUID(uuidStr);
30+
expect(Guid.toUUID(g)).toBe(uuidStr.toLowerCase());
31+
});
32+
33+
it("throws TypeError for wrong length", () => {
34+
expect(() => Guid.fromUUID("12345678-9abc-def0-1122-33445566778")).toThrow(TypeError);
35+
});
36+
37+
it("throws TypeError for missing dashes", () => {
38+
expect(() => Guid.fromUUID("123456789abcdef011223344556677881")).toThrow(TypeError);
39+
});
40+
41+
it("throws TypeError for non-hex characters", () => {
42+
expect(() => Guid.fromUUID("12345678-9xyz-def0-1122-334455667788")).toThrow(TypeError);
43+
});
44+
45+
it("throws TypeError for empty string", () => {
46+
expect(() => Guid.fromUUID("")).toThrow(TypeError);
47+
});
48+
});
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
// © 2026 Adobe. MIT License. See /LICENSE for details.
2+
3+
import type { Guid } from "./index.js";
4+
5+
const PATTERN = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
6+
7+
export const fromUUID = (s: string): Guid => {
8+
if (!PATTERN.test(s)) {
9+
throw new TypeError(`Invalid GUID string: "${s}"`);
10+
}
11+
const h = s.replace(/-/g, "");
12+
return [
13+
parseInt(h.slice(0, 8), 16),
14+
parseInt(h.slice(8, 16), 16),
15+
parseInt(h.slice(16, 24), 16),
16+
parseInt(h.slice(24, 32), 16),
17+
];
18+
};

0 commit comments

Comments
 (0)