Skip to content

Commit 7c5ef23

Browse files
committed
fix migration doc and add xdr architecture doc
1 parent 4a4dff1 commit 7c5ef23

2 files changed

Lines changed: 372 additions & 10 deletions

File tree

docs/ARCHITECTURE.md

Lines changed: 352 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,352 @@
1+
# XDR Architecture
2+
3+
This document describes the internal architecture of the in-tree XDR layer
4+
under `src/xdr/`. It's intended for SDK contributors — people fixing bugs,
5+
adding types, or extending the runtime. For consumer-facing migration
6+
guidance, see [`XDR_MIGRATION.md`](./XDR_MIGRATION.md).
7+
8+
The XDR layer replaces what used to be a thin wrapper around
9+
`@stellar/js-xdr`. It's a four-layer design — `core``types``values`
10+
`generated`, with `dx` overlays and a public `index.ts` on top — driven by
11+
codegen against a canonical schema file (`xdr/xdr.json`).
12+
13+
---
14+
15+
## Layered architecture
16+
17+
```mermaid
18+
flowchart TB
19+
XJSON[("xdr/xdr.json<br/><i>canonical schema source</i>")]
20+
CODEGEN["tools/xdrgen/generate.mjs<br/><i>codegen script</i>"]
21+
22+
subgraph runtime["src/xdr/ — runtime"]
23+
CORE["core/<br/><b>Reader · Writer</b><br/><b>BaseType&lt;T&gt;</b> · XdrError"]
24+
TYPES["types/<br/><b>schema primitives</b><br/>struct · union · enum<br/>opaque · varOpaque · string<br/>int32 · int64 · uint64<br/>array · fixedArray · option · lazy · void · bool"]
25+
VALUES["values/<br/><b>consumer value classes</b><br/>XdrValue (base)<br/>BytesValue · EnumValue · BigIntValue<br/>XdrString · to-json walker"]
26+
GEN["generated/<br/><b>447 generated classes</b><br/>Asset · ScVal · Memo<br/>TransactionEnvelope · ..."]
27+
DX["dx/<br/><b>DX overlays</b><br/>Int128 · Uint128<br/>Int256 · Uint256<br/>(bigint wrappers)"]
28+
INDEX["<b>index.ts</b><br/><i>public surface</i>"]
29+
end
30+
31+
XJSON --> CODEGEN
32+
CODEGEN -.->|emits 447 files| GEN
33+
34+
CORE --> TYPES
35+
CORE --> VALUES
36+
TYPES --> VALUES
37+
TYPES --> GEN
38+
VALUES --> GEN
39+
GEN --> DX
40+
41+
GEN --> INDEX
42+
DX --> INDEX
43+
VALUES --> INDEX
44+
45+
style CORE fill:#f6f8fa,stroke:#bbb
46+
style TYPES fill:#dff,stroke:#39c
47+
style VALUES fill:#fde,stroke:#c39
48+
style GEN fill:#efe,stroke:#3a3
49+
style DX fill:#ffd,stroke:#cc3
50+
style INDEX fill:#fff,stroke:#000,stroke-width:2px
51+
```
52+
53+
The dependency edges encode the design constraints:
54+
55+
- **`core/`** is the bottom — no dependencies on anything else in `xdr/`.
56+
Just the buffer reader/writer, the abstract `BaseType<T>` schema
57+
interface, and the error class. Could be lifted into a standalone XDR
58+
runtime as-is.
59+
- **`types/`** depends only on `core/`. Pure schema primitives, charset-
60+
agnostic, with no knowledge of consumer-facing value classes. Also
61+
liftable into a standalone runtime — see the "Why this layering" section
62+
below.
63+
- **`values/`** depends on `types/` *and* `core/`. Adds consumer ergonomics:
64+
the `XdrValue` base class with `toXdr`/`fromXdr`/`toJson`/`fromJson`, the
65+
shared subclass bases (`BytesValue`, `EnumValue`, `BigIntValue`,
66+
`XdrString`), and the SEP-0051 JSON walker.
67+
- **`generated/`** is codegen output. Each file references its schema
68+
primitives from `types/` and extends the right value-class base from
69+
`values/`. 447 files, one per named XDR type.
70+
- **`dx/`** is hand-written ergonomic overlays on top of `generated/`
71+
e.g. `Int128` exposes a single `bigint` instead of the generated
72+
`Int128Parts` struct's `{hi, lo}` split.
73+
- **`index.ts`** is the only public entry. Re-exports the generated barrel,
74+
the value-class bases, the DX overlays, the primitive shims, and the
75+
schema-builder primitives.
76+
77+
### Why this layering
78+
79+
The `types/``values/` split is the central design decision. **`types/`
80+
is a generic XDR schema runtime that knows nothing about Stellar-specific
81+
value types, JSON, or class semantics.** It would compile and work in a
82+
Bitcoin SDK, a NFS implementation, or any other RFC 4506 consumer.
83+
84+
`values/` is where Stellar's consumer ergonomics live: the `XdrValue` base
85+
class consumers extend, the SEP-0051 JSON walker, the StrKey-aware
86+
overrides, the `BytesValue` / `XdrString` wrappers that make byte-string-
87+
shaped fields ergonomic. If we ever spin the XDR runtime out into a
88+
separate package, `types/` + `core/` go together; `values/` stays here
89+
because it's where SDK-specific behavior lives.
90+
91+
A practical consequence: when adding a new schema primitive, ask whether
92+
the new primitive is generic XDR (struct, union, primitive int variant,
93+
container) or Stellar-flavored ergonomics (StrKey override, JSON encoding
94+
rule). The former goes in `types/`; the latter in `values/`. The codegen
95+
is set up to dispatch on the schema's `kind` (a `types/`-level concept),
96+
not on its `name` (which can be Stellar-specific).
97+
98+
---
99+
100+
## Anatomy of a single generated class
101+
102+
What you actually see when you read a generated file, using `AlphaNum4` as
103+
the example:
104+
105+
```mermaid
106+
flowchart LR
107+
CLASS["<b>class AlphaNum4 extends XdrValue</b><br/>━━━━━━━━━━━━━━━━━━━━━━<br/>readonly assetCode: AssetCode4<br/>readonly issuer: PublicKey<br/>━━━━━━━━━━━━━━━━━━━━━━<br/>static schema = struct('AlphaNum4', {…})<br/>━━━━━━━━━━━━━━━━━━━━━━<br/>toXdrObject() → wire<br/>static fromXdrObject(wire) → instance"]
108+
109+
XDRVALUE["<b>XdrValue</b><br/><i>values/xdr-value.ts</i><br/>━━━━━━━━━━━━━━━━━━<br/>toXdr / fromXdr<br/>toJson / fromJson<br/>equals / toString"]
110+
111+
STRUCT["<b>struct() / StructType</b><br/><i>types/struct.ts</i><br/>━━━━━━━━━━━━━━━━━━<br/>_read / _write iterate entries<br/>delegate to each field's schema"]
112+
113+
AC4["<b>AssetCode4</b><br/><i>generated/asset-code4.ts</i><br/>extends BytesValue<br/>schema = opaque(4, 'AssetCode4')"]
114+
115+
PK["<b>PublicKey</b><br/><i>generated/public-key.ts</i><br/>discriminated union<br/>schema = union('PublicKey', …)"]
116+
117+
WALKER["<b>JSON walker</b><br/><i>values/to-json.ts</i><br/>dispatch by schema.kind<br/>+ schema.name overrides<br/>(StrKey, wide-int parts, …)"]
118+
119+
CLASS -->|extends| XDRVALUE
120+
CLASS -->|schema instance| STRUCT
121+
XDRVALUE -->|toJson delegates to| WALKER
122+
STRUCT -->|field| AC4
123+
STRUCT -->|field| PK
124+
WALKER -->|reads schema.entries| STRUCT
125+
126+
style CLASS fill:#efe,stroke:#3a3,stroke-width:2px
127+
style XDRVALUE fill:#fde,stroke:#c39
128+
style STRUCT fill:#dff,stroke:#39c
129+
style AC4 fill:#efe,stroke:#3a3
130+
style PK fill:#efe,stroke:#3a3
131+
style WALKER fill:#fde,stroke:#c39
132+
```
133+
134+
Every generated class has two things:
135+
136+
1. **A `static readonly schema`** — built from `types/` primitives. This is
137+
the source of truth for wire layout. `_read` and `_write` on the
138+
schema do the actual byte I/O; everything else delegates to it.
139+
2. **An instance shape that matches the schema** — fields named after XDR
140+
struct/union/enum members, all `readonly`, with a `toXdrObject()` that
141+
returns the wire shape and a `static fromXdrObject(wire)` going the
142+
other way.
143+
144+
Inherited methods (`toXdr`, `fromXdr`, `toJson`, `fromJson`, `equals`,
145+
`toString`) all work automatically: the wire round-trip uses the schema,
146+
and the JSON round-trip uses the walker — both of which can inspect the
147+
schema generically because the schema graph is fully introspectable.
148+
149+
---
150+
151+
## Data flow
152+
153+
### Encoding (instance → bytes)
154+
155+
```mermaid
156+
flowchart LR
157+
INST["instance<br/>(generated class)"]
158+
WIRE["wire shape<br/>(plain object)"]
159+
BYTES["Uint8Array"]
160+
161+
INST -->|toXdrObject| WIRE
162+
WIRE -->|schema._write| BYTES
163+
INST ==>|toXdr<br/>shortcut| BYTES
164+
```
165+
166+
`toXdrObject()` translates the typed instance into the "wire shape" (a
167+
plain JS object whose fields match the XDR struct/union layout). The
168+
schema's `_write` then serializes that object into a `Uint8Array` via the
169+
`Writer`. The `toXdr()` shortcut composes both.
170+
171+
### Decoding (bytes → instance)
172+
173+
```mermaid
174+
flowchart LR
175+
BYTES["Uint8Array"]
176+
WIRE["wire shape"]
177+
INST["instance"]
178+
179+
BYTES -->|schema._read| WIRE
180+
WIRE -->|fromXdrObject| INST
181+
BYTES ==>|fromXdr<br/>shortcut| INST
182+
```
183+
184+
The schema's `_read` parses bytes off a `Reader` into the wire shape;
185+
`fromXdrObject(wire)` rebuilds the typed instance.
186+
187+
### JSON round-trip
188+
189+
```mermaid
190+
flowchart LR
191+
INST["instance"]
192+
WIRE["wire shape"]
193+
JSON["JsonValue"]
194+
195+
INST -->|toXdrObject| WIRE
196+
WIRE -->|walker dispatches<br/>on schema.kind| JSON
197+
JSON -->|walker dispatches<br/>+ overrides| WIRE
198+
WIRE -->|fromXdrObject| INST
199+
200+
INST ==>|toJson<br/>shortcut| JSON
201+
JSON ==>|fromJson<br/>shortcut| INST
202+
```
203+
204+
JSON serialization goes through the wire shape, not the instance — that's
205+
how the walker stays schema-driven and class-agnostic. The walker
206+
dispatches on `schema.kind` (struct / union / enum / opaque / …) and
207+
checks an override map keyed on `schema.name` for type-specific
208+
serializers (StrKey forms, wide-int decimal-string collapse, asset-code
209+
trim rules).
210+
211+
### Where the wire layout is decided
212+
213+
The wire layout is determined entirely by the `static schema` field on the
214+
generated class. The schema is built from primitives in `types/` — each of
215+
which knows exactly how to read and write its wire bytes per RFC 4506. The
216+
codegen mechanically translates XDR source into the appropriate primitive
217+
calls:
218+
219+
| XDR source | Schema expression |
220+
|----------------------------------|-------------------------------------------|
221+
| `int` | `int32()` |
222+
| `unsigned int` | `uint32()` |
223+
| `hyper` | `int64()` |
224+
| `unsigned hyper` | `uint64()` |
225+
| `bool` | `bool()` |
226+
| `opaque foo[N]` | `opaque(N, "Foo")` |
227+
| `opaque foo<N>` | `varOpaque(N, "Foo")` |
228+
| `string foo<N>` | `xdrString(N)` |
229+
| `T foo<N>` (variable array) | `array(T.schema, N)` |
230+
| `T foo[N]` (fixed array) | `fixedArray(T.schema, N)` |
231+
| `T* foo` / `T foo?` | `option(T.schema)` |
232+
| `enum E { A=0, B=1 }` | `enumType("E", { A: 0, B: 1 })` |
233+
| `struct S { T1 a; T2 b; }` | `struct("S", { a: T1.schema, b: T2.schema })` |
234+
| `union U switch (D d) { … }` | `union("U", { switchOn: D.schema, cases: […] })` |
235+
236+
---
237+
238+
## Where to extend
239+
240+
When you need to change behavior, the rule of thumb is to push the change
241+
as far down the stack as possible so the most types benefit:
242+
243+
| You want to… | Edit here |
244+
|-----------------------------------------------------------|-----------------------------------------------------------------|
245+
| Add a new XDR type to the schema | `xdr/xdr.json` (then `pnpm run xdrgen`) |
246+
| Change codegen output (TS type expr, generated factory shape, …) | `tools/xdrgen/generate.mjs` |
247+
| Change wire serialization of an existing primitive kind | `src/xdr/types/<kind>.ts` |
248+
| Add a new schema primitive (e.g. a new int variant) | new file in `src/xdr/types/`, then wire into codegen `schemaExpr` |
249+
| Change JSON encoding of an existing kind | the `walkToJson` / `walkFromJson` switch in `src/xdr/values/to-json.ts` |
250+
| Add a Stellar-specific JSON override (e.g. new strkey) | the `OVERRIDES` map in `src/xdr/values/to-json.ts` |
251+
| Add a consumer-side ergonomic helper for a class | hand-written file in `src/xdr/dx/` |
252+
| Hand-edit one specific generated class | **don't.** Edit codegen output is overwritten on regen — change `tools/xdrgen/generate.mjs` instead |
253+
254+
A few cross-cutting changes worth knowing how to do:
255+
256+
### Adding a new schema primitive
257+
258+
1. Create `src/xdr/types/<kind>.ts` with a class extending `BaseType<T>`:
259+
- Implement `_read(reader, path): T` and `_write(value, writer, path): void`.
260+
- Set `readonly kind = "<kind>"` (the discriminant the walker uses).
261+
- Export both the class (for type lookups) and a builder function
262+
`<kind>(...): XdrType<T>`.
263+
2. Wire it into codegen — add a `case "<kind>":` to `schemaExpr` and
264+
`tsTypeExpr` (and `wireTypeExpr` if the wire shape differs from the
265+
TS-side shape).
266+
3. If the walker should support this kind, add a `case "<kind>":` to both
267+
`walkToJson` and `walkFromJson` in `src/xdr/values/to-json.ts`.
268+
269+
### Adding a Stellar-specific JSON override
270+
271+
The walker's override map (`OVERRIDES` in `values/to-json.ts`) keys on
272+
`schema.name`. To add a new override (e.g. a new StrKey form):
273+
274+
```ts
275+
OVERRIDES.set("YourTypeName", {
276+
toJson(wire) { /* produce JSON */ },
277+
fromJson(json) { /* return wire shape */ },
278+
});
279+
```
280+
281+
For the override to fire on direct calls (`value.toJson()` on a top-level
282+
instance), the schema needs a name — passed through to the primitive
283+
builder (e.g., `opaque(N, "YourTypeName")`). The codegen does this for
284+
named typedefs automatically; if you're hand-rolling a schema you need to
285+
pass the name yourself.
286+
287+
### Adding a DX overlay
288+
289+
Hand-written files in `src/xdr/dx/` that wrap a generated class with a
290+
more ergonomic shape (e.g. `Int128` exposing a single `bigint`). The
291+
overlay typically:
292+
293+
1. Extends a value-class base (`BigIntValue`, `BytesValue`, …) or
294+
`XdrValue` directly.
295+
2. Reuses the underlying generated class's `static schema` so wire format
296+
stays identical.
297+
3. Provides convenient constructor and accessor patterns.
298+
4. Adds the overlay to the public exports in `src/xdr/index.ts`.
299+
300+
---
301+
302+
## Testing strategy
303+
304+
The XDR layer has a layered test suite mirroring the layered architecture:
305+
306+
| Layer | File | Purpose |
307+
|--------------------------------|-----------------------------------------------------|--------------------------------------------------------------------------------------------------|
308+
| 1. Hand-written smoke | `test/unit/base/xdr/legacy_round_trip.test.ts` | ~15 representative shapes; byte-equality against the legacy `@stellar/js-xdr` runtime |
309+
| 2. Real-traffic corpus | `test/unit/base/xdr/corpus_round_trip.test.ts` | Wire bytes captured from horizon mainnet; both SDKs decode/re-encode losslessly |
310+
| 3. Schema-driven exhaustive | `test/unit/base/xdr/schema_exhaustive.test.ts` | Auto-generated default-value sample for every named class (~2000 tests, no manual additions) |
311+
| 4. JSON walker tests | `test/unit/base/xdr/to_json.test.ts` | SEP-0051 conformance per encoding rule + field-level JSON round-trips |
312+
| Smoke / generated | `test/unit/base/xdr/generated.test.ts` | Sanity tests on codegen output (cyclic unions, lazy refs, inlined typedefs, …) |
313+
| Slice | `test/unit/base/xdr/slice.test.ts` | Hand-curated round-trips for the most-used types (Asset, PublicKey, AlphaNum4, Memo, Int128, …) |
314+
315+
The legacy `@stellar/js-xdr`-backed generated files are checked in at
316+
`test/fixtures/legacy-xdr/` as the wire-format oracle. Long-term once the
317+
new SDK has stayed agreement-green for a release or two, the legacy
318+
fixtures can be dropped and the corpus fixtures become frozen ground
319+
truth.
320+
321+
To refresh the mainnet corpus:
322+
323+
```
324+
pnpm tsx scripts/refresh-horizon-corpus.ts
325+
```
326+
327+
---
328+
329+
## Key invariants
330+
331+
A few properties the runtime depends on. Breaking any of these will cause
332+
broad regressions across the suite, so they're worth being deliberate
333+
about:
334+
335+
- **`types/` has no `import` from `values/`.** This is enforceable by
336+
inspection; CI doesn't pin it yet but should.
337+
- **Every generated class has a `static schema` and a
338+
`static fromXdrObject(wire)`.** `XdrValue.fromXdr` and the walker both
339+
depend on these being present.
340+
- **`schema.kind` matches one of the kinds enumerated in
341+
`values/to-json.ts`.** Adding a new primitive without updating the
342+
walker will produce runtime errors on any consumer call to `toJson()`.
343+
- **Schemas reachable from each other must form a DAG** *or* go through
344+
`lazy()`. Direct cyclic refs (e.g. `ScVal` containing `ScVal[]` directly
345+
rather than via `lazy()`) trigger a temporal-dead-zone error at module
346+
load. Codegen detects cycles and inserts `lazy()` automatically; a hand-
347+
rolled schema needs to think about this.
348+
- **The wire layer is byte-honest for `Uint8Array` and `XdrString`
349+
passthrough.** Tests assume `decode(encode(x))` produces byte-identical
350+
output. Adding silent transformations at the wire layer (lossy
351+
re-encoding, charset conversion, etc.) will break the corpus tests
352+
immediately.

0 commit comments

Comments
 (0)