|
| 1 | +# ADR-006: Serialization & Wire Format |
| 2 | + |
| 3 | +- **Status:** Accepted |
| 4 | +- **Date:** 2026-05-29 |
| 5 | +- **Deciders:** @ayhammouda |
| 6 | +- **Roadmap refs:** principles 2.5, 2.7; decisions 5.3, 5.4, 5.5, 5.8 |
| 7 | + |
| 8 | +## Context and Problem Statement |
| 9 | + |
| 10 | +The server returns structured data for tools such as `search_docs`, |
| 11 | +`list_versions`, and `compare_versions`, while `get_docs` returns the markdown |
| 12 | +documentation body that users asked for. The structured tools need an explicit |
| 13 | +serializer layer so the project can keep one stable internal result model while |
| 14 | +choosing the wire representation independently. |
| 15 | + |
| 16 | +Serialization is not a storage decision and does not change retrieval behavior. |
| 17 | +Storage remains SQLite plus markdown. Token economy is empirical, not |
| 18 | +architectural, so any non-JSON wire format must earn its place through measured |
| 19 | +client behavior rather than architectural preference. |
| 20 | + |
| 21 | +## Decision Drivers |
| 22 | + |
| 23 | +- Preserve backward compatibility for clients that already consume compact JSON. |
| 24 | +- Keep tool behavior cloneable by documenting the serializer as a distinct layer. |
| 25 | +- Allow empirical token and latency work without coupling it to storage, |
| 26 | + retrieval, or transport. |
| 27 | +- Measure the real client path: token cost and latency after client-side rewrap, |
| 28 | + not only raw payload size. |
| 29 | +- Keep `get_docs` markdown because markdown is the canonical documentation body, |
| 30 | + not a structured result that needs alternate serialization. |
| 31 | + |
| 32 | +## Considered Options |
| 33 | + |
| 34 | +1. JSON only. |
| 35 | + - Simple, stable, and fully backward-compatible. |
| 36 | + - Leaves no room for an empirically proven smaller structured-tool format in |
| 37 | + v0.3.x. |
| 38 | +2. JSON default + `format="toon"` opt-in on structured tools. (chosen) |
| 39 | + - Keeps compact JSON as the default wire format. |
| 40 | + - Allows TOON only as an explicit opt-in and only if the v0.3.0 empirical |
| 41 | + study proves a meaningful win after client-side rewrap, with acceptable |
| 42 | + latency. |
| 43 | + - Applies only to `search_docs`, `list_versions`, and `compare_versions`. |
| 44 | +3. TOON as the storage format. (rejected - decision 5.3) |
| 45 | + - Rejected because storage stays SQLite plus markdown. |
| 46 | + - Would mix storage and wire-format concerns, weaken debuggability, and |
| 47 | + reopen a decision already closed by the roadmap. |
| 48 | + |
| 49 | +## Decision Outcome |
| 50 | + |
| 51 | +The locked shape is: compact JSON is the default; `format="toon"` is opt-in and |
| 52 | +gated by the v0.3.0 empirical study; the `format` parameter exists on |
| 53 | +`search_docs`, `list_versions`, `compare_versions` only; `get_docs` stays |
| 54 | +markdown; TOON-as-storage is rejected. |
| 55 | + |
| 56 | +The v0.3.x implementation may expose a `format` parameter for those three |
| 57 | +structured tools only after the study in decision 5.8 confirms that any TOON win |
| 58 | +holds after client-side rewrap, not just on raw payloads. |
| 59 | + |
| 60 | +### Consequences |
| 61 | + |
| 62 | +**Positive:** Existing clients keep receiving compact JSON unless they opt in to |
| 63 | +another format. The serializer layer can evolve independently from retrieval, |
| 64 | +storage, cache, and transport. The project has a clear contract to cite when the |
| 65 | +v0.3.x `format` work begins. |
| 66 | + |
| 67 | +**Negative / risks:** The TOON path cannot be treated as decided until the |
| 68 | +empirical study is complete. Supporting more than one wire format adds testing |
| 69 | +and documentation work for every structured tool that opts in. Client-side |
| 70 | +rewrap may erase a raw-payload token win, which would make JSON-only the right |
| 71 | +outcome. |
| 72 | + |
| 73 | +## Layer Contract (principle 2.7) |
| 74 | + |
| 75 | +- **Inputs:** Structured tool result model plus the chosen wire format. |
| 76 | +- **Outputs:** Wire string returned to the MCP client. |
| 77 | +- **Invariants:** Serialization is a pure function of the result and chosen |
| 78 | + format; clients that do not opt in see no behavior change. |
| 79 | + |
| 80 | +The serializer is one of the eight documented layers in principle 2.7: source |
| 81 | +connector, ingestion, storage, retrieval, budget, serializer, cache, and |
| 82 | +transport. |
| 83 | + |
| 84 | +## Links |
| 85 | + |
| 86 | +- STRATEGIC-ROADMAP-2026-05-29.md §2.5, §5.3-§5.5, §5.8 |
| 87 | +- (future) v0.3.0 TOKEN-STUDY.md |
0 commit comments