| title | GKG ADR 012: GOON Format (Graph Object Output Notation) | ||
|---|---|---|---|
| creation-date | 2026-04-15 | ||
| last-updated | 2026-05-10 | ||
| authors |
|
||
| toc_hide | true |
Accepted
2026-05-10
The GKG server returns graph query results through ResponseFormat: RAW produces structured JSON from GraphFormatter; LLM produces text from GoonFormatter. The LLM path existed in proto, gRPC routing, and CLI wiring before any encoding shipped — GoonFormatter delegated to GraphFormatter and returned the same JSON. No LLM-optimized encoding existed.
When an agent calls query_graph and receives the full GraphResponse JSON, a traversal of 50 users and 200 merge requests with 200 edges runs roughly 12,000–15,000 tokens. The same information in a columnar text format fits in 5,000–7,000 tokens while preserving the graph topology the agent needs. Over a multi-turn session of 5–10 graph queries, that is 50,000–80,000 tokens of context spent on structural JSON syntax.
Encoding choice also affects correctness. Google's "Talk like a Graph" study (Fatemi et al., ICLR 2024) measured graph reasoning accuracy as a function of encoding alone and observed swings between 4.8% and 61.8% per task. Incident encoding (grouping edges by source) outperformed flat edge lists and adjacency matrices across most tasks. Dense encodings with many edges acted as distractors. Application-context framing ("who authored which merge requests") outperformed abstract graph framing by up to 18 percentage points.
The encoding choice is a correctness concern as much as a token-budget one.
Adopt GOON, a line-oriented text format for format=llm responses. Section-marker grammar with deterministic ordering, type-grouped node bodies, relationship-grouped edges, table-shaped aggregation rows, and chain notation for paths.
The format spec was validated by:
- A 5-variant Pareto benchmark on Haiku 4.5 against the production GitLab.com graph (
gitlab-org/orbit/gkg-evals-harness). 430 task-runs over two cohorts measured cost, duration, and tool-call correctness forkv,col,hier,min,incident, and the raw JSON baseline. - A corpus audit pass running the full query corpus (then
fixtures/queries/corpus-input.json, since reorganized underfixtures/queries/corpus/) against production via four parallel sub-agents, surfacing six production-confirmed encoder gaps (each fixed and regression-tested). - A post-merge data-loss audit using two parallel sub-agents to walk every field of
GraphResponseand verify the encoder reads it. Four silent drops found and fixed.
The kv variant was Pareto-dominant over raw JSON: −11% cost, −15% duration, +4.8pp correctness, p=0.043 on tool_sequence_length. The min variant matched kv on accuracy at lower token cost, which led to the @hints block being descoped from the format — its presence in kv did not improve agent behavior over min's absence of it.
GOON is line-oriented text. Sections are delimited by @-prefixed markers, emitted in a fixed order: @header, @nodes, then exactly one of @edges (for traversal, search, neighbors) or @paths (for path_finding), and @rows for aggregation. Empty sections still emit their marker so a parser does not have to special-case absence.
Query metadata. One field per line. Always present.
@header
query_type:traversal
goon_version:1.0.0
nodes:10
edges:5
Fields:
| Field | When | Meaning |
|---|---|---|
query_type |
always | One of traversal, aggregation, path_finding, search, neighbors. |
goon_version |
always | This encoder's wire version (semver). Distinct from the upstream GraphResponse.format_version, which is the source schema version owned by ADR 004. |
nodes / edges |
always | Counts of entries in @nodes / @edges. For aggregation with node-kind group columns, nodes reflects the deduplicated entities lifted from rows. |
rows |
aggregation | Number of rows in @rows. |
group_by |
aggregation with group_by |
Comma-separated descriptors: name(kind) or name(kind:detail). Kind is node (detail is the entity type, e.g. severity(node:Vulnerability)) or property (detail is the underlying ontology property when the alias differs, e.g. severity_bucket(property:severity)). |
aggregations |
aggregation with metrics | Comma-separated descriptors: name(function), name(function:target), or name(function:target.property). The richest form names both the node alias and the property being aggregated, so latest_update(max:v.updated_at) is unambiguous. |
has_more |
when cursor paginated and there are more rows | true. |
total_rows |
when cursor paginated | Authorized row count before cursor slicing. |
Nodes grouped by entity type. Each type starts with TypeName(count): then one entity per line. Each line begins with the integer ID, followed by key=value pairs.
@nodes
User(3):
1 username=alice name="Alice Smith" state=active
2 username=bob name="Bob Chen" state=active
3 username=carol name="Carol Davis" state=blocked
MergeRequest(2):
42 iid=101 state=merged title="Fix auth bug"
43 iid=102 state=merged title="Add caching layer"
Ordering is (entity_type, id) for traversal, search, neighbors, and path_finding. For aggregation, server row order is preserved so the order specified by aggregation_sort survives the encode pass.
For aggregation queries with node-kind group columns, the encoder lifts each unique {type, id, properties} cell from @rows into @nodes (deduplicated by (entity_type, id)). This keeps row lines one line each and avoids repeating node bodies on every bucket.
| Source | Encoded as | Example |
|---|---|---|
null, empty string |
omitted from the row | (key does not appear) |
null in a @rows cell |
bare null (literal) |
severity=null count=5 — a real bucket value, distinct from the string "null" which is quoted |
true / false (JSON bool) |
bare token | draft=true |
"true" / "false" / "null" (JSON string) |
quoted | state="true" — distinguishes a string from a native boolean |
| Integer | bare digits | iid=18, id=12971673076 (precision preserved up to i64) |
| Finite float | bare | avg_duration=941.131772070606 |
NaN, ±Inf |
dropped | (key does not appear) |
String matching [A-Za-z0-9_\-:./@+]+ or an ISO datetime |
bare | username=stanhu, created_at=2026-05-08T22:55:58Z |
ClickHouse datetime YYYY-MM-DD HH:MM:SS[.fraction] |
T-form (space at position 10 swapped to T) |
created_at=2026-05-08T22:55:58.467450 |
| Any other string | double-quoted with \\, \", \n, \r, \t escapes; other control chars dropped |
title="line one\nline two" |
Long text (body, description, name, note, title) over 200 chars |
truncated with ... plus a sibling <key>_len=N breadcrumb |
description="..." description_len=2308 |
| Any other string over 1000 chars | same truncation + breadcrumb |
Datetime validation goes through chrono::NaiveDateTime::parse_from_str and DateTime::parse_from_rfc3339. The output is built byte-for-byte from the input with at most one byte (the space at position 10) swapped to T; the source's fractional precision is preserved exactly rather than being round-tripped through chrono's nanosecond default.
Property order within a node row is column-priority then alphabetical: identity (iid, username, name, full_path, path, uuid) first, then status enums (state, status, visibility_level), then everything else, then timestamps (created_at, updated_at, merged_at, closed_at), then long text (title, description, body, note) last. This means a truncated description never hides a shorter identity field.
Edges grouped by relationship type. Arrow notation matches what models naturally produce when formatting graph data.
@edges
AUTHORED(3):
User:1 --> MergeRequest:42
User:1 --> MergeRequest:43
User:2 --> MergeRequest:44
IN_PROJECT(2):
MergeRequest:42 --> Project:100
MergeRequest:43 --> Project:100
Ordering is total: (path_id, step, edge_type, from, from_id, to, to_id, depth). Duplicates are removed using the same key. The total ordering means shuffle-invariance under property tests.
For variable-length traversals, an edge with a depth tag carries it on the row:
@edges
MEMBER_OF(2):
User:1 --> Group:100 depth=1
User:1 --> Group:200 depth=2
For path_finding, @edges is replaced by @paths. Each path is one line: a chain of Type:id --REL--> Type:id segments. The agent gets the path identity without reassembling it from scattered edges.
@paths
path=0: User:64248 --AUTHORED--> MergeRequest:482927048 --IN_PROJECT--> Project:278964
The @nodes section still carries full properties for each entity referenced in the chain.
Table-shaped aggregation rows. One row per line. Group columns come first, then metric columns, in the order declared by group_by: and aggregations:.
Property grouping with a single metric:
@header
query_type:aggregation
goon_version:1.0.0
nodes:0
edges:0
rows:5
group_by:severity(property)
aggregations:vulnerability_count(count:v)
@nodes
@edges
@rows
severity=medium vulnerability_count=8421
severity=high vulnerability_count=2350
severity=low vulnerability_count=1542
severity=critical vulnerability_count=120
severity=info vulnerability_count=42
Node grouping, with the lifted entity in @nodes:
@header
query_type:aggregation
goon_version:1.0.0
nodes:3
edges:0
rows:3
group_by:u(node:User)
aggregations:merged_count(count:u)
@nodes
User(3):
1243277 username=ghost1
35702613 username=bot_a
26832240 username=bot_b
@edges
@rows
u=User:1243277 merged_count=65555
u=User:35702613 merged_count=21277
u=User:26832240 merged_count=20289
Ungrouped (scalar) aggregation flows through the same @rows path with a single row:
@header
query_type:aggregation
goon_version:1.0.0
nodes:0
edges:0
rows:1
aggregations:total(count:u)
@nodes
@edges
@rows
total=2347
GoonFormatter lives in crates/query-engine/formatters/src/goon/. It implements the ResultFormatter trait the same way GraphFormatter does: format(&self, output: &PipelineOutput) -> Value. For LLM responses it composes GraphFormatter::build_response(output) with goon::encode(&response, &GOON_OUTPUT_FORMAT_VERSION) and wraps the result in Value::String.
Wiring at crates/gkg-server/src/grpc/service.rs dispatches statically per request: req.format == ResponseFormat::Llm calls GoonFormatter.format_stamped(&output); otherwise GraphFormatter.format_stamped(&output). The result rides the gRPC ExecuteQueryResult.formatted_text field with format-name and format-version metadata.
The encoder reads every field of GraphResponse (audited via parallel sub-agents post-implementation). Fields that travel:
- All node and edge fields, including
GraphEdge.depthfor variable-length traversals. ColumnDescriptor.targetand.property(rendered in theaggregations:line asfunction:target.property).GroupColumnDescriptor.entity(rendered asname(node:Entity)) and.propertywhen the alias differs from the underlying property name (rendered asname(property:underlying)).Value::Nullin@rowscells renders as barenullso a "no severity assigned" bucket stays distinguishable from an absent column.
Fields intentionally not surfaced:
GraphResponse.format_version— the upstream RAW schema version. The encoder emitsgoon_versioninstead; mixing both in one header creates the same field-name conflict that motivated the rename.GroupColumnDescriptor.node— the source node alias is internal compiler state.GraphEdge.path_id/step— used as sort keys and to drive@pathschain order; not surfaced as visible fields.
Locked by property tests with 64 cases each:
shuffle_invariant: a random payload shuffled by a seeded RNG must encode byte-identically.encoding_is_pure: same input encodes to same bytes across calls.output_starts_with_header:@header\nalways first.no_unescaped_control_chars: no raw\ror\treaches the output.
config/GOON_OUTPUT_FORMAT_VERSION (semver, 1.0.0 at first release) follows the same discipline as RAW_OUTPUT_FORMAT_VERSION from ADR 004:
scripts/check-goon-format-version.shmirrorscheck-response-schema-version.sh. It watchescrates/query-engine/formatters/src/goon/**.rs,graph.rs, andlib.rs, and requires a version bump on any change.- Lefthook runs the check pre-commit; GitLab CI runs it on MRs in the
lintstage. - Bypass for wire-neutral edits:
[skip goon-format-version-check]in the MR description orSKIP_GOON_FORMAT_VERSION_CHECK=1locally.
| Layer | Where | Count | Covers |
|---|---|---|---|
| Unit | crates/query-engine/formatters/src/goon/tests.rs |
51 | Header structure, sections, quoting, escape rules, datetime normalization, truncation, numerics, edges, dedup, path-finding, aggregation shapes (property + node + ungrouped), Value::Null row cells, depth on variable-length edges |
Property (proptest) |
tests/goon_properties.rs |
4 × 64 | Shuffle invariance, idempotence, header prefix, no unescaped control chars |
Snapshot (insta) |
tests/goon_snapshots.rs |
7 | One golden file per query shape + pagination |
| Integration | crates/integration-tests/tests/server/goon_formatter.rs |
8 subtests | Full compile → execute → redact → hydrate → format path against ClickHouse testcontainers; asserts format_stamped returns (Value::String, version, FormatName::Goon), headers carry goon_version, escape behavior, aggregation shapes, raw/goon count agreement |
Return JSON with a token-efficient serialization (MessagePack, CBOR). Binary formats save bytes, not tokens. LLMs tokenize text; a base64-wrapped MessagePack payload tokenizes about the same as the JSON it encodes.
Return natural language summaries. Natural language is the least token-efficient encoding. "User alice authored merge request 42 titled Fix auth bug which is merged" is 15 tokens; 42 iid=101 title="Fix auth bug" state=merged is 9. It is also unparseable for follow-up queries.
Pipe-delimited columnar tables (TOON-style headers + values). Declaring column names once and emitting 1|alice|active is ~10–15% more token-efficient than key=value for large result sets. Analysis of 316 AI coding sessions (83 Claude Code, 233 Codex) showed models never produce columnar output when formatting graph data. They consistently use inline key=value. The Pareto benchmark confirmed: col did not beat kv on cost-adjusted correctness.
Return the same JSON as format=raw. Wastes 40–60% of tokens on {, }, "key":, and commas. The starting point that motivated this ADR.
JSON with abbreviated keys. Shortening "username" to "u" saves tokens but forces the model to maintain a key mapping across the response. The key=value shape keeps full names inline at comparable token cost.
Keep the @hints navigation block. The kv variant carried a @hints block listing available outgoing relationships per entity type. The Pareto benchmark showed min (no hints) matched kv on accuracy at lower token cost. The hints did not change agent behavior enough to pay for themselves, so the block was descoped.
Hierarchical / incident-only encoding (one section per source node). The hier and incident variants scored higher on isolated graph-reasoning tasks but worse on the cost/duration axes in the benchmark. The Pareto front was won by kv's flat type-grouped shape.
What improves:
- 40–60% token reduction on the LLM path, validated across all five query shapes against production data.
- −11% cost, −15% duration, +4.8pp correctness against raw JSON on Haiku 4.5 (p=0.043 on
tool_sequence_length). - Format matches what models naturally produce and consume for graph data (validated against 316 session transcripts).
- The shape is a pure function of
GraphResponse— every visible field is a function of one input field. Adding a new field to the wire response is the only way to extend the format. orbit query --format=llmoutputs human-scannable text instead of dense JSON.
What gets harder:
- GOON is GKG-specific. Changes to
GraphResponseneed corresponding changes to the encoder (caught by the versioned CI check). - String quoting has more rules than pipe-delimited (escapes for
\\,\",\n,\r,\t; control char drops). The encoder treats this as one pass; the rules are tested. - Snapshot tests catch any unintended format drift, which is intentional but means any deliberate format change is a multi-line snapshot diff.
Cross-language parity (Rust encoder vs the Python prototype in gkg-evals-harness:vendor/skills/orbit-goon-kv/scripts/goon_encode.py) is locked on the Rust side only. Byte-identical output on identical inputs is a deferred follow-up.
HTTP response body shape. Workhorse owns the HTTP response. For format=llm, the body is currently {result: "<goon-text>", query_type, raw_query_strings, row_count} — the goon string lives in the result field of a JSON envelope. Returning the goon text as a plain text/plain body (so a viewer renders real newlines instead of \n escapes) requires a change in workhorse/internal/orbit/sendquery.go, not in GKG. That change is not part of this ADR.
- Issue: #271
- Implementation MR: !1289 (merged)
- Audit follow-up MR: !1291
- Property grouping (changed aggregation wire shape): !1287
- Benchmark harness:
gkg-evals-harness!1 - Benchmark result note: MR note 3331102607
- Eval archive:
gitlab-org/orbit/orbit-evals-results-archive - ADR 003 (API Design):
003_api_design.md - ADR 004 (Unified Query Response Schema):
004_unified_response_schema.md - ADR 008 (Workhorse Query Acceleration):
008_workhorse_query_acceleration.md - Fatemi, Halcrow, Perozzi. "Talk like a Graph: Encoding Graphs for Large Language Models." ICLR 2024.
- TOON specification
- Orbit API Design (snippet)
- Encoder source:
crates/query-engine/formatters/src/goon/ - Wire version constant:
config/GOON_OUTPUT_FORMAT_VERSION - Response JSON Schema:
config/schemas/query_response.json - Proto:
crates/gkg-server/proto/gkg.proto(ResponseFormat::RESPONSE_FORMAT_LLM)