The Format Tax — Recommendation Policy

This document defines the policy behind the homepage recommendation engine.

It is intentionally not the paper. The paper stays descriptive and conservative. The homepage is allowed to recommend a primary format, but every recommendation must expose:

a primary choice
a fallback
a confidence level
an evidence basis
the schema or validator pairing that makes the recommendation hold
the parser pairing where streaming matters

Design Principle

Users do not want a pile of formats. They want:

the most practical answer for their use case
why it wins
what would change the answer

That means the old category tree is replaced with a decision policy.

Evidence Ladder

Homepage recommendations are labeled with one of three evidence bases:

`benchmark-backed`

Used when the project has locally reproduced benchmark evidence for the relevant decision boundary.

`literature-backed`

Used when the strongest available support comes from published work, official project benchmarks, or official platform documentation, but local reproduction is incomplete.

`operational-heuristic`

Used when the recommendation is mainly an engineering judgment based on ecosystem maturity, ergonomics, and deployment reality rather than a clean benchmark win.

Confidence Levels

`high`

strong ecosystem support
direct operational fit
little disagreement in the available evidence

`medium`

plausible winner
evidence is incomplete, contested, or not yet locally reproduced

`low`

speculative
emerging format or narrow benchmark support

Low-confidence winners should be avoided on the homepage unless the user explicitly opts into experimental paths.

Inputs

The engine ranks candidates from four inputs:

Boundary
- LLM input
- LLM output
- streaming UI
- config
- backend transport
- storage
Data shape
- uniform tabular
- nested structured
- mixed prose + data
- state / memory
- simple key-value
Priority
- accuracy
- token cost
- generation validity
- maintainability
- latency
Hard constraints
- constrained decoding
- progressive streaming
- comments / embedded docs
- broad interoperability
- schema evolution

Hard Rules

These are non-negotiable overrides.

If the boundary is `LLM output` and the target is software

Primary: JSON + constrained decoding
Fallback: YAML
Confidence: High
Evidence: Literature-backed

Reason:

JSON Schema-backed structured output tooling is the strongest current ecosystem.
The relevant advantage is not “JSON alone”; it is JSON + schema-aware generation.

If the boundary is `streaming UI` and the granularity is property-level

Primary: YAML
Fallback: JSONL
Confidence: High
Evidence: Literature-backed

Reason:

meaningful partial prefixes matter more than universal interchange

If the boundary is `config` and the shape is simple key-value

Primary: TOML
Fallback: YAML
Confidence: High
Evidence: Operational heuristic

Reason:

maintainability beats universality for human-authored simple config

If the boundary is `config` and the artifact mixes prose with structure

Primary: Markdown + Frontmatter
Fallback: YAML
Confidence: High
Evidence: Operational heuristic

Reason:

the artifact is partly documentation, not just a payload

Current Recommendation Table

Use case	Primary	Fallback	Confidence	Evidence	Schema / Parser pairing
LLM output to software pipeline	JSON + constrained decoding	YAML	High	Literature-backed	JSON Schema
Agent state / memory for model retrieval	Markdown-KV	YAML	Medium	Literature-backed	Template checks or pre-validation
Mixed prose + structured instructions	Markdown + Frontmatter	YAML	High	Operational heuristic	Zod or Pydantic on frontmatter
Simple model-facing key-value input	YAML	TOML	Medium	Operational heuristic	Parse then validate
Flat repeated records, optimize for token cost	TOON	CSV	Medium	Literature-backed	Validate source data before conversion
Flat repeated records, optimize for retrieval accuracy	CSV	TOON	Medium	Literature-backed	External validation before conversion
Property-level streaming UI	YAML	JSONL	High	Literature-backed	`json-render`-style YAML streaming
Element-level streaming UI	JSONL	YAML	High	Benchmark-backed	line-by-line parser
Simple human-maintained config	TOML	YAML	High	Operational heuristic	Parse then validate
Nested operational config	YAML	TOML	High	Operational heuristic	JSON Schema or Pydantic after parse
Low-latency backend transport	FlatBuffers	Protobuf	Medium	Literature-backed	`.fbs` + codegen
General inter-service transport	Protobuf	JSON	High	Operational heuristic	`.proto` + compatibility rules

Tiebreakers

When more than one candidate survives the hard rules, rank them in this order:

hard constraint satisfaction
operational maturity
evidence strength
fit to the selected priority
migration cost

This ordering is deliberate. A slightly more efficient format does not win if it forces a fragile or poorly supported workflow.

Why Not JSON?

The recommendation engine should explain this explicitly whenever JSON is not the winner.

Common reasons:

repeated keys waste tokens on model-facing input
monolithic documents are poor for progressive streaming
JSON is hostile to comments and embedded explanation
binary protocols are better for hot service-to-service paths

Important:

JSON often is the winner for model output
JSON is not the villain; uniform JSON usage across all stages is

Schema Policy

The recommendation engine treats schemas as a separate layer.

Validators do not automatically make a format better for a model
The strongest current generation benefit comes from schema-guided output strategies
In practice, this usually means:
- emit JSON Schema directly, or
- generate JSON Schema from Zod, TypeBox, or Pydantic

The homepage should not imply that Zod, Valibot, Effect Schema, or Pydantic each independently improve generation quality by virtue of being those libraries. That is a benchmark question, not an assumption.

Non-Goals

The engine should not:

return five equally weighted answers
rank a format highly just because its own repository claims it wins
merge local results with external claims into one synthetic score
hide uncertainty behind confident copy

Relationship To The Paper

The paper asks:

what is measured
what has been reproduced
what remains uncertain

The homepage answers:

what should I use right now?

Those are different products and should remain different.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Format Tax — Recommendation Policy

Design Principle

Evidence Ladder

`benchmark-backed`

`literature-backed`

`operational-heuristic`

Confidence Levels

`high`

`medium`

`low`

Inputs

Hard Rules

If the boundary is `LLM output` and the target is software

If the boundary is `streaming UI` and the granularity is property-level

If the boundary is `config` and the shape is simple key-value

If the boundary is `config` and the artifact mixes prose with structure

Current Recommendation Table

Tiebreakers

Why Not JSON?

Schema Policy

Non-Goals

Relationship To The Paper

FilesExpand file tree

decision-tree.md

Latest commit

History

decision-tree.md

File metadata and controls

The Format Tax — Recommendation Policy

Design Principle

Evidence Ladder

benchmark-backed

literature-backed

operational-heuristic

Confidence Levels

high

medium

low

Inputs

Hard Rules

If the boundary is LLM output and the target is software

If the boundary is streaming UI and the granularity is property-level

If the boundary is config and the shape is simple key-value

If the boundary is config and the artifact mixes prose with structure

Current Recommendation Table

Tiebreakers

Why Not JSON?

Schema Policy

Non-Goals

Relationship To The Paper

`benchmark-backed`

`literature-backed`

`operational-heuristic`

`high`

`medium`

`low`

If the boundary is `LLM output` and the target is software

If the boundary is `streaming UI` and the granularity is property-level

If the boundary is `config` and the shape is simple key-value

If the boundary is `config` and the artifact mixes prose with structure