This document defines the policy behind the homepage recommendation engine.
It is intentionally not the paper. The paper stays descriptive and conservative. The homepage is allowed to recommend a primary format, but every recommendation must expose:
- a primary choice
- a fallback
- a confidence level
- an evidence basis
- the schema or validator pairing that makes the recommendation hold
- the parser pairing where streaming matters
Users do not want a pile of formats. They want:
- the most practical answer for their use case
- why it wins
- what would change the answer
That means the old category tree is replaced with a decision policy.
Homepage recommendations are labeled with one of three evidence bases:
Used when the project has locally reproduced benchmark evidence for the relevant decision boundary.
Used when the strongest available support comes from published work, official project benchmarks, or official platform documentation, but local reproduction is incomplete.
Used when the recommendation is mainly an engineering judgment based on ecosystem maturity, ergonomics, and deployment reality rather than a clean benchmark win.
- strong ecosystem support
- direct operational fit
- little disagreement in the available evidence
- plausible winner
- evidence is incomplete, contested, or not yet locally reproduced
- speculative
- emerging format or narrow benchmark support
Low-confidence winners should be avoided on the homepage unless the user explicitly opts into experimental paths.
The engine ranks candidates from four inputs:
- Boundary
- LLM input
- LLM output
- streaming UI
- config
- backend transport
- storage
- Data shape
- uniform tabular
- nested structured
- mixed prose + data
- state / memory
- simple key-value
- Priority
- accuracy
- token cost
- generation validity
- maintainability
- latency
- Hard constraints
- constrained decoding
- progressive streaming
- comments / embedded docs
- broad interoperability
- schema evolution
These are non-negotiable overrides.
- Primary: JSON + constrained decoding
- Fallback: YAML
- Confidence: High
- Evidence: Literature-backed
Reason:
- JSON Schema-backed structured output tooling is the strongest current ecosystem.
- The relevant advantage is not “JSON alone”; it is
JSON + schema-aware generation.
- Primary: YAML
- Fallback: JSONL
- Confidence: High
- Evidence: Literature-backed
Reason:
- meaningful partial prefixes matter more than universal interchange
- Primary: TOML
- Fallback: YAML
- Confidence: High
- Evidence: Operational heuristic
Reason:
- maintainability beats universality for human-authored simple config
- Primary: Markdown + Frontmatter
- Fallback: YAML
- Confidence: High
- Evidence: Operational heuristic
Reason:
- the artifact is partly documentation, not just a payload
| Use case | Primary | Fallback | Confidence | Evidence | Schema / Parser pairing |
|---|---|---|---|---|---|
| LLM output to software pipeline | JSON + constrained decoding | YAML | High | Literature-backed | JSON Schema |
| Agent state / memory for model retrieval | Markdown-KV | YAML | Medium | Literature-backed | Template checks or pre-validation |
| Mixed prose + structured instructions | Markdown + Frontmatter | YAML | High | Operational heuristic | Zod or Pydantic on frontmatter |
| Simple model-facing key-value input | YAML | TOML | Medium | Operational heuristic | Parse then validate |
| Flat repeated records, optimize for token cost | TOON | CSV | Medium | Literature-backed | Validate source data before conversion |
| Flat repeated records, optimize for retrieval accuracy | CSV | TOON | Medium | Literature-backed | External validation before conversion |
| Property-level streaming UI | YAML | JSONL | High | Literature-backed | json-render-style YAML streaming |
| Element-level streaming UI | JSONL | YAML | High | Benchmark-backed | line-by-line parser |
| Simple human-maintained config | TOML | YAML | High | Operational heuristic | Parse then validate |
| Nested operational config | YAML | TOML | High | Operational heuristic | JSON Schema or Pydantic after parse |
| Low-latency backend transport | FlatBuffers | Protobuf | Medium | Literature-backed | .fbs + codegen |
| General inter-service transport | Protobuf | JSON | High | Operational heuristic | .proto + compatibility rules |
When more than one candidate survives the hard rules, rank them in this order:
- hard constraint satisfaction
- operational maturity
- evidence strength
- fit to the selected priority
- migration cost
This ordering is deliberate. A slightly more efficient format does not win if it forces a fragile or poorly supported workflow.
The recommendation engine should explain this explicitly whenever JSON is not the winner.
Common reasons:
- repeated keys waste tokens on model-facing input
- monolithic documents are poor for progressive streaming
- JSON is hostile to comments and embedded explanation
- binary protocols are better for hot service-to-service paths
Important:
- JSON often is the winner for model output
- JSON is not the villain; uniform JSON usage across all stages is
The recommendation engine treats schemas as a separate layer.
- Validators do not automatically make a format better for a model
- The strongest current generation benefit comes from schema-guided output strategies
- In practice, this usually means:
- emit JSON Schema directly, or
- generate JSON Schema from Zod, TypeBox, or Pydantic
The homepage should not imply that Zod, Valibot, Effect Schema, or Pydantic each independently improve generation quality by virtue of being those libraries. That is a benchmark question, not an assumption.
The engine should not:
- return five equally weighted answers
- rank a format highly just because its own repository claims it wins
- merge local results with external claims into one synthetic score
- hide uncertainty behind confident copy
The paper asks:
- what is measured
- what has been reproduced
- what remains uncertain
The homepage answers:
- what should I use right now?
Those are different products and should remain different.