Schemas are a separate layer from serialization.
That sounds obvious, but many format debates blur the two:
- a format is blamed for generation failures that are really schema failures
- a validator library is credited for generation quality when the real win came from JSON Schema or constrained decoding
- teams compare Zod, Pydantic, and Valibot as if they were serialization formats
This document keeps those layers separate.
What shape is allowed?
Examples:
- JSON Schema
- Pydantic models
- Zod schemas
- TypeBox schemas
- Protobuf
.proto
Did the actual payload conform?
Examples:
- Zod parse / safeParse
- Pydantic validation
- AJV against JSON Schema
Can the model be steered toward the contract at generation time?
Examples:
- JSON Schema passed to structured-output systems
- schema text embedded in the prompt
- constrained decoding
The third job is the one most directly relevant to generation quality.
Schema-guided output strategies improve structural reliability more clearly than they improve semantic correctness.
This is the core distinction the project now enforces.
- if you need valid structured output, schema guidance matters
- if you need correct answers, schema guidance helps only part of the problem
- choosing a validator library is not the same thing as choosing a generation strategy
Best treated as the canonical interchange schema for model output.
Why it matters:
- it is the most important bridge between application schemas and structured-output tooling
- it is the common target that other validators often export toward
- it is the cleanest way to talk about constrained decoding today
Best treated as a TypeScript-first source-of-truth schema that can feed JSON Schema-oriented workflows.
Why it matters:
- good developer ergonomics
- widely used in TypeScript app stacks
- strong operational pairing with JSON output workflows
Best treated as the most direct TypeScript-to-JSON-Schema path.
Why it matters:
- if your generation stack wants JSON Schema, TypeBox keeps the translation gap small
Best treated as the Python source-of-truth model layer with JSON Schema export.
Why it matters:
- strong fit for Python-first LLM systems
- natural bridge from application model to validation and structured output
Best treated as validator choices whose relevance to generation depends on whether they cleanly interoperate with JSON Schema-oriented workflows.
Why it matters:
- runtime ergonomics and bundle/runtime tradeoffs may be excellent
- direct evidence that either library independently improves model generation remains limited
The project should not claim:
- “Zod improves generation quality”
- “Valibot beats Pydantic for LLM output”
- “Effect Schema produces more semantically correct JSON than TypeBox”
unless those claims are benchmarked directly.
At present, the safer claim is:
- the strongest generation benefit comes from the schema strategy
- library choice matters mostly through:
- export quality
- runtime ergonomics
- integration cost
| Boundary | Best schema layer | Why |
|---|---|---|
| LLM output to software | JSON Schema | strongest structured-output target |
| TypeScript app with model output | Zod or TypeBox -> JSON Schema | developer ergonomics plus output contract |
| Python app with model output | Pydantic -> JSON Schema | one model layer for validation and export |
| Human-authored config | parse first, then validate | schema is downstream of the file format |
| Model-facing input serialization | pre-validate source data | validators protect the source before conversion |
- Preferred schema layer: JSON Schema
- Typical source schema: Zod, TypeBox, or Pydantic
- Why: strongest operational support for structured output and downstream validation
- Preferred schema layer: validate after parsing
- Typical validators: JSON Schema, Zod, Pydantic
- Why: YAML itself is not the validator; it is the transport syntax
- Preferred schema layer: validate the frontmatter only
- Why: the prose body is documentation, not a single typed payload
- Preferred schema layer: template checks or source validation
- Why: this format is useful precisely because it is lightweight and model-readable, not because it plugs into a rich validation ecosystem
The schema question should be tested in layers:
- no schema guidance
- schema in prompt
- post-parse validation
- native constrained decoding
That is why the benchmark suite now includes an experimental schema-guidance track.
- If the output must be machine-consumable, think in terms of JSON Schema even if your app authoring layer starts in Zod or Pydantic.
- Validators matter most at generation time when they affect the generation strategy, not just the post-hoc parse step.
- For input serialization, validate the source data before conversion; the model-facing format and the schema layer solve different problems.
These remain benchmark questions rather than settled doctrine:
- Does schema-in-prompt materially improve semantic correctness, or mostly syntax?
- How much does native constrained decoding outperform prompt-only schema guidance for real tasks?
- Does library choice matter after normalizing to the same JSON Schema contract?
- Are there use cases where validator strictness harms reasoning quality by overconstraining the response space?
Those are exactly the questions the project should keep open until the benchmark data exists.