The Format Tax

Benchmark suite, paper draft, and recommendation tool for serialization choices in human-AI systems.

formattax.dev | Paper draft | Recommendation policy | Benchmarks | Schemas

Thesis

Serialization is not just an implementation detail.

The same system may cross several boundaries with incompatible needs:

model-facing input
model-facing output
progressive UI streaming
human-maintained config
inter-service transport
storage

Using one format for all of them imposes a Format Tax: unnecessary structural tokens, weaker streaming behavior, poorer ergonomics, or a less reliable output path than the stage actually requires.

What This Repo Is

This repository contains three related artifacts:

A paper draft An academic-facing document that stays conservative about evidence and separates local results from external claims.
A benchmark suite Canonical datasets, question sets, encoders, and runner tracks for measuring format behavior.
A homepage and recommendation engine A practitioner-facing tool that returns a primary format, fallback, confidence, and schema/parser pairing.

Evidence Policy

Every claim in the project should fit one of these buckets:

Locally measured — produced by this repository’s benchmark code
Externally reported — from papers, official project benchmarks, or official docs
Open hypothesis — worth testing, not yet proven here

This matters because the public serialization discourse is noisy:

some results are academic
some are official format-project benchmarks
some are independent reproductions
some are placeholder numbers used for site development

The project is being rewritten to keep those apart.

Current Benchmark Coverage

Implemented or in-progress runner tracks:

Token efficiency
Retrieval accuracy
Streaming readiness
Generation quality
Schema guidance

Planned but not yet implemented as a local runner track:

Binary throughput / transport benchmarking

Recommendation Layer

The homepage is intentionally more operational than the paper.

It answers:

what boundary am I at?
what shape is the data?
what is the main optimization target?
what hard constraints apply?

It then returns:

a primary recommendation
a fallback
confidence
evidence basis
schema pairing
parser pairing if relevant

The homepage is allowed to be decisive. The paper is not allowed to smuggle those decisions in as settled research results.

Schema Position

Validators and schemas are treated as a separate layer.

Current project stance:

validator libraries are not serialization formats
the strongest generation benefit usually comes from schema strategy, especially JSON Schema-oriented workflows
library choice mostly matters through:
- JSON Schema export quality
- runtime ergonomics
- ecosystem fit

Important Status Note

Some benchmark content under site/src/content/benchmarks/ and format metadata still exists primarily to support site UI development. Those placeholders should not be read as final reproduced benchmark results unless the corresponding run metadata is populated and dummy is false.

Repository Layout

the-format-tax/
├── README.md
├── docs/
│   ├── paper.md
│   ├── decision-tree.md
│   ├── benchmarks.md
│   ├── schemas.md
│   └── superpowers/
├── benchmarks/
│   ├── datasets/
│   ├── questions/
│   ├── references/
│   └── runner/
├── paper/
└── site/

Development

Site

cd site
bun install
bun run dev

Benchmarks

cd benchmarks/runner
bun run src/index.ts --track=1

The runner currently expects Bun.

Contributing

Useful contributions:

independent reproductions of public format claims
better harness normalization
schema-guidance experiments
binary transport benchmark implementation
clearer separation between dummy site content and measured benchmark output

License

MIT for code. Documentation and paper material follow the project’s existing documentation license conventions.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
docs		docs
paper		paper
site		site
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

The Format Tax

Thesis

What This Repo Is

Evidence Policy

Current Benchmark Coverage

Recommendation Layer

Schema Position

Important Status Note

Repository Layout

Development

Site

Benchmarks

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

The Format Tax

Thesis

What This Repo Is

Evidence Policy

Current Benchmark Coverage

Recommendation Layer

Schema Position

Important Status Note

Repository Layout

Development

Site

Benchmarks

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages