Skip to content

metanorma site generate: add --formats flag to override metanorma.yml output formats #418

@andrew2net

Description

@andrew2net

Summary

metanorma site generate currently has no way for the caller to restrict which output formats are produced. It builds everything the flavor supports (HTML, PDF, XML, RXL). For automation that only needs one specific format (e.g. RXL for downstream Relaton indexing), this means paying the cost of PDF rendering and HTML generation for every document, even when those artifacts will be discarded.

PDF generation via mn2pdf.jar is particularly expensive (~7–10 minutes per document via Java/XSL-FO). For repos with dozens of sources, this can push CI jobs past GitHub Actions' 6h hard timeout.

Concrete case

relaton/relaton-data-bipm's crawler clones metanorma/bipm-si-brochure and runs:

bundle exec metanorma site generate --agree-to-terms

…but only consumes _site/documents/*.rxl afterward. The HTML/PDF/XML output is built and thrown away.

A recent CI run hit the 6h cancellation: https://github.com/relaton/relaton-data-bipm/actions/runs/25994267578/job/76405880612

Note on metanorma.yml

There is no formats (or equivalent) key in the site manifest schema today. lib/metanorma/site_manifest.rb declares only files under metanorma.source, and SiteGenerator never reads any format key — anything added there is silently ignored. So no config-side workaround exists either. Tracking the absence here in case the eventual implementation also adds a metanorma.source.formats: entry to the manifest.

Proposed CLI

metanorma site generate [SITE_MANIFEST_PATH] --extensions rxl
metanorma site generate [SITE_MANIFEST_PATH] --extensions html,rxl

Easiest implementation path: add extensions to the options declared in lib/metanorma/cli/commands/site.rb and to the filter_compile_options allow-list in lib/metanorma/cli/thor_with_config.rb. Compiler already consumes it, so nothing else needs changing. Naming it --extensions (matching metanorma compile) keeps the surface consistent; --formats is fine too if that reads better.

Help-text snippet

-x, --extensions=EXTENSIONS    # Comma-separated list of output extensions to build
                               # for each document. Examples: rxl | html,rxl |
                               # html,pdf,xml,rxl. Matches `metanorma compile -x`.

Out of scope

  • Format-specific options (compile/render settings per format) — out of scope; users can keep using metanorma.yml for those.
  • Per-document format overrides — same.

The ask here is just a single top-level filter applied to all sources in the run.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions