Summary
metanorma site generate currently has no way for the caller to restrict which output formats are produced. It builds everything the flavor supports (HTML, PDF, XML, RXL). For automation that only needs one specific format (e.g. RXL for downstream Relaton indexing), this means paying the cost of PDF rendering and HTML generation for every document, even when those artifacts will be discarded.
PDF generation via mn2pdf.jar is particularly expensive (~7–10 minutes per document via Java/XSL-FO). For repos with dozens of sources, this can push CI jobs past GitHub Actions' 6h hard timeout.
Concrete case
relaton/relaton-data-bipm's crawler clones metanorma/bipm-si-brochure and runs:
bundle exec metanorma site generate --agree-to-terms
…but only consumes _site/documents/*.rxl afterward. The HTML/PDF/XML output is built and thrown away.
A recent CI run hit the 6h cancellation: https://github.com/relaton/relaton-data-bipm/actions/runs/25994267578/job/76405880612
Note on metanorma.yml
There is no formats (or equivalent) key in the site manifest schema today. lib/metanorma/site_manifest.rb declares only files under metanorma.source, and SiteGenerator never reads any format key — anything added there is silently ignored. So no config-side workaround exists either. Tracking the absence here in case the eventual implementation also adds a metanorma.source.formats: entry to the manifest.
Proposed CLI
metanorma site generate [SITE_MANIFEST_PATH] --extensions rxl
metanorma site generate [SITE_MANIFEST_PATH] --extensions html,rxl
Easiest implementation path: add extensions to the options declared in lib/metanorma/cli/commands/site.rb and to the filter_compile_options allow-list in lib/metanorma/cli/thor_with_config.rb. Compiler already consumes it, so nothing else needs changing. Naming it --extensions (matching metanorma compile) keeps the surface consistent; --formats is fine too if that reads better.
Help-text snippet
-x, --extensions=EXTENSIONS # Comma-separated list of output extensions to build
# for each document. Examples: rxl | html,rxl |
# html,pdf,xml,rxl. Matches `metanorma compile -x`.
Out of scope
- Format-specific options (compile/render settings per format) — out of scope; users can keep using
metanorma.yml for those.
- Per-document format overrides — same.
The ask here is just a single top-level filter applied to all sources in the run.
Summary
metanorma site generatecurrently has no way for the caller to restrict which output formats are produced. It builds everything the flavor supports (HTML, PDF, XML, RXL). For automation that only needs one specific format (e.g. RXL for downstream Relaton indexing), this means paying the cost of PDF rendering and HTML generation for every document, even when those artifacts will be discarded.PDF generation via
mn2pdf.jaris particularly expensive (~7–10 minutes per document via Java/XSL-FO). For repos with dozens of sources, this can push CI jobs past GitHub Actions' 6h hard timeout.Concrete case
relaton/relaton-data-bipm's crawler clonesmetanorma/bipm-si-brochureand runs:…but only consumes
_site/documents/*.rxlafterward. The HTML/PDF/XML output is built and thrown away.A recent CI run hit the 6h cancellation: https://github.com/relaton/relaton-data-bipm/actions/runs/25994267578/job/76405880612
Note on
metanorma.ymlThere is no
formats(or equivalent) key in the site manifest schema today.lib/metanorma/site_manifest.rbdeclares onlyfilesundermetanorma.source, andSiteGeneratornever reads any format key — anything added there is silently ignored. So no config-side workaround exists either. Tracking the absence here in case the eventual implementation also adds ametanorma.source.formats:entry to the manifest.Proposed CLI
Easiest implementation path: add
extensionsto theoptions declared inlib/metanorma/cli/commands/site.rband to thefilter_compile_optionsallow-list inlib/metanorma/cli/thor_with_config.rb.Compileralready consumes it, so nothing else needs changing. Naming it--extensions(matchingmetanorma compile) keeps the surface consistent;--formatsis fine too if that reads better.Help-text snippet
Out of scope
metanorma.ymlfor those.The ask here is just a single top-level filter applied to all sources in the run.