Visibility into the automation that builds and publishes the third-party module catalogue helps contributors reason about changes and spot failure points early. This document summarizes the current canonical pipeline and the parts of the broader architecture that are still future-facing.
The supported production pipeline is orchestrated via node scripts/orchestrator/index.ts run full-refresh-parallel (also exposed as node --run all). The orchestrator now drives four registered stages across three operational phases: metadata collection, parallel module processing, and publication.
| Order | Stage ID | Key Outputs |
|---|---|---|
| 1 | collect-metadata |
in-memory metadata payload, gitHubData.json |
| 2 | parallel-processing |
in-memory stage-5 payload, modules/, modules_temp/, website/images/ |
| 3 | aggregate-catalogue |
modules.json, modules.min.json, stats.json |
| 4 | generate-result-markdown |
result.md |
flowchart TB
orchestrator[[Orchestrator<br>4-stage execution]]
subgraph Phase 1: Metadata Collection
seed[("Module seed list")] --> collect{{Collect metadata}}
collect --> cache[("gitHubData.json cache")]
collect --> metadata["metadata payload (in-memory)"]
end
subgraph Phase 2: Parallel Module Processing
metadata --> parallel{{Parallel processing}}
parallel --> clones[("modules/<br>modules_temp/")]
parallel --> images[("website/images/")]
parallel --> stage5["stage-5 payload (in-memory)"]
end
subgraph Phase 3: Catalogue Aggregation
stage5 --> aggregate{{Aggregate catalogue}}
aggregate --> outputs[("modules.json<br>modules.min.json<br>stats.json")]
stage5 --> result{{Generate result markdown}}
outputs --> result
result --> resultMd[("result.md")]
end
orchestrator -.controls.-> collect
orchestrator -.controls.-> parallel
orchestrator -.controls.-> aggregate
- Orchestrator CLI: Declarative stage graph with
--only/--skipsupport, retries, and structured logging - Worker Pool Stage:
parallel-processingencapsulates clone, enrich, image, and analysis work behind a single supported stage - Aggregation Stage:
aggregate-cataloguebuilds published JSON artifacts from the in-memory stage-5 payload - Schema Validation: JSON schemas enforce contracts at the published boundaries (
modules.json,modules.min.json,stats.json) - Shared Utilities: HTTP, Git, filesystem, and rate limiting in
scripts/shared/
The pipeline implements intelligent caching and skip logic to avoid redundant work:
| Scope | Optimization | Current Behavior | Why It Helps |
|---|---|---|---|
| Metadata | API cache TTL | Reuses recent host API responses during collect-metadata |
Reduces external API traffic |
| Module processing | Clone reuse | Recycles modules_temp/ when repositories can be refreshed in place |
Avoids unnecessary full re-clones |
| Module processing | Worker batching | Processes modules in bounded child-process batches | Keeps memory bounded and throughput predictable |
| Analysis cache | Cache read/write | Worker-compatible moduleCache.json drives skip/read/write/prune in parallel-processing |
Restores second-run skip behavior while preserving worker throughput |
Open follow-up work is tracked centrally in Open Items.
No persisted intermediate stage boundary remains. Stage handoffs are fully in-memory.
This section is about how module data enters the system and reaches downstream consumers. Unlike the canonical pipeline above, part of this flow is still conceptual.
flowchart LR
wiki[(module wiki list<br><i>- unreliable -</i>)]
pipeline{{automation pipeline}}
api[(API<br>modules.json)]
remote[MMM-Remote-Control]
modinstall[MMM-ModInstall]
config[MMM-Config]
mmpm[mmpm]
moduleWebsite[website<br>modules.magicmirror.builders]
wiki --> pipeline --> api
api --> mmpm
api --> remote
api --> modinstall
api --> config
api --> moduleWebsite
flowchart LR
ui[(Form-based front end<br>for adding, editing, and<br>deleting modules<br><i>- not yet conceptualized -</i>)]
pipeline{{automation pipeline}}
api[(API<br>modules.json)]
remote[MMM-Remote-Control]
modinstall[MMM-ModInstall]
config[MMM-Config]
mmpm[mmpm]
moduleWebsite[website<br>modules.magicmirror.builders]
ui --> pipeline --> api
api --> remote
api --> modinstall
api --> config
api --> mmpm
api --> moduleWebsite
If this direction is pursued, the wiki would be replaced with a form-based frontend while downstream consumers continue using the unchanged API endpoint.
- Deterministic Outputs — Guarantees for reproducible builds
- Check Modules Reference — Rule registry and fixtures
- CONTRIBUTING.md — Setup instructions and workflow tips