This document describes the guarantees around reproducible, deterministic outputs from the module processing pipeline.
Deterministic outputs provide several benefits:
- Cleaner Git Diffs: When only meaningful changes are made, diffs show exactly what changed without noise from reordered keys or renamed files
- Easier Code Reviews: Reviewers can focus on actual changes rather than structural reorganization
- Reproducible Builds: Running the pipeline multiple times on the same input produces identical output
- Debugging: Comparing pipeline runs becomes straightforward with consistent formatting
All JSON files generated by the pipeline (modules.json, modules.min.json, stats.json, metadata files) have the following guarantees:
- Sorted Object Keys: All object keys are sorted alphabetically at every nesting level
- Consistent Indentation: 2-space indentation for pretty-printed files
- Trailing Newline: Every JSON file ends with exactly one newline character
Implementation: The stringifyDeterministic() function in scripts/shared/deterministic-output.ts recursively sorts all object keys before serialization.
Example:
{
"description": "A weather module",
"id": "MMM-Weather",
"maintainer": "example",
"url": "https://github.com/example/MMM-Weather"
}Module screenshots are stored with deterministic filenames to ensure:
- Same Module → Same Filename: The same module always gets the same screenshot filename
- Different Modules → Different Filenames: No collisions between different modules
- No Source Dependency: Renaming the source image doesn't change the output filename
- Human-Readable: Filename clearly identifies the module for easy debugging
Implementation: The createDeterministicImageName() function uses the module identifier (moduleName---maintainer) directly as the filename base.
Format: <moduleName>---<maintainer>.<extension>
Example:
- Module:
MMM-Weatherbyexample - Screenshot:
MMM-Weather---example.jpg(always the same for this module)
Previous approach used original source filenames, which caused issues:
❌ Old: MMM-Weather---example---path/to/screenshot.jpg
✅ New: MMM-Weather---example.jpg
Problems with old approach:
- Renaming source image triggered unnecessary file changes
- Path separators in filenames caused issues
- Long, unpredictable filenames
Benefits of simple deterministic approach:
- Consistent, predictable filenames
- No dependency on source filename
- Human-readable for easy debugging
- Simple implementation, no hashing needed
import { writeJson } from "./shared/fs-utils.ts";
// Automatically uses sorted keys
await writeJson("output.json", { b: 2, a: 1, c: 3 });
// Result: {"a": 1, "b": 2, "c": 3}import { stringifyDeterministic } from "./shared/deterministic-output.ts";
const data = { z: 26, a: 1, m: 13 };
const json = stringifyDeterministic(data, 2);
// Result: "{\n \"a\": 1,\n \"m\": 13,\n \"z\": 26\n}"import { createDeterministicImageName } from "./shared/deterministic-output.ts";
const filename = createDeterministicImageName("MMM-Weather", "example", "jpg");
// Result: "MMM-Weather---example.jpg" (deterministic, always the same)To verify deterministic output:
# Run pipeline twice
npm run pipeline
# Copy output
cp website/data/modules.json /tmp/modules-run1.json
# Run pipeline again
npm run pipeline
# Compare outputs - should be identical
diff website/data/modules.json /tmp/modules-run1.jsonNo diff means perfect reproducibility.
The sortObjectKeys() function recursively processes values:
- Primitives (
null,string,number,boolean): returned as-is - Arrays: mapped recursively, preserving order
- Objects: keys sorted alphabetically, values processed recursively
This ensures deterministic output at all nesting levels.
Screenshot filenames follow a simple, deterministic pattern:
- Format:
${moduleName}---${maintainer}.${extension} - Example:
MMM-Weather---example.jpg - Benefits: Human-readable, debuggable, no collisions
No hashing required - the module identifier itself is already unique and deterministic.
- Key Sorting: Negligible (<1% overhead on typical module counts)
- Filename Generation: Instant string concatenation
- Overall: No measurable impact on pipeline runtime
Some older snapshots in the repository may still contain pre-standardized screenshot filenames (for example, names derived from source paths). The current canonical output uses the deterministic <moduleName>---<maintainer>.<extension> format.
Downstream consumers should always read screenshot paths from modules.json rather than hard-coding file names.
Deterministic output safeguards (sorted keys and deterministic image naming) are part of the current pipeline behavior.
Potential deterministic-output enhancements are tracked centrally in Open Items under "Backlog (Optional)".