|
| 1 | +# YAML Data Refactoring Design |
| 2 | + |
| 3 | +**Date:** 2026-01-12 |
| 4 | +**Status:** Approved |
| 5 | +**Goal:** Refactor data loading from legacy MJS files to YAML-based multi-canon architecture |
| 6 | + |
| 7 | +## Background |
| 8 | + |
| 9 | +The legacy data files (`scriptdata.mjs`, `scriptlang.mjs`, `scriptregex.mjs`, `coc.mjs`, `coc-mapping.mjs`) have been archived and replaced with a YAML-based structure under `data/canons/` and `data/shared/`. The source code still imports the archived files and needs refactoring. |
| 10 | + |
| 11 | +## Decisions |
| 12 | + |
| 13 | +| Decision | Choice | |
| 14 | +|----------|--------| |
| 15 | +| Deployment target | NPM package only | |
| 16 | +| YAML handling | Build-time compilation to JS | |
| 17 | +| Canon priority | Explicit → LDS default → auto-detect fallback | |
| 18 | +| COC mapping loading | Lazy load on first use | |
| 19 | +| Canon conversion API | Explicit `convertCanon()` function only | |
| 20 | +| Result canon field | Only include when different from expected | |
| 21 | + |
| 22 | +## Build Pipeline |
| 23 | + |
| 24 | +### YAML Compilation Step |
| 25 | + |
| 26 | +New script `build/compile-yaml.mjs` compiles YAML to JS modules: |
| 27 | + |
| 28 | +``` |
| 29 | +data/ src/data/ (compiled, gitignored) |
| 30 | +├── canons/ ├── canons/ |
| 31 | +│ ├── bible/ │ ├── bible/ |
| 32 | +│ │ ├── _structure.yml → │ │ ├── structure.mjs |
| 33 | +│ │ └── en.yml → │ │ └── en.mjs |
| 34 | +│ ├── lds/ │ ├── lds/ |
| 35 | +│ │ ├── _structure.yml → │ │ ├── structure.mjs |
| 36 | +│ │ └── en.yml → │ │ └── en.mjs |
| 37 | +│ └── coc/ │ └── coc/ |
| 38 | +│ ├── _structure.yml → │ ├── structure.mjs |
| 39 | +│ └── en.yml → │ └── en.mjs |
| 40 | +└── shared/ └── shared/ |
| 41 | + ├── en.yml → ├── en.mjs |
| 42 | + └── ko.yml → └── ko.mjs |
| 43 | +``` |
| 44 | + |
| 45 | +Additionally, `_archive/data/coc-mapping.mjs` is compiled to `src/data/canons/coc/mapping.mjs`. |
| 46 | + |
| 47 | +### Build Command |
| 48 | + |
| 49 | +```bash |
| 50 | +npm run build |
| 51 | +# 1. compile-yaml.mjs: YAML → JS in src/data/ |
| 52 | +# 2. build.mjs: Bundle to dist/ |
| 53 | +``` |
| 54 | + |
| 55 | +### Gitignore Addition |
| 56 | + |
| 57 | +``` |
| 58 | +src/data/ |
| 59 | +``` |
| 60 | + |
| 61 | +## Data Loading Architecture |
| 62 | + |
| 63 | +### New Module: `src/lib/data-loader.mjs` |
| 64 | + |
| 65 | +```javascript |
| 66 | +// Eagerly loaded (small, always needed) |
| 67 | +import bibleStructure from '../data/canons/bible/structure.mjs'; |
| 68 | +import ldsStructure from '../data/canons/lds/structure.mjs'; |
| 69 | +import sharedEn from '../data/shared/en.mjs'; |
| 70 | + |
| 71 | +// Lazy loaded caches |
| 72 | +let cocStructure = null; |
| 73 | +let cocMapping = null; |
| 74 | +let languageCache = {}; |
| 75 | + |
| 76 | +export function getCanonStructure(canon) { |
| 77 | + // Returns structure, lazy-loads COC if needed |
| 78 | + // Merges parent structure if canon extends another |
| 79 | +} |
| 80 | + |
| 81 | +export function getLanguageData(canon, lang) { |
| 82 | + // Returns merged: shared/{lang} + canons/{canon}/{lang} |
| 83 | + // Uses deep-merge utility |
| 84 | +} |
| 85 | + |
| 86 | +export async function getCocMapping() { |
| 87 | + // Lazy loads 881KB mapping only when convertCanon() is called |
| 88 | + if (!cocMapping) { |
| 89 | + const module = await import('../data/canons/coc/mapping.mjs'); |
| 90 | + cocMapping = module.default; |
| 91 | + } |
| 92 | + return cocMapping; |
| 93 | +} |
| 94 | +``` |
| 95 | + |
| 96 | +### Loading Strategy |
| 97 | + |
| 98 | +| Data | Loading | Reason | |
| 99 | +|------|---------|--------| |
| 100 | +| Bible structure | Eager | Always needed (LDS extends it) | |
| 101 | +| LDS structure | Eager | Default canon | |
| 102 | +| COC structure | Lazy | Only for COC references | |
| 103 | +| COC mapping | Lazy | Only for `convertCanon()` | |
| 104 | +| Language data | Lazy + cached | Load per language on demand | |
| 105 | + |
| 106 | +## API Changes |
| 107 | + |
| 108 | +### Existing Functions (Backward Compatible) |
| 109 | + |
| 110 | +Signatures unchanged: |
| 111 | +- `lookupReference(query, language?, config?)` |
| 112 | +- `generateReference(verseIds, language?, config?)` |
| 113 | +- `detectReferences(text, language?, callback?)` |
| 114 | +- `setLanguage(lang)` / `getLanguage()` |
| 115 | + |
| 116 | +### New Config Option |
| 117 | + |
| 118 | +```javascript |
| 119 | +// Explicit canon |
| 120 | +lookupReference("1 Nephi 1:1", "en", { canon: "lds" }) |
| 121 | + |
| 122 | +// Default (LDS) |
| 123 | +lookupReference("1 Nephi 1:1", "en") |
| 124 | + |
| 125 | +// Auto-detect fallback (verse 150 doesn't exist in LDS 1 Nephi 3) |
| 126 | +lookupReference("1 Nephi 3:150", "en") |
| 127 | +// → { ref: "1 Nephi 3:150", verse_ids: [...], canon: "coc" } |
| 128 | +``` |
| 129 | + |
| 130 | +### New Functions |
| 131 | + |
| 132 | +```javascript |
| 133 | +// Convert between canons |
| 134 | +convertCanon(verseIds, { from: 'coc', to: 'lds' }) |
| 135 | +// Returns: { verse_ids: number[], partial: boolean } |
| 136 | + |
| 137 | +// Set/get default canon |
| 138 | +setCanon('lds') |
| 139 | +getCanon() |
| 140 | +``` |
| 141 | + |
| 142 | +### Result Canon Field |
| 143 | + |
| 144 | +Only include `canon` in result when it differs from expected: |
| 145 | + |
| 146 | +```javascript |
| 147 | +// Using default (LDS), found in LDS → no canon field |
| 148 | +lookupReference("John 3:16") |
| 149 | +// → { ref: "John 3:16", verse_ids: [26136] } |
| 150 | + |
| 151 | +// Using default (LDS), auto-detected COC → canon field included |
| 152 | +lookupReference("1 Nephi 3:150") |
| 153 | +// → { ref: "1 Nephi 3:150", verse_ids: [...], canon: "coc" } |
| 154 | + |
| 155 | +// Explicit COC, found in COC → no canon field |
| 156 | +lookupReference("1 Nephi 3:150", "en", { canon: "coc" }) |
| 157 | +// → { ref: "1 Nephi 3:150", verse_ids: [...] } |
| 158 | +``` |
| 159 | + |
| 160 | +## File Structure |
| 161 | + |
| 162 | +``` |
| 163 | +src/ |
| 164 | +├── scriptures.mjs # Main entry (updated imports) |
| 165 | +├── canon-converter.mjs # Renamed from scriptcanon.mjs |
| 166 | +├── lib/ |
| 167 | +│ ├── data-loader.mjs # NEW: lazy loading, caching, merging |
| 168 | +│ ├── deep-merge.mjs # KEEP: merges shared + canon data |
| 169 | +│ └── options-resolver.mjs # KEEP: resolves language/canon options |
| 170 | +├── data/ # Compiled JS (gitignored) |
| 171 | +│ ├── canons/ |
| 172 | +│ │ ├── bible/ |
| 173 | +│ │ │ ├── structure.mjs |
| 174 | +│ │ │ └── en.mjs |
| 175 | +│ │ ├── lds/ |
| 176 | +│ │ │ ├── structure.mjs |
| 177 | +│ │ │ └── en.mjs |
| 178 | +│ │ └── coc/ |
| 179 | +│ │ ├── structure.mjs |
| 180 | +│ │ ├── en.mjs |
| 181 | +│ │ └── mapping.mjs |
| 182 | +│ └── shared/ |
| 183 | +│ ├── en.mjs |
| 184 | +│ └── ko.mjs |
| 185 | +
|
| 186 | +build/ |
| 187 | +├── build.mjs # Existing bundler |
| 188 | +└── compile-yaml.mjs # NEW: YAML → JS compiler |
| 189 | +
|
| 190 | +data/ # Source YAML (committed) |
| 191 | +├── canons/ |
| 192 | +│ ├── bible/ |
| 193 | +│ │ ├── _structure.yml |
| 194 | +│ │ └── en.yml |
| 195 | +│ ├── lds/ |
| 196 | +│ │ ├── _structure.yml |
| 197 | +│ │ └── en.yml |
| 198 | +│ └── coc/ |
| 199 | +│ ├── _structure.yml |
| 200 | +│ └── en.yml |
| 201 | +└── shared/ |
| 202 | + ├── en.yml |
| 203 | + └── ko.yml |
| 204 | +``` |
| 205 | + |
| 206 | +## Files to Delete |
| 207 | + |
| 208 | +After refactoring is complete, remove from `src/lib/`: |
| 209 | +- `yaml-loader.mjs` (only needed by build script, move to `build/`) |
| 210 | +- `canon-loader.mjs` (replaced by `data-loader.mjs`) |
| 211 | + |
| 212 | +## Implementation Tasks |
| 213 | + |
| 214 | +1. Create `build/compile-yaml.mjs` script |
| 215 | +2. Create `src/lib/data-loader.mjs` module |
| 216 | +3. Update `src/scriptures.mjs` to use data-loader |
| 217 | +4. Rename `scriptcanon.mjs` to `canon-converter.mjs`, make generic |
| 218 | +5. Add `canon` option to lookup/generate/detect functions |
| 219 | +6. Add `setCanon()`/`getCanon()` functions |
| 220 | +7. Implement auto-detect fallback logic |
| 221 | +8. Update `build/build.mjs` to run compile-yaml first |
| 222 | +9. Add `src/data/` to `.gitignore` |
| 223 | +10. Update tests for new canon functionality |
| 224 | +11. Move `yaml-loader.mjs` to `build/` directory |
| 225 | +12. Remove `canon-loader.mjs` |
| 226 | + |
| 227 | +## Testing Strategy |
| 228 | + |
| 229 | +- All existing tests must pass (backward compatibility) |
| 230 | +- Add tests for explicit canon selection |
| 231 | +- Add tests for auto-detect fallback |
| 232 | +- Add tests for `convertCanon()` with lazy loading |
| 233 | +- Add tests for `setCanon()`/`getCanon()` |
0 commit comments