|
1 | | -Trying to make Wiktionary Lua pronunciation modules to work in browser using fengari and wasmoon. |
| 1 | +# wiktionary_pron |
| 2 | + |
| 3 | +Browser-side IPA transcription using Wiktionary's Lua pronunciation modules executed via |
| 4 | +[wasmoon](https://github.com/ceifa/wasmoon) (Lua 5.4 → WebAssembly). |
| 5 | + |
| 6 | +Live: https://hellpanderrr.github.io/wiktionary_pron/ |
| 7 | + |
| 8 | +## Supported languages |
| 9 | + |
| 10 | +| Language | Code | Styles | Forms | Help | |
| 11 | +|----------|------|--------|-------|------| |
| 12 | +| [Latin](https://hellpanderrr.github.io/wiktionary_pron/?lang=Latin) | `la` | Classical, Ecclesiastical, Vulgar | Phonetic, Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/latin.html) | |
| 13 | +| [Ancient Greek](https://hellpanderrr.github.io/wiktionary_pron/?lang=Greek) | `grc` | 5th BCE Attic, 1st CE Egyptian, 4th CE Koine, 10th CE Byzantine, 15th CE Constantinopolitan | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/greek.html) | |
| 14 | +| [Armenian](https://hellpanderrr.github.io/wiktionary_pron/?lang=Armenian) | `hy` | Western, Eastern | Phonemic, Phonetic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/armenian.html) | |
| 15 | +| [German](https://hellpanderrr.github.io/wiktionary_pron/?lang=German) | `de` | Default | Phonemic, Phonetic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/german.html) | |
| 16 | +| [French](https://hellpanderrr.github.io/wiktionary_pron/?lang=French) | `fr` | Default, Parisian (experimental) | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/french.html) | |
| 17 | +| [Spanish](https://hellpanderrr.github.io/wiktionary_pron/?lang=Spanish) | `es` | Castilian, Latin American | Phonetic, Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/spanish.html) | |
| 18 | +| [Portuguese](https://hellpanderrr.github.io/wiktionary_pron/?lang=Portuguese) | `pt` | Brazil, Portugal | Phonetic, Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/portuguese.html) | |
| 19 | +| [Russian](https://hellpanderrr.github.io/wiktionary_pron/?lang=Russian) | `ru` | Default | Phonetic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/russian.html) | |
| 20 | +| [Ukrainian](https://hellpanderrr.github.io/wiktionary_pron/?lang=Ukrainian) | `uk` | Default | Phonetic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/ukrainian.html) | |
| 21 | +| [Belorussian](https://hellpanderrr.github.io/wiktionary_pron/?lang=Belorussian) | `be` | Default | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/belorussian.html) | |
| 22 | +| [Polish](https://hellpanderrr.github.io/wiktionary_pron/?lang=Polish) | `pl` | Default | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/polish.html) | |
| 23 | +| [Bulgarian](https://hellpanderrr.github.io/wiktionary_pron/?lang=Bulgarian) | `bg` | Default | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/bulgarian.html) | |
| 24 | +| [Czech](https://hellpanderrr.github.io/wiktionary_pron/?lang=Czech) | `cs` | Default | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/czech.html) | |
| 25 | +| [Lithuanian](https://hellpanderrr.github.io/wiktionary_pron/?lang=Lithuanian) | `lt` | Default | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/lithuanian.html) | |
| 26 | +| [Icelandic](https://hellpanderrr.github.io/wiktionary_pron/?lang=Icelandic) | `is` | Default | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/icelandic.html) | |
| 27 | +| [Mongolian](https://hellpanderrr.github.io/wiktionary_pron/?lang=Mongolian) | `mn` | Default | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/mongolian.html) | |
| 28 | + |
| 29 | +## Architecture |
| 30 | + |
| 31 | +### Lua runtime (`scripts/lua_init.js`) |
| 32 | + |
| 33 | +[wasmoon](https://github.com/ceifa/wasmoon) is initialized once on page load. A custom `require` |
| 34 | +shim replaces the default Lua `require`: |
| 35 | + |
| 36 | +- Converts dot-separated module paths to slash-separated paths |
| 37 | + (e.g. `ustring.charsets` → `ustring/charsets`) |
| 38 | +- Fetches `.lua` files from `lua_modules/` over HTTP |
| 39 | +- Executes them via `load(text)()` |
| 40 | +- Wraps the whole `require` in a Lua-side `memoize` to avoid redundant fetches |
| 41 | + |
| 42 | +After the runtime is ready, `loadLanguage(code)` runs `require("<code>-pron_wasm")` inside Lua |
| 43 | +and exposes the result as `window[code + "_ipa"]` for JS-side calls. |
| 44 | + |
| 45 | +### Lua modules (`lua_modules/`) |
| 46 | + |
| 47 | +Two categories: |
| 48 | + |
| 49 | +- **Wiktionary modules** — taken verbatim from |
| 50 | + [Wiktionary pronunciation modules](https://en.wiktionary.org/wiki/Category:Pronunciation_modules) |
| 51 | + and the [wiktra](https://github.com/kbatsuren/wiktra) MediaWiki compatibility layer (`mw.lua`, |
| 52 | + `mw-text.lua`, `mw-title.lua`, etc.) |
| 53 | +- **`*_wasm.lua` wrappers** — thin adapters (e.g. `la-pron_wasm.lua`, `grc-pron_wasm.lua`) |
| 54 | + that bridge the Wiktionary module API to the interface expected by `loadLanguage()` |
| 55 | + |
| 56 | +### Caching |
| 57 | + |
| 58 | +| Data | Storage | TTL | |
| 59 | +|------|---------|-----| |
| 60 | +| IPA results | `localStorage` | 7 days | |
| 61 | +| Audio (TTS) | IndexedDB | persistent | |
| 62 | +| Lexicons | IndexedDB ([localforage](https://github.com/localForage/localForage)) | persistent | |
| 63 | + |
| 64 | +IPA result caching is implemented in `scripts/utils.js` via `memoizeLocalStorage()`. |
| 65 | + |
| 66 | +### TTS (`scripts/tts.js`) |
| 67 | + |
| 68 | +- **Web Speech API** — via [EasySpeech](https://github.com/jankapunkt/easy-speech) wrapper |
| 69 | +- **Edge TTS** — `StreamingTTS` class, proxied through Cloudflare Workers; responses cached in |
| 70 | + IndexedDB |
| 71 | + |
| 72 | +### Lexicons (`scripts/lexicon.js`) |
| 73 | + |
| 74 | +Some languages use a dictionary lookup before falling back to Lua rules. Lexicons are stored as |
| 75 | +compressed `.zip` files, decompressed client-side via [JSZip](https://stuk.github.io/jszip/), |
| 76 | +and loaded into `globalThis.lexicon` as `Map` objects. Loading is deferred until the language |
| 77 | +is first selected. |
| 78 | + |
| 79 | +### Other tools |
| 80 | + |
| 81 | +- [`macronizer.html`](https://hellpanderrr.github.io/wiktionary_pron/macronizer.html) — Latin |
| 82 | + vowel length marking (dictionary-based) |
| 83 | +- `index_fengari.html` — legacy version using [fengari](https://github.com/fengari-lua/fengari) |
| 84 | + (Lua 5.3 in JS), kept for reference |
| 85 | + |
| 86 | +## Dependencies |
| 87 | + |
| 88 | +| Library | Purpose | |
| 89 | +|---------|---------| |
| 90 | +| [wasmoon](https://github.com/ceifa/wasmoon) | Lua 5.4 runtime via WebAssembly | |
| 91 | +| [EasySpeech](https://github.com/jankapunkt/easy-speech) | Web Speech API wrapper | |
| 92 | +| [localforage](https://github.com/localForage/localForage) | IndexedDB/localStorage abstraction | |
| 93 | +| [JSZip](https://stuk.github.io/jszip/) | Lexicon decompression | |
| 94 | +| [pdf-lib](https://github.com/Hopding/pdf-lib) | Client-side PDF export | |
| 95 | + |
| 96 | +## Deployment |
| 97 | + |
| 98 | +GitHub Pages, no build step. All assets are static; Lua modules are fetched on demand at runtime. |
0 commit comments