Skip to content

Commit ed5b6cf

Browse files
authored
Revise README.md with project details and structure
Updated README.md to provide detailed project information, including supported languages, architecture, dependencies, and deployment instructions.
1 parent e9ae653 commit ed5b6cf

1 file changed

Lines changed: 98 additions & 1 deletion

File tree

wiktionary_pron/README.md

Lines changed: 98 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,98 @@
1-
Trying to make Wiktionary Lua pronunciation modules to work in browser using fengari and wasmoon.
1+
# wiktionary_pron
2+
3+
Browser-side IPA transcription using Wiktionary's Lua pronunciation modules executed via
4+
[wasmoon](https://github.com/ceifa/wasmoon) (Lua 5.4 → WebAssembly).
5+
6+
Live: https://hellpanderrr.github.io/wiktionary_pron/
7+
8+
## Supported languages
9+
10+
| Language | Code | Styles | Forms | Help |
11+
|----------|------|--------|-------|------|
12+
| [Latin](https://hellpanderrr.github.io/wiktionary_pron/?lang=Latin) | `la` | Classical, Ecclesiastical, Vulgar | Phonetic, Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/latin.html) |
13+
| [Ancient Greek](https://hellpanderrr.github.io/wiktionary_pron/?lang=Greek) | `grc` | 5th BCE Attic, 1st CE Egyptian, 4th CE Koine, 10th CE Byzantine, 15th CE Constantinopolitan | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/greek.html) |
14+
| [Armenian](https://hellpanderrr.github.io/wiktionary_pron/?lang=Armenian) | `hy` | Western, Eastern | Phonemic, Phonetic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/armenian.html) |
15+
| [German](https://hellpanderrr.github.io/wiktionary_pron/?lang=German) | `de` | Default | Phonemic, Phonetic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/german.html) |
16+
| [French](https://hellpanderrr.github.io/wiktionary_pron/?lang=French) | `fr` | Default, Parisian (experimental) | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/french.html) |
17+
| [Spanish](https://hellpanderrr.github.io/wiktionary_pron/?lang=Spanish) | `es` | Castilian, Latin American | Phonetic, Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/spanish.html) |
18+
| [Portuguese](https://hellpanderrr.github.io/wiktionary_pron/?lang=Portuguese) | `pt` | Brazil, Portugal | Phonetic, Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/portuguese.html) |
19+
| [Russian](https://hellpanderrr.github.io/wiktionary_pron/?lang=Russian) | `ru` | Default | Phonetic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/russian.html) |
20+
| [Ukrainian](https://hellpanderrr.github.io/wiktionary_pron/?lang=Ukrainian) | `uk` | Default | Phonetic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/ukrainian.html) |
21+
| [Belorussian](https://hellpanderrr.github.io/wiktionary_pron/?lang=Belorussian) | `be` | Default | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/belorussian.html) |
22+
| [Polish](https://hellpanderrr.github.io/wiktionary_pron/?lang=Polish) | `pl` | Default | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/polish.html) |
23+
| [Bulgarian](https://hellpanderrr.github.io/wiktionary_pron/?lang=Bulgarian) | `bg` | Default | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/bulgarian.html) |
24+
| [Czech](https://hellpanderrr.github.io/wiktionary_pron/?lang=Czech) | `cs` | Default | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/czech.html) |
25+
| [Lithuanian](https://hellpanderrr.github.io/wiktionary_pron/?lang=Lithuanian) | `lt` | Default | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/lithuanian.html) |
26+
| [Icelandic](https://hellpanderrr.github.io/wiktionary_pron/?lang=Icelandic) | `is` | Default | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/icelandic.html) |
27+
| [Mongolian](https://hellpanderrr.github.io/wiktionary_pron/?lang=Mongolian) | `mn` | Default | Phonemic | [help](https://hellpanderrr.github.io/wiktionary_pron/help/mongolian.html) |
28+
29+
## Architecture
30+
31+
### Lua runtime (`scripts/lua_init.js`)
32+
33+
[wasmoon](https://github.com/ceifa/wasmoon) is initialized once on page load. A custom `require`
34+
shim replaces the default Lua `require`:
35+
36+
- Converts dot-separated module paths to slash-separated paths
37+
(e.g. `ustring.charsets``ustring/charsets`)
38+
- Fetches `.lua` files from `lua_modules/` over HTTP
39+
- Executes them via `load(text)()`
40+
- Wraps the whole `require` in a Lua-side `memoize` to avoid redundant fetches
41+
42+
After the runtime is ready, `loadLanguage(code)` runs `require("<code>-pron_wasm")` inside Lua
43+
and exposes the result as `window[code + "_ipa"]` for JS-side calls.
44+
45+
### Lua modules (`lua_modules/`)
46+
47+
Two categories:
48+
49+
- **Wiktionary modules** — taken verbatim from
50+
[Wiktionary pronunciation modules](https://en.wiktionary.org/wiki/Category:Pronunciation_modules)
51+
and the [wiktra](https://github.com/kbatsuren/wiktra) MediaWiki compatibility layer (`mw.lua`,
52+
`mw-text.lua`, `mw-title.lua`, etc.)
53+
- **`*_wasm.lua` wrappers** — thin adapters (e.g. `la-pron_wasm.lua`, `grc-pron_wasm.lua`)
54+
that bridge the Wiktionary module API to the interface expected by `loadLanguage()`
55+
56+
### Caching
57+
58+
| Data | Storage | TTL |
59+
|------|---------|-----|
60+
| IPA results | `localStorage` | 7 days |
61+
| Audio (TTS) | IndexedDB | persistent |
62+
| Lexicons | IndexedDB ([localforage](https://github.com/localForage/localForage)) | persistent |
63+
64+
IPA result caching is implemented in `scripts/utils.js` via `memoizeLocalStorage()`.
65+
66+
### TTS (`scripts/tts.js`)
67+
68+
- **Web Speech API** — via [EasySpeech](https://github.com/jankapunkt/easy-speech) wrapper
69+
- **Edge TTS**`StreamingTTS` class, proxied through Cloudflare Workers; responses cached in
70+
IndexedDB
71+
72+
### Lexicons (`scripts/lexicon.js`)
73+
74+
Some languages use a dictionary lookup before falling back to Lua rules. Lexicons are stored as
75+
compressed `.zip` files, decompressed client-side via [JSZip](https://stuk.github.io/jszip/),
76+
and loaded into `globalThis.lexicon` as `Map` objects. Loading is deferred until the language
77+
is first selected.
78+
79+
### Other tools
80+
81+
- [`macronizer.html`](https://hellpanderrr.github.io/wiktionary_pron/macronizer.html) — Latin
82+
vowel length marking (dictionary-based)
83+
- `index_fengari.html` — legacy version using [fengari](https://github.com/fengari-lua/fengari)
84+
(Lua 5.3 in JS), kept for reference
85+
86+
## Dependencies
87+
88+
| Library | Purpose |
89+
|---------|---------|
90+
| [wasmoon](https://github.com/ceifa/wasmoon) | Lua 5.4 runtime via WebAssembly |
91+
| [EasySpeech](https://github.com/jankapunkt/easy-speech) | Web Speech API wrapper |
92+
| [localforage](https://github.com/localForage/localForage) | IndexedDB/localStorage abstraction |
93+
| [JSZip](https://stuk.github.io/jszip/) | Lexicon decompression |
94+
| [pdf-lib](https://github.com/Hopding/pdf-lib) | Client-side PDF export |
95+
96+
## Deployment
97+
98+
GitHub Pages, no build step. All assets are static; Lua modules are fetched on demand at runtime.

0 commit comments

Comments
 (0)