Skip to content

Commit 5e6dfca

Browse files
authored
Merge pull request #87 from engalar/worktree-i18n-design
docs: MDL i18n design proposal
2 parents b33a48c + 0372b51 commit 5e6dfca

File tree

1 file changed

+234
-0
lines changed

1 file changed

+234
-0
lines changed
Lines changed: 234 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,234 @@
1+
# MDL Internationalization (i18n) Support
2+
3+
**Date:** 2026-04-03 (updated 2026-04-03)
4+
**Status:** Proposal (revised per review feedback)
5+
**Author:** @engalar
6+
7+
## Problem
8+
9+
MDL currently handles all translatable text fields (page titles, widget captions, enumeration captions, microflow message templates) as single-language strings. When creating or describing model elements, only the default language is read or written. All other translations are silently dropped.
10+
11+
This means:
12+
- `DESCRIBE PAGE` output loses translations — roundtripping a page strips non-default languages
13+
- `CREATE PAGE` can only set one language — multi-language projects require Studio Pro for translation
14+
- No way to see all translations in context (note: `SHOW LANGUAGES` and `QUAL005 MissingTranslations` linter rule already provide language inventory and gap detection via the catalog `strings` table)
15+
16+
Mendix stores translations as `Texts$Text` objects containing an array of `Texts$Translation` entries (one per language). The mxcli internal model (`model.Text`) already represents translations as `map[string]string`, and the BSON reader/writer already handles multi-language serialization. The gap is purely at the MDL syntax and command layer.
17+
18+
## Scope
19+
20+
**In scope (syntax-layer extension):**
21+
- Inline multi-language text literal syntax for CREATE/ALTER/ALTER PAGE SET
22+
- DESCRIBE WITH TRANSLATIONS output mode
23+
- Writer changes to serialize multi-language BSON correctly
24+
25+
**Out of scope:**
26+
- Batch export/import (CSV, XLIFF) — future proposal
27+
- ALTER TRANSLATION standalone command — future proposal
28+
- Translation memory or machine translation integration
29+
30+
## Design
31+
32+
### 1. Translated Text Literal Syntax
33+
34+
Any MDL property that accepts a string literal `'text'` can alternatively accept a translation map:
35+
36+
```sql
37+
-- Single language (backward compatible, unchanged)
38+
Title: 'Hello World'
39+
40+
-- Multi-language
41+
Title: {
42+
en_US: 'Hello World',
43+
zh_CN: '你好世界',
44+
nl_NL: 'Hallo Wereld'
45+
}
46+
```
47+
48+
**Grammar (ANTLR4):**
49+
50+
New rule:
51+
52+
```antlr
53+
translationMap
54+
: LBRACE translationEntry (COMMA translationEntry)* COMMA? RBRACE
55+
;
56+
57+
translationEntry
58+
: IDENTIFIER COLON STRING_LITERAL
59+
;
60+
```
61+
62+
Integration into `propertyValueV3` (MDLParser.g4 line ~1961):
63+
64+
```antlr
65+
propertyValueV3
66+
: STRING_LITERAL
67+
| translationMap // NEW: { en_US: 'Hello', zh_CN: '你好' }
68+
| NUMBER_LITERAL
69+
| booleanLiteral
70+
| qualifiedName
71+
| IDENTIFIER
72+
| H1 | H2 | H3 | H4 | H5 | H6
73+
| LBRACKET (expression (COMMA expression)*)? RBRACKET
74+
;
75+
```
76+
77+
**Disambiguation from widget body `{`**: `translationMap` only appears inside `propertyValueV3`, which follows `COLON` or `EQUALS` in property definitions. Widget bodies (`widgetBodyV3`) follow `)` at statement level, never after `:`. The parser sees `Caption: {` and enters `propertyValueV3 → translationMap` — there is no ambiguity because `widgetBodyV3` is a separate production in `widgetStatementV3` that requires `(...)` before `{`.
78+
79+
**AST node:**
80+
81+
```go
82+
type TranslatedText struct {
83+
Translations map[string]string // languageCode → text
84+
IsMultiLang bool // false = single bare string
85+
}
86+
```
87+
88+
**Semantics:**
89+
- Bare string `'text'` writes to the project's `DefaultLanguageCode`. Existing translations in other languages are preserved.
90+
- Map `{ lang: 'text', ... }` writes the specified languages. Languages not mentioned in the map are preserved (merge, not replace).
91+
- No syntax for deleting a translation (use Studio Pro).
92+
93+
### 2. DESCRIBE WITH TRANSLATIONS
94+
95+
```sql
96+
-- Default: single language output (backward compatible)
97+
DESCRIBE PAGE Module.MyPage;
98+
-- Output: Title: 'Hello World'
99+
100+
-- New: all translations
101+
DESCRIBE PAGE Module.MyPage WITH TRANSLATIONS;
102+
-- Output:
103+
-- Title: {
104+
-- en_US: 'Hello World',
105+
-- zh_CN: '你好世界'
106+
-- }
107+
```
108+
109+
**Rules:**
110+
- Without `WITH TRANSLATIONS`: outputs only the default language as a bare string (current behavior).
111+
- With `WITH TRANSLATIONS`: if only one language exists, still uses bare string; if ≥2 languages, uses map syntax.
112+
- Output must be re-parseable by the MDL parser (roundtrip guarantee).
113+
114+
**Grammar:**
115+
116+
```antlr
117+
describeStatement
118+
: DESCRIBE objectType qualifiedName withTranslationsClause?
119+
;
120+
121+
withTranslationsClause
122+
: WITH TRANSLATIONS
123+
;
124+
```
125+
126+
**Affected commands:**
127+
- DESCRIBE PAGE / SNIPPET — Title, widget Caption, Placeholder
128+
- DESCRIBE ENTITY — validation rule messages
129+
- DESCRIBE MICROFLOW / NANOFLOW — LogMessage, ShowMessage, ValidationFeedback templates
130+
- DESCRIBE ENUMERATION — value captions
131+
- DESCRIBE WORKFLOW — task names, descriptions, outcome captions
132+
133+
### 3. ALTER PAGE SET with Translation Maps
134+
135+
Translation maps work in ALTER PAGE SET, enabling in-place translation updates:
136+
137+
```sql
138+
ALTER PAGE Module.MyPage
139+
SET WIDGET saveButton Caption: { en_US: 'Save', zh_CN: '保存' };
140+
```
141+
142+
This reuses the `translationMap` rule inside `propertyValueV3` — no additional grammar changes needed since ALTER PAGE SET already uses `propertyValueV3` for values.
143+
144+
### 4. Relationship to Existing Translation Features
145+
146+
`SHOW LANGUAGES` (commit a060152) already lists project languages with string counts. `QUAL005 MissingTranslations` linter rule already detects missing translations. The catalog `strings` FTS5 table already stores per-language text with `SELECT * FROM CATALOG.strings WHERE Language = 'nl_NL'`.
147+
148+
This proposal does **not** duplicate those features. It addresses the gap they cannot fill: **writing and round-tripping multi-language text in MDL syntax**.
149+
150+
### 5. Writer Layer Changes
151+
152+
When executing CREATE/ALTER with multi-language text, the writer serializes all provided translations into the standard Mendix BSON format:
153+
154+
```go
155+
titleItems := bson.A{int32(2)} // marker for non-empty
156+
for langCode, text := range translatedText.Translations {
157+
titleItems = append(titleItems, bson.D{
158+
{Key: "$ID", Value: generateUUID()},
159+
{Key: "$Type", Value: "Texts$Translation"},
160+
{Key: "LanguageCode", Value: langCode},
161+
{Key: "Text", Value: text},
162+
})
163+
}
164+
```
165+
166+
**Merge semantics for bare strings (architectural change):**
167+
168+
Currently, all writer functions construct `Texts$Text` from scratch — e.g. `writer_pages.go:219-247` builds a new `Items` array every time. Bare-string merge semantics require a **read-modify-write cycle**:
169+
170+
1. Read the existing `Texts$Text` BSON from the MPR via `GetRawUnit`
171+
2. Parse existing `Items` array to find the entry for `DefaultLanguageCode`
172+
3. Update that entry's `Text` field (or insert if missing)
173+
4. Preserve all other `Texts$Translation` entries unchanged
174+
5. Write back the modified `Items` array
175+
176+
This is a significant change to writer architecture. A shared helper should be introduced:
177+
178+
```go
179+
// mergeTranslation reads existing Texts$Text, merges new translations, returns updated BSON.
180+
// For bare strings: translations = {defaultLang: text}
181+
// For maps: translations = the full map
182+
func mergeTranslation(existingBSON bson.D, translations map[string]string) bson.D
183+
```
184+
185+
**Affected writer functions (11+ call sites):**
186+
- `writer_pages.go` — Page Title, widget Caption/Placeholder
187+
- `writer_enumeration.go` — EnumerationValue Caption
188+
- `writer_microflow.go` — StringTemplate (log/show/validation messages)
189+
- `writer_widgets.go` — all widget Caption/Placeholder properties
190+
- `writer_widgets_action.go`, `writer_widgets_display.go`, `writer_widgets_input.go`
191+
192+
**Serialization ordering:** Translations within `Items` array must be sorted by language code for deterministic BSON output and diff-friendly DESCRIBE.
193+
194+
## Translatable Fields Inventory
195+
196+
The following fields use `Texts$Text` and are affected by this proposal:
197+
198+
| Category | StringContext | Count | Examples |
199+
|----------|-------------|-------|---------|
200+
| Page metadata | `page_title` | 1 | Page.Title |
201+
| Enumeration values | `enum_caption` | per value | EnumerationValue.Caption |
202+
| Microflow actions | `log_message`, `show_message`, `validation_message` | 3 | LogMessageAction, ShowMessageAction |
203+
| Workflow objects | `task_name`, `task_description`, `outcome_caption`, `activity_caption` | 4 | UserTask.Name, UserTask.Description |
204+
| Widget properties | `caption`, `placeholder` | 7+ | ActionButton.Caption, TextInput.Placeholder |
205+
206+
**Note:** Widget-level translations (caption, placeholder) are not currently indexed in the catalog `strings` table. A follow-up task should extend `catalog/builder_strings.go` to extract these.
207+
208+
## Implementation Phases
209+
210+
| Phase | Scope | Dependency | Risk |
211+
|-------|-------|------------|------|
212+
| **P1** | DESCRIBE WITH TRANSLATIONS: all describe commands output multi-language | None — read-only, no grammar change | Low |
213+
| **P2** | Grammar + AST: `translationMap` rule, `TranslatedText` node | None | Low |
214+
| **P3** | Visitor: parse `{ lang: 'text' }` into AST | P2 | Low |
215+
| **P4** | Writer `mergeTranslation` helper + multi-lang BSON write | P3 | **High** — architectural change to writer, must test against Studio Pro |
216+
| **P5** | Widget translation indexing: extend catalog builder for widget-level translations | None (independent) | Low |
217+
218+
P1 first — highest user value, zero risk. P4 is the riskiest phase.
219+
220+
**Dropped**: SHOW TRANSLATIONS command — `SHOW LANGUAGES` + `QUAL005` + `SELECT ... FROM CATALOG.strings` already cover translation auditing.
221+
222+
## Compatibility
223+
224+
- **Backward compatible**: existing MDL scripts with bare strings continue to work identically.
225+
- **Forward compatible**: MDL scripts using `{ lang: 'text' }` syntax will fail gracefully on older mxcli versions with a parse error pointing to the `{` token.
226+
- **DESCRIBE roundtrip**: `DESCRIBE ... WITH TRANSLATIONS` output can be fed back to `CREATE OR REPLACE` to reproduce the same translations.
227+
228+
## Risks
229+
230+
| Risk | Mitigation |
231+
|------|-----------|
232+
| `{` ambiguity with widget body blocks | Grammar context: `translatedText` only appears in property value position, not statement position. Widget bodies follow `)` not `:`. |
233+
| Translation ordering in BSON | Mendix does not depend on translation order within `Items` array. Sort by language code for deterministic output. |
234+
| Large translation maps cluttering DESCRIBE output | `WITH TRANSLATIONS` is opt-in; default remains single-language. |

0 commit comments

Comments
 (0)