Localization: AI translation primitives by jkmassel · Pull Request #25705 · wordpress-mobile/WordPress-iOS

jkmassel · 2026-06-26T04:48:30Z

Reusable Ruby primitives for the AI translation tier of the localization pipeline — the service behind the human ?? AI ?? English floor whose AI stub (ai_translate_plural → nil) was left open in #25688. All of it is pure prompt-building + validation with the Anthropic SDK call injected, so every line of logic is unit-testable without the gem or the network; the live SDK wiring is one thin factory.

Nothing is wired into a lane yet — these are the building blocks, deliberately decoupled from the GlotPress / catalog plumbing that's still in flux.

Summary

TranslationValidator — the format-specifier safety gate. A machine translation must preserve the source's printf/NSString arguments exactly (count + type; positional %1$@ may reorder, which is the whole point). A mismatch is rejected, so a broken translation falls back to English rather than shipping a crash in a locale no one on the team can read.
Glossary — brand do-not-translate list (WordPress, Jetpack, …) plus per-locale preferred terms and a register note, rendered into the prompt. Pure data in; sourcing it (the WordPress.org per-locale glossaries / style guides) is pre-processing handed in later.
AITranslator — three translation shapes: translate (one string), translate_plural (a whole CLDR form-set in one request, so the model keeps one stem across forms), and translate_all (batched regular strings). Structured outputs (output_config json_schema) enforce the reply shape on the plural and batch paths.
AnthropicBatch — the async Message Batches path for a bulk backfill (~50% cheaper): submit → await (poll) → results → collect_batch, plus all the SDK-shape glue, shared with the sync path so the request shape can't drift between them.

Design notes

The SDK call is injected

AITranslator takes a complete: callable (and the batch path takes a client), so prompt-building, validation, batching, and result-assembly are all exercised by unit tests with a canned reply. AITranslator.with_anthropic / AnthropicBatch.client build the live instances (default model claude-opus-4-8). The two live-API bugs we hit — a custom_id that didn't match ^[a-zA-Z0-9_-]{1,64}$, and results_streaming yielding raw JSONL strings rather than typed objects — were both invisible to a permissive fake client and only caught by real calls. The SDK seams are now live-verified, not just fake-tested.

The placeholder gate is a hard floor

Every machine cell — single string, plural form, or batch entry — passes through TranslationValidator before it's returned. This is the same invariant the catalog needs_review machinery already assumes: the AI tier can only ever produce a safe translation or nil.

Plural consistency

Translating each CLDR category independently let the model drift between synonyms across forms (Polish słowo → wyrazy → słów). translate_plural sends the whole form-set in one request and instructs one consistent stem; verified it now yields słowo / słowa / słów.

Verified live

Against the real API (fr/de/ja/pl): verb/adjective disambiguated from the dev comment (Suivre/Suivi, Folgen/Gefolgt), brand terms kept verbatim, German informal register (Dein), French space-before-?, plural stems consistent, and a full Batch round-trip (submit → await → collect) returning Suivre / %1$@ vues mapped back to keys and grouped by locale. Placeholders were preserved throughout — the gate never had to fire on real output.

Not in this PR (deliberate)

Lane wiring — download_localized_plurals still calls the nil stub; switching it to AITranslator#for_plural (and a translate_all pass over the regular catalog) is a separate change.
A .strings / .xcstrings reader — these consume a {key, source, comment} array and plural form-sets; building that from the real catalog (plus any pre-processing) is upstream of here.
Glossary sourcing — Glossary is data-in; pulling the WordPress.org per-locale glossaries / style guides into it comes later.
Bulk-backfill ergonomics — model tiering for cost, and the submit-now / collect-later split for very large batches (vs. the synchronous await).

Test plan

All checks are pure Ruby — stdlib minitest, no bundle, no network:

ruby fastlane/lanes/translation_validator_test.rb — 9
ruby fastlane/lanes/translation_glossary_test.rb — 5
ruby fastlane/lanes/anthropic_batch_test.rb — 6
ruby fastlane/lanes/ai_translator_test.rb — 30
rubocop clean on all eight files
Live translation pass (needs ANTHROPIC_API_KEY + bundle install): ruby fastlane/lanes/ai_translator.rb fr "You have %1$d new posts" "Notification. %1$d is the count."

Reusable, unit-tested Ruby primitives for the AI translation tier of the localization pipeline — the service behind the `human ?? AI ?? English` floor whose AI stub was left open in #25688. Pure prompt-building and validation with the Anthropic SDK call injected, so the logic is testable without the gem or the network. Not wired into any lane yet. - TranslationValidator: format-specifier safety gate — a translation must preserve the source's placeholders (count and type; positional reordering allowed), or it is rejected and falls back to English. - Glossary: brand do-not-translate list plus per-locale terms and register. - AITranslator: single-string, per-key plural form-set (one consistent stem across CLDR forms), and batched string translation, with structured-output (output_config) enforcement. - AnthropicBatch: Message Batches submit/await/results/collect for bulk backfill. 50 unit tests, rubocop clean.

The pure-Ruby unit suites (TranslationValidator, Glossary, AnthropicBatch, AITranslator) weren't executed by any pipeline step — the "Unit Tests" jobs are the Xcode/XCTest suites, and rubocop (via Danger) only lints them. Add a lightweight Buildkite step that runs each fastlane/lanes/*_test.rb with plain ruby (stdlib minitest — no Xcode, no app build, no bundle). Runs unconditionally rather than behind should-skip-job.sh --job-type validation, which skips on tooling-only changes — i.e. exactly the PRs that touch these files.

jkmassel added 2 commits June 26, 2026 11:35

jkmassel force-pushed the jkmassel/claude-string-translation branch from 28b37b5 to 7412850 Compare June 26, 2026 17:37

This was referenced Jun 26, 2026

Localization: wire AI plural translation into the GlotPress reverse fold #25710

Draft

Localization: stage regular-string translations into Localizable.xcstrings (manual) #25713

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Localization: AI translation primitives#25705

Localization: AI translation primitives#25705
jkmassel wants to merge 2 commits into
trunkfrom
jkmassel/claude-string-translation

jkmassel commented Jun 26, 2026

Uh oh!

dangermattic commented Jun 26, 2026

Uh oh!

wpmobilebot commented Jun 26, 2026 •

edited

Loading

Uh oh!

wpmobilebot commented Jun 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

jkmassel commented Jun 26, 2026

Summary

Design notes

The SDK call is injected

The placeholder gate is a hard floor

Plural consistency

Verified live

Not in this PR (deliberate)

Test plan

Related

Uh oh!

dangermattic commented Jun 26, 2026

Uh oh!

wpmobilebot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wpmobilebot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wpmobilebot commented Jun 26, 2026 •

edited

Loading

wpmobilebot commented Jun 26, 2026 •

edited

Loading