Skip to content

Localization: AI translation primitives#25705

Draft
jkmassel wants to merge 2 commits into
trunkfrom
jkmassel/claude-string-translation
Draft

Localization: AI translation primitives#25705
jkmassel wants to merge 2 commits into
trunkfrom
jkmassel/claude-string-translation

Conversation

@jkmassel

Copy link
Copy Markdown
Contributor

Reusable Ruby primitives for the AI translation tier of the localization pipeline — the service behind the human ?? AI ?? English floor whose AI stub (ai_translate_pluralnil) was left open in #25688. All of it is pure prompt-building + validation with the Anthropic SDK call injected, so every line of logic is unit-testable without the gem or the network; the live SDK wiring is one thin factory.

Nothing is wired into a lane yet — these are the building blocks, deliberately decoupled from the GlotPress / catalog plumbing that's still in flux.

Summary

  • TranslationValidator — the format-specifier safety gate. A machine translation must preserve the source's printf/NSString arguments exactly (count + type; positional %1$@ may reorder, which is the whole point). A mismatch is rejected, so a broken translation falls back to English rather than shipping a crash in a locale no one on the team can read.
  • Glossary — brand do-not-translate list (WordPress, Jetpack, …) plus per-locale preferred terms and a register note, rendered into the prompt. Pure data in; sourcing it (the WordPress.org per-locale glossaries / style guides) is pre-processing handed in later.
  • AITranslator — three translation shapes: translate (one string), translate_plural (a whole CLDR form-set in one request, so the model keeps one stem across forms), and translate_all (batched regular strings). Structured outputs (output_config json_schema) enforce the reply shape on the plural and batch paths.
  • AnthropicBatch — the async Message Batches path for a bulk backfill (~50% cheaper): submitawait (poll) → resultscollect_batch, plus all the SDK-shape glue, shared with the sync path so the request shape can't drift between them.

Design notes

The SDK call is injected

AITranslator takes a complete: callable (and the batch path takes a client), so prompt-building, validation, batching, and result-assembly are all exercised by unit tests with a canned reply. AITranslator.with_anthropic / AnthropicBatch.client build the live instances (default model claude-opus-4-8). The two live-API bugs we hit — a custom_id that didn't match ^[a-zA-Z0-9_-]{1,64}$, and results_streaming yielding raw JSONL strings rather than typed objects — were both invisible to a permissive fake client and only caught by real calls. The SDK seams are now live-verified, not just fake-tested.

The placeholder gate is a hard floor

Every machine cell — single string, plural form, or batch entry — passes through TranslationValidator before it's returned. This is the same invariant the catalog needs_review machinery already assumes: the AI tier can only ever produce a safe translation or nil.

Plural consistency

Translating each CLDR category independently let the model drift between synonyms across forms (Polish słowowyrazysłów). translate_plural sends the whole form-set in one request and instructs one consistent stem; verified it now yields słowo / słowa / słów.

Verified live

Against the real API (fr/de/ja/pl): verb/adjective disambiguated from the dev comment (Suivre/Suivi, Folgen/Gefolgt), brand terms kept verbatim, German informal register (Dein), French space-before-?, plural stems consistent, and a full Batch round-trip (submit → await → collect) returning Suivre / %1$@ vues mapped back to keys and grouped by locale. Placeholders were preserved throughout — the gate never had to fire on real output.

Not in this PR (deliberate)

  • Lane wiringdownload_localized_plurals still calls the nil stub; switching it to AITranslator#for_plural (and a translate_all pass over the regular catalog) is a separate change.
  • A .strings / .xcstrings reader — these consume a {key, source, comment} array and plural form-sets; building that from the real catalog (plus any pre-processing) is upstream of here.
  • Glossary sourcingGlossary is data-in; pulling the WordPress.org per-locale glossaries / style guides into it comes later.
  • Bulk-backfill ergonomics — model tiering for cost, and the submit-now / collect-later split for very large batches (vs. the synchronous await).

Test plan

All checks are pure Ruby — stdlib minitest, no bundle, no network:

  • ruby fastlane/lanes/translation_validator_test.rb — 9
  • ruby fastlane/lanes/translation_glossary_test.rb — 5
  • ruby fastlane/lanes/anthropic_batch_test.rb — 6
  • ruby fastlane/lanes/ai_translator_test.rb — 30
  • rubocop clean on all eight files
  • Live translation pass (needs ANTHROPIC_API_KEY + bundle install): ruby fastlane/lanes/ai_translator.rb fr "You have %1$d new posts" "Notification. %1$d is the count."

Related

@dangermattic

Copy link
Copy Markdown
Collaborator
1 Warning
⚠️ This PR is larger than 500 lines of changes. Please consider splitting it into smaller PRs for easier and faster reviews.
1 Message
📖 This PR is still a Draft: some checks will be skipped.

Generated by 🚫 Danger

@wpmobilebot

wpmobilebot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor
App Icon📲 You can test the changes from this Pull Request in Jetpack by scanning the QR code below to install the corresponding build.
App NameJetpack
ConfigurationRelease-Alpha
Build Number32867
VersionPR #25705
Bundle IDcom.jetpack.alpha
Commit7412850
Installation URL2brteotvt3r8g
Automatticians: You can use our internal self-serve MC tool to give yourself access to those builds if needed.

@wpmobilebot

wpmobilebot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor
App Icon📲 You can test the changes from this Pull Request in WordPress by scanning the QR code below to install the corresponding build.
App NameWordPress
ConfigurationRelease-Alpha
Build Number32867
VersionPR #25705
Bundle IDorg.wordpress.alpha
Commit7412850
Installation URL18r3gimi0qglo
Automatticians: You can use our internal self-serve MC tool to give yourself access to those builds if needed.

jkmassel added 2 commits June 26, 2026 11:35
Reusable, unit-tested Ruby primitives for the AI translation tier of the
localization pipeline — the service behind the `human ?? AI ?? English` floor
whose AI stub was left open in #25688. Pure prompt-building and validation with
the Anthropic SDK call injected, so the logic is testable without the gem or the
network. Not wired into any lane yet.

- TranslationValidator: format-specifier safety gate — a translation must
  preserve the source's placeholders (count and type; positional reordering
  allowed), or it is rejected and falls back to English.
- Glossary: brand do-not-translate list plus per-locale terms and register.
- AITranslator: single-string, per-key plural form-set (one consistent stem
  across CLDR forms), and batched string translation, with structured-output
  (output_config) enforcement.
- AnthropicBatch: Message Batches submit/await/results/collect for bulk backfill.

50 unit tests, rubocop clean.
The pure-Ruby unit suites (TranslationValidator, Glossary, AnthropicBatch,
AITranslator) weren't executed by any pipeline step — the "Unit Tests" jobs are
the Xcode/XCTest suites, and rubocop (via Danger) only lints them. Add a
lightweight Buildkite step that runs each fastlane/lanes/*_test.rb with plain
ruby (stdlib minitest — no Xcode, no app build, no bundle).

Runs unconditionally rather than behind should-skip-job.sh --job-type validation,
which skips on tooling-only changes — i.e. exactly the PRs that touch these files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants