-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Localization: String Catalog pipeline (plurals + catalog generation) #25688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
40040cb
ad2a555
b4df4fa
61f521b
c8138f1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| #!/bin/bash -eu | ||
|
|
||
| # Verifies that the build-free String Catalog generation (xcstringstool extract/sync) captures every string | ||
| # the legacy genstrings flow finds over the same source — guarding against extraction regressions (e.g. the | ||
| # same-basename .stringsdata collision). Runs on the `mac` queue (needs Xcode's genstrings/xcstringstool). | ||
|
|
||
| if "$(dirname "${BASH_SOURCE[0]}")/should-skip-job.sh" --job-type validation; then | ||
| exit 0 | ||
| fi | ||
|
|
||
| echo "--- :rubygems: Setting up Gems" | ||
| install_gems | ||
|
|
||
| echo "--- :writing_hand: Copy Files" | ||
| mkdir -pv ~/.configure/wordpress-ios/secrets | ||
| cp -v fastlane/env/project.env-example ~/.configure/wordpress-ios/secrets/project.env | ||
|
Comment on lines
+11
to
+16
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Side note: I looked at https://github.com/wordpress-mobile/WordPress-iOS/blob/d7fec8765364c9a7f7f8330fe5840ddbdef38d99/.buildkite/commands/shared-set-up.sh to understand if it would have been better here (I wouldn't have) and noticed that script could do with the |
||
|
|
||
| echo "--- :package: Generate Localizable.xcstrings from source" | ||
| bundle exec fastlane ios generate_strings_catalog | ||
|
|
||
| echo "--- :mag: Verify the catalog covers every genstrings string" | ||
| bundle exec fastlane ios verify_strings_catalog | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When adding strings that support plural, in addition to the Swift code, do we need to manually update this file, too?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, it actually kind of goes the other way – you add strings to the catalogue then use the generated symbols in code: https://developer.apple.com/documentation/xcode/using-generated-localizable-symbols-in-your-code. But I think that's once we're fully using the catalogues, in the interim there's more bookkeeping. We don't have any real plural strings yet, so right now this is just groundwork.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Gotcha. I'm a little bit behind on this 😅 . I was only aware of the stringdicts file for plural support. Does that mean we'll have similar files in the Swift package modules? Which means the "forward" and "reverse" processes will need to handle multiple sources of the original localizable strings?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Same! To be fair. Our tooling has been behind for a long time (not to shift blame, but part of it was GlotPress only doing |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,40 @@ | ||
| { | ||
| "sourceLanguage" : "en", | ||
| "strings" : { | ||
| "blogging.reminders.weeklyCount" : { | ||
| "comment" : "Number of blogging reminders per week. %1$d is the count.", | ||
| "localizations" : { | ||
| "en" : { | ||
| "variations" : { | ||
| "plural" : { | ||
| "one" : { | ||
| "stringUnit" : { "state" : "translated", "value" : "%1$d time a week" } | ||
| }, | ||
| "other" : { | ||
| "stringUnit" : { "state" : "translated", "value" : "%1$d times a week" } | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
| }, | ||
| "editor.textCounter.wordCount" : { | ||
| "comment" : "Number of words in the editor. %1$ld is the count.", | ||
| "localizations" : { | ||
| "en" : { | ||
| "variations" : { | ||
| "plural" : { | ||
| "one" : { | ||
| "stringUnit" : { "state" : "translated", "value" : "%1$ld word" } | ||
| }, | ||
| "other" : { | ||
| "stringUnit" : { "state" : "translated", "value" : "%1$ld words" } | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
| }, | ||
| "version" : "1.0" | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,98 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| require 'json' | ||
|
|
||
| # Helpers for the build-free catalog generation pipeline (genstrings-coverage verification + needs_review | ||
| # reconciliation). Plain Ruby with no fastlane dependencies, so it's unit-testable directly — the lanes in | ||
| # `localization_catalog.rb` call into it. | ||
| module CatalogHelper | ||
| module_function | ||
|
|
||
| # --- coverage verification: catalog vs the legacy genstrings output ------------------------------------- | ||
|
|
||
| # printf-style format specifier (incl. positional %N$ and length modifiers). The space flag (`% d`) is | ||
| # deliberately excluded: it's vanishingly rare in our strings, and allowing it makes `% <letter>` match | ||
| # inside ordinary prose ("100% sure" → "% s"), corrupting the canonical form used for the coverage compare. | ||
| FORMAT_SPECIFIER = /%(?:\d+\$)?[#0\-+']*(?:\d+|\*)?(?:\.(?:\d+|\*))?(?:hh|h|ll|l|L|q|z|t|j)?[@dDiuUxXoOfFeEgGaAcCsSpn%]/ | ||
|
|
||
| # Keys present in `reference` (e.g. genstrings output) but absent from `catalog_keys`, compared on the | ||
| # format-canonical form (so `%li` vs `%1$li` don't read as false gaps). Both lists arrive already decoded — | ||
| # genstrings keys via `L10nHelper.read_strings_file_as_hash` (Apple's `plutil`), catalog keys straight from | ||
| # the parsed JSON — so there's no unescaping to do here. | ||
| def coverage_gap(reference, catalog_keys) | ||
| catalog_canonical = catalog_keys.to_set { |key| canonical(key) } | ||
| reference.reject { |key| catalog_canonical.include?(canonical(key)) } | ||
| end | ||
|
|
||
| # Collapse format specifiers to a single token so source-form (%li) and normalized (%1$li) compare equal. | ||
| def canonical(key) | ||
| key.gsub(FORMAT_SPECIFIER, "\u0001") | ||
| end | ||
|
|
||
| # --- needs_review reconciliation ---------------------------------------------------------------------- | ||
|
|
||
|
|
||
| # `xcstringstool sync` does NOT reconcile an existing key whose English source VALUE changed: it leaves | ||
| # both the stored English value and the affected translations untouched (verified — source "Settings" → | ||
| # "Preferences" left en="Settings" and fr="translated"). The in-Xcode build does this reconciliation; the | ||
| # standalone CLI does not. This closes that gap: where the freshly-extracted English differs from what the | ||
| # catalog stores, it updates the English value and flips that key's translations from `translated` to | ||
| # `needs_review` (so the AI/human pipeline re-checks them). | ||
| # | ||
| # Out of scope here (handled elsewhere): English-as-key strings — editing their text changes the KEY, which | ||
| # sync already handles as new/stale; and plural entries, whose English is itself a plural variation, so | ||
| # `reconcile_entry!` bails (no flat English `stringUnit`) — those live in the separate plurals catalog. | ||
| # Translation-side device/width variations of a regular string ARE reconciled (see `string_units`). | ||
| # | ||
| # @param catalog [Hash] parsed `.xcstrings`, mutated in place | ||
| # @param current_en [Hash{String=>String}] key => freshly-extracted English value | ||
| # @return [Array<String>] keys that were reconciled (English updated + translations re-flagged) | ||
| def reconcile_source_changes!(catalog, current_en) | ||
| (catalog['strings'] || {}).filter_map do |key, entry| | ||
| key if reconcile_entry!(entry, current_en[key]) | ||
| end | ||
| end | ||
|
|
||
| # Reconcile one entry against its freshly-extracted English value. Returns the entry (truthy) if it | ||
| # changed, nil otherwise — matching the Ruby bang-method convention (cf. String#gsub!). | ||
| def reconcile_entry!(entry, new_value) | ||
| return if new_value.nil? | ||
|
|
||
| english = entry.dig('localizations', 'en', 'stringUnit') | ||
| return if english.nil? || english['value'] == new_value | ||
|
|
||
| english['value'] = new_value | ||
| flag_translations_for_review!(entry['localizations']) | ||
| entry | ||
| end | ||
|
|
||
| def flag_translations_for_review!(localizations) | ||
| localizations.each do |locale, body| | ||
| next if locale == 'en' || body.nil? | ||
|
|
||
| string_units(body).each do |unit| | ||
| unit['state'] = 'needs_review' if unit['state'] == 'translated' | ||
| end | ||
| end | ||
| end | ||
|
|
||
| # All stringUnits in a localization body, whether stored flat (`stringUnit`) or nested under one or more | ||
| # `variations` (a regular string's translation can be varied by device/width, and variations can nest). | ||
| # Returns the unit hashes themselves so a caller can flip their `state` in place — a single top-level | ||
| # `body['stringUnit']` lookup would miss the varied leaves entirely. | ||
| def string_units(node) | ||
| return [] unless node.is_a?(Hash) | ||
|
|
||
| units = [] | ||
| units << node['stringUnit'] if node['stringUnit'].is_a?(Hash) | ||
| variations = node['variations'] | ||
| if variations.is_a?(Hash) | ||
| variations.each_value do |cases| | ||
| next unless cases.is_a?(Hash) | ||
|
|
||
| cases.each_value { |child| units.concat(string_units(child)) } | ||
| end | ||
| end | ||
| units | ||
| end | ||
| end | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It never crossed my mind before AI made generating helper code cheap, but we can obviously write tests for this kind of code, too. I've been getting my AI to generate tests for new fastlane automation, e.g. in Studio |
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't hurt, but, is it useful? All this info is more pertinent in the
generate_strings_catalogandverify_strings_catalogheaders.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind having it here so someone can look at the script and know what it's up to 🤷