Skip to content

Commit 42e2159

Browse files
committed
Add build-free Localizable.xcstrings generation + CI coverage gate
Replaces genstrings with Apple's `xcstringstool extract`/`sync` to generate Localizable.xcstrings from source with no app build, keeping AppLocalizedString and its call sites unchanged. Each extract chunk gets its own output directory so same-basename `.stringsdata` (e.g. the two NSDate+Helpers.swift) don't overwrite each other and silently drop strings. `sync` leaves a key's translations untouched when its English source value changes, so the lane reconciles them to `needs_review` afterward, walking device/width variations as well as flat units. Adds a "Verify String Catalog Coverage" CI step that runs genstrings over the same files and fails if the catalog is missing any key, comparing on a format-canonical form. The catalog is generated as an artifact, not wired into the runtime build.
1 parent 586df38 commit 42e2159

6 files changed

Lines changed: 332 additions & 0 deletions

File tree

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
#!/bin/bash -eu
2+
3+
# Verifies that the build-free String Catalog generation (xcstringstool extract/sync) captures every string
4+
# the legacy genstrings flow finds over the same source — guarding against extraction regressions (e.g. the
5+
# same-basename .stringsdata collision). Runs on the `mac` queue (needs Xcode's genstrings/xcstringstool).
6+
7+
if "$(dirname "${BASH_SOURCE[0]}")/should-skip-job.sh" --job-type validation; then
8+
exit 0
9+
fi
10+
11+
echo "--- :rubygems: Setting up Gems"
12+
install_gems
13+
14+
echo "--- :writing_hand: Copy Files"
15+
mkdir -pv ~/.configure/wordpress-ios/secrets
16+
cp -v fastlane/env/project.env-example ~/.configure/wordpress-ios/secrets/project.env
17+
18+
echo "--- :package: Generate Localizable.xcstrings from source"
19+
bundle exec fastlane ios generate_strings_catalog
20+
21+
echo "--- :mag: Verify the catalog covers every genstrings string"
22+
bundle exec fastlane ios verify_strings_catalog

.buildkite/pipeline.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,13 @@ steps:
130130
command: .buildkite/commands/lint-localized-strings-format.sh
131131
plugins: [$CI_TOOLKIT_PLUGIN]
132132

133+
- label: ":mag: Verify String Catalog Coverage"
134+
command: .buildkite/commands/verify-strings-catalog.sh
135+
plugins: [$CI_TOOLKIT_PLUGIN]
136+
notify:
137+
- github_commit_status:
138+
context: "Verify String Catalog Coverage"
139+
133140
#################
134141
# Claude Build Analysis - dynamically uploaded so Build result conditions evaluate at runtime after the wait
135142
#################

RELEASE-NOTES.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
27.1
22
-----
3+
* [*] [build tooling] Add a String Catalog localization pipeline (plurals + build-free catalog generation) with a CI coverage gate [#25688]
34

45

56
27.0

fastlane/Fastfile

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -166,6 +166,8 @@ end
166166
import 'lanes/build.rb'
167167
import 'lanes/codesign.rb'
168168
import 'lanes/localization.rb'
169+
import 'lanes/localization_plurals.rb'
170+
import 'lanes/localization_catalog.rb'
169171
import 'lanes/release.rb'
170172
import 'lanes/screenshots.rb'
171173

fastlane/lanes/catalog_helper.rb

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# frozen_string_literal: true
2+
3+
require 'json'
4+
5+
# Helpers for the build-free catalog generation pipeline (genstrings-coverage verification + needs_review
6+
# reconciliation). Plain Ruby with no fastlane dependencies, so it's unit-testable directly — the lanes in
7+
# `localization_catalog.rb` call into it.
8+
module CatalogHelper
9+
module_function
10+
11+
# --- coverage verification: catalog vs the legacy genstrings output -------------------------------------
12+
13+
# printf-style format specifier (incl. positional %N$ and length modifiers). The space flag (`% d`) is
14+
# deliberately excluded: it's vanishingly rare in our strings, and allowing it makes `% <letter>` match
15+
# inside ordinary prose ("100% sure" → "% s"), corrupting the canonical form used for the coverage compare.
16+
FORMAT_SPECIFIER = /%(?:\d+\$)?[#0\-+']*(?:\d+|\*)?(?:\.(?:\d+|\*))?(?:hh|h|ll|l|L|q|z|t|j)?[@dDiuUxXoOfFeEgGaAcCsSpn%]/
17+
18+
# Keys present in `reference` (e.g. genstrings output) but absent from `catalog_keys`, compared on the
19+
# format-canonical form (so `%li` vs `%1$li` don't read as false gaps). Both lists arrive already decoded —
20+
# genstrings keys via `L10nHelper.read_strings_file_as_hash` (Apple's `plutil`), catalog keys straight from
21+
# the parsed JSON — so there's no unescaping to do here.
22+
def coverage_gap(reference, catalog_keys)
23+
catalog_canonical = catalog_keys.to_set { |key| canonical(key) }
24+
reference.reject { |key| catalog_canonical.include?(canonical(key)) }
25+
end
26+
27+
# Collapse format specifiers to a single token so source-form (%li) and normalized (%1$li) compare equal.
28+
def canonical(key)
29+
key.gsub(FORMAT_SPECIFIER, "\u0001")
30+
end
31+
32+
# --- needs_review reconciliation ----------------------------------------------------------------------
33+
34+
35+
# `xcstringstool sync` does NOT reconcile an existing key whose English source VALUE changed: it leaves
36+
# both the stored English value and the affected translations untouched (verified — source "Settings" →
37+
# "Preferences" left en="Settings" and fr="translated"). The in-Xcode build does this reconciliation; the
38+
# standalone CLI does not. This closes that gap: where the freshly-extracted English differs from what the
39+
# catalog stores, it updates the English value and flips that key's translations from `translated` to
40+
# `needs_review` (so the AI/human pipeline re-checks them).
41+
#
42+
# Out of scope here (handled elsewhere): English-as-key strings — editing their text changes the KEY, which
43+
# sync already handles as new/stale; and plural entries, whose English is itself a plural variation, so
44+
# `reconcile_entry!` bails (no flat English `stringUnit`) — those live in the separate plurals catalog.
45+
# Translation-side device/width variations of a regular string ARE reconciled (see `string_units`).
46+
#
47+
# @param catalog [Hash] parsed `.xcstrings`, mutated in place
48+
# @param current_en [Hash{String=>String}] key => freshly-extracted English value
49+
# @return [Array<String>] keys that were reconciled (English updated + translations re-flagged)
50+
def reconcile_source_changes!(catalog, current_en)
51+
(catalog['strings'] || {}).filter_map do |key, entry|
52+
key if reconcile_entry!(entry, current_en[key])
53+
end
54+
end
55+
56+
# Reconcile one entry against its freshly-extracted English value. Returns the entry (truthy) if it
57+
# changed, nil otherwise — matching the Ruby bang-method convention (cf. String#gsub!).
58+
def reconcile_entry!(entry, new_value)
59+
return if new_value.nil?
60+
61+
english = entry.dig('localizations', 'en', 'stringUnit')
62+
return if english.nil? || english['value'] == new_value
63+
64+
english['value'] = new_value
65+
flag_translations_for_review!(entry['localizations'])
66+
entry
67+
end
68+
69+
def flag_translations_for_review!(localizations)
70+
localizations.each do |locale, body|
71+
next if locale == 'en' || body.nil?
72+
73+
string_units(body).each do |unit|
74+
unit['state'] = 'needs_review' if unit['state'] == 'translated'
75+
end
76+
end
77+
end
78+
79+
# All stringUnits in a localization body, whether stored flat (`stringUnit`) or nested under one or more
80+
# `variations` (a regular string's translation can be varied by device/width, and variations can nest).
81+
# Returns the unit hashes themselves so a caller can flip their `state` in place — a single top-level
82+
# `body['stringUnit']` lookup would miss the varied leaves entirely.
83+
def string_units(node)
84+
return [] unless node.is_a?(Hash)
85+
86+
units = []
87+
units << node['stringUnit'] if node['stringUnit'].is_a?(Hash)
88+
variations = node['variations']
89+
if variations.is_a?(Hash)
90+
variations.each_value do |cases|
91+
next unless cases.is_a?(Hash)
92+
93+
cases.each_value { |child| units.concat(string_units(child)) }
94+
end
95+
end
96+
units
97+
end
98+
end
Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
# frozen_string_literal: true
2+
3+
require 'json'
4+
require 'tmpdir'
5+
require 'fileutils'
6+
require_relative 'catalog_helper'
7+
8+
#################################################
9+
# Catalog generation (forward / extraction)
10+
#
11+
# Build-free replacement for the genstrings step: extract the app's ENGLISH source strings into a
12+
# String Catalog using Apple's own `xcstringstool extract` + `sync` (NOT a full app build). This is the
13+
# first step of moving the localization backing store to String Catalogs for the AI translation pipeline.
14+
#
15+
# `xcstringstool extract --legacy-localizable-strings --modern-localizable-strings -s AppLocalizedString`
16+
# recognizes NSLocalizedString + ObjC siblings (legacy), `String(localized:)`/`LocalizedStringResource`
17+
# (modern — so catalog-native code is covered the moment it's written), and the app's custom
18+
# `AppLocalizedString` routine (the same `-s` flag genstrings uses today — call sites stay UNCHANGED).
19+
# `sync` then MERGES all the extracted `.stringsdata` (every source that targets the Localizable table) into
20+
# the one catalog, deduped by key, applying the per-string state machine (new / extracted_with_value / stale).
21+
#
22+
# NOTE: this lane only GENERATES the English-source catalog as the future backing store. It writes to a
23+
# non-synchronized folder so it is NOT yet a build member (the runtime still uses the committed
24+
# `Localizable.strings`). Wiring the catalog into the target and retiring the legacy `.strings` is a separate
25+
# migration step.
26+
#################################################
27+
28+
# Generated English-source catalog (Localizable table). In WordPress/Resources (NON-synced) so it is produced
29+
# as an artifact without auto-joining the target / conflicting with the existing Localizable.strings.
30+
LOCALIZABLE_CATALOG = File.join(PROJECT_ROOT_FOLDER, 'WordPress', 'Resources', 'Localizable.xcstrings')
31+
32+
# Source roots to extract from — mirrors `generate_strings_file`'s genstrings inputs.
33+
CATALOG_SOURCE_ROOTS = [
34+
File.join(PROJECT_ROOT_FOLDER, 'WordPress'),
35+
File.join(PROJECT_ROOT_FOLDER, 'Modules', 'Sources')
36+
].freeze
37+
38+
# The custom localization routine to additionally extract (same as the genstrings `routines:` today).
39+
CATALOG_LOCALIZATION_ROUTINE = 'AppLocalizedString'
40+
41+
platform :ios do
42+
# Extracts English source strings from code into Localizable.xcstrings (build-free; replaces genstrings).
43+
#
44+
# @option gutenberg_path [String] Optional path to a Gutenberg source clone to also extract from
45+
# (Gutenberg ships as a binary XCFramework, so its source must be cloned — same as the legacy lane).
46+
desc 'Generates Localizable.xcstrings from source via xcstringstool extract + sync (build-free)'
47+
lane :generate_strings_catalog do |gutenberg_path: nil, swiftui: false|
48+
roots = CATALOG_SOURCE_ROOTS + [gutenberg_path].compact
49+
files = catalog_source_files(roots)
50+
UI.user_error!('No source files found to extract from') if files.empty?
51+
UI.message("Extracting localizable strings from #{files.count} source files in #{roots.count} roots…")
52+
53+
Dir.mktmpdir do |stringsdata_dir|
54+
extract_stringsdata(files: files, output_dir: stringsdata_dir, swiftui: swiftui)
55+
synced = sync_localizable_catalog(stringsdata_dir: stringsdata_dir)
56+
reconciled = reconcile_changed_sources(stringsdata_dir: stringsdata_dir)
57+
report_catalog(LOCALIZABLE_CATALOG, extracted_count: synced, reconciled_count: reconciled)
58+
end
59+
end
60+
61+
# Verifies the generated catalog captures every string the legacy genstrings flow finds over the SAME
62+
# source files — the safety net proving the build-free extraction loses nothing, and guarding against
63+
# regressions like the same-basename `.stringsdata` collision. Fails listing any string only genstrings found.
64+
desc 'Verifies Localizable.xcstrings covers every string genstrings extracts (coverage gate)'
65+
lane :verify_strings_catalog do |gutenberg_path: nil|
66+
UI.user_error!("#{LOCALIZABLE_CATALOG} not found — run generate_strings_catalog first") unless File.exist?(LOCALIZABLE_CATALOG)
67+
files = catalog_source_files(CATALOG_SOURCE_ROOTS + [gutenberg_path].compact)
68+
69+
Dir.mktmpdir do |genout|
70+
run_genstrings(files: files, output_dir: genout)
71+
reference = Fastlane::Helper::Ios::L10nHelper.read_strings_file_as_hash(path: File.join(genout, 'Localizable.strings')).keys
72+
catalog_keys = JSON.parse(File.read(LOCALIZABLE_CATALOG))['strings'].keys
73+
gap = CatalogHelper.coverage_gap(reference, catalog_keys)
74+
75+
if gap.empty?
76+
UI.success("Localizable.xcstrings covers all #{reference.count} genstrings keys. ✅")
77+
else
78+
gap.sort.first(25).each { |key| UI.error(" MISSING from catalog: #{key.inspect}") }
79+
UI.user_error!("#{gap.count} string(s) found by genstrings are missing from Localizable.xcstrings.")
80+
end
81+
end
82+
end
83+
84+
#################################################
85+
# Helpers
86+
#################################################
87+
88+
# Runs the legacy genstrings extraction (the verification reference) over the same files into output_dir.
89+
def run_genstrings(files:, output_dir:)
90+
sh('genstrings', '-s', CATALOG_LOCALIZATION_ROUTINE, '-o', output_dir, *files)
91+
end
92+
93+
# Enumerate .swift/.m source files under the given roots, applying the same exclusions as the legacy lane:
94+
# vendored code, the unit-test harness, and AppLocalizedString.swift itself (its definition would otherwise
95+
# be misparsed as a call site).
96+
def catalog_source_files(roots)
97+
roots.flat_map { |root| Dir.glob(File.join(root, '**', '*.{swift,m}')) }
98+
.reject { |path| catalog_excluded?(path) }
99+
.uniq
100+
.sort
101+
end
102+
103+
def catalog_excluded?(path)
104+
path.include?('Vendor/') ||
105+
path.include?('/WordPressTest/') ||
106+
File.basename(path) == 'AppLocalizedString.swift'
107+
end
108+
109+
# xcstringstool extract -> one .stringsdata per source file (basename-disambiguated). Chunked to stay under
110+
# the OS argument limit; each chunk gets its own output subdir (see below), which sync then consumes together.
111+
# `--SwiftUI-Text` (extract `Text("literal")`) is OFF by default and gated behind `swiftui:`. The app has
112+
# ~91 such literals but only 16 `Text(verbatim:)`, so non-translatable glyphs (`Text("Aa")`, `Text("A")`)
113+
# are NOT guarded — extracting them would feed garbage to translators. Enabling it is a deliberate coverage
114+
# expansion that needs a cleanup pass first (convert non-translatable literals to `verbatim:`); then pass
115+
# `swiftui: true`.
116+
def extract_stringsdata(files:, output_dir:, swiftui: false)
117+
flags = [
118+
'--legacy-localizable-strings', # NSLocalizedString + ObjC siblings
119+
'--modern-localizable-strings', # String(localized:) / LocalizedStringResource — future catalog-native code
120+
'-s', CATALOG_LOCALIZATION_ROUTINE # the app's AppLocalizedString custom routine
121+
]
122+
flags << '--SwiftUI-Text' if swiftui
123+
# Chunk to stay under ARG_MAX, but give each chunk its OWN output dir. `extract` names .stringsdata by
124+
# source basename and only disambiguates collisions WITHIN a single invocation — so two same-named files
125+
# in different chunks (e.g. the two NSDate+Helpers.swift / SupportDataProvider.swift) would otherwise
126+
# overwrite each other in a shared dir and silently drop strings.
127+
files.each_slice(400).with_index do |chunk, index|
128+
chunk_dir = File.join(output_dir, "chunk-#{index}")
129+
FileUtils.mkdir_p(chunk_dir)
130+
sh('xcrun', 'xcstringstool', 'extract', *chunk, *flags, '--output-directory', chunk_dir)
131+
end
132+
end
133+
134+
# All .stringsdata under a dir (recursive, since extract writes one subdir per chunk).
135+
def stringsdata_files(dir)
136+
Dir.glob(File.join(dir, '**', '*.stringsdata'))
137+
end
138+
139+
# sync all the .stringsdata into Localizable.xcstrings. The catalog FILENAME selects the table, so this only
140+
# pulls in the `Localizable` table; strings routed to other tables (AppLocalizedString tableName:) are
141+
# ignored here and would sync into their own `<Table>.xcstrings`. Returns the resulting key count.
142+
def sync_localizable_catalog(stringsdata_dir:)
143+
ensure_catalog_exists(LOCALIZABLE_CATALOG)
144+
stringsdata = stringsdata_files(stringsdata_dir)
145+
UI.user_error!('xcstringstool produced no .stringsdata') if stringsdata.empty?
146+
147+
sh('xcrun', 'xcstringstool', 'sync', LOCALIZABLE_CATALOG, *stringsdata.flat_map { |f| ['--stringsdata', f] })
148+
JSON.parse(File.read(LOCALIZABLE_CATALOG))['strings'].count
149+
end
150+
151+
# Create the catalog as an empty shell if it doesn't exist yet; leave an existing one untouched so its
152+
# translations survive across runs — that persistence is what makes reconcile_changed_sources meaningful.
153+
def ensure_catalog_exists(path)
154+
FileUtils.mkdir_p(File.dirname(path))
155+
return if File.exist?(path)
156+
157+
File.write(path, "#{JSON.pretty_generate('sourceLanguage' => 'en', 'strings' => {}, 'version' => '1.0')}\n")
158+
end
159+
160+
# `xcstringstool sync` leaves an existing key's English value (and its translations) untouched when the
161+
# source text changes (verified). Re-derive the current English from a fresh extraction and, where it
162+
# differs from the catalog, update the English and flip that key's translations to `needs_review`.
163+
def reconcile_changed_sources(stringsdata_dir:)
164+
current_en = current_english_values(stringsdata_dir)
165+
catalog = JSON.parse(File.read(LOCALIZABLE_CATALOG))
166+
reconciled = CatalogHelper.reconcile_source_changes!(catalog, current_en)
167+
unless reconciled.empty?
168+
File.write(LOCALIZABLE_CATALOG, "#{JSON.pretty_generate(catalog)}\n")
169+
UI.important("Re-flagged #{reconciled.count} key(s) as needs_review — English source changed.")
170+
end
171+
reconciled.count
172+
end
173+
174+
# Current English value per key, by syncing the extraction into a throwaway empty catalog (every key is
175+
# 'new', so its English is populated straight from source — which is what `sync` won't do for keys that
176+
# already exist in the real catalog).
177+
def current_english_values(stringsdata_dir)
178+
Dir.mktmpdir do |tmp|
179+
fresh = File.join(tmp, 'Localizable.xcstrings')
180+
File.write(fresh, "#{JSON.pretty_generate('sourceLanguage' => 'en', 'strings' => {}, 'version' => '1.0')}\n")
181+
stringsdata = stringsdata_files(stringsdata_dir)
182+
sh('xcrun', 'xcstringstool', 'sync', fresh, *stringsdata.flat_map { |f| ['--stringsdata', f] })
183+
english_values(JSON.parse(File.read(fresh)))
184+
end
185+
end
186+
187+
# { key => English value } for every catalog entry that has one (skips key-as-source entries).
188+
def english_values(catalog)
189+
catalog['strings'].each_with_object({}) do |(key, entry), acc|
190+
value = entry.dig('localizations', 'en', 'stringUnit', 'value')
191+
acc[key] = value unless value.nil?
192+
end
193+
end
194+
195+
def report_catalog(path, extracted_count:, reconciled_count:)
196+
catalog = JSON.parse(File.read(path))
197+
with_value = catalog['strings'].count { |_, v| v.dig('localizations', 'en', 'stringUnit', 'value') }
198+
message = "Generated #{File.basename(path)} with #{extracted_count} keys (#{with_value} carry an explicit English value; the rest are key-as-source)."
199+
message += " Re-flagged #{reconciled_count} for review (English source changed)." if reconciled_count.positive?
200+
UI.success(message)
201+
end
202+
end

0 commit comments

Comments
 (0)