Skip to content

fix(i18n): #192 Étape 3 — deterministic translation consistency fixes#432

Open
jsboige wants to merge 2 commits into
masterfrom
fix/192-etape3-translation-polish
Open

fix(i18n): #192 Étape 3 — deterministic translation consistency fixes#432
jsboige wants to merge 2 commits into
masterfrom
fix/192-etape3-translation-polish

Conversation

@jsboige
Copy link
Copy Markdown
Contributor

@jsboige jsboige commented Jun 3, 2026

#192 Étape 3 — Translation consistency polish

Changes

3A: Scenarii PT case normalization (commit 7cb86d52)

  • 113 cells fixed in Cards/Scenarii/Argumentum Scenarii - Cards.csv
  • category_pt and subcategory_pt fields: normalized to Title Case canonical forms
  • All 25 PT inconsistencies from the translation-consistency-audit-192 report were case-related — no content changes
  • Canonical forms: História, Mitologia, Política, Cultura pop, Vida pessoal, Vida profissional, Antiguidade, Séculos XX e XXI, Literatura, Ciência, etc.

3B: Virtues deterministic outlier fixes (commit 638d0ee3)

  • 5 cells fixed in Cards/Fallacies/Argumentum Virtues - Taxonomy.csv
  • Single-record outliers aligned to majority consensus:
PK Field Old (1x) New (majority) Count
1 family_ar حجة وجيهة الصلة حجة ملائمة 1→32
34 family_fa ارائهٔ درستکارانه ارائه صادقانه 1→24
34 family_zh 诚信呈现 诚实陈述 1→24
59 family_es Rigor matemático Exactitud matemática 1→19
135 subfamily_es Definiciones admisibles Definiciones aceptables 1→6

Remaining work (NOT in this PR)

~16 Virtues cells with significant minority variants (5-23 records) — need gpt-5.5 or editorial decision:

  • AR: "أمانة فكرية" (24) vs "النزاهة الفكرية" (3), "استدلال صحيح" (47) vs "الاستدلال الصحيح" (8)
  • FA: "استنتاج معتبر" (32) vs "استدلال معتبر" (23), "تبادل غنی‌ساز" (39) vs "تبادل پربار" (5)
  • ZH: "充实性交流" (39) vs "富有成效的交流" (5)
  • AR/FA/ZH subfamily splits

These require linguistic judgment → targeted gpt-5.5 pass (tracked separately).

Merge timing

⚠️ Per ai-01 dispatch: merge AFTER jsboige visual validation Thursday.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

Your and others added 2 commits June 3, 2026 08:54
…192 étape 3A)

Standardize 113 cells in category_pt and subcategory_pt to Title Case:
- category_pt: História, Mitologia, Política, Cultura pop, Vida pessoal, Vida profissional
- subcategory_pt: Antiguidade, Séculos XX e XXI, Literatura, Ciência, Romance, etc.

Cosmetic only — no content change, purely capitalization alignment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PK=1:   family_ar  حجة وجيهة الصلة → حجة ملائمة (1 vs 32)
PK=34:  family_fa  ارائهٔ درستکارانه → ارائه صادقانه (1 vs 24)
PK=34:  family_zh  诚信呈现 → 诚实陈述 (1 vs 24)
PK=59:  family_es  family=Rigor matemático → Exactitud matemática (1 vs 19)
PK=135: subfamily_es Definiciones admisibles → Definiciones aceptables (1 vs 6)

Part of #192 Étape 3 — translation consistency polish.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@clusterManager-Myia clusterManager-Myia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[NanoClaw]

Reviewed: 2 CSV files, 78 removed + 78 added lines (1:1 row replacement, no insertions/deletions).

Taxonomy CSV — 5 cell changes across 4 rows

All changes align single-record outliers to the majority translation consensus. Verified against the French source:

  • pk=1 family_ar: حجة وجيهة الصلةحجة ملائمة — Correct. Old form was over-descriptive ("argument of strong relevance"); new ملائمة matches "pertinent" more accurately and aligns with the 32-record majority.
  • pk=34 family_zh: 诚信呈现诚实陈述 — Correct. Old used 诚信 (integrity/trustworthiness) which connotes moral character; new 诚实 (honest/truthful) better matches "Présentation intègre" as "transparent/honest presentation".
  • pk=34 family_fa: ارائهٔ درستکارانهارائه صادقانه — Correct. Old درستکارانه is "ethical/proper" (more about conduct); new صادقانه is "truthful" (aligns with the French "transparente et honnête").
  • pk=59 family_es: Rigor matemáticoExactitud matemática — Correct. French "Rigueur mathématique" emphasizes precision, not rigor/discipline. Exactitud is the better match.
  • pk=135 subfamily_es: Definiciones admisiblesDefiniciones aceptables — Correct. French "Définitions recevables" = acceptable, not admissible (legal connotation).

Scenarii CSV — BOM removal + PT title-case normalization

  • Header: UTF-8 BOM stripped (pathpath). Harmless fix.
  • category_pt and subcategory_pt: ~113 cells changed from lowercase to Title Case (e.g. históriaHistória, antiguidadeAntiguidade). All are pure case normalization with zero content changes. Portuguese orthographic convention for categories/subcategories requires Title Case.

Security: No formula injection, script injection, or path traversal patterns detected.

Structural: Line counts match 1:1 (78/78). No CSV quoting or structural integrity issues.

No concerns. The PR description accurately reflects the changes, including the honest note about remaining minority variants deferred to editorial/gpt-5.5 review.

@myia-ai-01
Copy link
Copy Markdown

Revue contenu cellule-par-cellule (ai-01, lane non déléguée) — ✅ VÉRIFIÉ

J'ai vérifié les deux volets indépendamment (pas seulement via le bot), conformément à mon rôle de jugement contenu/terminologie.

3A — Scenarii PT (113 cellules) : case-only PROUVÉ

Test d'identité par minuscules : après strip BOM + lowercase, Argumentum Scenarii - Cards.csv master et branche sont byte-identiques. Donc chaque changement des 148 lignes est purement de la casse — zéro dérive de contenu cachée dans les longues lignes multilingues. BOM d'en-tête retiré (conforme à la convention no-BOM maison ; CsvHelper gère les deux). ✓

3B — Virtues (5 cellules) : alignement déterministe au consensus intra-famille

Chaque cellule était un outlier singleton (1 enregistrement) face à une majorité préexistante ; après fix, chaque groupe famille/sous-famille est unanime (vérifié par groupement family_fr/subfamily_fr) :

PK Champ Avant Après
1 family_ar 1 outlier + 32 33/33 unanime
34 family_fa 1 + 24 25/25
34 family_zh 1 + 24 25/25
59 family_es 1 + 19 20/20
135 subfamily_es 1 + 6 7/7

Alignement déterministe (la nouvelle valeur préexistait déjà comme majorité dans la même famille) — pas d'invention LLM. ✓

Scope deferral approuvé

Les ~16 variantes minoritaires Virtues (5-23 records) sont correctement reportées (Étape 3C) : ce sont de vrais splits, pas des singletons → l'alignement déterministe serait inapproprié, elles demandent un jugement linguistique (gpt-5.5/éditorial). Bon découpage.

Verdict

Contenu APPROUVÉ. CI 3/3 vert, MERGEABLE/CLEAN, NanoClaw COMMENTED (0 CHANGES_REQUESTED).

Merge séquencé APRÈS le sign-off visuel jsboige de jeudi (#140) — ces cellules rendent sur les cartes Scenarii (catégories PT) et Virtues (familles AR/FA/ZH/ES) ; le merge rouvre donc leur validation visuelle. Le dossier #140 valide le master stable actuel ; #432 atterrit juste après. Déclencheur de merge : sign-off #140 → merge → régén ciblée Scenarii+Virtues → re-check visuel ai-01.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants