Skip to content

feat: expand i18n and Vietnamese source support#12

Open
nguyen-hung-dev wants to merge 12 commits into
mouseart2025:mainfrom
nguyen-hung-dev:feat/frontend-i18n
Open

feat: expand i18n and Vietnamese source support#12
nguyen-hung-dev wants to merge 12 commits into
mouseart2025:mainfrom
nguyen-hung-dev:feat/frontend-i18n

Conversation

@nguyen-hung-dev
Copy link
Copy Markdown
Contributor

Summary

  • add a lightweight frontend i18n provider with zh-CN and en locale files
  • migrate app shell, navigation, demo shell, and language setting to translation keys
  • document the full FE/BE i18n integration roadmap in I18N_PLAN.md

Scope

  • keeps zh-CN as the default/source locale
  • does not translate uploaded novel content, extracted entity names, or AI-generated analysis output
  • leaves backend localization for a later phase

Validation

  • cd frontend && npm run build

Closes #11

@mouseart2025
Copy link
Copy Markdown
Owner

@nguyen-hung-dev 感谢这个非常有价值的贡献!i18n 基建对项目长期的国际化意义很大。

当前状态:此 PR 涉及 61 文件、+5,107/-1,129 行的大规模 refactor,当前项目处在 EMNLP 2026 ARR 投稿准备期(5/25 截稿),按内部节奏约定,这段时间我们 冻结所有非 bugfix 的功能变更,专注稳定性。即使 CI 通过,61 文件的 review 深度 + 破 UI / 打包 build 的潜在风险对投稿窗口过高。

计划:ARR 录用通知预期在 2026-08-20 前后。此 PR 将在 8/20 后 优先 review 合并。届时我会:

  1. 拉 main 最新版本做 conflict resolve
  2. 全量验证(Mac + Windows desktop build + live demo + 核心路径 smoke test)
  3. 可能分解为 2-3 个更小 PR 以便 review(您的 I18N_PLAN.md 正好给了 phase 划分)

PR 不 close,仍保留开放状态追踪。感谢您的耐心!

@nguyen-hung-dev nguyen-hung-dev changed the title feat(frontend): add i18n foundation feat: expand i18n and Vietnamese source support Apr 23, 2026
@mouseart2025
Copy link
Copy Markdown
Owner

Hi @nguyen-hung-dev — thank you for the substantial work here. 13k+ lines across 100+ files with full Vietnamese support is a serious contribution and the multilingual ambition is genuinely appreciated.

Two distinct concerns bundled together

Looking carefully, this PR contains two largely independent pieces of work:

(A) Frontend i18n — well-scoped, low risk:

  • frontend/src/i18n/ provider + runtime
  • 3 locales (zh-CN / en / vi, 1397 lines each)
  • Components migrated to translation keys
  • frontend/scripts/i18n/index.mjs extraction tooling
  • I18N_PLAN.md

(B) Vietnamese source pipeline — broader surface, touches L3 files:

  • New files: source_language_adapter.py, source_language_heuristics.py, domain_labels.py (678 lines), entity_identity.py, utils/source_language.py
  • Modified: fact_validator.py (+180), entity_aggregator.py (+157/-88), world_structure_agent.py (+58), analysis_service.py (+61/-26), visualization_service.py (+96), name_authority.py (+28)
  • Vietnamese fixture + 3 new test files

Per our CLAUDE.md, those modified backend files are pipeline-critical (L3) and require impact analysis + 5-novel Chinese gold standard regression before merging.

Project timeline

We're submitting to ARR by May 25, 2026 (~2.5 weeks out). The paper's §3.7 contamination-free eval and §3.3 single-root claim both depend on the current pipeline producing reproducible numbers on the 5-novel benchmark. Modifying the L3 files in (B) before submission would mean re-running the full benchmark and updating paper tables — not feasible at this stage.

Proposed path

Could you split this into two PRs?

  • PR-12A — frontend i18n only: I'll review and merge this week (after you rebase against the recent v0.71.4–v0.71.6 work).
  • PR-12B — Vietnamese pipeline support: I'll give it the careful L3 review it deserves after May 25. The Vietnamese fixture work especially is well thought-out and I want to see it land properly.

Two practical notes regardless:

  • This PR currently shows CONFLICTING / DIRTY against main. Recent hotfixes in v0.71.4 (alias merge), v0.71.5 (export), v0.71.6 (LM Studio support) likely touch overlapping code.
  • Last CI run was Apr 23, before those changes — a fresh CI is needed.

If splitting isn't workable on your end, I'm happy to keep this PR open and we'll review post-May 25 as a single unit. Just slower.

Truly appreciate the multilingual vision and the careful structure of the Vietnamese adapter. Looking forward to landing this properly.

— Lei


中文 TL;DR(备注,方便后续中文读者快速理解):

感谢非常用心的贡献。这个 PR 实际打包了两件事 — 前端 i18n(低风险,可单独合)和越南语抽取管线支持(新增 1100+ 行模块 + 改动多个 CLAUDE.md 标记的 L3 关键文件,必须做 5 本中文 gold 标准回归才能合)。论文 ARR 5/25 投稿前我没法动 L3 文件,否则要重跑全 benchmark 并更新论文 §3.7/§3.3 的数字。

建议路径:拆成 PR-12A(纯 i18n,本周可合)+ PR-12B(越南语管线,5/25 后认真 review)。如果不愿拆,整体保留到 5/25 后处理也 OK,只是慢一些。

@mouseart2025
Copy link
Copy Markdown
Owner

Thanks for the substantial contribution! This is a well-structured i18n framework (13K+ lines, complete provider + locale infrastructure).

Heads up: the project is currently in the ACL Rolling Review evaluation cycle (reviews land ~July 7-13, rebuttal closes ~July 13). To avoid disruptive main-branch changes during the reviewer reproducibility window, I'll do a careful review of this PR after the July rebuttal period ends.

Two things before merge can be considered:

  1. Rebase against main — there are conflicts now (main has v0.71.x hotfixes from late May)
  2. Vietnamese translation scope — happy to merge the i18n infra now and have Vietnamese as an opt-in second locale; the Chinese-first audience is the current main user base, so en + zh-CN is the core

I'll re-engage on this in mid-July. Apologies for the delay, and thanks for your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants