Conversation
There was a problem hiding this comment.
7 issues found across 29 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/rules/glossary.rs">
<violation number="1" location="src/rules/glossary.rs:86">
P1: Skipping synthetic banned issues when a regular issue already exists can let TM downgrade the only report, breaking the intended `banned > TM` precedence.</violation>
</file>
<file name="src/engine/consistency.rs">
<violation number="1" location="src/engine/consistency.rs:148">
P1: Do not bypass group matching when only one term group exists; it can select unrelated glossary terms and create false consistency diagnostics.</violation>
</file>
<file name="src/mcp/tools.rs">
<violation number="1" location="src/mcp/tools.rs:853">
P1: Fix-mode ordering lets TM downgrade glossary-banned synthetic errors, breaking the intended `banned > TM` precedence.</violation>
<violation number="2" location="src/mcp/tools.rs:1083">
P2: `tools/list` schema was not updated for the new `exempt_blockquotes`/`glossary`/`consistency` arguments, causing API contract drift.</violation>
</file>
<file name="src/main.rs">
<violation number="1" location="src/main.rs:869">
P2: The new `--exempt-blockquotes` mode is not represented in scan-cache keys, so cached results can be incorrect when toggling the flag.</violation>
<violation number="2" location="src/main.rs:1194">
P1: Cache-hit paths can drop source text needed by the new glossary/consistency features, causing missed findings.</violation>
</file>
<file name="tests/realworld_calques.rs">
<violation number="1" location="tests/realworld_calques.rs:62">
P2: Match on containment here so the collocation regression is caught even when the scanner reports the full phrase instead of the bare term.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| if params.relaxed { | ||
| cfg = cfg.with_relaxed(); | ||
| } | ||
| if params.exempt_blockquotes { |
There was a problem hiding this comment.
P2: The new --exempt-blockquotes mode is not represented in scan-cache keys, so cached results can be incorrect when toggling the flag.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/main.rs, line 869:
<comment>The new `--exempt-blockquotes` mode is not represented in scan-cache keys, so cached results can be incorrect when toggling the flag.</comment>
<file context>
@@ -825,6 +866,9 @@ fn run_lint_batch(params: &LintBatchParams<'_>) -> Result<()> {
if params.relaxed {
cfg = cfg.with_relaxed();
}
+ if params.exempt_blockquotes {
+ cfg = cfg.with_exempt_blockquotes(true);
+ }
</file context>
A real-world deployment study [1] reported mainland-Chinese terms slipping past the linter in published zh-TW articles, blockquote citation contexts producing ~50 false positives across a 72-article corpus, and ASCII quotes auto-converted to 「」 inside YAML frontmatter breaking downstream parsers. User-facing additions: - '--consistency' reports mixed regional usage of one concept (both 線程 and 執行緒 in the same document). Groups by the rule's "english" anchor; skips TM-suppressed terms. - '--exempt-blockquotes' (CLI + '[markdown]' config) excludes pulldown-cmark 'Tag::BlockQuote' ranges from scanning. Off by default: adopted blockquote prose is real content. - YAML frontmatter preserves ASCII '"' / ''' scalar delimiters. Body prose still converts to 「」. - '[glossary]' section in '.zhtw-mcp.toml': banned / preferred / proper_nouns lists. Banned terms inject synthetic Errors that TM cannot downgrade; proper_nouns suppress matching issues; both honor exclusion zones. - Per-rule 'editorial_confidence' (low / medium / high) flows through issue inflation into MCP explain output. Low forces auto_fix_safe = false and needs_review = true. 優化, 算法, 場景 tagged low because both regional forms are valid zh-TW. Calque-audit refinements: - 消息 gains positional_clues; 好消息 / 壞消息 / 消息來源 no longer fire. - Symmetric 元資料 rule mirrors 元數據 — both use to: [] plus english: "metadata", surfacing the English original as the preferred form. 詮釋資料 and 後設資料 (NAER terminology bank) remain unflagged as acceptable zh-TW alternatives. - Real-world regression fixture pins the 14 documented blind-spot terms. [1] https://ai-muninn.com/zh-TW/blog/zhtw-mcp-calque-blindspot-sweep
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A real-world deployment study [1] reported mainland-Chinese terms slipping past the linter in published zh-TW articles, blockquote citation contexts producing ~50 false positives across a 72-article corpus, and ASCII quotes auto-converted to 「」 inside YAML frontmatter breaking downstream parsers.
User-facing additions:
Calque-audit refinements:
[1] https://ai-muninn.com/zh-TW/blog/zhtw-mcp-calque-blindspot-sweep
Summary by cubic
Adds a document-wide terminology consistency report and a project glossary to enforce preferred terms. Reduces false positives in blockquote citations and preserves ASCII quotes in YAML frontmatter.
New Features
--consistency: groups by a rule’senglishanchor to catch mixed regional terms in one doc; ignores TM‑suppressed issues..zhtw-mcp.toml([glossary]):banned(always error),preferred(guides suggestions),proper_nouns(suppress); all honor exclusion zones.--exempt-blockquotesand[markdown].exempt_blockquotes = trueexclude Markdown blockquotes from scanning (off by default).editorial_confidence(low/medium/high): included in diagnostics;lowforcesauto_fix_safe = falseandneeds_review = true.Bug Fixes
"and'scalar delimiters; body prose still converts to 「」.消息(no more false positives like 好消息/壞消息/消息來源); added a symmetric元資料rule mirroring元數據withenglish: "metadata"while keeping詮釋資料/後設資料acceptable.Written for commit 269c8dd. Summary will update on new commits.