Skip to content

Commit 23905eb

Browse files
RiskeyLclaude
andauthored
chore: bulk formatting cleanup and post-writing format-check skills (#751)
* style: move trailing colons outside bold asterisks Relocates colons from inside bold markers (`**Start:**`) to after them (`**Start**:`) across en, zh, and ja. Colons that sit inside bold markers can break rendering when adjacent to other inline syntax, so keeping them outside produces more reliable output. Preserves colon width (half-width and full-width) and leaves code fences and inline code spans untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style: normalize punctuation widths across en, zh, ja Corrects punctuation width to match the surrounding language: full-width forms (,。:?()) in Chinese and Japanese prose when adjacent to CJK characters, half-width forms in English prose. Japanese uses 、 for commas. One Latin-heavy list in ja/use-dify/workspace/model-providers.mdx was left untouched because converting only its trailing comma would break visual consistency. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: document frontmatter quoting rule Replaces the vague "quote values that contain special characters" guidance with a concrete rule: leave values unquoted by default, wrap in double quotes only when the value contains a colon followed by a space. This is the one case where YAML would otherwise misparse the value. Example frontmatter block updated to show bare values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style: remove triple-asterisk horizontal rule dividers Deletes standalone *** divider lines across en, zh, and ja docs. Our pages rely on headings for structure, so horizontal rules add visual noise without carrying meaning. Excludes the versions/ folder and use-dify/knowledge/create-knowledge/import-text-data/readme.mdx where the dividers serve a layout purpose. Code fences are left untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style: convert -ing headings to base form across en docs Applies the formatting guide's heading verb rule to ~111 headings and frontmatter titles: `Creating X` becomes `Create X`, `Publishing the Plugin` becomes `Publish the Plugin`, and so on. Preserves noun-phrase gerund headings that name a concept or feature (`Troubleshooting`, `Getting Started`, `Monitoring Data List`, `Streaming Output`, etc.) and skips heading-less gerunds used as adjectives (`Publishing Methods`, `Mounting & Volumes`). Heading numeric prefixes like `### 6.` are handled before verb detection. Also fixes two pre-existing missing-space issues on numbered headings in embedding-in-websites.mdx. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: add English and CJK format-check skills Adds two post-writing verification skills under .claude/skills/. Each combines a Python linter for mechanical rules with a SKILL.md prose section for judgment-call rules, following the same pattern as dify-docs-terminology-check. The English skill enforces every rule in writing-guides/formatting-guide.md. The CJK skill covers general rules plus the Chinese-specific rules in tools/translate/formatting-zh.md and the Japanese-specific rules in tools/translate/formatting-ja.md, auto-selecting the rule set based on the file path. Both linters skip fenced code blocks, inline code spans, and markdown link syntax where appropriate, and they exempt the translation disclaimer from the cross-language link check so zh/ja pages can continue to link back to their English source. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: centralize post-writing verification in writing-guides index Moves the canonical list of post-writing checks into writing-guides/index.md so it lives in one place, adds the two new format-check skills to that list, and shortens the Post-Writing Verification section in each writing skill (guides, api-reference, env-vars) to a one-line pointer. Also unwraps a pre-existing hard-wrapped paragraph in the api-reference skill's Cross-API Links section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address Copilot review comments Applies twelve fixes raised in the PR review: - Corrects the "Veriyfing" typo in a newly-renamed en heading and updates it to the imperative base form (Verify). - Removes a stray empty secondary frontmatter block from the same file. - Restores four garbled translation strings that were flagged by the reviewer (ja loop-variable description, ja conversation_id bullet, zh team-member access description, zh annotation-threshold sentence). - Repairs an unbalanced bold markup in ja/use-dify/workspace/team-members-management.mdx introduced by the earlier bold-colon relocation pass. - Converts half-width colons to full-width `:` after bold labels adjacent to CJK across 26 ja and 2 zh files (174 occurrences). The earlier bold-colon pass moved the colons outside `**...**` but left them half-width; Japanese and Chinese prose use `:` next to CJK. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address second round of Copilot review comments Covers nine findings from the follow-up review: - Extends the bold-colon width fix to any `**...**:` pattern on a line containing CJK, not only when the bolded text itself contains CJK. Catches 34 occurrences in 11 ja and zh files such as `**CentOS**:`, `**Windows**:`, and `- **Professional**:` where the bolded label is Latin but the surrounding prose is Japanese or Chinese. - Replaces `左上角` with `左上` in three ja files. The original translation carried over the Chinese term instead of using the natural Japanese form. - Unquotes 99 frontmatter values across 56 en files that were wrapped in single or double quotes without any `: ` in the value. Converts the four single-quoted values to unquoted, matching the new frontmatter quoting rule documented in the formatting guide. - Converts one stray `**Vector Search Settings**:` in en/use-dify/knowledge/create-knowledge/setting-indexing-methods.mdx back to `:` so English prose uses half-width punctuation. - Relaxes `_yaml_needs_quotes()` in both format-check linters to match the documented rule (quote only when the value contains `: `). The previous stricter check flagged cases the guide intentionally left unquoted, which would make the skills report violations that the guide does not treat as errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 7b291e5 commit 23905eb

221 files changed

Lines changed: 2841 additions & 1063 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/skills/dify-docs-api-reference/SKILL.md

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -145,8 +145,7 @@ Pattern: `{verb}{AppType}{Resource}`
145145

146146
### Cross-API Links
147147

148-
When a description mentions another endpoint, add a markdown link.
149-
Pattern: `/api-reference/{category}/{endpoint-name}` (kebab-case from endpoint summary).
148+
When a description mentions another endpoint, add a markdown link. Pattern: `/api-reference/{category}/{endpoint-name}` (kebab-case from endpoint summary).
150149

151150
## Parameters
152151

@@ -222,7 +221,4 @@ Exception: Tags without a create operation (e.g., Conversations). GET list comes
222221

223222
## Post-Writing Verification
224223

225-
After completing the document:
226-
227-
1. Invoke `dify-docs-terminology-check` to verify terminology consistency against the glossary and codebase.
228-
2. Invoke `dify-docs-reader-test` to verify it from the reader's perspective.
224+
After completing the document, run the post-writing checks listed in `writing-guides/index.md#post-writing-verification`.

.claude/skills/dify-docs-env-vars/SKILL.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,4 @@ The automated translation pipeline does not cover `en/self-host/configuration/en
143143

144144
## Post-Writing Verification
145145

146-
After completing the document:
147-
148-
1. Invoke `dify-docs-terminology-check` to verify terminology consistency against the glossary and codebase.
149-
2. Invoke `dify-docs-reader-test` to verify it from the reader's perspective.
146+
After completing the document, run the post-writing checks listed in `writing-guides/index.md#post-writing-verification`.
Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
---
2+
name: dify-docs-format-check-cjk
3+
description: >
4+
Check formatting compliance in changed Chinese and Japanese documentation
5+
against writing-guides/formatting-guide.md, tools/translate/formatting-zh.md,
6+
and tools/translate/formatting-ja.md. Use after finalizing a translation
7+
batch, or when the user says "check formatting (zh/ja)", "check CJK
8+
formatting", or "format audit" on translated content.
9+
---
10+
11+
# CJK Formatting Check (Chinese + Japanese)
12+
13+
## Purpose
14+
15+
Verify changed Chinese (`zh/`) and Japanese (`ja/`) documentation against every rule in:
16+
17+
- `writing-guides/formatting-guide.md` — general rules
18+
- `tools/translate/formatting-zh.md` — Chinese-specific rules
19+
- `tools/translate/formatting-ja.md` — Japanese-specific rules
20+
21+
Mechanical rules are enforced by the linter script (`check-format-cjk.py`); judgment-call rules are checked by reading the file. The linter automatically selects the right rule set based on whether the path starts with `zh/` or `ja/`.
22+
23+
## Before Starting
24+
25+
Detect changed documentation files by combining:
26+
27+
- `git diff --name-only`
28+
- `git diff --cached --name-only`
29+
- Untracked files from `git status --porcelain` (lines starting with `??`)
30+
31+
Filter for `.mdx` and `.md` files under `zh/` or `ja/`. If no changed files are detected, ask the user which files to check.
32+
33+
Because translations are typically produced as a zh/ja pair from the same English source, it is natural to audit both languages in a single session.
34+
35+
## Checks
36+
37+
### Part 1 — Deterministic (run the linter)
38+
39+
Run the linter:
40+
41+
```bash
42+
python3 .claude/skills/dify-docs-format-check-cjk/check-format-cjk.py <file> [<file> ...]
43+
```
44+
45+
The script selects zh or ja rules based on the file path. Rules fall into three groups: shared (both zh and ja), zh-only, and ja-only.
46+
47+
**Shared structural rules (zh + ja)**
48+
49+
- `F-title-missing` — frontmatter missing `title`.
50+
- `F-desc-trailing-period``description` ends with a period.
51+
- `F-quote-needed` / `F-quote-unnecessary` / `F-single-quote` — frontmatter quoting rules.
52+
- `F-blank-after-fm` — no blank line between frontmatter close and body.
53+
- `H-trailing-hash`, `H-blank-before`, `H-blank-after`, `H-skip-level` — heading structure rules from the general guide.
54+
- `B-trailing-colon-inside` — colon inside `**...**`.
55+
- `L-asterisk-bullet`, `L-nested-indent`, `L-blank-before`, `L-blank-after` — list structure.
56+
- `C-no-language`, `C-blank-before`, `C-blank-after` — code block rules.
57+
- `Li-click-here`, `Li-http-external` — link rules.
58+
- `I-raw-img-tag`, `I-alt-too-long`, `I-caption-alt-mismatch`, `I-filename-*` — image rules (same set as the EN skill).
59+
- `M-tab-no-title`, `M-component-blank-before`, `M-component-blank-after` — Mintlify component rules.
60+
- `S-double-blank`, `S-trailing-whitespace` — spacing.
61+
- `P-em-dash-spaces`, `P-en-dash-spaces` — general punctuation.
62+
63+
**Shared CJK rules (zh + ja)**
64+
65+
- `CJK-latin-spacing` — CJK character directly adjacent to Latin letter, digit, or backtick without a space. Exceptions: punctuation boundary, start/end of line, inside code/URLs.
66+
- `CJK-halfwidth-punct` — half-width punctuation `, . : ; ? ! ( )` directly adjacent to a CJK character. (Slash and other exceptions are handled by language-specific rules below.)
67+
- `CJK-bold-no-space` — bold span `**...**` adjacent to CJK character without a space on each side.
68+
- `CJK-link-no-space` — markdown link text adjacent to CJK character without a space on each side.
69+
- `CJK-italic``*text*` italic used on a CJK span. Chinese and Japanese should use bold, never italic.
70+
- `CJK-em-dash` — em dash `` or double em dash `——` appears in CJK text. Restructure the sentence.
71+
- `CJK-disclaimer-missing` — translation disclaimer (`<Note> ⚠️ ...`) missing directly below the frontmatter.
72+
- `CJK-cross-lang-link` — internal link begins with the wrong language prefix (e.g., `/en/...` inside a `zh/` file).
73+
74+
**Heading rules**
75+
76+
- `H-heading-end-punct` — CJK heading ends with sentence-ending punctuation (`。,、;:`).
77+
78+
**Chinese-only rules**
79+
80+
- `ZH-ascii-ellipsis``...` used where Chinese ellipsis `……` is expected.
81+
- `ZH-fullwidth-slash` — full-width slash `` used; must be `/`.
82+
- `ZH-quotes` — mainland-style double or single quotation marks `""`, `''` used; must be corner brackets `「」` (single) or `『』` (nested).
83+
- `ZH-range-hyphen` — numeric range uses `-` or ``; must use ``.
84+
- `ZH-percent-space` — space between a digit and `%` or `°`.
85+
86+
**Japanese-only rules**
87+
88+
- `JA-fullwidth-digit` — full-width digit used (``, ``, ...).
89+
- `JA-fullwidth-latin` — full-width Latin letter used (``, ``, ...).
90+
- `JA-fullwidth-space` — full-width space (` `) used.
91+
- `JA-sentence-too-long` — sentence longer than 80 Japanese characters.
92+
- `JA-go-prefix``` prefix used on a verb in the "avoid" list (`ご確認ください`, `ご参照ください`, `ご入力ください`). `ご利用` is allowed.
93+
- `JA-heading-sentence-ending` — heading ends with `します` / `します。` / `します?`, indicating a full-sentence rather than noun-phrase form.
94+
- `JA-style-mix` — the file contains both です/ます and だ/である forms in body text. Only one register should be used.
95+
96+
### Part 2 — Judgment-call review (LLM reads each file)
97+
98+
For each changed file, read it and look for:
99+
100+
**Translatable elements**
101+
102+
- Tab titles (`<Tab title="...">`) translated with the glossary value.
103+
- Frame captions and image alt text translated.
104+
- Bold UI labels translated, matching the codebase i18n file (`web/i18n/zh-Hans/` for zh, `web/i18n/ja-JP/` for ja). Cross-check with the terminology-check skill.
105+
- Natural-language prompt examples inside code blocks translated. Variable placeholders (`{{variable_name}}`) stay unchanged.
106+
107+
**Anchor translation (zh/ja)**
108+
109+
- Cross-references `[text](#anchor)` or `[text](/path#anchor)` use the translated heading slug, not the English original. Example: the heading `## 响应` produces `#响应`, not `#response`.
110+
111+
**Chinese style**
112+
113+
- Enumeration comma `` used for parallel items within a sentence (where Chinese separates items that would use commas in English).
114+
- Chinese ellipsis `……` used correctly; not combined with `` in the same phrase.
115+
- Arabic numerals used for technical content (`3 种`, not `三种`) — judgment call since some idiomatic expressions use Chinese numerals.
116+
- Translationese patterns from `tools/translate/formatting-zh.md`: redundant `你的`, unnecessary ``, `当...时` wrappers, `能够` instead of ``, `可以` where `` would suffice. Flag sentences that read as machine-translated.
117+
118+
**Japanese style**
119+
120+
- `です/ます` maintained in body text.
121+
- Headings are noun phrases, not full sentences.
122+
- Katakana long-vowel mark rules: short loanwords (3 morae or fewer) keep the trailing ``; longer ones drop it. Established compound katakana terms (ワークフロー, ナレッジベース) match the glossary.
123+
- Middle dot `` only where needed for readability; established compound terms have no middle dot.
124+
- Translationese patterns from `tools/translate/formatting-ja.md`: redundant `あなたの`, `〜することができます` instead of `〜できます`, `〜とき/〜場合` wrappers, stacked ``. Flag sentences that read as machine-translated.
125+
126+
**General**
127+
128+
- Terminology matches the glossary (`writing-guides/glossary.md`). Prefer invoking the terminology-check skill for rigorous checking rather than duplicating its logic here.
129+
130+
## Output Format
131+
132+
```
133+
## CJK Formatting Check Results
134+
135+
### File: {path} ({lang})
136+
137+
**Deterministic violations** ({n})
138+
- Line {n} [{rule-id}]: {message}
139+
- ...
140+
141+
**Judgment-call findings** ({n})
142+
- Line {n}: {description of issue} — {rule area}
143+
- ...
144+
145+
**Clean checks**
146+
- ✅ Disclaimer present
147+
- ✅ No bold/CJK spacing issues
148+
- ...
149+
```
150+
151+
Group by file. If a file has no issues at all, report a single ✅ line.
152+
153+
## Important
154+
155+
- Do NOT modify any files. This is a read-only audit.
156+
- When a deterministic rule flags something that is clearly intentional on inspection (e.g., a half-width comma inside a Latin acronym, a mainland quotation mark quoted from an external source), surface it but note the ambiguity so the user can decide.
157+
- The terminology-check skill is the authoritative source for glossary and UI-label verification. If a finding overlaps, defer to that skill and just note the overlap.
158+
- Japanese sentence-length and style-mix checks are heuristic. They flag candidates for human review; they are not definitive rules violations.

0 commit comments

Comments
 (0)