|
| 1 | +--- |
| 2 | +name: doris-doc-optimize |
| 3 | +description: Restructure and SEO/GEO-optimize an existing Apache Doris user documentation file (`.md` / `.mdx`), primarily under `i18n/zh-CN/docusaurus-plugin-content-docs-next/current/` or `docs-next/`. The skill reorganizes the doc from a user-scenario perspective, applies tables and lists for parameters / scenarios / FAQs, fixes formatting (4-space indent, code-block language tags, blank lines, Chinese/English spacing, command-concatenation bugs), validates every external link with `curl` and every `/images/...` reference against `static/`, and updates the frontmatter `description` + `keywords` per the bundled `./references/seo-geo.md` guide — all without changing the original meaning or dropping any technical content (commands, parameters, YAML, sample outputs, images are preserved verbatim). Use this skill whenever the user points at a single Doris doc file and asks to "优化这篇文档", "重新组织这篇文档", "帮我优化文档结构", "对这篇文章进行 SEO 优化", "按 SEO/GEO 优化这篇文档", "调整一下这篇文档的结构", "把这篇文档润色一下", "optimize this doc", "restructure this doc", "polish this doc", or anything similar — even if the user only says "make this doc better" or "这篇文档读起来不顺" while pointing at a Doris doc path. Do NOT use this skill for translating between languages (use `doris-translate-zh-to-en` instead), writing a brand-new doc from scratch (use `doris-feature-card` or `doc-coauthoring`), or pure link-checking without restructure (use `check-md-links` instead). |
| 4 | +--- |
| 5 | + |
| 6 | +# Doris Doc Optimize |
| 7 | + |
| 8 | +Optimize an existing Apache Doris user-documentation file in place: restructure for clarity, apply SEO/GEO best practices, fix formatting, and verify links — all while preserving every piece of technical content from the original. |
| 9 | + |
| 10 | +## Audience and intent |
| 11 | + |
| 12 | +The user has a Doris doc that is technically correct but hard to scan, has weak SEO metadata, or has accumulated small format / grammar issues. They want the same content reorganized to be more useful for readers and search engines. They are **not** asking for new technical material, translation, or a from-scratch rewrite. |
| 13 | + |
| 14 | +The most common files this skill operates on: |
| 15 | +- `i18n/zh-CN/docusaurus-plugin-content-docs-next/current/**/*.md` (Chinese docs, JSON-style frontmatter inside `---` fences) |
| 16 | +- `docs-next/**/*.md` or `docs-next/**/*.mdx` (English docs, YAML frontmatter) |
| 17 | + |
| 18 | +## Inputs |
| 19 | + |
| 20 | +- **Required**: a path to one `.md` or `.mdx` file (typically the user `@`-references it or pastes the path). If the user gives multiple files or a directory, stop and ask which single file to optimize — this skill operates on one file at a time. |
| 21 | +- **Optional**: extra instructions the user adds (e.g. "保留所有截图", "再加一节 troubleshooting"). Treat these as constraints layered on top of the default workflow. |
| 22 | + |
| 23 | +If no file path is provided, ask for one before doing anything else. Do not guess. |
| 24 | + |
| 25 | +## Paths used by this skill |
| 26 | + |
| 27 | +At runtime the working directory is the `doris-website` repo root. The paths below mix two conventions — pay attention to which is which: |
| 28 | + |
| 29 | +- **SEO/GEO guide (must read at runtime)**: `./references/seo-geo.md` **relative to this skill's directory** — i.e. resolve it as `doc-tools/skills/doris-doc-optimize/references/seo-geo.md` from the repo root. This is the source of truth for description length, keyword strategy, knowledge-type meta comments, and structural patterns. Read it on every run; it can evolve. |
| 30 | +- **Static image assets** (repo-relative): `./static/` — internal references like `/images/next/install/foo.png` must resolve to `./static/images/next/install/foo.png`. |
| 31 | +- **Internal doc cross-references**: never include `.md` / `.mdx` extensions (project convention). |
| 32 | + |
| 33 | +## Workflow |
| 34 | + |
| 35 | +Run the steps in order. Step 1 (read the file) and step 2 (read the SEO/GEO guide) can run in parallel. |
| 36 | + |
| 37 | +### Step 1 — Read the target file |
| 38 | + |
| 39 | +Read the entire file. Note: |
| 40 | +- Frontmatter style (JSON-inside-`---` for zh-CN Doris docs, YAML elsewhere) — preserve it exactly. |
| 41 | +- All code blocks, YAML samples, image references, and example outputs — these are immutable content; you may regroup them but never delete or rewrite their substance. |
| 42 | +- The language (zh-CN vs en) — this affects keyword strategy and section names. |
| 43 | + |
| 44 | +### Step 2 — Read `./references/seo-geo.md` |
| 45 | + |
| 46 | +Always read the bundled SEO/GEO guide at runtime so the latest conventions apply. The path is relative to this skill's directory — resolve it to `doc-tools/skills/doris-doc-optimize/references/seo-geo.md` from the repo root. Pull from it: |
| 47 | +- The frontmatter checklist (title / description / keywords). |
| 48 | +- The GEO knowledge-type and 适用场景 meta-comment patterns. |
| 49 | +- The structural recommendations per doc type (Guide / Reference / Feature / Tutorial / FAQ / Mixed). |
| 50 | + |
| 51 | +### Step 3 — Plan the new structure |
| 52 | + |
| 53 | +Before writing, sketch the target outline. The goal is **scenario-driven** organization — the reader should be able to answer "is this doc for me?" within the first screen, then follow a clear path to action. |
| 54 | + |
| 55 | +A typical reshape adds these sections near the top (use only the ones that apply — don't pad): |
| 56 | + |
| 57 | +1. **Opening paragraph** — one short paragraph framing who the doc is for and what they will achieve. |
| 58 | +2. **适用场景 / Use cases** — a table when there are multiple scenarios; omit if the doc has a single obvious use case. |
| 59 | +3. **前置条件 / Prerequisites** — a bulleted list of environment / version / permission requirements; omit if there are none. |
| 60 | +4. **流程总览 / Overview** — a numbered list of the high-level steps; only when the doc has a multi-step procedure (≥3 steps). |
| 61 | + |
| 62 | +For the body: |
| 63 | +- Group related content into clear H2 sections, with H3 sub-steps. |
| 64 | +- Each procedural block follows **目的 → 命令 → 说明** (intent → command → explanation), so a reader copying commands always knows why. |
| 65 | +- Move parameter explanations into tables (one row per field). |
| 66 | +- End with a **常见问题 / Troubleshooting** table when the doc has clear failure modes (the original had warnings, "if X then Y" prose, or known pitfalls). If the original has none, do not invent failure modes. |
| 67 | + |
| 68 | +**Important**: every heading from the original must have its content reachable in the new version. You may rename, regroup, or split sections — but if any original H2/H3 carried content, that content must land somewhere in the output. |
| 69 | + |
| 70 | +### Step 4 — Rewrite the file |
| 71 | + |
| 72 | +Apply the plan. While rewriting, enforce: |
| 73 | + |
| 74 | +**Content preservation** |
| 75 | +- All commands, YAML, JSON, sample outputs, and image references are kept verbatim. You may move them, but not edit them — except to fix obvious typos in the prose around them, or to fix code-block-internal bugs that are clearly wrong (e.g. two shell commands concatenated with no newline; missing closing brace in a non-functional snippet). When in doubt, leave it as-is and note it in the summary. |
| 76 | +- Never invent new commands, parameters, version numbers, or facts. If the original is ambiguous, preserve the ambiguity. |
| 77 | + |
| 78 | +**Writing** |
| 79 | +- Tighten paragraphs to ≤3 sentences each. |
| 80 | +- Unify terminology — `Kubernetes` (not `kubernetes` or `K8s` mid-sentence), `Prometheus`, `Grafana`, `Helm`, `Doris`, etc. Match the canonical casing of each product name. |
| 81 | +- Insert a space between Chinese characters and adjacent ASCII letters / digits (e.g. `部署 Prometheus`, not `部署Prometheus`). |
| 82 | +- Fix obvious grammar, punctuation, and typos in the prose. |
| 83 | + |
| 84 | +**Structure** |
| 85 | +- Tables for: parameter explanations, scenario matrices, troubleshooting, comparison. |
| 86 | +- Ordered lists for: sequential steps. |
| 87 | +- Unordered lists for: prerequisites, non-sequential items, bullet-point summaries. |
| 88 | + |
| 89 | +**Format hygiene** |
| 90 | +- Indentation: 4 spaces (Markdown nested lists and YAML inside fenced blocks). |
| 91 | +- Code-block language tags: `shell` / `bash` for commands, `yaml` for YAML, `json` for JSON, `sql` for SQL, `text` for plain output (Pod listings, log lines, expected stdout). Do not use `shell` for non-shell output. |
| 92 | +- One blank line between blocks; no trailing blank lines at end of file. |
| 93 | +- Spaces around inline code: `` 访问 `http://...` `` style. |
| 94 | + |
| 95 | +**SEO / GEO** |
| 96 | +- Update frontmatter `description` to be problem-oriented and ≤120 chars. Preserve the existing frontmatter shape (JSON-style for zh-CN Doris docs, YAML elsewhere). |
| 97 | +- Expand `keywords` to cover synonyms, error/scenario keywords, and (for zh-CN) Chinese long-tail variants. Keep the original keywords; only add. |
| 98 | +- Insert `<!-- 知识类型: ... -->` and `<!-- 适用场景: ... -->` HTML comments at the top of major sections per `./references/seo-geo.md`. Place them above the H2/H3 they describe, not inline. |
| 99 | + |
| 100 | +**Internal links** |
| 101 | +- Doc cross-references must NOT include `.md` / `.mdx` extensions (project convention). If the original violates this, fix it. |
| 102 | + |
| 103 | +### Step 5 — Validate links |
| 104 | + |
| 105 | +Run validation in parallel; fast and worth the time. |
| 106 | + |
| 107 | +**External `http(s)` links** — for each unique URL: |
| 108 | +```shell |
| 109 | +curl -sI -o /dev/null -w "%{http_code}\n" <url> |
| 110 | +``` |
| 111 | +If HEAD returns 4xx (especially 403/405), retry with `curl -sIL` (follow redirects) and then `curl -sL -o /dev/null -w "%{http_code}\n" <url>` (GET). Accept 2xx and 3xx as alive. Anything else, flag in the summary; do not silently remove the link. |
| 112 | + |
| 113 | +**Image references** — for each `/images/...` path, check that `./static/images/...` exists: |
| 114 | +```shell |
| 115 | +test -f ./static/images/next/install/foo.png && echo OK || echo MISSING |
| 116 | +``` |
| 117 | + |
| 118 | +**Internal doc links** — verify the target file exists (with `.md` / `.mdx` / `index.md` fallback, since the project strips extensions). If a link points to a non-existent doc, flag it; do not delete. |
| 119 | + |
| 120 | +### Step 6 — Write the file |
| 121 | + |
| 122 | +Overwrite the target file in place using the `Write` tool. Read first if you haven't yet in this turn (you usually will have, in Step 1). |
| 123 | + |
| 124 | +### Step 7 — Report |
| 125 | + |
| 126 | +Print a short summary listing: |
| 127 | + |
| 128 | +1. **Structural changes** — sections added (适用场景, 前置条件, 流程总览, 常见问题, etc.), sections merged, parameter table extracted from prose. |
| 129 | +2. **SEO/GEO additions** — new `description` (quote it), `keywords` added, knowledge-type meta comments inserted, FAQ table added. |
| 130 | +3. **Bug fixes** — typos / grammar / command-concatenation fixes / wrong code-block language tags / `.md` extensions stripped from internal links. List concretely so the user can spot-check. |
| 131 | +4. **Link-check results** — count of external links checked (all green / N flagged), images verified (all present / N missing). For any flag, name the URL or path. |
| 132 | + |
| 133 | +Keep the summary under ~15 lines. The user can read the diff; the summary's job is to highlight non-obvious changes and any flags that need their attention. |
| 134 | + |
| 135 | +## Constraints and guardrails |
| 136 | + |
| 137 | +These exist because the failure mode for this skill is silent content loss or hallucinated facts — both worse than leaving the doc unchanged. |
| 138 | + |
| 139 | +- **Never invent**: no new commands, parameters, version numbers, error messages, or troubleshooting steps that weren't in the original (or obviously implied by it). |
| 140 | +- **Never delete**: code blocks, image references, example outputs, and admonitions stay. If a section feels redundant, merge it; don't drop it. |
| 141 | +- **Preserve frontmatter shape**: zh-CN Doris docs use a JSON object inside `---` fences — keep that exact form. English docs use standard YAML — keep that. |
| 142 | +- **No `.md` / `.mdx` extensions in internal doc cross-references** — project convention. |
| 143 | +- **Don't translate**: if the user wants Chinese ↔ English, that is `doris-translate-zh-to-en`, not this skill. |
| 144 | +- **One file per invocation**: if asked to optimize a folder, ask which single file to start with. |
| 145 | + |
| 146 | +## Reference example |
| 147 | + |
| 148 | +The canonical example is the install-prometheus-and-grafana doc under `i18n/zh-CN/docusaurus-plugin-content-docs-next/current/install/deploy-on-kubernetes/separating-storage-compute/`. The optimization: |
| 149 | + |
| 150 | +- Added `适用场景` (table), `前置条件` (bullet list), `部署流程总览` (numbered list) at the top. |
| 151 | +- Split each `Step N` into `N.1 / N.2 / N.3` sub-steps, each with intent → command → explanation. |
| 152 | +- Extracted ServiceMonitor YAML field meanings into a parameter table. |
| 153 | +- Added a `常见问题` table at the end covering the four most common failure modes implied by the original prose (Targets not visible, Targets DOWN, Grafana empty, port unreachable). |
| 154 | +- Updated `description` to a 50-char problem-oriented summary; expanded `keywords` with Chinese long-tail terms (`存算分离`, `集群监控`, `指标采集`, `Dashboard`). |
| 155 | +- Inserted `<!-- 知识类型 -->` and `<!-- 适用场景 -->` meta comments at the top of major sections. |
| 156 | +- Fixed an upstream bug where `helm repo add ... helm-charts` and `helm repo update` had been concatenated into one line with no separator. |
| 157 | +- Verified all three external links (`get-helm-3`, prometheus-community charts repo, dashboard JSON) returned 200, and confirmed all three image files existed under `static/images/next/install/`. |
| 158 | + |
| 159 | +That run is a good shape to emulate, but adapt to the doc in front of you — don't force these exact sections onto a doc that doesn't need them. |
| 160 | + |
| 161 | +## Self-check before reporting done |
| 162 | + |
| 163 | +Before printing the summary: |
| 164 | +- Diff the new file against the original mentally: is every code block, image, and YAML sample still present? |
| 165 | +- Did you read `./references/seo-geo.md` this run? (Yes / re-read it.) |
| 166 | +- Did you actually run `curl` on every external link, not just eyeball them? |
| 167 | +- Is the frontmatter shape unchanged? |
| 168 | +- Are there any internal doc cross-references with `.md` / `.mdx` still on them? |
| 169 | + |
| 170 | +If any answer is wrong, fix before reporting. |
0 commit comments