diff --git a/.github/workflows/docs-next-build.yml b/.github/workflows/docs-next-build.yml index 06731855b803f..090be3ae24b7e 100644 --- a/.github/workflows/docs-next-build.yml +++ b/.github/workflows/docs-next-build.yml @@ -53,6 +53,36 @@ jobs: echo "No docs-next-related changes detected; skipping build." fi + - name: Enforce docs-next mirror for legacy 4.x doc changes + run: | + set -e + BASE_SHA="${{ github.event.pull_request.base.sha }}" + HEAD_SHA="${{ github.event.pull_request.head.sha }}" + CHANGED=$(git diff --name-only "$BASE_SHA" "$HEAD_SHA") + + LEGACY_EN=$(echo "$CHANGED" | grep -E '^versioned_docs/version-4\.x/' || true) + LEGACY_ZH=$(echo "$CHANGED" | grep -E '^i18n/zh-CN/docusaurus-plugin-content-docs/version-4\.x/' || true) + NEXT_EN=$(echo "$CHANGED" | grep -E '^docs-next/' || true) + NEXT_ZH=$(echo "$CHANGED" | grep -E '^i18n/zh-CN/docusaurus-plugin-content-docs-next/current/' || true) + + FAIL=0 + if [ -n "$LEGACY_EN" ] && [ -z "$NEXT_EN" ]; then + echo "::error::Changes detected under versioned_docs/version-4.x/ but no corresponding changes under docs-next/. Please mirror the edits into docs-next/." + echo "Legacy EN files changed:" + echo "$LEGACY_EN" + FAIL=1 + fi + if [ -n "$LEGACY_ZH" ] && [ -z "$NEXT_ZH" ]; then + echo "::error::Changes detected under i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/ but no corresponding changes under i18n/zh-CN/docusaurus-plugin-content-docs-next/current/. Please mirror the edits." + echo "Legacy zh-CN files changed:" + echo "$LEGACY_ZH" + FAIL=1 + fi + if [ "$FAIL" -ne 0 ]; then + exit 1 + fi + echo "✅ Legacy 4.x / docs-next mirroring check passed" + - name: Use Node.js if: steps.changes.outputs.relevant == 'true' uses: actions/setup-node@v4 diff --git a/prompt/img-illustration-style-guide.md b/doc-tools/prompts/img-illustration-style-guide.md similarity index 100% rename from prompt/img-illustration-style-guide.md rename to doc-tools/prompts/img-illustration-style-guide.md diff --git a/doc-tools/skills/doris-doc-optimize/SKILL.md b/doc-tools/skills/doris-doc-optimize/SKILL.md new file mode 100644 index 0000000000000..5168e10f6866e --- /dev/null +++ b/doc-tools/skills/doris-doc-optimize/SKILL.md @@ -0,0 +1,170 @@ +--- +name: doris-doc-optimize +description: Restructure and SEO/GEO-optimize an existing Apache Doris user documentation file (`.md` / `.mdx`), primarily under `i18n/zh-CN/docusaurus-plugin-content-docs-next/current/` or `docs-next/`. The skill reorganizes the doc from a user-scenario perspective, applies tables and lists for parameters / scenarios / FAQs, fixes formatting (4-space indent, code-block language tags, blank lines, Chinese/English spacing, command-concatenation bugs), validates every external link with `curl` and every `/images/...` reference against `static/`, and updates the frontmatter `description` + `keywords` per the bundled `./references/seo-geo.md` guide — all without changing the original meaning or dropping any technical content (commands, parameters, YAML, sample outputs, images are preserved verbatim). Use this skill whenever the user points at a single Doris doc file and asks to "优化这篇文档", "重新组织这篇文档", "帮我优化文档结构", "对这篇文章进行 SEO 优化", "按 SEO/GEO 优化这篇文档", "调整一下这篇文档的结构", "把这篇文档润色一下", "optimize this doc", "restructure this doc", "polish this doc", or anything similar — even if the user only says "make this doc better" or "这篇文档读起来不顺" while pointing at a Doris doc path. Do NOT use this skill for translating between languages (use `doris-translate-zh-to-en` instead), writing a brand-new doc from scratch (use `doris-feature-card` or `doc-coauthoring`), or pure link-checking without restructure (use `check-md-links` instead). +--- + +# Doris Doc Optimize + +Optimize an existing Apache Doris user-documentation file in place: restructure for clarity, apply SEO/GEO best practices, fix formatting, and verify links — all while preserving every piece of technical content from the original. + +## Audience and intent + +The user has a Doris doc that is technically correct but hard to scan, has weak SEO metadata, or has accumulated small format / grammar issues. They want the same content reorganized to be more useful for readers and search engines. They are **not** asking for new technical material, translation, or a from-scratch rewrite. + +The most common files this skill operates on: +- `i18n/zh-CN/docusaurus-plugin-content-docs-next/current/**/*.md` (Chinese docs, JSON-style frontmatter inside `---` fences) +- `docs-next/**/*.md` or `docs-next/**/*.mdx` (English docs, YAML frontmatter) + +## Inputs + +- **Required**: a path to one `.md` or `.mdx` file (typically the user `@`-references it or pastes the path). If the user gives multiple files or a directory, stop and ask which single file to optimize — this skill operates on one file at a time. +- **Optional**: extra instructions the user adds (e.g. "保留所有截图", "再加一节 troubleshooting"). Treat these as constraints layered on top of the default workflow. + +If no file path is provided, ask for one before doing anything else. Do not guess. + +## Paths used by this skill + +At runtime the working directory is the `doris-website` repo root. The paths below mix two conventions — pay attention to which is which: + +- **SEO/GEO guide (must read at runtime)**: `./references/seo-geo.md` **relative to this skill's directory** — i.e. resolve it as `doc-tools/skills/doris-doc-optimize/references/seo-geo.md` from the repo root. This is the source of truth for description length, keyword strategy, knowledge-type meta comments, and structural patterns. Read it on every run; it can evolve. +- **Static image assets** (repo-relative): `./static/` — internal references like `/images/next/install/foo.png` must resolve to `./static/images/next/install/foo.png`. +- **Internal doc cross-references**: never include `.md` / `.mdx` extensions (project convention). + +## Workflow + +Run the steps in order. Step 1 (read the file) and step 2 (read the SEO/GEO guide) can run in parallel. + +### Step 1 — Read the target file + +Read the entire file. Note: +- Frontmatter style (JSON-inside-`---` for zh-CN Doris docs, YAML elsewhere) — preserve it exactly. +- All code blocks, YAML samples, image references, and example outputs — these are immutable content; you may regroup them but never delete or rewrite their substance. +- The language (zh-CN vs en) — this affects keyword strategy and section names. + +### Step 2 — Read `./references/seo-geo.md` + +Always read the bundled SEO/GEO guide at runtime so the latest conventions apply. The path is relative to this skill's directory — resolve it to `doc-tools/skills/doris-doc-optimize/references/seo-geo.md` from the repo root. Pull from it: +- The frontmatter checklist (title / description / keywords). +- The GEO knowledge-type and 适用场景 meta-comment patterns. +- The structural recommendations per doc type (Guide / Reference / Feature / Tutorial / FAQ / Mixed). + +### Step 3 — Plan the new structure + +Before writing, sketch the target outline. The goal is **scenario-driven** organization — the reader should be able to answer "is this doc for me?" within the first screen, then follow a clear path to action. + +A typical reshape adds these sections near the top (use only the ones that apply — don't pad): + +1. **Opening paragraph** — one short paragraph framing who the doc is for and what they will achieve. +2. **适用场景 / Use cases** — a table when there are multiple scenarios; omit if the doc has a single obvious use case. +3. **前置条件 / Prerequisites** — a bulleted list of environment / version / permission requirements; omit if there are none. +4. **流程总览 / Overview** — a numbered list of the high-level steps; only when the doc has a multi-step procedure (≥3 steps). + +For the body: +- Group related content into clear H2 sections, with H3 sub-steps. +- Each procedural block follows **目的 → 命令 → 说明** (intent → command → explanation), so a reader copying commands always knows why. +- Move parameter explanations into tables (one row per field). +- End with a **常见问题 / Troubleshooting** table when the doc has clear failure modes (the original had warnings, "if X then Y" prose, or known pitfalls). If the original has none, do not invent failure modes. + +**Important**: every heading from the original must have its content reachable in the new version. You may rename, regroup, or split sections — but if any original H2/H3 carried content, that content must land somewhere in the output. + +### Step 4 — Rewrite the file + +Apply the plan. While rewriting, enforce: + +**Content preservation** +- All commands, YAML, JSON, sample outputs, and image references are kept verbatim. You may move them, but not edit them — except to fix obvious typos in the prose around them, or to fix code-block-internal bugs that are clearly wrong (e.g. two shell commands concatenated with no newline; missing closing brace in a non-functional snippet). When in doubt, leave it as-is and note it in the summary. +- Never invent new commands, parameters, version numbers, or facts. If the original is ambiguous, preserve the ambiguity. + +**Writing** +- Tighten paragraphs to ≤3 sentences each. +- Unify terminology — `Kubernetes` (not `kubernetes` or `K8s` mid-sentence), `Prometheus`, `Grafana`, `Helm`, `Doris`, etc. Match the canonical casing of each product name. +- Insert a space between Chinese characters and adjacent ASCII letters / digits (e.g. `部署 Prometheus`, not `部署Prometheus`). +- Fix obvious grammar, punctuation, and typos in the prose. + +**Structure** +- Tables for: parameter explanations, scenario matrices, troubleshooting, comparison. +- Ordered lists for: sequential steps. +- Unordered lists for: prerequisites, non-sequential items, bullet-point summaries. + +**Format hygiene** +- Indentation: 4 spaces (Markdown nested lists and YAML inside fenced blocks). +- Code-block language tags: `shell` / `bash` for commands, `yaml` for YAML, `json` for JSON, `sql` for SQL, `text` for plain output (Pod listings, log lines, expected stdout). Do not use `shell` for non-shell output. +- One blank line between blocks; no trailing blank lines at end of file. +- Spaces around inline code: `` 访问 `http://...` `` style. + +**SEO / GEO** +- Update frontmatter `description` to be problem-oriented and ≤120 chars. Preserve the existing frontmatter shape (JSON-style for zh-CN Doris docs, YAML elsewhere). +- Expand `keywords` to cover synonyms, error/scenario keywords, and (for zh-CN) Chinese long-tail variants. Keep the original keywords; only add. +- Insert `` and `` HTML comments at the top of major sections per `./references/seo-geo.md`. Place them above the H2/H3 they describe, not inline. + +**Internal links** +- Doc cross-references must NOT include `.md` / `.mdx` extensions (project convention). If the original violates this, fix it. + +### Step 5 — Validate links + +Run validation in parallel; fast and worth the time. + +**External `http(s)` links** — for each unique URL: +```shell +curl -sI -o /dev/null -w "%{http_code}\n" +``` +If HEAD returns 4xx (especially 403/405), retry with `curl -sIL` (follow redirects) and then `curl -sL -o /dev/null -w "%{http_code}\n" ` (GET). Accept 2xx and 3xx as alive. Anything else, flag in the summary; do not silently remove the link. + +**Image references** — for each `/images/...` path, check that `./static/images/...` exists: +```shell +test -f ./static/images/next/install/foo.png && echo OK || echo MISSING +``` + +**Internal doc links** — verify the target file exists (with `.md` / `.mdx` / `index.md` fallback, since the project strips extensions). If a link points to a non-existent doc, flag it; do not delete. + +### Step 6 — Write the file + +Overwrite the target file in place using the `Write` tool. Read first if you haven't yet in this turn (you usually will have, in Step 1). + +### Step 7 — Report + +Print a short summary listing: + +1. **Structural changes** — sections added (适用场景, 前置条件, 流程总览, 常见问题, etc.), sections merged, parameter table extracted from prose. +2. **SEO/GEO additions** — new `description` (quote it), `keywords` added, knowledge-type meta comments inserted, FAQ table added. +3. **Bug fixes** — typos / grammar / command-concatenation fixes / wrong code-block language tags / `.md` extensions stripped from internal links. List concretely so the user can spot-check. +4. **Link-check results** — count of external links checked (all green / N flagged), images verified (all present / N missing). For any flag, name the URL or path. + +Keep the summary under ~15 lines. The user can read the diff; the summary's job is to highlight non-obvious changes and any flags that need their attention. + +## Constraints and guardrails + +These exist because the failure mode for this skill is silent content loss or hallucinated facts — both worse than leaving the doc unchanged. + +- **Never invent**: no new commands, parameters, version numbers, error messages, or troubleshooting steps that weren't in the original (or obviously implied by it). +- **Never delete**: code blocks, image references, example outputs, and admonitions stay. If a section feels redundant, merge it; don't drop it. +- **Preserve frontmatter shape**: zh-CN Doris docs use a JSON object inside `---` fences — keep that exact form. English docs use standard YAML — keep that. +- **No `.md` / `.mdx` extensions in internal doc cross-references** — project convention. +- **Don't translate**: if the user wants Chinese ↔ English, that is `doris-translate-zh-to-en`, not this skill. +- **One file per invocation**: if asked to optimize a folder, ask which single file to start with. + +## Reference example + +The canonical example is the install-prometheus-and-grafana doc under `i18n/zh-CN/docusaurus-plugin-content-docs-next/current/install/deploy-on-kubernetes/separating-storage-compute/`. The optimization: + +- Added `适用场景` (table), `前置条件` (bullet list), `部署流程总览` (numbered list) at the top. +- Split each `Step N` into `N.1 / N.2 / N.3` sub-steps, each with intent → command → explanation. +- Extracted ServiceMonitor YAML field meanings into a parameter table. +- Added a `常见问题` table at the end covering the four most common failure modes implied by the original prose (Targets not visible, Targets DOWN, Grafana empty, port unreachable). +- Updated `description` to a 50-char problem-oriented summary; expanded `keywords` with Chinese long-tail terms (`存算分离`, `集群监控`, `指标采集`, `Dashboard`). +- Inserted `` and `` meta comments at the top of major sections. +- Fixed an upstream bug where `helm repo add ... helm-charts` and `helm repo update` had been concatenated into one line with no separator. +- Verified all three external links (`get-helm-3`, prometheus-community charts repo, dashboard JSON) returned 200, and confirmed all three image files existed under `static/images/next/install/`. + +That run is a good shape to emulate, but adapt to the doc in front of you — don't force these exact sections onto a doc that doesn't need them. + +## Self-check before reporting done + +Before printing the summary: +- Diff the new file against the original mentally: is every code block, image, and YAML sample still present? +- Did you read `./references/seo-geo.md` this run? (Yes / re-read it.) +- Did you actually run `curl` on every external link, not just eyeball them? +- Is the frontmatter shape unchanged? +- Are there any internal doc cross-references with `.md` / `.mdx` still on them? + +If any answer is wrong, fix before reporting. diff --git a/doc-tools/skills/doris-doc-optimize/references/seo-geo.md b/doc-tools/skills/doris-doc-optimize/references/seo-geo.md new file mode 100644 index 0000000000000..1c9453ca21daa --- /dev/null +++ b/doc-tools/skills/doris-doc-optimize/references/seo-geo.md @@ -0,0 +1,112 @@ +# SEO and GEO Optimization Techniques (General) + +## SEO (Search Engine Optimization) + +### 1. Frontmatter Optimization + +- **title**: Close to real search queries; include common user keywords +- **description**: Problem-oriented, usable as a search snippet; keep within 120 characters +- **keywords**: Cover synonyms and long-tail variants (optional) + +### 2. Content Structure + +- Clear H2/H3 hierarchy for easy parsing by search engines +- Provide a checklist or quick navigation at the beginning +- Each section has explicit subheadings to support quick scanning + +### 3. Keyword Coverage + +- Cover common error keywords or failure scenarios (e.g., `Too many open files`) +- Naturally weave in key synonyms (e.g., vector search / semantic search) +- Make the technical context explicit (Kafka / Iceberg / S3, etc.) + +### 4. Enhancement Blocks + +- **FAQ block**: Cover common questions to broaden search coverage +- **Troubleshooting section**: Add common errors and their solutions +- **Comparison table**: Clarify trade-offs across technology choices + +--- + +## GEO (Generative Engine Optimization) + +### 1. Knowledge Type Annotation + +Add meta comments at the start of a section to declare the content type: + +```html + + + + +``` + +### 2. Use-Case Annotation + +```html + + +``` + +### 3. Structured Expression + +| Technique | Description | +|-----------|-------------| +| Tabulation | Present parameters, configs, and version mappings as tables | +| Step-by-step | Standardize operational content as "Purpose → Command → Explanation" | +| Minimal example | Keep example code concise and direct; avoid redundant context | +| One-line definition | Feature docs should provide a quotable capability definition | + +### 4. Conciseness and Decoupling + +- Keep each explanatory paragraph to 3 sentences or fewer +- Avoid mixing content types (Guide vs. Reference) +- Front-load key information; drop redundant explanation + +### 5. Unified Terminology + +- Use the same term for the same concept throughout +- Define a term on first appearance, then use it directly afterward +- Avoid mixing synonyms + +--- + +## Docusaurus Adaptation + +### Frontmatter Best Practices + +```yaml +--- +title: A title the user would actually search +description: Problem-oriented description, within 120 characters, usable as a search snippet +keywords: + - synonym1 + - synonym2 + - error-scenario keyword +--- +``` + +### Content Organization + +- Use H2 for top-level sections and H3 for sub-steps +- Add a brief intro sentence before each table +- Highlight important configuration with tables or lists +- Annotate expected output for code blocks (when applicable) + +### Cross-link + +- Add links between related docs +- Use meaningful link text; avoid "click here" + +--- + +## SEO/GEO Strategy by Document Type + +| Type | SEO Focus | GEO Focus | +|------|-----------|-----------| +| **Guide** | Executable steps, FAQ, Troubleshooting | Structured steps, knowledge-type annotation | +| **Reference** | Parameter accuracy, long-tail coverage, clear headings | Tabulated parameters, parseable structure | +| **Feature** | Scenario coverage, comparison, problem-oriented titles | Capability definition, decomposable structure | +| **Tutorial** | Clear path, no skipped steps, reusability | Step independence, minimal examples | +| **FAQ** | Comprehensive question coverage, keyword embedding | Concise, directly quotable answers | +| **Mixed** | Clear partitioning by type, heading differentiation | Independent blocks, selectively quotable | diff --git a/docs-next/install/deploy-on-kubernetes/intro.mdx b/docs-next/install/deploy-on-kubernetes/intro.mdx index e2d558c56c943..b59a4cb90fe1d 100644 --- a/docs-next/install/deploy-on-kubernetes/intro.mdx +++ b/docs-next/install/deploy-on-kubernetes/intro.mdx @@ -30,4 +30,10 @@ On Kubernetes, Apache Doris is managed by Doris Operator. Choose the guide that description="Deploy a Doris cluster on Kubernetes in storage-compute separation mode" link="separating-storage-compute/install-doris-cluster" /> + + diff --git a/docs-next/install/deploy-on-kubernetes/separating-storage-compute/install-prometheus-and-grafana.md b/docs-next/install/deploy-on-kubernetes/separating-storage-compute/install-prometheus-and-grafana.md new file mode 100644 index 0000000000000..0164217fc615e --- /dev/null +++ b/docs-next/install/deploy-on-kubernetes/separating-storage-compute/install-prometheus-and-grafana.md @@ -0,0 +1,241 @@ +--- +{ + "title": "Deploy Prometheus and Grafana", + "language": "en", + "description": "Deploy Prometheus and Grafana on Kubernetes with Helm to collect and visualize metrics for a Doris compute-storage decoupled cluster.", + "keywords": ["Doris", "decoupled storage and compute", "compute-storage decoupled", "Kubernetes", "K8s", "Prometheus", "Grafana", "Helm", "ServiceMonitor", "metric", "metric collection", "cluster monitoring", "monitoring deployment", "Dashboard", "kube-prometheus-stack"] +} +--- + + + + +This document describes how to deploy Prometheus and Grafana on Kubernetes with Helm and connect them to an Apache Doris compute-storage decoupled cluster for metric collection, visualization, and alerting. Prometheus scrapes the HTTP and bRPC metrics exposed by FE, BE, and Meta Service. Grafana presents the cluster status through dashboards. + +## Use Cases + +| Scenario | Description | +|------|------| +| New cluster onboarding | Set up monitoring before the Doris compute-storage decoupled cluster goes into production so anomalies can be detected in time. | +| Day-to-day operations | Continuously observe the key metrics of FE, BE, and Meta Service, along with node resource usage. | +| Troubleshooting | Use historical metrics, dashboard views, and alerts to quickly pinpoint performance or availability issues. | +| Capacity planning | Evaluate when to scale out based on the trend of node and component metrics. | + +## Prerequisites + +- A usable Kubernetes cluster with `kubectl` already configured. +- A compute-storage decoupled cluster already deployed in the `default` namespace through Doris Operator, with all three component types (FE, BE/Compute Group, Meta Service) ready. +- Nodes have public network access and can download the Helm installation script, Prometheus Community Charts, and the Grafana Dashboard JSON file. +- Permissions to create namespaces, Helm Releases, ServiceMonitors, and other resources in Kubernetes. + +## Deployment Overview + +1. Install Helm, and deploy Prometheus, Grafana, and Alertmanager in one step through `kube-prometheus-stack`. +2. Configure a Prometheus `ServiceMonitor` so that Prometheus can auto-discover and scrape the HTTP and bRPC metrics of the Doris cluster. +3. Log in to Grafana, import the official Doris dashboard, and add a node monitoring panel as needed. + +## Step 1: Deploy Helm, Prometheus, and Grafana + + + +### 1.1 Install Helm + +Purpose: Install Helm 3 on a local machine or operations node to install and manage the monitoring components on Kubernetes. + +```shell +curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash +``` + +### 1.2 Add the Prometheus Community Helm Repository + +Purpose: Register the repository that hosts the `kube-prometheus-stack` Chart, and refresh the local cache. + +```shell +helm repo add prometheus-community https://prometheus-community.github.io/helm-charts +helm repo update +``` + +### 1.3 Deploy kube-prometheus-stack + +Purpose: Deploy Prometheus, Grafana, Alertmanager, and the related Operator in a dedicated `monitoring` namespace. + +```shell +# Create the namespace +kubectl create namespace monitoring + +# Deploy kube-prometheus-stack +helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring +``` + +### 1.4 Check Pod Status + +Purpose: Confirm that all Pods in the monitoring stack are in the `Running` state before moving on to the next step. + +```shell +kubectl get pods -n monitoring +``` + +A normal output looks like the following: + +```text +NAME READY STATUS RESTARTS AGE +alertmanager-prometheus-kube-prometheus-alertmanager-0 2/2 Running 8 (5h28m ago) 4d23h +prometheus-grafana-7994c77c7-8nk7j 3/3 Running 12 (5h28m ago) 5d +prometheus-kube-prometheus-operator-5576477887-dgp8h 1/1 Running 4 (5h28m ago) 5d +prometheus-kube-state-metrics-77885ddddc-hldlw 1/1 Running 4 (5h28m ago) 4d23h +prometheus-prometheus-kube-prometheus-prometheus-0 2/2 Running 0 4h11m +prometheus-prometheus-node-exporter-2tl9s 1/1 Running 4 (5h28m ago) 4d23h +prometheus-prometheus-node-exporter-b58rd 1/1 Running 4 (5h28m ago) 4d23h +prometheus-prometheus-node-exporter-fqp6v 1/1 Running 4 (5h28m ago) 4d23h +``` + +## Step 2: Configure Prometheus to Scrape Doris Metrics + + + + +Create a `ServiceMonitor` so that Prometheus Operator auto-discovers Doris Services in the `default` namespace that carry the label `app.doris.disaggregated.cluster=test-disaggregated-cluster`, and scrapes their metrics grouped by the three component types: FE, BE, and Meta Service. + +### 2.1 Prepare the ServiceMonitor YAML + +Purpose: Declare the scrape targets, endpoint paths, and scrape interval, and use `relabelings` to assign a unified `group` label to each service by role for dashboard filtering. + +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: doris-disaggregated-monitor + namespace: monitoring + labels: + release: prometheus +spec: + namespaceSelector: + matchNames: + - default + selector: + matchLabels: + app.doris.disaggregated.cluster: test-disaggregated-cluster + endpoints: + - port: http + path: /metrics + interval: 15s + relabelings: + # 1. Unify the job name + - action: replace + targetLabel: job + replacement: doris-cluster + # 2. Map Service name suffix to component group: -cg1 -> be, -fe -> fe, -ms -> meta_service + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-cg1 + replacement: be + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-fe + replacement: fe + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-ms + replacement: meta_service + targetLabel: group + + - port: brpc-port + path: /brpc_metrics + interval: 15s + relabelings: + # 1. Unify the job name + - action: replace + targetLabel: job + replacement: doris-cluster + # 2. Map Service name suffix to component group: -cg1 -> be, -fe -> fe, -ms -> meta_service + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-cg1 + replacement: be + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-fe + replacement: fe + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-ms + replacement: meta_service + targetLabel: group +``` + +### 2.2 Key Fields of the ServiceMonitor + +| Field | Value | Description | +|------|------|------| +| `metadata.namespace` | `monitoring` | The ServiceMonitor must reside in the same namespace as the Prometheus instance. | +| `metadata.labels.release` | `prometheus` | Must match the Helm Release name. Prometheus Operator uses this label to discover ServiceMonitors. | +| `spec.namespaceSelector.matchNames` | `default` | The namespace where the Doris cluster runs. Adjust to match your environment. | +| `spec.selector.matchLabels` | `app.doris.disaggregated.cluster: test-disaggregated-cluster` | Selects the Service of the Doris compute-storage decoupled cluster. Update the cluster name as needed. | +| `endpoints[0].port` | `http` | The HTTP port name on which FE, BE, and Meta Service expose `/metrics`. | +| `endpoints[1].port` | `brpc-port` | The bRPC port name on which BE exposes `/brpc_metrics`. | +| `endpoints[*].interval` | `15s` | Scrape interval. Adjust based on data volume and precision requirements. | +| The `group` label in `relabelings` | `be` / `fe` / `meta_service` | Divides metrics into three component categories by Service name suffix, for dashboard variable filtering. | + +### 2.3 Apply the YAML and Verify + +Purpose: Let Prometheus Operator detect the new `ServiceMonitor` and refresh its scrape targets. + +```shell +kubectl apply -f doris-monitor.yaml +``` + +In a browser, open Prometheus (default port `9090`, for example `http://your_ip:9090`), navigate to **Status → Targets**, and confirm that the FE, BE, and Meta Service targets under `doris-cluster` are all in the `UP` state. + +## Step 3: Configure Grafana and the Dashboard + + + + +### 3.1 Log In to Grafana + +Purpose: Access the Grafana bundled with `kube-prometheus-stack` and complete the first login. + +1. In a browser, open Grafana (default port `3000`, for example `http://your_ip:3000`). +2. The username is `admin`. Retrieve the initial password with the following command: + + ```shell + kubectl get secret --namespace monitoring prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo + ``` + +### 3.2 Import the Doris Dashboard + +Purpose: Use the official Dashboard JSON file to visualize Doris cluster metrics. + +1. Download the official Dashboard file: [Doris-Dashboard-Cloud.json](https://doris.apache.org/files/doris-grafana-dashboard-cloud.json) +2. In Grafana, go to **Dashboards → New → Import**, import the JSON file, and select the bundled Prometheus as the data source. +3. Append `&var-cluster_id=doris-cluster` to the dashboard URL to match the `job` name set in the ServiceMonitor. For example: + + ```text + http://your_ip:3000/d/3fFiWJ4mz456/doris-cloud-dashboard-overview?orgId=1&var-cluster_id=doris-cluster&refresh=5s + ``` + +### 3.3 Add Node Monitoring (Optional) + +Purpose: The example JSON file does not include a host node monitoring panel. Use the official Grafana template `1860` to display `node-exporter` metrics directly. + +1. In Grafana, import a dashboard: + + ![image-for-grafana-import-dashboard](/images/next/install/image-for-grafana-import-dashboard.png) + +2. Select the official template number `1860`: + + ![image-for-grafana-demo-1860](/images/next/install/image-for-grafana-demo-1860.png) + +3. After the import completes, you can view the node metrics: + + ![image-for-node-metrics](/images/next/install/image-for-node-metrics.png) + +## Common Issues + + + +| Issue | Possible cause | Resolution | +|------|----------|----------| +| Doris targets do not appear under Prometheus **Targets** | The `namespaceSelector` or `matchLabels` of the ServiceMonitor does not match the actual Doris cluster; the `release` label does not match the Helm Release name. | Verify the cluster namespace, the Service label `app.doris.disaggregated.cluster`, and confirm that the `release` label on the `ServiceMonitor` is set to `prometheus`. | +| Targets are listed but show `DOWN` | The Pod is not ready, or the `http` / `brpc-port` port name does not match the port name that is actually exposed. | Use `kubectl get svc -n default` and `kubectl describe pod` to confirm the port names, the readiness state, and that `/metrics` and `/brpc_metrics` are accessible inside the container. | +| Grafana dashboard panels are empty | The URL is missing `var-cluster_id=doris-cluster`, or the `job` name in the ServiceMonitor has been changed. | Check that the `var-cluster_id` in the dashboard URL and the `job` label in the `ServiceMonitor` are both set to `doris-cluster`. | +| Cannot access Prometheus on port 9090 or Grafana on port 3000 | The Service type defaults to `ClusterIP`, which is not reachable from outside the cluster. | Forward the port with `kubectl port-forward`, or change the corresponding Service type to `NodePort` or `LoadBalancer`. | +| The command to retrieve the Grafana password returns `NotFound` | The Helm Release is not named `prometheus`, so the Secret name differs. | Use `kubectl get secret -n monitoring` to find the actual Grafana Secret name, then substitute it for `prometheus-grafana` in the command. | diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs-next/current/install/deploy-on-kubernetes/intro.mdx b/i18n/zh-CN/docusaurus-plugin-content-docs-next/current/install/deploy-on-kubernetes/intro.mdx index 967e95f19a7a3..34524e7128492 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs-next/current/install/deploy-on-kubernetes/intro.mdx +++ b/i18n/zh-CN/docusaurus-plugin-content-docs-next/current/install/deploy-on-kubernetes/intro.mdx @@ -30,4 +30,10 @@ Apache Doris 在 Kubernetes 上由 Doris Operator 管理。请根据您要部署 description="在 Kubernetes 上以存算分离模式部署 Doris 集群" link="separating-storage-compute/install-doris-cluster" /> + + diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs-next/current/install/deploy-on-kubernetes/separating-storage-compute/install-prometheus-and-grafana.md b/i18n/zh-CN/docusaurus-plugin-content-docs-next/current/install/deploy-on-kubernetes/separating-storage-compute/install-prometheus-and-grafana.md new file mode 100644 index 0000000000000..63c341d7192d7 --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-docs-next/current/install/deploy-on-kubernetes/separating-storage-compute/install-prometheus-and-grafana.md @@ -0,0 +1,241 @@ +--- +{ + "title": "部署 Prometheus 和 Grafana", + "language": "zh-CN", + "description": "在 Kubernetes 上使用 Helm 部署 Prometheus 与 Grafana,采集并可视化 Doris 存算分离集群指标。", + "keywords": ["Doris", "存算分离", "compute-storage decoupled", "Kubernetes", "K8s", "Prometheus", "Grafana", "Helm", "ServiceMonitor", "metric", "指标采集", "集群监控", "监控部署", "Dashboard", "kube-prometheus-stack"] +} +--- + + + + +本文介绍如何在 Kubernetes 上使用 Helm 部署 Prometheus 和 Grafana,并将其接入 Apache Doris 存算分离集群,实现指标采集、可视化与告警。Prometheus 负责拉取 FE/BE/Meta Service 暴露的 HTTP 与 bRPC 指标,Grafana 负责通过 Dashboard 呈现集群状态。 + +## 适用场景 + +| 场景 | 说明 | +|------|------| +| 新集群上线 | 在 Doris 存算分离集群投入使用前完成监控接入,确保异常可被及时发现 | +| 日常运维 | 持续观察 FE/BE/Meta Service 的关键指标与节点资源使用情况 | +| 故障排查 | 通过历史指标、Dashboard 视图与告警快速定位性能或可用性问题 | +| 容量规划 | 基于节点与组件指标趋势评估扩容时机 | + +## 前置条件 + +- 已具备一个可用的 Kubernetes 集群,并已配置好 `kubectl`。 +- 已经使用 Doris Operator 在 `default` 命名空间部署了一个存算分离集群(FE、BE/Compute Group、Meta Service 三类组件已就绪)。 +- 节点能够访问公网,可下载 Helm 安装脚本、Prometheus Community Charts 与 Grafana Dashboard JSON 文件。 +- 具备在 Kubernetes 中创建命名空间、Helm Release、ServiceMonitor 等资源的权限。 + +## 部署流程总览 + +1. 安装 Helm,并通过 `kube-prometheus-stack` 一键部署 Prometheus、Grafana、Alertmanager。 +2. 配置 Prometheus `ServiceMonitor`,让其能自动发现并采集 Doris 集群的 HTTP 与 bRPC 指标。 +3. 登录 Grafana,导入 Doris 官方 Dashboard,并按需补充节点监控面板。 + +## 第 1 步:部署 Helm、Prometheus 与 Grafana + + + +### 1.1 安装 Helm + +目的:在本地或运维节点安装 Helm 3,用于安装与管理 Kubernetes 上的监控组件。 + +```shell +curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash +``` + +### 1.2 添加 Prometheus Community Helm 仓库 + +目的:注册 `kube-prometheus-stack` Chart 所在的仓库,并刷新本地缓存。 + +```shell +helm repo add prometheus-community https://prometheus-community.github.io/helm-charts +helm repo update +``` + +### 1.3 部署 kube-prometheus-stack + +目的:在独立的 `monitoring` 命名空间中部署 Prometheus、Grafana、Alertmanager 及相关 Operator。 + +```shell +# 创建命名空间 +kubectl create namespace monitoring + +# 部署 kube-prometheus-stack +helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring +``` + +### 1.4 检查 Pod 状态 + +目的:确认监控栈中的全部 Pod 处于 `Running` 状态后再进入下一步。 + +```shell +kubectl get pods -n monitoring +``` + +正常输出示例如下: + +```text +NAME READY STATUS RESTARTS AGE +alertmanager-prometheus-kube-prometheus-alertmanager-0 2/2 Running 8 (5h28m ago) 4d23h +prometheus-grafana-7994c77c7-8nk7j 3/3 Running 12 (5h28m ago) 5d +prometheus-kube-prometheus-operator-5576477887-dgp8h 1/1 Running 4 (5h28m ago) 5d +prometheus-kube-state-metrics-77885ddddc-hldlw 1/1 Running 4 (5h28m ago) 4d23h +prometheus-prometheus-kube-prometheus-prometheus-0 2/2 Running 0 4h11m +prometheus-prometheus-node-exporter-2tl9s 1/1 Running 4 (5h28m ago) 4d23h +prometheus-prometheus-node-exporter-b58rd 1/1 Running 4 (5h28m ago) 4d23h +prometheus-prometheus-node-exporter-fqp6v 1/1 Running 4 (5h28m ago) 4d23h +``` + +## 第 2 步:配置 Prometheus 采集 Doris 指标 + + + + +通过创建 `ServiceMonitor`,让 Prometheus Operator 自动发现位于 `default` 命名空间下、带有 `app.doris.disaggregated.cluster=test-disaggregated-cluster` 标签的 Doris Service,并按 FE、BE、Meta Service 三类分组采集指标。 + +### 2.1 准备 ServiceMonitor YAML + +目的:声明采集目标、端点路径、采集周期,并通过 `relabelings` 将服务按角色打上统一的 `group` 标签,便于在 Dashboard 中过滤。 + +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: doris-disaggregated-monitor + namespace: monitoring + labels: + release: prometheus +spec: + namespaceSelector: + matchNames: + - default + selector: + matchLabels: + app.doris.disaggregated.cluster: test-disaggregated-cluster + endpoints: + - port: http + path: /metrics + interval: 15s + relabelings: + # 1. 统一 job 名称 + - action: replace + targetLabel: job + replacement: doris-cluster + # 2. 按 Service 名称后缀映射到组件分组:-cg1 -> be, -fe -> fe, -ms -> meta_service + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-cg1 + replacement: be + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-fe + replacement: fe + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-ms + replacement: meta_service + targetLabel: group + + - port: brpc-port + path: /brpc_metrics + interval: 15s + relabelings: + # 1. 统一 job 名称 + - action: replace + targetLabel: job + replacement: doris-cluster + # 2. 按 Service 名称后缀映射到组件分组:-cg1 -> be, -fe -> fe, -ms -> meta_service + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-cg1 + replacement: be + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-fe + replacement: fe + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-ms + replacement: meta_service + targetLabel: group +``` + +### 2.2 ServiceMonitor 关键字段说明 + +| 字段 | 取值 | 说明 | +|------|------|------| +| `metadata.namespace` | `monitoring` | ServiceMonitor 必须与 Prometheus 实例位于同一命名空间 | +| `metadata.labels.release` | `prometheus` | 与 Helm Release 名称保持一致,Prometheus Operator 据此发现 ServiceMonitor | +| `spec.namespaceSelector.matchNames` | `default` | Doris 集群所在的命名空间,按实际情况调整 | +| `spec.selector.matchLabels` | `app.doris.disaggregated.cluster: test-disaggregated-cluster` | 用于选中 Doris 存算分离集群的 Service,集群名按实际情况修改 | +| `endpoints[0].port` | `http` | FE/BE/Meta Service 暴露 `/metrics` 的 HTTP 端口名 | +| `endpoints[1].port` | `brpc-port` | BE 暴露 `/brpc_metrics` 的 bRPC 端口名 | +| `endpoints[*].interval` | `15s` | 抓取间隔,可按数据量与精度需求调整 | +| `relabelings` 的 `group` 标签 | `be` / `fe` / `meta_service` | 通过 Service 名称后缀将指标划分到三类组件,用于 Dashboard 变量过滤 | + +### 2.3 应用 YAML 并验证 + +目的:让 Prometheus Operator 检测到新的 `ServiceMonitor` 并刷新采集目标。 + +```shell +kubectl apply -f doris-monitor.yaml +``` + +在浏览器中访问 Prometheus(默认端口 `9090`,例如 `http://your_ip:9090`),打开 **Status → Targets**,确认 `doris-cluster` 下的 FE、BE、Meta Service 目标均处于 `UP` 状态。 + +## 第 3 步:配置 Grafana 与 Dashboard + + + + +### 3.1 登录 Grafana + +目的:访问 `kube-prometheus-stack` 自带的 Grafana,并完成首次登录。 + +1. 在浏览器中访问 Grafana(默认端口 `3000`,例如 `http://your_ip:3000`)。 +2. 用户名为 `admin`,通过以下命令获取初始密码: + + ```shell + kubectl get secret --namespace monitoring prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo + ``` + +### 3.2 导入 Doris Dashboard + +目的:使用官方提供的 Dashboard JSON 文件可视化 Doris 集群指标。 + +1. 下载官方 Dashboard 文件:[Doris-Dashboard-Cloud.json](https://doris.apache.org/files/doris-grafana-dashboard-cloud.json) +2. 在 Grafana 中通过 **Dashboards → New → Import** 导入该 JSON 文件,并将数据源选择为已自带的 Prometheus。 +3. 在 Dashboard URL 后追加 `&var-cluster_id=doris-cluster` 以匹配 ServiceMonitor 中设置的 `job` 名称,例如: + + ```text + http://your_ip:3000/d/3fFiWJ4mz456/doris-cloud-dashboard-overview?orgId=1&var-cluster_id=doris-cluster&refresh=5s + ``` + +### 3.3 补充节点监控(可选) + +目的:示例 JSON 文件未包含主机节点监控面板,可使用 Grafana 官方模板 `1860` 直接呈现 `node-exporter` 指标。 + +1. 在 Grafana 中导入 Dashboard: + + ![image-for-grafana-import-dashboard](/images/next/install/image-for-grafana-import-dashboard.png) + +2. 选择官方模板编号 `1860`: + + ![image-for-grafana-demo-1860](/images/next/install/image-for-grafana-demo-1860.png) + +3. 导入完成后即可查看节点指标: + + ![image-for-node-metrics](/images/next/install/image-for-node-metrics.png) + +## 常见问题 + + + +| 问题 | 可能原因 | 处理方式 | +|------|----------|----------| +| Prometheus **Targets** 中看不到 Doris 目标 | ServiceMonitor 的 `namespaceSelector` 或 `matchLabels` 未匹配实际 Doris 集群;`release` 标签与 Helm Release 名称不一致 | 核对集群所在命名空间、Service 标签 `app.doris.disaggregated.cluster`,以及 `ServiceMonitor` 的 `release` 标签是否为 `prometheus` | +| Targets 显示但状态为 `DOWN` | Pod 未就绪,或 `http` / `brpc-port` 端口与实际暴露的端口名不一致 | 使用 `kubectl get svc -n default` 与 `kubectl describe pod` 确认端口名称、就绪状态及容器内 `/metrics`、`/brpc_metrics` 可访问 | +| Grafana Dashboard 面板为空 | URL 中未带 `var-cluster_id=doris-cluster`,或 ServiceMonitor 的 `job` 名称被改动 | 检查 Dashboard URL 的 `var-cluster_id` 与 `ServiceMonitor` 中 `job` 标签是否同为 `doris-cluster` | +| 无法访问 Prometheus 的 9090 或 Grafana 的 3000 | Service 默认类型为 `ClusterIP`,外部不可达 | 通过 `kubectl port-forward` 转发,或将对应 Service 改为 `NodePort` / `LoadBalancer` | +| 获取 Grafana 密码命令报 `NotFound` | Helm Release 名称不是 `prometheus`,导致 Secret 名称不同 | 使用 `kubectl get secret -n monitoring` 查看实际的 Grafana Secret 名称,再替换命令中的 `prometheus-grafana` | diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/install/deploy-on-kubernetes/separating-storage-compute/install-prometheus-and-grafana.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/install/deploy-on-kubernetes/separating-storage-compute/install-prometheus-and-grafana.md new file mode 100644 index 0000000000000..63c341d7192d7 --- /dev/null +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/install/deploy-on-kubernetes/separating-storage-compute/install-prometheus-and-grafana.md @@ -0,0 +1,241 @@ +--- +{ + "title": "部署 Prometheus 和 Grafana", + "language": "zh-CN", + "description": "在 Kubernetes 上使用 Helm 部署 Prometheus 与 Grafana,采集并可视化 Doris 存算分离集群指标。", + "keywords": ["Doris", "存算分离", "compute-storage decoupled", "Kubernetes", "K8s", "Prometheus", "Grafana", "Helm", "ServiceMonitor", "metric", "指标采集", "集群监控", "监控部署", "Dashboard", "kube-prometheus-stack"] +} +--- + + + + +本文介绍如何在 Kubernetes 上使用 Helm 部署 Prometheus 和 Grafana,并将其接入 Apache Doris 存算分离集群,实现指标采集、可视化与告警。Prometheus 负责拉取 FE/BE/Meta Service 暴露的 HTTP 与 bRPC 指标,Grafana 负责通过 Dashboard 呈现集群状态。 + +## 适用场景 + +| 场景 | 说明 | +|------|------| +| 新集群上线 | 在 Doris 存算分离集群投入使用前完成监控接入,确保异常可被及时发现 | +| 日常运维 | 持续观察 FE/BE/Meta Service 的关键指标与节点资源使用情况 | +| 故障排查 | 通过历史指标、Dashboard 视图与告警快速定位性能或可用性问题 | +| 容量规划 | 基于节点与组件指标趋势评估扩容时机 | + +## 前置条件 + +- 已具备一个可用的 Kubernetes 集群,并已配置好 `kubectl`。 +- 已经使用 Doris Operator 在 `default` 命名空间部署了一个存算分离集群(FE、BE/Compute Group、Meta Service 三类组件已就绪)。 +- 节点能够访问公网,可下载 Helm 安装脚本、Prometheus Community Charts 与 Grafana Dashboard JSON 文件。 +- 具备在 Kubernetes 中创建命名空间、Helm Release、ServiceMonitor 等资源的权限。 + +## 部署流程总览 + +1. 安装 Helm,并通过 `kube-prometheus-stack` 一键部署 Prometheus、Grafana、Alertmanager。 +2. 配置 Prometheus `ServiceMonitor`,让其能自动发现并采集 Doris 集群的 HTTP 与 bRPC 指标。 +3. 登录 Grafana,导入 Doris 官方 Dashboard,并按需补充节点监控面板。 + +## 第 1 步:部署 Helm、Prometheus 与 Grafana + + + +### 1.1 安装 Helm + +目的:在本地或运维节点安装 Helm 3,用于安装与管理 Kubernetes 上的监控组件。 + +```shell +curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash +``` + +### 1.2 添加 Prometheus Community Helm 仓库 + +目的:注册 `kube-prometheus-stack` Chart 所在的仓库,并刷新本地缓存。 + +```shell +helm repo add prometheus-community https://prometheus-community.github.io/helm-charts +helm repo update +``` + +### 1.3 部署 kube-prometheus-stack + +目的:在独立的 `monitoring` 命名空间中部署 Prometheus、Grafana、Alertmanager 及相关 Operator。 + +```shell +# 创建命名空间 +kubectl create namespace monitoring + +# 部署 kube-prometheus-stack +helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring +``` + +### 1.4 检查 Pod 状态 + +目的:确认监控栈中的全部 Pod 处于 `Running` 状态后再进入下一步。 + +```shell +kubectl get pods -n monitoring +``` + +正常输出示例如下: + +```text +NAME READY STATUS RESTARTS AGE +alertmanager-prometheus-kube-prometheus-alertmanager-0 2/2 Running 8 (5h28m ago) 4d23h +prometheus-grafana-7994c77c7-8nk7j 3/3 Running 12 (5h28m ago) 5d +prometheus-kube-prometheus-operator-5576477887-dgp8h 1/1 Running 4 (5h28m ago) 5d +prometheus-kube-state-metrics-77885ddddc-hldlw 1/1 Running 4 (5h28m ago) 4d23h +prometheus-prometheus-kube-prometheus-prometheus-0 2/2 Running 0 4h11m +prometheus-prometheus-node-exporter-2tl9s 1/1 Running 4 (5h28m ago) 4d23h +prometheus-prometheus-node-exporter-b58rd 1/1 Running 4 (5h28m ago) 4d23h +prometheus-prometheus-node-exporter-fqp6v 1/1 Running 4 (5h28m ago) 4d23h +``` + +## 第 2 步:配置 Prometheus 采集 Doris 指标 + + + + +通过创建 `ServiceMonitor`,让 Prometheus Operator 自动发现位于 `default` 命名空间下、带有 `app.doris.disaggregated.cluster=test-disaggregated-cluster` 标签的 Doris Service,并按 FE、BE、Meta Service 三类分组采集指标。 + +### 2.1 准备 ServiceMonitor YAML + +目的:声明采集目标、端点路径、采集周期,并通过 `relabelings` 将服务按角色打上统一的 `group` 标签,便于在 Dashboard 中过滤。 + +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: doris-disaggregated-monitor + namespace: monitoring + labels: + release: prometheus +spec: + namespaceSelector: + matchNames: + - default + selector: + matchLabels: + app.doris.disaggregated.cluster: test-disaggregated-cluster + endpoints: + - port: http + path: /metrics + interval: 15s + relabelings: + # 1. 统一 job 名称 + - action: replace + targetLabel: job + replacement: doris-cluster + # 2. 按 Service 名称后缀映射到组件分组:-cg1 -> be, -fe -> fe, -ms -> meta_service + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-cg1 + replacement: be + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-fe + replacement: fe + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-ms + replacement: meta_service + targetLabel: group + + - port: brpc-port + path: /brpc_metrics + interval: 15s + relabelings: + # 1. 统一 job 名称 + - action: replace + targetLabel: job + replacement: doris-cluster + # 2. 按 Service 名称后缀映射到组件分组:-cg1 -> be, -fe -> fe, -ms -> meta_service + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-cg1 + replacement: be + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-fe + replacement: fe + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-ms + replacement: meta_service + targetLabel: group +``` + +### 2.2 ServiceMonitor 关键字段说明 + +| 字段 | 取值 | 说明 | +|------|------|------| +| `metadata.namespace` | `monitoring` | ServiceMonitor 必须与 Prometheus 实例位于同一命名空间 | +| `metadata.labels.release` | `prometheus` | 与 Helm Release 名称保持一致,Prometheus Operator 据此发现 ServiceMonitor | +| `spec.namespaceSelector.matchNames` | `default` | Doris 集群所在的命名空间,按实际情况调整 | +| `spec.selector.matchLabels` | `app.doris.disaggregated.cluster: test-disaggregated-cluster` | 用于选中 Doris 存算分离集群的 Service,集群名按实际情况修改 | +| `endpoints[0].port` | `http` | FE/BE/Meta Service 暴露 `/metrics` 的 HTTP 端口名 | +| `endpoints[1].port` | `brpc-port` | BE 暴露 `/brpc_metrics` 的 bRPC 端口名 | +| `endpoints[*].interval` | `15s` | 抓取间隔,可按数据量与精度需求调整 | +| `relabelings` 的 `group` 标签 | `be` / `fe` / `meta_service` | 通过 Service 名称后缀将指标划分到三类组件,用于 Dashboard 变量过滤 | + +### 2.3 应用 YAML 并验证 + +目的:让 Prometheus Operator 检测到新的 `ServiceMonitor` 并刷新采集目标。 + +```shell +kubectl apply -f doris-monitor.yaml +``` + +在浏览器中访问 Prometheus(默认端口 `9090`,例如 `http://your_ip:9090`),打开 **Status → Targets**,确认 `doris-cluster` 下的 FE、BE、Meta Service 目标均处于 `UP` 状态。 + +## 第 3 步:配置 Grafana 与 Dashboard + + + + +### 3.1 登录 Grafana + +目的:访问 `kube-prometheus-stack` 自带的 Grafana,并完成首次登录。 + +1. 在浏览器中访问 Grafana(默认端口 `3000`,例如 `http://your_ip:3000`)。 +2. 用户名为 `admin`,通过以下命令获取初始密码: + + ```shell + kubectl get secret --namespace monitoring prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo + ``` + +### 3.2 导入 Doris Dashboard + +目的:使用官方提供的 Dashboard JSON 文件可视化 Doris 集群指标。 + +1. 下载官方 Dashboard 文件:[Doris-Dashboard-Cloud.json](https://doris.apache.org/files/doris-grafana-dashboard-cloud.json) +2. 在 Grafana 中通过 **Dashboards → New → Import** 导入该 JSON 文件,并将数据源选择为已自带的 Prometheus。 +3. 在 Dashboard URL 后追加 `&var-cluster_id=doris-cluster` 以匹配 ServiceMonitor 中设置的 `job` 名称,例如: + + ```text + http://your_ip:3000/d/3fFiWJ4mz456/doris-cloud-dashboard-overview?orgId=1&var-cluster_id=doris-cluster&refresh=5s + ``` + +### 3.3 补充节点监控(可选) + +目的:示例 JSON 文件未包含主机节点监控面板,可使用 Grafana 官方模板 `1860` 直接呈现 `node-exporter` 指标。 + +1. 在 Grafana 中导入 Dashboard: + + ![image-for-grafana-import-dashboard](/images/next/install/image-for-grafana-import-dashboard.png) + +2. 选择官方模板编号 `1860`: + + ![image-for-grafana-demo-1860](/images/next/install/image-for-grafana-demo-1860.png) + +3. 导入完成后即可查看节点指标: + + ![image-for-node-metrics](/images/next/install/image-for-node-metrics.png) + +## 常见问题 + + + +| 问题 | 可能原因 | 处理方式 | +|------|----------|----------| +| Prometheus **Targets** 中看不到 Doris 目标 | ServiceMonitor 的 `namespaceSelector` 或 `matchLabels` 未匹配实际 Doris 集群;`release` 标签与 Helm Release 名称不一致 | 核对集群所在命名空间、Service 标签 `app.doris.disaggregated.cluster`,以及 `ServiceMonitor` 的 `release` 标签是否为 `prometheus` | +| Targets 显示但状态为 `DOWN` | Pod 未就绪,或 `http` / `brpc-port` 端口与实际暴露的端口名不一致 | 使用 `kubectl get svc -n default` 与 `kubectl describe pod` 确认端口名称、就绪状态及容器内 `/metrics`、`/brpc_metrics` 可访问 | +| Grafana Dashboard 面板为空 | URL 中未带 `var-cluster_id=doris-cluster`,或 ServiceMonitor 的 `job` 名称被改动 | 检查 Dashboard URL 的 `var-cluster_id` 与 `ServiceMonitor` 中 `job` 标签是否同为 `doris-cluster` | +| 无法访问 Prometheus 的 9090 或 Grafana 的 3000 | Service 默认类型为 `ClusterIP`,外部不可达 | 通过 `kubectl port-forward` 转发,或将对应 Service 改为 `NodePort` / `LoadBalancer` | +| 获取 Grafana 密码命令报 `NotFound` | Helm Release 名称不是 `prometheus`,导致 Secret 名称不同 | 使用 `kubectl get secret -n monitoring` 查看实际的 Grafana Secret 名称,再替换命令中的 `prometheus-grafana` | diff --git a/sidebars-next.ts b/sidebars-next.ts index 83be1f3840a1b..e9434173083a0 100644 --- a/sidebars-next.ts +++ b/sidebars-next.ts @@ -89,6 +89,7 @@ const sidebars: SidebarsConfig = { 'install/deploy-on-kubernetes/separating-storage-compute/config-fe', 'install/deploy-on-kubernetes/separating-storage-compute/config-cg', 'install/deploy-on-kubernetes/separating-storage-compute/config-cluster', + 'install/deploy-on-kubernetes/separating-storage-compute/install-prometheus-and-grafana', ], }, ], diff --git a/static/files/doris-grafana-dashboard-cloud.json b/static/files/doris-grafana-dashboard-cloud.json new file mode 100644 index 0000000000000..74f4f20b73b65 --- /dev/null +++ b/static/files/doris-grafana-dashboard-cloud.json @@ -0,0 +1,15065 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": { + "type": "datasource" + }, + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "target": { + "limit": 100, + "matchAny": false, + "tags": [], + "type": "dashboard" + }, + "type": "dashboard" + } + ] + }, + "description": "Dashboard for Doris Cloud", + "editable": true, + "fiscalYearStartMonth": 0, + "gnetId": 9734, + "graphTooltip": 0, + "id": 1, + "links": [], + "liveNow": true, + "panels": [ + { + "collapsed": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 0 + }, + "id": 46, + "panels": [], + "repeat": "cluster_name", + "targets": [ + { + "datasource": { + "type": "prometheus" + }, + "refId": "A" + } + ], + "title": "Cluster Overview", + "type": "row" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Total Frontends node number", + "fieldConfig": { + "defaults": { + "color": { + "mode": "thresholds" + }, + "mappings": [ + { + "options": { + "match": "null", + "result": { + "text": "N/A" + } + }, + "type": "special" + } + ], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green", + "value": null + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "none" + }, + "overrides": [] + }, + "gridPos": { + "h": 4, + "w": 6, + "x": 0, + "y": 1 + }, + "id": 10, + "links": [], + "maxDataPoints": 100, + "options": { + "colorMode": "none", + "graphMode": "none", + "justifyMode": "auto", + "orientation": "horizontal", + "reduceOptions": { + "calcs": [ + "mean" + ], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "9.5.21", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "count(up{group=\"fe\", job=\"$cluster_id\"})", + "format": "time_series", + "instant": true, + "intervalFactor": 1, + "refId": "A" + } + ], + "title": "FE Node Number", + "type": "stat" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Total not alive number of Frontends.", + "fieldConfig": { + "defaults": { + "color": { + "mode": "thresholds" + }, + "mappings": [ + { + "options": { + "match": "null", + "result": { + "text": "N/A" + } + }, + "type": "special" + } + ], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green", + "value": null + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "none" + }, + "overrides": [] + }, + "gridPos": { + "h": 4, + "w": 5, + "x": 6, + "y": 1 + }, + "id": 12, + "links": [], + "maxDataPoints": 100, + "options": { + "colorMode": "none", + "graphMode": "none", + "justifyMode": "auto", + "orientation": "horizontal", + "reduceOptions": { + "calcs": [ + "mean" + ], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "9.5.21", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "exemplar": false, + "expr": "count(up{group=\"fe\", job=\"$cluster_id\"}) - sum(up{group=\"fe\", job=\"$cluster_id\"})", + "format": "time_series", + "hide": false, + "instant": true, + "intervalFactor": 2, + "legendFormat": "__auto", + "range": false, + "refId": "A" + } + ], + "title": "FE Not Alive Number", + "type": "stat" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Total Backends node number", + "fieldConfig": { + "defaults": { + "color": { + "mode": "thresholds" + }, + "mappings": [ + { + "options": { + "match": "null", + "result": { + "text": "N/A" + } + }, + "type": "special" + } + ], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green", + "value": null + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "none" + }, + "overrides": [] + }, + "gridPos": { + "h": 4, + "w": 6, + "x": 11, + "y": 1 + }, + "id": 11, + "links": [], + "maxDataPoints": 100, + "options": { + "colorMode": "none", + "graphMode": "none", + "justifyMode": "auto", + "orientation": "horizontal", + "reduceOptions": { + "calcs": [ + "mean" + ], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "9.5.21", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "count(up{group=\"be\", job=\"$cluster_id\"})", + "format": "time_series", + "instant": true, + "intervalFactor": 1, + "refId": "A" + } + ], + "title": "BE Node Number", + "type": "stat" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Total not alive number of Backends.", + "fieldConfig": { + "defaults": { + "color": { + "mode": "thresholds" + }, + "mappings": [ + { + "options": { + "match": "null", + "result": { + "text": "N/A" + } + }, + "type": "special" + } + ], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green", + "value": null + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "none" + }, + "overrides": [] + }, + "gridPos": { + "h": 4, + "w": 7, + "x": 17, + "y": 1 + }, + "id": 14, + "links": [], + "maxDataPoints": 100, + "options": { + "colorMode": "none", + "graphMode": "none", + "justifyMode": "auto", + "orientation": "horizontal", + "reduceOptions": { + "calcs": [ + "mean" + ], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "9.5.21", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "count(up{group=\"be\", job=\"$cluster_id\"}) - sum(up{group=\"be\", job=\"$cluster_id\"})", + "format": "time_series", + "instant": true, + "intervalFactor": 1, + "refId": "A" + } + ], + "title": "BE Not Alive Number", + "type": "stat" + }, + { + "datasource": {}, + "description": "", + "fieldConfig": { + "defaults": { + "color": { + "mode": "thresholds" + }, + "mappings": [ + { + "options": { + "match": "null", + "result": { + "text": "N/A" + } + }, + "type": "special" + } + ], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green", + "value": null + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "bytes" + }, + "overrides": [] + }, + "gridPos": { + "h": 6, + "w": 11, + "x": 0, + "y": 5 + }, + "id": 59, + "links": [], + "maxDataPoints": 100, + "options": { + "colorMode": "none", + "graphMode": "none", + "justifyMode": "auto", + "orientation": "horizontal", + "reduceOptions": { + "calcs": [ + "lastNotNull" + ], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "9.5.21", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_fe_table_data_size{job=\"$cluster_id\"}) / count(up{group=\"fe\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "sum", + "range": true, + "refId": "A" + } + ], + "title": "Data Size", + "type": "stat" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "axisSoftMax": -1, + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green", + "value": null + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "bytes" + }, + "overrides": [] + }, + "gridPos": { + "h": 6, + "w": 13, + "x": 11, + "y": 5 + }, + "id": 402, + "options": { + "legend": { + "calcs": [ + "last" + ], + "displayMode": "table", + "placement": "right", + "showLegend": true, + "sortBy": "Last", + "sortDesc": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_fe_table_data_size{job=\"$cluster_id\"})/2", + "hide": true, + "legendFormat": "sum", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "max by (db_name, table_name) (doris_fe_table_data_size{job=\"$cluster_id\"})", + "hide": false, + "legendFormat": "{{db_name}}:{{table_name}}", + "range": true, + "refId": "B" + } + ], + "title": "Data size per table", + "type": "timeseries" + }, + { + "aliasColors": { + "percentage": "#890f02" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "JVM Heap usage of specified Frontend.\nLeft Y Axes shows the used/max heap size.\nRight Y Axes shows the used percentage.", + "fill": 1, + "fillGradient": 0, + "format": "time_series", + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 11 + }, + "hiddenSeries": false, + "id": 172, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:263", + "alias": "percentage", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(jvm_heap_size_bytes{group=\"fe\", type=\"used\", instance=~\"$fe_instance\", job=\"$cluster_id\"} * 100) by (instance, job) / sum(jvm_heap_size_bytes{group=\"fe\", instance=~\"$fe_instance\", type=\"max\", job=\"$cluster_id\"}) by (instance, job)", + "hide": false, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "FE JVM Heap Used Rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2307", + "format": "percent", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:2308", + "format": "percent", + "logBase": 1, + "max": "100", + "min": "0", + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": {}, + "description": "The compaction score of each BE", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 11 + }, + "hiddenSeries": false, + "id": 142, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "sort": "avg", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "max by(backend) (doris_fe_tablet_max_compaction_score{job=\"$cluster_id\", group=\"fe\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "backend-{{backend}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE Compaction Score", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:214", + "decimals": 0, + "format": "none", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:215", + "format": "short", + "logBase": 1, + "min": "0", + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": {}, + "description": "Left Y axes indicates the total received bytes rate of txn. Right Y axes indicates the loaded rows rate of txn.", + "fieldConfig": { + "defaults": { + "unit": "ops" + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 11 + }, + "hiddenSeries": false, + "id": 180, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:4038", + "alias": "rows", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(rate(doris_be_stream_load{group=\"be\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", job=\"$cluster_id\", type=\"load_rows\"}[$__rate_interval])) by (instance)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}-rows", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Load Rows Rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:3114", + "decimals": 1, + "format": "ops", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:3115", + "format": "ops", + "logBase": 1, + "min": "0", + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Queries per seconds on each Frontends.\nQueries only include Select requests.", + "fill": 1, + "fillGradient": 0, + "format": "time_series", + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 18 + }, + "hiddenSeries": false, + "id": 178, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "hideEmpty": false, + "hideZero": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "max by(job, instance) (doris_fe_qps{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", user=\"\", cluster_id=\"\"})", + "hide": false, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "QPS", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1179", + "decimals": 2, + "format": "ops", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:1180", + "format": "short", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "99 quantiles of query latency on each Frontends.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 18 + }, + "hiddenSeries": false, + "id": 182, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "hideEmpty": false, + "hideZero": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "max by (job, instance, user) (doris_fe_query_latency_ms{job=\"$cluster_id\", quantile=\"0.99\", group=\"fe\", instance=~\"$fe_instance\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}-{{user}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "99th Latency", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1557", + "format": "ms", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1558", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fieldConfig": { + "defaults": { + "unit": "none" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 18 + }, + "hiddenSeries": false, + "id": 211, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "hideEmpty": false, + "hideZero": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_fe_tablet_status_count{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", type=\"unhealthy\"}) by (instance)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Unhealthy Tablet Number", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1557", + "format": "none", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1558", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fieldConfig": { + "defaults": { + "unit": "percentunit" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "format": "time_series", + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 25 + }, + "hiddenSeries": false, + "id": 422, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "hideEmpty": false, + "hideZero": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum by (instance) (rate(doris_be_num_io_bytes_read_from_cache{job=\"$cluster_id\", group=\"be\", compute_group=~\"$compute_group\", name=~\".*\"}[$__rate_interval])) / (sum by (instance) (rate(doris_be_num_io_bytes_read_total{job=\"$cluster_id\", group=\"be\",compute_group=~\"$compute_group\", name=~\".*\"}[$__rate_interval])) + 1e-10)", + "hide": false, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE File Cache Hit Rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1179", + "format": "percentunit", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1180", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": {}, + "description": "", + "fieldConfig": { + "defaults": { + "unit": "MBs" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "format": "time_series", + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 25 + }, + "hiddenSeries": false, + "id": 423, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "hideEmpty": false, + "hideZero": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_be_disk_bytes_read{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", device=~\".*\"}) by (job, instance) / sum(doris_be_disk_read_time_ms{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", device=~\".*\"}) by (job, instance) / 1048576 ", + "hide": false, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE Cache Read MB/s", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1179", + "format": "MBs", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:1180", + "format": "short", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fieldConfig": { + "defaults": { + "unit": "MBs" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "format": "time_series", + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 25 + }, + "hiddenSeries": false, + "id": 424, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "hideEmpty": false, + "hideZero": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_be_disk_bytes_written{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", device=~\".*\"}) by (job, instance) / sum(doris_be_disk_read_time_ms{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", device=~\".*\"}) by (job, instance) / 1048576 ", + "hide": false, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE Cache Write MB/s", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1179", + "format": "MBs", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:1180", + "format": "short", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fieldConfig": { + "defaults": { + "unit": "percentunit" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "format": "time_series", + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 32 + }, + "hiddenSeries": false, + "id": 425, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "hideEmpty": false, + "hideZero": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_cache_hit{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", type=\"partition\"}", + "hide": false, + "legendFormat": "partition-{{instance}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_cache_hit{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", type=\"sql\"}", + "hide": false, + "legendFormat": "sql-{{instance}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "FE Cache Hit", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1179", + "format": "percentunit", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1180", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 39 + }, + "id": 184, + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "CPU Used Rate dashboard includes 'iowait' mode", + "fieldConfig": { + "defaults": { + "unit": "percent" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 2 + }, + "hiddenSeries": false, + "id": 186, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true, + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "100 - sum by(job, instance) (rate(node_cpu_seconds_total{job=~\"$cluster_id.*\",mode=\"idle\",group=~\"$node_exporter_group\"}[$__rate_interval])) / sum by(job, instance) (rate(node_cpu_seconds_total{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"}[$__rate_interval])) * 100", + "hide": false, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "exemplar": false, + "expr": "node:cpu_usage_percent:recorded:1m{job=~\"$cluster_id.*\", group=~\"$node_exporter_group\"}", + "hide": true, + "instant": false, + "interval": "", + "legendFormat": "{{instance}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "CPU Used Rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1454", + "format": "percent", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1455", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "unit": "decgbytes" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 8, + "y": 2 + }, + "hiddenSeries": false, + "id": 190, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true, + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum by(job, instance) (node_memory_MemTotal_bytes{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"} - node_memory_MemFree_bytes{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"} - node_memory_Cached_bytes{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"} - node_memory_Buffers_bytes{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"} - node_memory_SReclaimable_bytes{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"}) / 1.073741824e+09", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Mem Usage", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1393", + "decimals": 2, + "format": "decgbytes", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:1394", + "format": "short", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fieldConfig": { + "defaults": { + "unit": "percent" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 16, + "y": 2 + }, + "hiddenSeries": false, + "id": 188, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true, + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum by(job, instance) (node_memory_MemTotal_bytes{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"} - node_memory_MemFree_bytes{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"} - node_memory_Cached_bytes{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"} - node_memory_Buffers_bytes{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"} - node_memory_SReclaimable_bytes{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"}) / sum by(job, instance) (node_memory_MemTotal_bytes{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"}) * 100", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Mem Used Rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1574", + "format": "percent", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1575", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "unit": "percent" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 8 + }, + "hiddenSeries": false, + "id": 192, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true, + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "avg by(job, instance, device) (rate(node_disk_io_time_seconds_total{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"}[$__rate_interval])) * 100", + "legendFormat": "{{instance}}-{{device}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "avg by(job, instance) (rate(node_disk_io_time_seconds_total{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"}[$__rate_interval])) * 100", + "hide": false, + "legendFormat": "{{instance}}-allDisks", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "I/O Util", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:900", + "format": "percent", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:901", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": {}, + "fieldConfig": { + "defaults": { + "unit": "percent" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 8 + }, + "hiddenSeries": false, + "id": 202, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true, + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "exemplar": false, + "expr": "sum by (instance, mountpoint) (100 - (node_filesystem_avail_bytes{fstype!~\"rootfs|tmpfs|devtmpfs|proc|sysfs|cgroup|bpf\",job=~\"$cluster_id.*\",mountpoint=~\".*\",group=~\"$node_exporter_group\"} * 100) / (node_filesystem_size_bytes{fstype!~\"rootfs|tmpfs|devtmpfs|proc|sysfs|cgroup|bpf\",job=~\"$cluster_id.*\",mountpoint=~\".*\",group=~\"$node_exporter_group\"}))", + "format": "time_series", + "instant": false, + "interval": "", + "legendFormat": "{{instance}}-{{mountpoint}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Disk Used Rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1694", + "format": "percent", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1695", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "unit": "GBs" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 8 + }, + "hiddenSeries": false, + "id": 200, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true, + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "avg by(job, instance, device) (rate(node_disk_written_bytes_total{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"}[$__rate_interval])) / 1.048576e+09", + "legendFormat": "{{instance}}-{{device}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "avg by(job, instance) (rate(node_disk_written_bytes_total{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"}[$__rate_interval])) / 1.048576e+09", + "hide": false, + "legendFormat": "{{instance}}-allDisks", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Disk Write Throughput", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "GBs", + "logBase": 1, + "show": true + }, + { + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "unit": "GBs" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 15 + }, + "hiddenSeries": false, + "id": 198, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true, + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum by(job, instance, device) (rate(node_disk_read_bytes_total{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"}[$__rate_interval])) / 1.048576e+09", + "format": "time_series", + "interval": "", + "legendFormat": "{{instance}}-{{device}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "avg by(job, instance) (rate(node_disk_read_bytes_total{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"}[$__rate_interval])) / 1.048576e+09", + "hide": false, + "legendFormat": "{{instance}}-allDisks", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Disk Read Throughput", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1755", + "format": "GBs", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1756", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "unit": "GBs" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 15 + }, + "hiddenSeries": false, + "id": 205, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true, + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum by(job, instance) (rate(node_network_transmit_bytes_total{device!~\"bond.*?|lo\", group=~\"$node_exporter_group\" ,job=~\"$cluster_id.*\"}[$__rate_interval])) / 1.048576e+09", + "format": "time_series", + "interval": "", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Network Outbound Traffic", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1273", + "format": "GBs", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1274", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "unit": "GBs" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 15 + }, + "hiddenSeries": false, + "id": 194, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true, + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum by(job, instance) (rate(node_network_receive_bytes_total{device!~\"bond.*?|lo\",job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"}[$__rate_interval])) / 1.048576e+09", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Network Inbound Traffic", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1875", + "format": "GBs", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1876", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "unit": "none" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 22 + }, + "hiddenSeries": false, + "id": 204, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true, + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum by(job, instance) (node_filesystem_files_free{fstype!~\"tmpfs|rootfs\",job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"})", + "format": "time_series", + "interval": "", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Inode Free Count", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "none", + "logBase": 1, + "show": true + }, + { + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "unit": "none" + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 22 + }, + "hiddenSeries": false, + "id": 203, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true, + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "abs(node_ntp_offset_seconds{job=~\"$cluster_id.*\",group=~\"$node_exporter_group\"})", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "NTP Offset", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "s", + "logBase": 1, + "show": true + }, + { + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + } + ], + "title": "Host Monitor", + "type": "row" + }, + { + "collapsed": true, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 40 + }, + "id": 47, + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Requests per seconds on each Frontends.\nRequests include all requests sending to the Frontends.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 8, + "x": 0, + "y": 3 + }, + "hiddenSeries": false, + "id": 52, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "hideEmpty": false, + "hideZero": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "max by(job, instance, user) (doris_fe_rps{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", cluster_id=\"\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}-{{user}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "RPS", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1329", + "decimals": 2, + "format": "ops", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:1330", + "format": "short", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Queries per seconds on each Frontends.\nQueries only include Select requests.", + "fill": 1, + "fillGradient": 0, + "format": "time_series", + "gridPos": { + "h": 9, + "w": 8, + "x": 8, + "y": 3 + }, + "hiddenSeries": false, + "id": 53, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "hideEmpty": false, + "hideZero": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "max by(job, instance) (doris_fe_qps{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", user=\"\", cluster_id=\"\"})", + "hide": false, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "QPS", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1179", + "decimals": 2, + "format": "ops", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:1180", + "format": "short", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "99 quantiles of query latency on each Frontends.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 8, + "x": 16, + "y": 3 + }, + "hiddenSeries": false, + "id": 54, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "hideEmpty": false, + "hideZero": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "max by (job, instance, user) (doris_fe_query_latency_ms{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", cluster_id=\"\", quantile=\"0.99\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}-{{user}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "99th Latency", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1557", + "format": "ms", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1558", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Left Y axes indicates 95 to 99 quantiles of query latency on each Frontends.\nRight Y axes indicates the query rate per 1 min.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 12 + }, + "hiddenSeries": false, + "id": 30, + "legend": { + "alignAsTable": true, + "avg": true, + "current": true, + "max": false, + "min": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:479", + "alias": "query rate", + "lines": false, + "points": true, + "yaxis": 2 + }, + { + "$$hashKey": "object:480", + "alias": "0.999", + "legend": false, + "lines": false + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "max by(job, instance, quantile, user) (doris_fe_query_latency_ms{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", cluster_id=\"\"})", + "format": "time_series", + "hide": false, + "intervalFactor": 2, + "legendFormat": "{{quantile}}-{{instance}}-{{user}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Query Percentile", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:499", + "format": "ms", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:500", + "decimals": 0, + "format": "short", + "logBase": 1, + "min": "0", + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Left Y axes indicates the accumulated error queries number.\nRight Y axes indicates the error query rate per 1 min.\nNormally, the error query rate should be 0.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 8, + "y": 12 + }, + "hiddenSeries": false, + "id": 33, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": true, + "min": true, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:1205", + "alias": "/query_err_.*/", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "max by (job, instance, user) (doris_fe_query_err{job=\"$cluster_id\", instance=~\"$fe_instance\", cluster_id=\"\"})", + "format": "time_series", + "hide": false, + "intervalFactor": 2, + "legendFormat": "query_err_count-{{instance}}-{{user}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "max by(job, instance, user) (doris_fe_query_err_rate{job=\"$cluster_id\", instance=~\"$fe_instance\", cluster_id=\"\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "query_err_rate-{{instance}}-{{user}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Query Error", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1212", + "decimals": 2, + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:1213", + "format": "cps", + "logBase": 1, + "min": "0", + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Number of mysql connections of each Frontend.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 16, + "y": 12 + }, + "hiddenSeries": false, + "id": 34, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "max by(job, instance) (doris_fe_connection_total{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", cluster_id=\"\"})", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "FE Mysql Connections", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:970", + "format": "short", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:971", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + } + ], + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "refId": "A" + } + ], + "title": "Query Statistics", + "type": "row" + }, + { + "collapsed": true, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 41 + }, + "id": 128, + "panels": [ + { + "columns": [], + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Statistic of Broker load jobs's num in each Load State.", + "fontSize": "100%", + "gridPos": { + "h": 6, + "w": 6, + "x": 0, + "y": 4 + }, + "id": 141, + "links": [], + "scroll": true, + "showHeader": true, + "sort": { + "col": 0, + "desc": true + }, + "styles": [ + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "state", + "thresholds": [], + "type": "string", + "unit": "short" + }, + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 0, + "mappingType": 1, + "pattern": "Value", + "thresholds": [], + "type": "number", + "unit": "none" + }, + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "mappingType": 1, + "pattern": "/.*/", + "thresholds": [], + "type": "hidden", + "unit": "short" + } + ], + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_job{job=\"$cluster_id\", exported_job=\"load\", type=\"BROKER\", instance=\"$fe_master\"}", + "format": "table", + "hide": false, + "instant": true, + "intervalFactor": 2, + "refId": "A" + } + ], + "title": "Broker Load Job", + "transform": "table", + "type": "table-old" + }, + { + "columns": [], + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Statistic of load jobs's num in each Load State which is generated by Insert Stmt.", + "fontSize": "100%", + "gridPos": { + "h": 6, + "w": 6, + "x": 6, + "y": 4 + }, + "id": 140, + "links": [], + "scroll": true, + "showHeader": true, + "sort": { + "col": 0, + "desc": true + }, + "styles": [ + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "state", + "thresholds": [], + "type": "string", + "unit": "short" + }, + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 0, + "mappingType": 1, + "pattern": "Value", + "thresholds": [], + "type": "number", + "unit": "none" + }, + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "mappingType": 1, + "pattern": "/.*/", + "thresholds": [], + "type": "hidden", + "unit": "short" + } + ], + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_job{job=\"$cluster_id\", exported_job=\"load\", type=\"INSERT\", instance=\"$fe_master\"}", + "format": "table", + "instant": true, + "intervalFactor": 2, + "refId": "A" + } + ], + "title": "Insert Load Job", + "transform": "table", + "type": "table-old" + }, + { + "columns": [], + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Statistic of Routine load jobs's num in each Load State.", + "fontSize": "100%", + "gridPos": { + "h": 6, + "w": 6, + "x": 12, + "y": 4 + }, + "id": 164, + "links": [], + "scroll": true, + "showHeader": true, + "sort": { + "col": 0, + "desc": true + }, + "styles": [ + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "state", + "thresholds": [], + "type": "string", + "unit": "short" + }, + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 0, + "mappingType": 1, + "pattern": "Value", + "thresholds": [], + "type": "number", + "unit": "none" + }, + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "mappingType": 1, + "pattern": "/.*/", + "thresholds": [], + "type": "hidden", + "unit": "short" + } + ], + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_job{job=\"$cluster_id\", exported_job=\"load\", type=\"ROUTINE_LOAD\", instance=\"$fe_master\"}", + "format": "table", + "instant": true, + "intervalFactor": 2, + "refId": "A" + } + ], + "title": "Routine Load Job", + "transform": "table", + "type": "table-old" + }, + { + "columns": [], + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Statistic of Spark load jobs's num in each Load State.", + "fontSize": "100%", + "gridPos": { + "h": 6, + "w": 6, + "x": 18, + "y": 4 + }, + "id": 166, + "links": [], + "scroll": true, + "showHeader": true, + "sort": { + "col": 0, + "desc": true + }, + "styles": [ + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "state", + "thresholds": [], + "type": "string", + "unit": "short" + }, + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 0, + "mappingType": 1, + "pattern": "Value", + "thresholds": [], + "type": "number", + "unit": "none" + }, + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "mappingType": 1, + "pattern": "/.*/", + "thresholds": [], + "type": "hidden", + "unit": "short" + } + ], + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_job{job=\"$cluster_id\", exported_job=\"load\", type=\"SPARK\", instance=\"$fe_master\"}", + "format": "table", + "instant": true, + "intervalFactor": 2, + "refId": "A" + } + ], + "title": "Spark Load Job", + "transform": "table", + "type": "table-old" + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The trend report of broker load job", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 6, + "x": 0, + "y": 10 + }, + "hiddenSeries": false, + "id": 133, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sort": "current", + "sortDesc": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_job{job=\"$cluster_id\", exported_job=\"load\", type=\"BROKER\", instance=\"$fe_master\", state=~\"PENDING|ETL|LOADING|FINISHED\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{state}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Broker Load Tendency", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:721", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:722", + "format": "short", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The trend report of insert load job", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 6, + "x": 6, + "y": 10 + }, + "hiddenSeries": false, + "id": 134, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sort": "current", + "sortDesc": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_job{job=\"$cluster_id\", exported_job=\"load\", type=\"INSERT\", instance=\"$fe_master\", state=~\"PENDING|ETL|LOADING|FINISHED\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{state}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Insert Load Tendency", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:929", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:930", + "format": "short", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The trend report of routine load job", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 6, + "x": 12, + "y": 10 + }, + "hiddenSeries": false, + "id": 170, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sort": "current", + "sortDesc": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_job{job=\"$cluster_id\", exported_job=\"load\", type=\"ROUTINE_LOAD\", instance=\"$fe_master\", state=~\"NEED_SCHEDULE|RUNNING|PAUSED\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{state}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Routine Load Tendency", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2767", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2768", + "format": "short", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The trend report of spark load job", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 6, + "x": 18, + "y": 10 + }, + "hiddenSeries": false, + "id": 168, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sort": "current", + "sortDesc": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_job{job=\"$cluster_id\", exported_job=\"load\", type=\"SPARK\", instance=\"$fe_master\", state=~\"PENDING|ETL|LOADING|FINISHED\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{state}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Spark Load Tendency", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2985", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2986", + "format": "short", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "columns": [], + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Number of running schema change jobs.", + "fontSize": "100%", + "gridPos": { + "h": 3, + "w": 6, + "x": 0, + "y": 16 + }, + "id": 135, + "links": [], + "scroll": true, + "showHeader": true, + "sort": { + "col": 0, + "desc": true + }, + "styles": [ + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "state", + "thresholds": [], + "type": "string", + "unit": "short" + }, + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "mappingType": 1, + "pattern": "Value", + "thresholds": [], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "mappingType": 1, + "pattern": "/.*/", + "thresholds": [], + "type": "hidden", + "unit": "short" + } + ], + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_job{job=\"$cluster_id\", instance=\"$fe_master\", type=\"SCHEMA_CHANGE\"}", + "format": "table", + "hide": false, + "instant": true, + "intervalFactor": 2, + "legendFormat": "asds", + "refId": "A" + } + ], + "title": "SC Job", + "transform": "table", + "type": "table-old" + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Queue size of report in Master FE.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 6, + "x": 6, + "y": 16 + }, + "hiddenSeries": false, + "id": 137, + "legend": { + "avg": true, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_report_queue_size{job=\"$cluster_id\", instance=\"$fe_master\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Report queue size", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Report Queue Size", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2027", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2028", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Stream Load Job Txn Request", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 6, + "x": 12, + "y": 16 + }, + "hiddenSeries": false, + "id": 210, + "legend": { + "avg": true, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "rate(doris_be_stream_load_txn_request{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}-{{type}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Stream Load Job", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2027", + "decimals": 2, + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2028", + "format": "short", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "columns": [], + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Number of running rollup jobs.", + "fontSize": "100%", + "gridPos": { + "h": 3, + "w": 6, + "x": 0, + "y": 19 + }, + "id": 136, + "links": [], + "scroll": true, + "showHeader": true, + "sort": { + "col": 0, + "desc": true + }, + "styles": [ + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "state", + "thresholds": [], + "type": "string", + "unit": "short" + }, + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "mappingType": 1, + "pattern": "Value", + "thresholds": [], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "align": "auto", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "mappingType": 1, + "pattern": "/.*/", + "thresholds": [], + "type": "hidden", + "unit": "short" + } + ], + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_job{job=\"$cluster_id\", instance=\"$fe_master\", type=\"ROLLUP\"}", + "format": "table", + "instant": true, + "intervalFactor": 2, + "refId": "A" + } + ], + "title": "Rollup Job", + "transform": "table", + "type": "table-old" + } + ], + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "refId": "A" + } + ], + "title": "Jobs", + "type": "row" + }, + { + "collapsed": true, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 42 + }, + "id": 107, + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Show the number and rate of txn begin and success", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 5 + }, + "hiddenSeries": false, + "id": 124, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "alias": "txn failed", + "yaxis": 1 + }, + { + "alias": "rate", + "yaxis": 2 + }, + { + "alias": "txn begin rate", + "lines": false, + "points": true, + "yaxis": 2 + }, + { + "alias": "txn success rate", + "lines": false, + "points": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_txn_counter{type=\"begin\"}", + "format": "time_series", + "hide": false, + "intervalFactor": 1, + "legendFormat": "{{instance}}_txn begin", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_txn_counter{type=\"success\"}", + "format": "time_series", + "hide": false, + "intervalFactor": 1, + "legendFormat": "{{instance}}_txn success", + "range": true, + "refId": "D" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(doris_fe_txn_counter{type=\"begin\"}[$__rate_interval])", + "format": "time_series", + "hide": false, + "intervalFactor": 1, + "legendFormat": "{{instance}}_txn begin rate", + "range": true, + "refId": "B" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(doris_fe_txn_counter{type=\"success\"}[$__rate_interval])", + "format": "time_series", + "hide": false, + "intervalFactor": 1, + "legendFormat": "{{instance}}_txn success rate", + "range": true, + "refId": "C" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Txn Begin/Success on FE", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2530", + "format": "none", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:2531", + "format": "ops", + "logBase": 1, + "min": "0", + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Show the failed txn request. Including rejected request and failed txn", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 8, + "y": 5 + }, + "hiddenSeries": false, + "id": 123, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "alias": "txn failed rate", + "lines": false, + "points": true + }, + { + "alias": "txn reject rate", + "lines": false, + "points": true + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "rate(doris_fe_txn_counter{type=\"reject\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}_txn reject rate", + "range": true, + "refId": "C" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "rate(doris_fe_txn_counter{type=\"failed\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}_txn failed rate", + "range": true, + "refId": "D" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Txn Failed/Reject on FE", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2966", + "format": "ops", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2967", + "format": "ops", + "logBase": 1, + "min": "0", + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The number of total publish task request and error rate.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 16, + "y": 5 + }, + "hiddenSeries": false, + "id": 126, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": true, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "alias": "Total", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_be_engine_requests_total{job=\"$cluster_id\", type=\"publish\", status=\"total\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Total", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(doris_be_engine_requests_total{job=\"$cluster_id\", type=\"publish\", status=\"failed\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Publish Task on BE", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "decimals": 0, + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Show the txn status on FE", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 11 + }, + "hiddenSeries": false, + "id": 102, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "alias": "/rollback/", + "color": "#bf1b00", + "points": true + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_txn_status{group=\"fe\", job=\"$cluster_id\", type=\"prepare\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "prepare", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_txn_status{group=\"fe\", job=\"$cluster_id\", type=\"precommitted\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "precommitted", + "range": true, + "refId": "B" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_txn_status{group=\"fe\", job=\"$cluster_id\", type=\"committed\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "committed", + "range": true, + "refId": "C" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_txn_status{group=\"fe\", job=\"$cluster_id\", type=\"aborted\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "aborted", + "range": true, + "refId": "D" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "exemplar": false, + "expr": "doris_fe_txn_status{group=\"fe\", job=\"$cluster_id\", type=\"visible\"}", + "hide": false, + "legendFormat": "visible", + "range": true, + "refId": "E" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "exemplar": false, + "expr": "doris_fe_txn_status{group=\"fe\", job=\"$cluster_id\", type=\"unknown\"}", + "hide": false, + "legendFormat": "unknown", + "range": true, + "refId": "F" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Txn Status on FE", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "none", + "logBase": 1, + "show": true + }, + { + "format": "ops", + "logBase": 1, + "min": "0", + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Left Y axes indicates the total received bytes rate of txn. Right Y axes indicates the loaded rows rate of txn.", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 11 + }, + "hiddenSeries": false, + "id": 103, + "legend": { + "alignAsTable": true, + "avg": true, + "current": true, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "alias": "rows", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(rate(doris_be_stream_load{group=\"be\", compute_group=~\"$compute_group\", job=\"$cluster_id\", type=\"receive_bytes\"}[$__rate_interval]))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "bytes", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(rate(doris_be_stream_load{group=\"be\", compute_group=~\"$compute_group\", job=\"$cluster_id\", type=\"load_rows\"}[$__rate_interval]))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "rows", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Txn Load Bytes/Rows rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:3114", + "format": "Bps", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:3115", + "format": "ops", + "logBase": 1, + "min": "0", + "show": true + } + ], + "yaxis": { + "align": false + } + } + ], + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "refId": "A" + } + ], + "title": "Transactions", + "type": "row" + }, + { + "collapsed": true, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 43 + }, + "id": 49, + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The max replayed meta data journal id on Frontends.\nNormally, all Frontends should be same on this metrics, or just slightly different for a short period.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 6 + }, + "hiddenSeries": false, + "id": 63, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_max_journal_id{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", cluster_id=\"\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Max Replayed Journal ID", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1483", + "format": "none", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1484", + "format": "short", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The edit log size for each FE", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 8, + "y": 6 + }, + "hiddenSeries": false, + "id": 150, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_edit_log{job=\"$cluster_id\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}-{{type}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Edit Log Size", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:162", + "decimals": 0, + "format": "decbytes", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:163", + "format": "none", + "label": "", + "logBase": 1, + "min": "0", + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The edit log clean of each FE", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 16, + "y": 6 + }, + "hiddenSeries": false, + "id": 144, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "sort": "avg", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_edit_log_clean{job=\"$cluster_id\", type=\"success\", group=\"fe\", instance=~\"$fe_instance\", cluster_id=\"\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}_success", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_edit_log_clean{job=\"$cluster_id\", type=\"failed\", group=\"fe\", instance=~\"$fe_instance\", cluster_id=\"\"}", + "hide": false, + "legendFormat": "{{instance}}_failed", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Edit Log Clean", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:162", + "decimals": 0, + "format": "none", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:163", + "format": "none", + "label": "", + "logBase": 1, + "min": "0", + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The image push of each FE", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 12 + }, + "hiddenSeries": false, + "id": 154, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "sort": "avg", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_image_push{job=\"$cluster_id\", type=\"success\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}_success", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_image_push{job=\"$cluster_id\", type=\"failed\"}", + "hide": false, + "legendFormat": "{{instance}}_failed", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Image Push", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:162", + "decimals": 0, + "format": "none", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:163", + "format": "none", + "label": "", + "logBase": 1, + "min": "0", + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The image Write of each FE", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 8, + "y": 12 + }, + "hiddenSeries": false, + "id": 156, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "sort": "avg", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_image_write{job=\"$cluster_id\", type=\"success\", group=\"fe\", instance=~\"$fe_instance\", cluster_id=\"\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}_success", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_image_write{job=\"$cluster_id\", type=\"failed\", group=\"fe\", instance=~\"$fe_instance\", cluster_id=\"\"}", + "hide": false, + "legendFormat": "{{instance}}_failed", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Image Write", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:162", + "decimals": 0, + "format": "none", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:163", + "format": "none", + "label": "", + "logBase": 1, + "min": "0", + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The image clean of each FE", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 16, + "y": 12 + }, + "hiddenSeries": false, + "id": 146, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "sort": "avg", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_image_clean{job=\"$cluster_id\", type=\"success\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}_success", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_image_clean{job=\"$cluster_id\", type=\"failed\"}", + "hide": false, + "legendFormat": "{{instance}}_failed", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Image Clean", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:162", + "decimals": 0, + "format": "none", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:163", + "format": "none", + "label": "", + "logBase": 1, + "min": "0", + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The counter of meta data image generation on Master Frontend. And the counter of image successfully pushing to other Non-master Frontends.\nThese metrics is expected to increase at reasonable intervals. And normally, they should be equal.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 18 + }, + "hiddenSeries": false, + "id": 65, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_image_write{job=\"$cluster_id\", instance=\"$fe_master\"}", + "format": "time_series", + "instant": false, + "intervalFactor": 1, + "legendFormat": "{{instance}}-write", + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_image_push{job=\"$cluster_id\", instance=\"$fe_master\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}-push", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Image Counter", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:263", + "decimals": 0, + "format": "none", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:264", + "format": "short", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The left Y axes shows write latency of 99th. The right Y axes shows the write per seconds of journal.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 8, + "y": 18 + }, + "hiddenSeries": false, + "id": 112, + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:773", + "alias": "/.*-rate/", + "points": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_editlog_write_latency_ms{job=\"$cluster_id\", instance=~\"$fe_instance\", quantile=\"0.99\"}", + "format": "time_series", + "hide": false, + "intervalFactor": 1, + "legendFormat": "{{instance}}-99th", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "rate(doris_fe_edit_log{job=\"$cluster_id\", type=\"write\", instance=~\"$fe_instance\"}[$__rate_interval])", + "format": "time_series", + "hide": false, + "intervalFactor": 1, + "legendFormat": "{{instance}}-write-rate", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BDBJE Write Latency & Rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:500", + "format": "ms", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:501", + "format": "wps", + "logBase": 1, + "min": "0", + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The left Y axes shows the read per seconds of journal.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 16, + "y": 18 + }, + "hiddenSeries": false, + "id": 152, + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:457", + "alias": "/.*-rate/", + "points": true + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "exemplar": false, + "expr": "rate(doris_fe_edit_log{job=\"$cluster_id\", type=\"read\", instance=~\"$fe_instance\"}[$__rate_interval])", + "format": "time_series", + "hide": false, + "instant": false, + "intervalFactor": 1, + "legendFormat": "{{instance}}-read-rate", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BDBJE Read Rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:500", + "format": "rps", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:501", + "format": "wps", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": { + "percentage": "#890f02" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "JVM Heap usage of specified Frontend.\nLeft Y Axes shows the used/max heap size.\nRight Y Axes shows the used percentage.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 24 + }, + "hiddenSeries": false, + "id": 13, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:1409", + "alias": "/.*-percentage/", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "jvm_heap_size_bytes{job=\"$cluster_id\", instance=~\"$fe_instance\", type=\"used\"}", + "format": "time_series", + "hide": false, + "instant": false, + "intervalFactor": 2, + "legendFormat": "{{instance}}-used", + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "jvm_heap_size_bytes{job=\"$cluster_id\", instance=~\"$fe_instance\", type=\"max\"}", + "format": "time_series", + "hide": false, + "intervalFactor": 2, + "legendFormat": "{{instance}}-max", + "range": true, + "refId": "B" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum by (instance) (jvm_heap_size_bytes{job=\"$cluster_id\", instance=~\"$fe_instance\", type=\"used\"}) * 100 / sum by (instance) (jvm_heap_size_bytes{job=\"$cluster_id\",instance=~\"$fe_instance\", type=\"max\"})", + "format": "time_series", + "hide": false, + "intervalFactor": 2, + "legendFormat": "{{instance}}-percentage", + "range": true, + "refId": "C" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "JVM Heap", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2307", + "format": "bytes", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:2308", + "format": "percent", + "logBase": 1, + "max": "100", + "min": "0", + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": { + "percentage": "#890f02" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "JVM Non Heap usage of specified Frontend.\nLeft Y Axes shows the used/committed non heap size.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 8, + "y": 24 + }, + "hiddenSeries": false, + "id": 24, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:1679", + "alias": "percentage", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "jvm_non_heap_size_bytes{job=\"$cluster_id\", instance=~\"$fe_instance\", type=\"used\"}", + "format": "time_series", + "hide": false, + "instant": false, + "intervalFactor": 2, + "legendFormat": "{{instance}}-used", + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "jvm_non_heap_size_bytes{job=\"$cluster_id\", instance=~\"$fe_instance\", type=\"committed\"}", + "format": "time_series", + "hide": false, + "intervalFactor": 2, + "legendFormat": "{{instance}}-committed", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "JVM Non Heap", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2336", + "format": "bytes", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2337", + "format": "percent", + "logBase": 1, + "max": "100", + "min": "0", + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Num of threads of FE JVM", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 16, + "y": 24 + }, + "hiddenSeries": false, + "id": 88, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": true, + "min": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "jvm_thread{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", type=\"count\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "JVM Threads", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2481", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2482", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": { + "percentage": "#890f02" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "JVM old generation usage of specified Frontend. Left Y Axes shows the used/max old generation size. Right Y Axes shows the used percentage.\nNormally, the usage percentage should be less than 80%.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 30 + }, + "hiddenSeries": false, + "id": 426, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:1224", + "alias": "/.*-percentage/", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "jvm_old_size_bytes{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", cluster_id=\"\", type=\"used\"}", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{instance}}-used", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "jvm_old_size_bytes{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", cluster_id=\"\", type=\"max\"}", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{instance}}-max", + "range": true, + "refId": "B" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(jvm_old_size_bytes{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", type=\"used\"}) * 100 / sum(jvm_old_size_bytes{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", type=\"max\"})", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{instance}}-percentage", + "range": true, + "refId": "C" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "JVM Old", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2365", + "format": "bytes", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:2366", + "format": "percent", + "logBase": 1, + "max": "100", + "min": "0", + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": { + "percentage": "#890f02" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "JVM young generation usage of specified Frontend.\nLeft Y Axes shows the used/max young generation size.\nRight Y Axes shows the used percentage.Normally, the usage percentage should be less than 80%.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 8, + "y": 30 + }, + "hiddenSeries": false, + "id": 27, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:872", + "alias": "/.*-percentage/", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "jvm_young_size_bytes{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", type=\"used\"}", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{instance}}-used", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "jvm_young_size_bytes{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", type=\"max\"}", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{instance}}-max", + "range": true, + "refId": "B" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum by (instance) (jvm_young_size_bytes{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", type=\"used\"}) * 100 / sum by (instance) (jvm_young_size_bytes{job=\"$cluster_id\", group=\"fe\", instance=~\"$fe_instance\", type=\"max\"})", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{instance}}-percentage", + "range": true, + "refId": "C" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "JVM Young", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2365", + "format": "bytes", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:2366", + "format": "percent", + "label": "", + "logBase": 1, + "max": "100", + "min": "0", + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Cluster's max Compaction Score value collected by master FE", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 16, + "y": 30 + }, + "hiddenSeries": false, + "id": 158, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "sort": "avg", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_max_tablet_compaction_score{job=\"$cluster_id\", cluster_id=\"\"} * on(instance) group_left(group, job) (node_info{type=\"is_master\", group=\"fe\"} == 1)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Master FE: {{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Cluster Max Compaction Score", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:214", + "decimals": 0, + "format": "none", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:215", + "format": "short", + "logBase": 1, + "min": "0", + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Number of tablets begin scheduled. These tablet may be in recovery process or balance process", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 36 + }, + "hiddenSeries": false, + "id": 117, + "legend": { + "avg": false, + "current": true, + "max": false, + "min": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_scheduled_tablet_num{job=\"$cluster_id\", instance=\"$fe_master\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Scheduling tablet number", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Scheduling Tablets", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1383", + "decimals": 0, + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:1384", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + } + ], + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "refId": "A" + } + ], + "title": "FE", + "type": "row" + }, + { + "collapsed": true, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 44 + }, + "id": 50, + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The file descriptor usage of Backends. Left Y axes shows the used fd num. Right Y axes shows the soft limit open file number.", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 7 + }, + "hiddenSeries": false, + "id": 94, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:1675", + "alias": "/.*limit/", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_be_process_fd_num_used{job=\"$cluster_id\", group=\"be\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}-used", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_be_process_fd_num_limit_soft{job=\"$cluster_id\", group=\"be\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}-soft limit", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE FD Count", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1657", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:1658", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The thread number of Backends", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 7 + }, + "hiddenSeries": false, + "id": 95, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_be_process_thread_num{job=\"$cluster_id\", group=\"be\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE Thread Num", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1686", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:1687", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Number of tablets of each Backends", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 7 + }, + "hiddenSeries": false, + "id": 115, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_fe_tablet_num{job=\"$cluster_id\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{backend}}-backend", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Tablet Distribution", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1609", + "format": "none", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:1610", + "format": "short", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Base compaction rate of Backends.\nNormally, base compaction only runs between 20:00 to 4:00 and it is configurable.\nRight Y axes indicates the total base compaction bytes.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 0, + "y": 14 + }, + "hiddenSeries": false, + "id": 39, + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 0, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:2154", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + }, + { + "$$hashKey": "object:2155", + "alias": "Total", + "color": "rgb(27, 255, 0)", + "fill": 0, + "points": true, + "steppedLine": false, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": true, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "rate(doris_be_compaction_bytes_total{type=\"base\", job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_be_compaction_bytes_total{type=\"base\", job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Total", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE Compaction Base", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1813", + "format": "Bps", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:1814", + "format": "bytes", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Cumulative compaction rate of Backends.\nRight Y axes indicates the total cumulative compaction bytes.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 12, + "y": 14 + }, + "hiddenSeries": false, + "id": 40, + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": true, + "min": false, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 0, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:2388", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + }, + { + "$$hashKey": "object:2389", + "alias": "Total", + "color": "rgb(15, 255, 0)", + "fill": 0, + "points": true, + "steppedLine": false, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": true, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "rate(doris_be_compaction_bytes_total{type=\"cumulative\", job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_be_compaction_bytes_total{type=\"cumulative\", job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Total", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE Compaction Cumulate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1842", + "format": "Bps", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1843", + "format": "bytes", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Loading rate of Backends.\nThis indicates the rate of file downloading in LOADING state of load job(MINI and BROKER load).\nRight Y axes indicates the total rate of file downloading.", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 0, + "y": 19 + }, + "hiddenSeries": false, + "id": 41, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sort": "avg", + "sortDesc": false, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:2629", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + }, + { + "$$hashKey": "object:2630", + "alias": "Total rate", + "bars": false, + "color": "rgb(56, 255, 0)", + "lines": true, + "points": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "rate(doris_be_push_request_write_bytes{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(rate(doris_be_push_request_write_bytes{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}[$__rate_interval]))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Total rate", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE Push Bytes", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2006", + "format": "Bps", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2007", + "format": "Bps", + "logBase": 1, + "min": "0", + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Loading rows rate of Backends.\nThis indicates the rate of rows loaded in LOADING state of load job. Right Y axes shows the total push rate of cluster.", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 12, + "y": 19 + }, + "hiddenSeries": false, + "id": 42, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:2862", + "alias": "/Total/", + "color": "rgb(0, 255, 26)", + "points": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "rate(doris_be_push_request_write_rows{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(rate(doris_be_push_request_write_rows{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}[$__rate_interval]))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "Total", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE Push Rows", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1925", + "format": "ops", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:1926", + "format": "ops", + "logBase": 1, + "min": "0", + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Scan rate of Backends.\nThis indicates the read rate when processing queries.", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 0, + "y": 24 + }, + "hiddenSeries": false, + "id": 43, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:3082", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "rate(doris_be_query_scan_bytes{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE Scan Bytes", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2143", + "format": "Bps", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2144", + "format": "ops", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Scan rows rate of Backends.\nThis indicates the read rows rate when processing queries.", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 12, + "y": 24 + }, + "hiddenSeries": false, + "id": 44, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:1418", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "rate(doris_be_query_scan_rows{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE Scan Rows", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1954", + "format": "ops", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:1955", + "format": "ops", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "bthread worker count.", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 0, + "y": 29 + }, + "hiddenSeries": false, + "id": 208, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:3082", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "bthread_worker_count{job=\"$cluster_id\", group=\"be-brpc\", compute_group=~\"$compute_group\", instance=~\"$brpc_instance\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Bthread Worker Count", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2143", + "format": "Bps", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:2144", + "format": "ops", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 12, + "y": 29 + }, + "hiddenSeries": false, + "id": 207, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:3082", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "bthread_timer_usage{job=\"$cluster_id\", group=\"be-brpc\", compute_group=~\"$compute_group\", instance=~\"$brpc_instance\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Bthread Timer Usage", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2143", + "format": "Bps", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:2144", + "format": "ops", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "bthread worker usage.", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 0, + "y": 34 + }, + "hiddenSeries": false, + "id": 209, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:3082", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "bthread_worker_usage{job=\"$cluster_id\", group=\"be-brpc\", compute_group=~\"$compute_group\", instance=~\"$brpc_instance\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Bthread Worker Usage", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2143", + "format": "Bps", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:2144", + "format": "ops", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "BE light work pool queue size", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 12, + "y": 34 + }, + "hiddenSeries": false, + "id": 215, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": true, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:3082", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_be_light_work_pool_queue_size{job=\"$cluster_id\", group=\"be\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE Light Work pool Queue Size", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2143", + "format": "none", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2144", + "format": "ops", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Brpc heavy thread pool queue size", + "fieldConfig": { + "defaults": { + "unit": "none" + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 0, + "y": 39 + }, + "hiddenSeries": false, + "id": 212, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": true, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:3082", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_be_heavy_work_pool_queue_size{job=\"$cluster_id\", group=\"be\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE Heavy Work pool Queue Size", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2143", + "format": "none", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2144", + "format": "ops", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "BE light work active threads", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 12, + "y": 39 + }, + "hiddenSeries": false, + "id": 213, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": true, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:3082", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_be_light_work_active_threads{job=\"$cluster_id\", group=\"be\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE Light Work Active Threads", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2143", + "format": "none", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2144", + "format": "ops", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Be heavy work active threads", + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 0, + "y": 44 + }, + "hiddenSeries": false, + "id": 214, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": true, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:3082", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "doris_be_heavy_work_active_threads{job=\"$cluster_id\", group=\"be\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "BE Heavy Work Active Threads", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2143", + "format": "none", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2144", + "format": "ops", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fieldConfig": { + "defaults": { + "unit": "percent" + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 12, + "y": 44 + }, + "hiddenSeries": false, + "id": 430, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": true, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:3082", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "(sum by(cluster_name, group, instance, compute_group, job, workload_group) (rate(doris_be_workload_group_cpu_time_sec{job=\"$cluster_id\", group=\"be\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}[$__rate_interval])) / scalar(max(doris_be_avail_cpu_num{job=\"$cluster_id\", group=\"be\", compute_group=~\"$compute_group\",instance=~\"$be_instance\"}))) * 100", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}-{{workload_group}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Workload Group CPU Used Rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2143", + "format": "percent", + "logBase": 1, + "max": "100", + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2144", + "format": "ops", + "logBase": 1, + "max": "100", + "min": "0", + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fieldConfig": { + "defaults": { + "unit": "decmbytes" + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 0, + "y": 49 + }, + "hiddenSeries": false, + "id": 432, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": true, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:3082", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum by (job, instance, workload_group) (doris_be_workload_group_mem_used_bytes{job=\"$cluster_id\", group=\"be\", instance=~\"$be_instance\"} / 1.048675e+06)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}-{{workload_group}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Workload Group Mem Used", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2143", + "format": "decmbytes", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2144", + "format": "ops", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fieldConfig": { + "defaults": { + "unit": "MBs" + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 5, + "w": 12, + "x": 12, + "y": 49 + }, + "hiddenSeries": false, + "id": 431, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "max": true, + "min": true, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.21", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:3082", + "alias": "/Counter/", + "stack": true, + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum by(job, compute_group, instance, group, workload_group) (rate(doris_be_workload_group_local_scan_bytes{job=\"$cluster_id\", group=\"be\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}[$__rate_interval]) / 1.048675e+06)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "local-{{instance}}-{{workload_group}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum by(job, compute_group, instance, group, workload_group) (rate(doris_be_workload_group_remote_scan_bytes{job=\"$cluster_id\", group=\"be\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}[$__rate_interval]) / 1.048675e+06)", + "hide": false, + "legendFormat": "remote-{{instance}}-{{workload_group}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Workload Group File Read Rate(Local & Remote)", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2143", + "format": "MBs", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:2144", + "format": "ops", + "logBase": 1, + "show": false + } + ], + "yaxis": { + "align": false + } + } + ], + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "refId": "A" + } + ], + "title": "BE", + "type": "row" + }, + { + "collapsed": true, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 45 + }, + "id": 75, + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Left Y axes indicates the failure rate of specified tasks. Normally, it should be 0.\nRight Y axes indicates the total number of specified tasks in all Backends.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 8 + }, + "hiddenSeries": false, + "id": 78, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": true, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.16", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:4272", + "alias": "Failed", + "yaxis": 2 + }, + { + "$$hashKey": "object:4273", + "alias": "Total", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"report_all_tablets\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}-status: {{status}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Tablets Report", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:441", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:442", + "decimals": 0, + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Left Y axes indicates the failure rate of specified tasks. Normally, it should be 0.\nRight Y axes indicates the total number of specified tasks in all Backends.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 8, + "y": 8 + }, + "hiddenSeries": false, + "id": 82, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": true, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.16", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:4369", + "alias": "Failed", + "yaxis": 2 + }, + { + "$$hashKey": "object:4370", + "alias": "Total", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"finish_task\", status=\"total\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Total", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"finish_task\", status=\"failed\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Finish Task Report", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:499", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:500", + "decimals": 0, + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Left Y axes indicates the failure rate of specified tasks. Normally, it should be 0.\nRight Y axes indicates the total number of specified tasks in all Backends.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 16, + "y": 8 + }, + "hiddenSeries": false, + "id": 80, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": true, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.16", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:4466", + "alias": "Failed", + "yaxis": 2 + }, + { + "$$hashKey": "object:4467", + "alias": "Total", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"delete\", status=\"total\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Total", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"delete\", status=\"failed\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Delete", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:598", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:599", + "decimals": 0, + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Left Y axes indicates the failure rate of specified tasks. Normally, it should be 0.\nRight Y axes indicates the total number of specified tasks in all Backends.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 14 + }, + "hiddenSeries": false, + "id": 91, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sortDesc": true, + "total": true, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.16", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:4563", + "alias": "Total", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_be_push_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", status=\"SUCCESS\"})", + "format": "time_series", + "hide": false, + "intervalFactor": 1, + "legendFormat": "Total", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(doris_be_push_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", status=\"FAIL\"}[$__rate_interval])", + "format": "time_series", + "hide": false, + "intervalFactor": 1, + "legendFormat": "{{instance}}-failed", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Push Task", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:540", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:541", + "decimals": 0, + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "The average cost time of push tasks on each Backend.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 8, + "y": 14 + }, + "hiddenSeries": false, + "id": 92, + "legend": { + "alignAsTable": true, + "avg": true, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sort": "avg", + "sortDesc": false, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.16", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:4653", + "alias": "Failed", + "yaxis": 2 + }, + { + "$$hashKey": "object:4654", + "alias": "Total", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(doris_be_push_request_duration_us{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Push Task Cost Time", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:569", + "format": "µs", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:570", + "decimals": 0, + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Left Y axes indicates the failure rate of specified tasks. Normally, it should be 0.\nRight Y axes indicates the total number of specified tasks in all Backends.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 16, + "y": 14 + }, + "hiddenSeries": false, + "id": 81, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": true, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.16", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:4750", + "alias": "Failed", + "yaxis": 2 + }, + { + "$$hashKey": "object:4751", + "alias": "Total", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"clone\", status=\"total\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Total", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"clone\", status=\"failed\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Clone", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:766", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:767", + "decimals": 0, + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Left Y axes indicates the failure rate of specified tasks. Normally, it should be 0.\nRight Y axes indicates the total number of specified tasks in all Backends.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 20 + }, + "hiddenSeries": false, + "id": 84, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": true, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.16", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:4847", + "alias": "Failed", + "yaxis": 2 + }, + { + "$$hashKey": "object:4848", + "alias": "Total", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"base_compaction\", status=\"total\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Total", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"base_compaction\", status=\"failed\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Base Compaction", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:627", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:628", + "decimals": 0, + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Left Y axes indicates the failure rate of specified tasks. Normally, it should be 0.\nRight Y axes indicates the total number of specified tasks in all Backends.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 8, + "y": 20 + }, + "hiddenSeries": false, + "id": 83, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sort": "total", + "sortDesc": false, + "total": true, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.16", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:2446", + "alias": "Failed", + "yaxis": 2 + }, + { + "$$hashKey": "object:2447", + "alias": "Total", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_be_engine_requests_total{job=\"$cluster_id\", instance=~\"$be_instance\", type=\"cumulative_compaction\", status=\"total\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Total", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(doris_be_engine_requests_total{job=\"$cluster_id\", instance=~\"$be_instance\", type=\"cumulative_compaction\", status=\"failed\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Cumulative Compaction", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:683", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:684", + "decimals": 0, + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Left Y axes indicates the failure rate of specified tasks. Normally, it should be 0.\nRight Y axes indicates the total number of specified tasks in all Backends.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 16, + "y": 20 + }, + "hiddenSeries": false, + "id": 73, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": true, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.16", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:4944", + "alias": "Failed", + "yaxis": 2 + }, + { + "$$hashKey": "object:4945", + "alias": "Total", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"create_tablet\", status=\"total\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Total", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"create_tablet\", status=\"failed\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Create Tablet", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:880", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:881", + "decimals": 0, + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Left Y axes indicates the failure rate of specified tasks. Normally, it should be 0.\nRight Y axes indicates the total number of specified tasks in all Backends.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 26 + }, + "hiddenSeries": false, + "id": 76, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": true, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.16", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:5041", + "alias": "Failed", + "yaxis": 2 + }, + { + "$$hashKey": "object:5042", + "alias": "Total", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"create_rollup\", status=\"total\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Total", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"create_rollup\", status=\"failed\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Create Rollup", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:822", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:823", + "decimals": 0, + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "Left Y axes indicates the failure rate of specified tasks. Normally, it should be 0.\nRight Y axes indicates the total number of specified tasks in all Backends.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 8, + "y": 26 + }, + "hiddenSeries": false, + "id": 77, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": true, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.5.16", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "$$hashKey": "object:1014", + "alias": "Failed", + "yaxis": 2 + }, + { + "$$hashKey": "object:1015", + "alias": "Total", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "sum(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"schema_change\", status=\"total\"})", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Total", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(doris_be_engine_requests_total{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$be_instance\", type=\"schema_change\", status=\"failed\"}[$__rate_interval])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{instance}}", + "range": true, + "refId": "B" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Schema Change", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:851", + "format": "short", + "logBase": 1, + "min": "0", + "show": true + }, + { + "$$hashKey": "object:852", + "decimals": 0, + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + } + ], + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "refId": "A" + } + ], + "title": "BE tasks", + "type": "row" + }, + { + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 46 + }, + "id": 216, + "panels": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 9 + }, + "id": 7, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"rpc_server_.*_doris_cloud_meta_service_get_tablet_qps\", job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"rpc_server_.*_selectdb_meta_service_get_tablet_qps\", job=\"$cluster_id\", group=\"meta_service\"}", + "hide": false, + "legendFormat": "__auto", + "range": true, + "refId": "B" + } + ], + "title": "get_tablet_qps", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 9 + }, + "id": 4, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"rpc_server_.*_doris_cloud_meta_service_finish_tablet_job_qps\",job=\"$cluster_id\", group=\"meta_service\"}", + "hide": false, + "legendFormat": "finish-job-{{instance}}", + "range": true, + "refId": "B" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"rpc_server_.*_selectdb_meta_service_finish_tablet_job_qps\", job=\"$cluster_id\", group=\"meta_service\"}", + "hide": false, + "legendFormat": "__auto", + "range": true, + "refId": "A" + } + ], + "title": "tablet_job_qps", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 17 + }, + "id": 22, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"ms_get_tablet_stats_.*_qps\", job=\"$cluster_id\", group=\"meta_service\"}", + "hide": false, + "legendFormat": "{{__name__}}-{{instance}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"ms_commit_txn.*_latency\", job=\"$cluster_id\", group=\"meta_service\"}", + "hide": true, + "legendFormat": "{{__name__}}-{{instance}}", + "range": true, + "refId": "B" + } + ], + "title": "warehouse-get_tablet_stats_qps", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 17 + }, + "id": 3, + "options": { + "legend": { + "calcs": [ + "lastNotNull", + "min", + "max" + ], + "displayMode": "table", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"rpc_server_.*_doris_cloud_meta_service_begin_txn_qps\", job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "begin-{{instance}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "builder", + "expr": "rpc_server_5000_selectdb_meta_service_commit_txn_qps{job=\"$cluster_id\", group=\"meta_service\"}", + "hide": false, + "legendFormat": "commit-{{instance}}", + "range": true, + "refId": "B" + } + ], + "title": "txn_qps", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 25 + }, + "id": 9, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "txn_kv_get_qps{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_get_qps ", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "µs" + }, + "overrides": [ + { + "matcher": { + "id": "byRegexp", + "options": "/.*99%.*/" + }, + "properties": [ + { + "id": "custom.transform", + "value": "negative-Y" + } + ] + } + ] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 25 + }, + "id": 23, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_get_latency.*\", job=\"$cluster_id\", group=\"meta_service\"}", + "hide": true, + "legendFormat": "{{instance}}-{{__name__}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_get_latency\", job=\"$cluster_id\", group=\"meta_service\"}", + "hide": false, + "legendFormat": "{{instance}}-avg", + "range": true, + "refId": "B" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_get_latency_99\", job=\"$cluster_id\", group=\"meta_service\"}", + "hide": false, + "legendFormat": "{{instance}}-99%", + "range": true, + "refId": "C" + } + ], + "title": "txn_kv_get_latency", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 33 + }, + "id": 223, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "txn_kv_range_get_qps{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_range_get_qps ", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "µs" + }, + "overrides": [ + { + "matcher": { + "id": "byRegexp", + "options": "/.*99%.*/" + }, + "properties": [ + { + "id": "custom.transform", + "value": "negative-Y" + } + ] + } + ] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 33 + }, + "id": 225, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_range_get_latency.*\", job=\"$cluster_id\", group=\"meta_service\"}", + "hide": true, + "legendFormat": "{{instance}}-{{__name__}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_range_get_latency\", job=\"$cluster_id\", group=\"meta_service\"}", + "hide": false, + "legendFormat": "{{instance}}-avg", + "range": true, + "refId": "B" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_range_get_latency_99\", job=\"$cluster_id\", group=\"meta_service\"}", + "hide": false, + "legendFormat": "{{instance}}-99%", + "range": true, + "refId": "C" + } + ], + "title": "txn_kv_range_get_latency", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 41 + }, + "id": 224, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "txn_kv_put_qps{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_put_qps ", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "µs" + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 41 + }, + "id": 226, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_put_latency.*\", job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}-{{__name__}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_put_latency", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 49 + }, + "id": 19, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "txn_kv_commit_qps{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_commit_qps ", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "µs" + }, + "overrides": [ + { + "matcher": { + "id": "byRegexp", + "options": "/.*99%.*/" + }, + "properties": [ + { + "id": "custom.transform", + "value": "negative-Y" + } + ] + } + ] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 49 + }, + "id": 228, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_commit_latency.*\", job=\"$cluster_id\", group=\"meta_service\"}", + "hide": true, + "legendFormat": "{{instance}}-{{__name__}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_commit_latency\", job=\"$cluster_id\", group=\"meta_service\"}", + "hide": false, + "legendFormat": "{{instance}}-avg", + "range": true, + "refId": "B" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_commit_latency_99\", job=\"$cluster_id\", group=\"meta_service\"}", + "hide": false, + "legendFormat": "{{instance}}-99%", + "range": true, + "refId": "C" + } + ], + "title": "txn_kv_commit_latency", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 57 + }, + "id": 227, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "txn_kv_commit_error{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_commit_error", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 57 + }, + "id": 20, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "txn_kv_commit_conflict{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_commit_conflict", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 65 + }, + "id": 17, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "txn_kv_atomic_set_ver_value_qps{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_atomic_set_ver_value_qps ", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 65 + }, + "id": 229, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_atomic_set_ver_value_latency.*\", job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}-{{__name__}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_atomic_set_ver_value_latency", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 73 + }, + "id": 18, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "txn_kv_atomic_set_ver_key_qps{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_atomic_set_ver_key_qps ", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 73 + }, + "id": 230, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_atomic_set_ver_key_latency.*\", job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}-{{__name__}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_atomic_set_ver_key_latency", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 81 + }, + "id": 16, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "txn_kv_remove_qps{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_remove_qps ", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 81 + }, + "id": 31, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_remove_latency.*\", job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}-{{__name__}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_remove_latency", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 89 + }, + "id": 15, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "txn_kv_range_remove_qps{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_range_remove_qps ", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 89 + }, + "id": 25, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_range_remove_latency.*\", job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}-{{__name__}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_range_remove_latency", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 97 + }, + "id": 8, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "txn_kv_atomic_add_qps{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_atomic_add_qps ", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 97 + }, + "id": 232, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_atomic_add_latency.*\", job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}-{{__name__}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_atomic_add_latency", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 105 + }, + "id": 231, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "txn_kv_get_read_version_qps{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_get_read_version_qps ", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 105 + }, + "id": 32, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_get_read_version_latency.*\", job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}-{{__name__}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_get_read_version_latency", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 113 + }, + "id": 233, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "txn_kv_get_committed_version_qps{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_get_committed_version_qps ", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 113 + }, + "id": 234, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"txn_kv_get_committed_version_latency.*\", job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}-{{__name__}}", + "range": true, + "refId": "A" + } + ], + "title": "txn_kv_get_committed_version_latency", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 121 + }, + "id": 51, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"rpc_server_.*_doris_cloud_meta_service_commit_rowset_count.*\", job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}-{{__name__}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\"rpc_server_.*_selectdb_meta_service_commit_rowset_count.*\", job=\"$cluster_id\", group=\"meta_service\"}", + "hide": false, + "legendFormat": "__auto", + "range": true, + "refId": "B" + } + ], + "title": "txn_kv_get_committed_version_latency", + "type": "timeseries" + } + ], + "title": "Meta Service", + "type": "row" + }, + { + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 47 + }, + "id": 218, + "panels": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 10 + }, + "id": 72, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "s3_delete_object_qps{job=\"$cluster_id\", compute_group=~\"$compute_group\", instance=~\"$brpc_instance\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "s3_delete_object_qps", + "type": "timeseries" + } + ], + "title": "Recycler", + "type": "row" + }, + { + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 48 + }, + "id": 385, + "panels": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "none" + }, + "overrides": [] + }, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 11 + }, + "id": 401, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\".*_file_cache_normal_queue_element_count\", job=\"$cluster_id\", instance=~\"$brpc_instance\"}", + "hide": false, + "legendFormat": "{{instance}}-{{__name__}}-data", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\".*_file_cache_index_queue_element_count\",job=\"$cluster_id\", instance=~\"$brpc_instance\"}", + "hide": false, + "legendFormat": "{{instance}}--{{__name__}}-index", + "range": true, + "refId": "B" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\".*_file_cache_disposable_queue_element_count\", job=\"$cluster_id\", instance=~\"$brpc_instance\"}", + "hide": false, + "legendFormat": "{{instance}}--{{__name__}}-disposable", + "range": true, + "refId": "C" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\".*_file_cache_ttl_element_count\", job=\"$cluster_id\", instance=~\"$brpc_instance\"}", + "hide": false, + "legendFormat": "{{instance}}--{{__name__}}-ttl", + "range": true, + "refId": "D" + } + ], + "title": "file cache element count", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "binBps" + }, + "overrides": [] + }, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 11 + }, + "id": 419, + "options": { + "legend": { + "calcs": [], + "displayMode": "table", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(label_replace({__name__=~\".*_file_cache_total_evict_size\", job=\"$cluster_id\", instance=~\"$brpc_instance\"}, \"name_label\", \"$1\",\"__name__\", \"(.+)\")[$__rate_interval:])", + "legendFormat": "{{instance}}-total", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(label_replace({__name__=~\".*_file_cache_normal_queue_evict_size\", job=\"$cluster_id\", instance=~\"$brpc_instance\"}, \"name_label\", \"$1\",\"__name__\", \"(.+)\")[$__rate_interval:])", + "hide": false, + "legendFormat": "{{instance}}-normal", + "range": true, + "refId": "B" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(label_replace({__name__=~\".*_file_cache_index_queue_evict_size\", job=\"$cluster_id\", instance=~\"$brpc_instance\"}, \"name_label\", \"$1\",\"__name__\", \"(.+)\")[$__rate_interval:])", + "hide": false, + "legendFormat": "{{instance}}-index", + "range": true, + "refId": "C" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(label_replace({__name__=~\".*_file_cache_disposable_queue_evict_size\", job=\"$cluster_id\", instance=~\"$brpc_instance\"}, \"name_label\", \"$1\",\"__name__\", \"(.+)\")[$__rate_interval:])", + "hide": false, + "legendFormat": "{{instance}}-disposable", + "range": true, + "refId": "D" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(label_replace({__name__=~\".*_file_cache_ttl_cache_evict_size\", job=\"$cluster_id\", instance=~\"$brpc_instance\"}, \"name_label\", \"$1\",\"__name__\", \"(.+)\")[$__rate_interval:])", + "hide": false, + "legendFormat": "{{instance}}-ttl", + "range": true, + "refId": "E" + } + ], + "title": "file cache evict rate", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "bytes" + }, + "overrides": [] + }, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 11 + }, + "id": 400, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\".*_file_cache_cache_size\", job=\"$cluster_id\", instance=~\"$brpc_instance\"}", + "legendFormat": "{{instance}}-{{__name__}}-total", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\".*_file_cache_normal_queue_cache_size\", job=\"$cluster_id\", instance=~\"$brpc_instance\"}", + "hide": false, + "legendFormat": "{{instance}}-normal", + "range": true, + "refId": "B" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\".*_file_cache_index_queue_cache_size\", job=\"$cluster_id\", instance=~\"$brpc_instance\"}", + "hide": false, + "legendFormat": "{{instance}}-{{__name__}}-index", + "range": true, + "refId": "C" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\".*_file_cache_disposable_queue_cache_size\", job=\"$cluster_id\", instance=~\"$brpc_instance\"}", + "hide": false, + "legendFormat": "{{instance}}-{{__name__}}-disposable", + "range": true, + "refId": "D" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "{__name__=~\".*_file_cache_ttl_cache_size\", job=\"$cluster_id\", instance=~\"$brpc_instance\"}", + "hide": false, + "legendFormat": "{{instance}}-{{__name__}}-ttl", + "range": true, + "refId": "E" + } + ], + "title": "file cache data size", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "binBps" + }, + "overrides": [] + }, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 18 + }, + "id": 421, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(s3_file_writer_bytes_written{job=\"$cluster_id\", instance=~\"$brpc_instance\"}[$__rate_interval])", + "hide": false, + "legendFormat": "write_rate-{{instance}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "irate(s3_file_reader_bytes_read{job=\"$cluster_id\", instance=~\"$brpc_instance\"}[$__rate_interval])", + "hide": false, + "legendFormat": "read_rate-{{instance}}", + "range": true, + "refId": "B" + } + ], + "title": "s3_write_read_rate", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "bytes" + }, + "overrides": [] + }, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 18 + }, + "id": 405, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "s3_file_writer_bytes_written{job=\"$cluster_id\", instance=~\"$brpc_instance\"}", + "hide": false, + "legendFormat": "{{__name__}}-{{instance}}", + "range": true, + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "s3_file_reader_bytes_read{job=\"$cluster_id\", instance=~\"$brpc_instance\"}", + "hide": false, + "legendFormat": "{{__name__}}-{{instance}}", + "range": true, + "refId": "B" + } + ], + "title": "s3_write_read", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "", + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 18 + }, + "id": 387, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "buffered_reader_bytes_downloaded{job=\"$cluster_id\", instance=~\"$brpc_instance\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "load bytes downloaded from s3", + "type": "timeseries" + } + ], + "title": "Remote Storage", + "type": "row" + }, + { + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 49 + }, + "id": 217, + "panels": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "bytes" + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 12 + }, + "id": 45, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "fdb_data_total_disk_used_bytes{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "fdb_data_total_disk_used_bytes", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + }, + "unit": "bytes" + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 12 + }, + "id": 219, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "fdb_data_total_kv_size_bytes{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "fdb_data_total_kv_size_bytes", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 20 + }, + "id": 220, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "fdb_client_count{job=\"$cluster_id\", group=\"meta_service\"}", + "hide": false, + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "fdb_client_count", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [ + { + "matcher": { + "id": "byRegexp", + "options": ".*read.*" + }, + "properties": [ + { + "id": "custom.transform", + "value": "negative-Y" + } + ] + } + ] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 20 + }, + "id": 221, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "fdb_workload_read_rate_hz{job=\"$cluster_id\", group=\"meta_service\"}", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "fdb_read_rate_hz", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "When the queue backlog exceeds 1500 MB, it indicates that FDB resources are under pressure.", + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [ + { + "matcher": { + "id": "byRegexp", + "options": ".*read.*" + }, + "properties": [ + { + "id": "custom.transform", + "value": "negative-Y" + } + ] + } + ] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 28 + }, + "id": 222, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "fdb_qos_worst_storage_server_queue_bytes{job=\"$cluster_id\"}/1024/1024", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "fdb_qos_worst_storage_server_queue_mbytes", + "type": "timeseries" + }, + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "description": "When the log queue backlog exceeds 400 MB, it indicates that FDB resources are under pressure.", + "fieldConfig": { + "defaults": { + "color": { + "mode": "palette-classic" + }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { + "legend": false, + "tooltip": false, + "viz": false + }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { + "type": "linear" + }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { + "group": "A", + "mode": "none" + }, + "thresholdsStyle": { + "mode": "off" + } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [ + { + "matcher": { + "id": "byRegexp", + "options": ".*read.*" + }, + "properties": [ + { + "id": "custom.transform", + "value": "negative-Y" + } + ] + } + ] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 28 + }, + "id": 57, + "options": { + "legend": { + "calcs": [], + "displayMode": "list", + "placement": "bottom", + "showLegend": true + }, + "tooltip": { + "mode": "single", + "sort": "none" + } + }, + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "" + }, + "editorMode": "code", + "expr": "fdb_qos_worst_log_server_queue_bytes{job=\"$cluster_id\"}/1024/1024", + "legendFormat": "{{instance}}", + "range": true, + "refId": "A" + } + ], + "title": "fdb_qos_worst_log_server_queue_mbytes", + "type": "timeseries" + } + ], + "title": "Fdb", + "type": "row" + } + ], + "refresh": "5s", + "revision": 1, + "schemaVersion": 38, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "selected": false, + "text": "", + "value": "" + }, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "definition": "label_values(up, job)", + "hide": 2, + "includeAll": false, + "multi": false, + "name": "cluster_id", + "options": [], + "query": { + "query": "label_values(up, job)", + "refId": "StandardVariableQuery" + }, + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "current": { + "selected": false, + "text": "", + "value": "" + }, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "definition": "label_values(up{job=\"$cluster_id\"},cluster_name)", + "hide": 2, + "includeAll": false, + "multi": false, + "name": "cluster_name", + "options": [], + "query": { + "query": "label_values(up{job=\"$cluster_id\"},cluster_name)", + "refId": "StandardVariableQuery" + }, + "refresh": 1, + "regex": "", + "skipUrlSync": false, + "sort": 0, + "type": "query" + }, + { + "current": { + "selected": false, + "text": "", + "value": "" + }, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "definition": "query_result(node_info{group=\"fe\", job=\"$cluster_id\", type=\"is_master\"})", + "hide": 0, + "includeAll": false, + "multi": false, + "name": "fe_master", + "options": [], + "query": { + "query": "query_result(node_info{group=\"fe\", job=\"$cluster_id\", type=\"is_master\"})", + "refId": "StandardVariableQuery" + }, + "refresh": 1, + "regex": "/instance=\"(.+:\\d+)\"/", + "skipUrlSync": false, + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "current": { + "selected": true, + "text": [ + "All" + ], + "value": [ + "$__all" + ] + }, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "definition": "up{group=\"fe\", job=\"$cluster_id\"}", + "hide": 0, + "includeAll": true, + "multi": true, + "name": "fe_instance", + "options": [], + "query": { + "query": "up{group=\"fe\", job=\"$cluster_id\"}", + "refId": "StandardVariableQuery" + }, + "refresh": 1, + "regex": "/instance=\"(.+:\\d+)/", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "current": { + "selected": true, + "text": [ + "All" + ], + "value": [ + "$__all" + ] + }, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "definition": "up{group=\"be\", compute_group!=\"\", job=\"$cluster_id\"}", + "hide": 0, + "includeAll": true, + "multi": true, + "name": "compute_group", + "options": [], + "query": { + "query": "up{group=\"be\", compute_group!=\"\", job=\"$cluster_id\"}", + "refId": "StandardVariableQuery" + }, + "refresh": 1, + "regex": "/.*compute_group=\"([^\"]*).*/", + "skipUrlSync": false, + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "current": { + "selected": true, + "text": [ + "All" + ], + "value": [ + "$__all" + ] + }, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "definition": "up{group=\"be\", compute_group=~\"$compute_group\", job=\"$cluster_id\"}", + "hide": 0, + "includeAll": true, + "multi": true, + "name": "be_instance", + "options": [], + "query": { + "query": "up{group=\"be\", compute_group=~\"$compute_group\", job=\"$cluster_id\"}", + "refId": "StandardVariableQuery" + }, + "refresh": 1, + "regex": "/instance=\"(.+:\\d+)/", + "skipUrlSync": false, + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "current": { + "selected": true, + "text": [ + "All" + ], + "value": [ + "$__all" + ] + }, + "datasource": { + "type": "prometheus", + "uid": "" + }, + "definition": "up{group=\"be-brpc\", compute_group=~\"$compute_group\", job=\"$cluster_id\"}", + "hide": 0, + "includeAll": true, + "multi": true, + "name": "brpc_instance", + "options": [], + "query": { + "query": "up{group=\"be-brpc\", compute_group=~\"$compute_group\", job=\"$cluster_id\"}", + "refId": "StandardVariableQuery" + }, + "refresh": 1, + "regex": "/instance=\"(.+:\\d+)/", + "skipUrlSync": false, + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "current": { + "selected": true, + "text": "all-node-exporter", + "value": "(cluster-node-exporter|fe-node-exporter|be-node-exporter)" + }, + "hide": 0, + "includeAll": false, + "label": "", + "multi": false, + "name": "node_exporter_group", + "options": [ + { + "selected": true, + "text": "all-node-exporter", + "value": "(cluster-node-exporter|fe-node-exporter|be-node-exporter)" + }, + { + "selected": false, + "text": "fe-node-exporter", + "value": "(cluster-node-exporter|fe-node-exporter)" + }, + { + "selected": false, + "text": "be-node-exporter", + "value": "(cluster-node-exporter|be-node-exporter)" + } + ], + "query": "(cluster-node-exporter|fe-node-exporter|be-node-exporter),(cluster-node-exporter|fe-node-exporter),(cluster-node-exporter|be-node-exporter)", + "queryValue": "", + "skipUrlSync": false, + "type": "custom" + } + ] + }, + "time": { + "from": "now-5m", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "", + "title": "Doris Cloud Dashboard Overview", + "uid": "3fFiWJ4mz456", + "version": 1, + "weekStart": "" +} \ No newline at end of file diff --git a/static/images/next/install/image-for-grafana-demo-1860.png b/static/images/next/install/image-for-grafana-demo-1860.png new file mode 100644 index 0000000000000..2185ae2e02443 Binary files /dev/null and b/static/images/next/install/image-for-grafana-demo-1860.png differ diff --git a/static/images/next/install/image-for-grafana-import-dashboard.png b/static/images/next/install/image-for-grafana-import-dashboard.png new file mode 100644 index 0000000000000..2b44392ab8269 Binary files /dev/null and b/static/images/next/install/image-for-grafana-import-dashboard.png differ diff --git a/static/images/next/install/image-for-node-metrics.png b/static/images/next/install/image-for-node-metrics.png new file mode 100644 index 0000000000000..b29f250e4ae5f Binary files /dev/null and b/static/images/next/install/image-for-node-metrics.png differ diff --git a/versioned_docs/version-4.x/install/deploy-on-kubernetes/separating-storage-compute/install-prometheus-and-grafana.md b/versioned_docs/version-4.x/install/deploy-on-kubernetes/separating-storage-compute/install-prometheus-and-grafana.md new file mode 100644 index 0000000000000..0164217fc615e --- /dev/null +++ b/versioned_docs/version-4.x/install/deploy-on-kubernetes/separating-storage-compute/install-prometheus-and-grafana.md @@ -0,0 +1,241 @@ +--- +{ + "title": "Deploy Prometheus and Grafana", + "language": "en", + "description": "Deploy Prometheus and Grafana on Kubernetes with Helm to collect and visualize metrics for a Doris compute-storage decoupled cluster.", + "keywords": ["Doris", "decoupled storage and compute", "compute-storage decoupled", "Kubernetes", "K8s", "Prometheus", "Grafana", "Helm", "ServiceMonitor", "metric", "metric collection", "cluster monitoring", "monitoring deployment", "Dashboard", "kube-prometheus-stack"] +} +--- + + + + +This document describes how to deploy Prometheus and Grafana on Kubernetes with Helm and connect them to an Apache Doris compute-storage decoupled cluster for metric collection, visualization, and alerting. Prometheus scrapes the HTTP and bRPC metrics exposed by FE, BE, and Meta Service. Grafana presents the cluster status through dashboards. + +## Use Cases + +| Scenario | Description | +|------|------| +| New cluster onboarding | Set up monitoring before the Doris compute-storage decoupled cluster goes into production so anomalies can be detected in time. | +| Day-to-day operations | Continuously observe the key metrics of FE, BE, and Meta Service, along with node resource usage. | +| Troubleshooting | Use historical metrics, dashboard views, and alerts to quickly pinpoint performance or availability issues. | +| Capacity planning | Evaluate when to scale out based on the trend of node and component metrics. | + +## Prerequisites + +- A usable Kubernetes cluster with `kubectl` already configured. +- A compute-storage decoupled cluster already deployed in the `default` namespace through Doris Operator, with all three component types (FE, BE/Compute Group, Meta Service) ready. +- Nodes have public network access and can download the Helm installation script, Prometheus Community Charts, and the Grafana Dashboard JSON file. +- Permissions to create namespaces, Helm Releases, ServiceMonitors, and other resources in Kubernetes. + +## Deployment Overview + +1. Install Helm, and deploy Prometheus, Grafana, and Alertmanager in one step through `kube-prometheus-stack`. +2. Configure a Prometheus `ServiceMonitor` so that Prometheus can auto-discover and scrape the HTTP and bRPC metrics of the Doris cluster. +3. Log in to Grafana, import the official Doris dashboard, and add a node monitoring panel as needed. + +## Step 1: Deploy Helm, Prometheus, and Grafana + + + +### 1.1 Install Helm + +Purpose: Install Helm 3 on a local machine or operations node to install and manage the monitoring components on Kubernetes. + +```shell +curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash +``` + +### 1.2 Add the Prometheus Community Helm Repository + +Purpose: Register the repository that hosts the `kube-prometheus-stack` Chart, and refresh the local cache. + +```shell +helm repo add prometheus-community https://prometheus-community.github.io/helm-charts +helm repo update +``` + +### 1.3 Deploy kube-prometheus-stack + +Purpose: Deploy Prometheus, Grafana, Alertmanager, and the related Operator in a dedicated `monitoring` namespace. + +```shell +# Create the namespace +kubectl create namespace monitoring + +# Deploy kube-prometheus-stack +helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring +``` + +### 1.4 Check Pod Status + +Purpose: Confirm that all Pods in the monitoring stack are in the `Running` state before moving on to the next step. + +```shell +kubectl get pods -n monitoring +``` + +A normal output looks like the following: + +```text +NAME READY STATUS RESTARTS AGE +alertmanager-prometheus-kube-prometheus-alertmanager-0 2/2 Running 8 (5h28m ago) 4d23h +prometheus-grafana-7994c77c7-8nk7j 3/3 Running 12 (5h28m ago) 5d +prometheus-kube-prometheus-operator-5576477887-dgp8h 1/1 Running 4 (5h28m ago) 5d +prometheus-kube-state-metrics-77885ddddc-hldlw 1/1 Running 4 (5h28m ago) 4d23h +prometheus-prometheus-kube-prometheus-prometheus-0 2/2 Running 0 4h11m +prometheus-prometheus-node-exporter-2tl9s 1/1 Running 4 (5h28m ago) 4d23h +prometheus-prometheus-node-exporter-b58rd 1/1 Running 4 (5h28m ago) 4d23h +prometheus-prometheus-node-exporter-fqp6v 1/1 Running 4 (5h28m ago) 4d23h +``` + +## Step 2: Configure Prometheus to Scrape Doris Metrics + + + + +Create a `ServiceMonitor` so that Prometheus Operator auto-discovers Doris Services in the `default` namespace that carry the label `app.doris.disaggregated.cluster=test-disaggregated-cluster`, and scrapes their metrics grouped by the three component types: FE, BE, and Meta Service. + +### 2.1 Prepare the ServiceMonitor YAML + +Purpose: Declare the scrape targets, endpoint paths, and scrape interval, and use `relabelings` to assign a unified `group` label to each service by role for dashboard filtering. + +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: doris-disaggregated-monitor + namespace: monitoring + labels: + release: prometheus +spec: + namespaceSelector: + matchNames: + - default + selector: + matchLabels: + app.doris.disaggregated.cluster: test-disaggregated-cluster + endpoints: + - port: http + path: /metrics + interval: 15s + relabelings: + # 1. Unify the job name + - action: replace + targetLabel: job + replacement: doris-cluster + # 2. Map Service name suffix to component group: -cg1 -> be, -fe -> fe, -ms -> meta_service + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-cg1 + replacement: be + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-fe + replacement: fe + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-ms + replacement: meta_service + targetLabel: group + + - port: brpc-port + path: /brpc_metrics + interval: 15s + relabelings: + # 1. Unify the job name + - action: replace + targetLabel: job + replacement: doris-cluster + # 2. Map Service name suffix to component group: -cg1 -> be, -fe -> fe, -ms -> meta_service + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-cg1 + replacement: be + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-fe + replacement: fe + targetLabel: group + - sourceLabels: [__meta_kubernetes_service_name] + regex: .*-ms + replacement: meta_service + targetLabel: group +``` + +### 2.2 Key Fields of the ServiceMonitor + +| Field | Value | Description | +|------|------|------| +| `metadata.namespace` | `monitoring` | The ServiceMonitor must reside in the same namespace as the Prometheus instance. | +| `metadata.labels.release` | `prometheus` | Must match the Helm Release name. Prometheus Operator uses this label to discover ServiceMonitors. | +| `spec.namespaceSelector.matchNames` | `default` | The namespace where the Doris cluster runs. Adjust to match your environment. | +| `spec.selector.matchLabels` | `app.doris.disaggregated.cluster: test-disaggregated-cluster` | Selects the Service of the Doris compute-storage decoupled cluster. Update the cluster name as needed. | +| `endpoints[0].port` | `http` | The HTTP port name on which FE, BE, and Meta Service expose `/metrics`. | +| `endpoints[1].port` | `brpc-port` | The bRPC port name on which BE exposes `/brpc_metrics`. | +| `endpoints[*].interval` | `15s` | Scrape interval. Adjust based on data volume and precision requirements. | +| The `group` label in `relabelings` | `be` / `fe` / `meta_service` | Divides metrics into three component categories by Service name suffix, for dashboard variable filtering. | + +### 2.3 Apply the YAML and Verify + +Purpose: Let Prometheus Operator detect the new `ServiceMonitor` and refresh its scrape targets. + +```shell +kubectl apply -f doris-monitor.yaml +``` + +In a browser, open Prometheus (default port `9090`, for example `http://your_ip:9090`), navigate to **Status → Targets**, and confirm that the FE, BE, and Meta Service targets under `doris-cluster` are all in the `UP` state. + +## Step 3: Configure Grafana and the Dashboard + + + + +### 3.1 Log In to Grafana + +Purpose: Access the Grafana bundled with `kube-prometheus-stack` and complete the first login. + +1. In a browser, open Grafana (default port `3000`, for example `http://your_ip:3000`). +2. The username is `admin`. Retrieve the initial password with the following command: + + ```shell + kubectl get secret --namespace monitoring prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo + ``` + +### 3.2 Import the Doris Dashboard + +Purpose: Use the official Dashboard JSON file to visualize Doris cluster metrics. + +1. Download the official Dashboard file: [Doris-Dashboard-Cloud.json](https://doris.apache.org/files/doris-grafana-dashboard-cloud.json) +2. In Grafana, go to **Dashboards → New → Import**, import the JSON file, and select the bundled Prometheus as the data source. +3. Append `&var-cluster_id=doris-cluster` to the dashboard URL to match the `job` name set in the ServiceMonitor. For example: + + ```text + http://your_ip:3000/d/3fFiWJ4mz456/doris-cloud-dashboard-overview?orgId=1&var-cluster_id=doris-cluster&refresh=5s + ``` + +### 3.3 Add Node Monitoring (Optional) + +Purpose: The example JSON file does not include a host node monitoring panel. Use the official Grafana template `1860` to display `node-exporter` metrics directly. + +1. In Grafana, import a dashboard: + + ![image-for-grafana-import-dashboard](/images/next/install/image-for-grafana-import-dashboard.png) + +2. Select the official template number `1860`: + + ![image-for-grafana-demo-1860](/images/next/install/image-for-grafana-demo-1860.png) + +3. After the import completes, you can view the node metrics: + + ![image-for-node-metrics](/images/next/install/image-for-node-metrics.png) + +## Common Issues + + + +| Issue | Possible cause | Resolution | +|------|----------|----------| +| Doris targets do not appear under Prometheus **Targets** | The `namespaceSelector` or `matchLabels` of the ServiceMonitor does not match the actual Doris cluster; the `release` label does not match the Helm Release name. | Verify the cluster namespace, the Service label `app.doris.disaggregated.cluster`, and confirm that the `release` label on the `ServiceMonitor` is set to `prometheus`. | +| Targets are listed but show `DOWN` | The Pod is not ready, or the `http` / `brpc-port` port name does not match the port name that is actually exposed. | Use `kubectl get svc -n default` and `kubectl describe pod` to confirm the port names, the readiness state, and that `/metrics` and `/brpc_metrics` are accessible inside the container. | +| Grafana dashboard panels are empty | The URL is missing `var-cluster_id=doris-cluster`, or the `job` name in the ServiceMonitor has been changed. | Check that the `var-cluster_id` in the dashboard URL and the `job` label in the `ServiceMonitor` are both set to `doris-cluster`. | +| Cannot access Prometheus on port 9090 or Grafana on port 3000 | The Service type defaults to `ClusterIP`, which is not reachable from outside the cluster. | Forward the port with `kubectl port-forward`, or change the corresponding Service type to `NodePort` or `LoadBalancer`. | +| The command to retrieve the Grafana password returns `NotFound` | The Helm Release is not named `prometheus`, so the Secret name differs. | Use `kubectl get secret -n monitoring` to find the actual Grafana Secret name, then substitute it for `prometheus-grafana` in the command. | diff --git a/versioned_sidebars/version-4.x-sidebars.json b/versioned_sidebars/version-4.x-sidebars.json index 29de294461cf0..9060a442fe71d 100644 --- a/versioned_sidebars/version-4.x-sidebars.json +++ b/versioned_sidebars/version-4.x-sidebars.json @@ -69,7 +69,8 @@ "install/deploy-on-kubernetes/separating-storage-compute/config-ms", "install/deploy-on-kubernetes/separating-storage-compute/config-fe", "install/deploy-on-kubernetes/separating-storage-compute/config-cg", - "install/deploy-on-kubernetes/separating-storage-compute/install-doris-cluster" + "install/deploy-on-kubernetes/separating-storage-compute/install-doris-cluster", + "install/deploy-on-kubernetes/separating-storage-compute/install-prometheus-and-grafana" ] } ]