You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs-next/key-features/analytic-functions.mdx
+19-14Lines changed: 19 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,42 +20,44 @@ featureCard:
20
20
- performance
21
21
---
22
22
23
-
> **TL;DR**Analytic functions (window functions) compute a value for every row of a result set without collapsing rows the way `GROUP BY` does. Doris supports the standard set, `ROW_NUMBER`, `RANK`, `DENSE_RANK`, `NTILE`, `LAG`, `LEAD`, `FIRST_VALUE`, `LAST_VALUE`, `NTH_VALUE`, `PERCENT_RANK`, `CUME_DIST`, plus every aggregate (`SUM`, `AVG`, `COUNT`, `MIN`, `MAX`) used with `OVER`. The Nereids optimizer also rewrites `WHERE row_num <= K` into a partition-aware Top-N below the window, so a "top 10 per category" query never sorts the whole partition.
23
+
> **TL;DR**Apache Doris analytic functions (window functions) compute a value for every row of a result set without collapsing rows the way `GROUP BY` does. Apache Doris supports the standard set, `ROW_NUMBER`, `RANK`, `DENSE_RANK`, `NTILE`, `LAG`, `LEAD`, `FIRST_VALUE`, `LAST_VALUE`, `NTH_VALUE`, `PERCENT_RANK`, `CUME_DIST`, plus every aggregate (`SUM`, `AVG`, `COUNT`, `MIN`, `MAX`) used with `OVER`. The Nereids optimizer also rewrites `WHERE row_num <= K` into a partition-aware Top-N below the window, so a "top 10 per category" query never sorts the whole partition.
24
24
25
25

26
-
## Why it matters{#why}
26
+
## Why use analytic functions in Apache Doris?{#why}
27
27
28
-
Three questions show up in almost every analytics workload, and `GROUP BY`answers none of them on its own.
28
+
Apache Doris analytic functions answer three questions that show up in almost every analytics workload and that `GROUP BY`cannot answer on its own.
29
29
30
30
- "Number every order per customer in time order, then keep the first 10." A self-join on `(customer_id, order_date)` works but reads the table twice.
31
31
- "What was the running total of sales by day, and the 7-day moving average?" `GROUP BY` collapses the rows you still want to see.
32
32
- "How does each row compare to the previous row?" Year-over-year, period-over-period, gap-between-events, this is the lag/lead pattern, and writing it without window functions means a self-join with an offset predicate that the planner cannot optimize.
33
33
34
-
Analytic functions answer all three in one pass over the data, with no self-join and the original rows preserved.
34
+
Apache Doris analytic functions answer all three in one pass over the data, with no self-join and the original rows preserved.
35
35
36
-
## What it is{#what}
36
+
## What are Apache Doris analytic functions?{#what}
37
37
38
-
An analytic function is a SQL function that, for each input row, computes a value over a *window* of related rows defined by an `OVER` clause. The window can be the whole partition, a fixed range around the current row, or anything in between. The output has the same number of rows as the input.
38
+
An Apache Doris analytic function is a SQL function that, for each input row, computes a value over a *window* of related rows defined by an `OVER` clause. The window can be the whole partition, a fixed range around the current row, or anything in between. The output has the same number of rows as the input.
39
39
40
40
```
41
41
function(args) OVER ( [PARTITION BY ...] [ORDER BY ...] [<frame>] )
42
42
```
43
43
44
44
**Key terms**
45
45
46
-
-**`OVER` clause**: tells Doris this call is a window function rather than a regular aggregate. Required.
46
+
-**`OVER` clause**: tells Apache Doris this call is a window function rather than a regular aggregate. Required.
47
47
-**`PARTITION BY`**: splits the input into independent groups. Each partition is computed on its own. Different from table partitions, this is a runtime concept.
48
48
-**`ORDER BY` (inside `OVER`)**: orders rows within each partition. `LAG`, `LEAD`, `ROW_NUMBER`, `RANK`, and any frame with `PRECEDING`/`FOLLOWING` need it.
49
-
-**Window frame**: the slice of the partition the function reads for the current row. Doris supports `ROWS BETWEEN ... PRECEDING/FOLLOWING/CURRENT ROW/UNBOUNDED ...` and a restricted form of `RANGE`.
49
+
-**Window frame**: the slice of the partition the function reads for the current row. Apache Doris supports `ROWS BETWEEN ... PRECEDING/FOLLOWING/CURRENT ROW/UNBOUNDED ...` and a restricted form of `RANGE`.
50
50
-**PartitionTopN**: an internal operator the planner inserts when it can prove a `WHERE rank <= K` filter only needs the top K rows per partition. Source: `CreatePartitionTopNFromWindow.java`.
51
51
52
-
## How it works {#how}
52
+
## How do Apache Doris analytic functions work? {#how}
53
+
54
+
Apache Doris analytic functions run through a five-step pipeline: plan, shuffle, sort, evaluate, and push down.
53
55
54
56
1.**Plan.** The optimizer parses the `OVER` clause into a `WindowExpression`, normalizes the frame (`CheckAndStandardizeWindowFunctionAndFrame`), and groups window calls that share the same `PARTITION BY` plus `ORDER BY` into a single physical window operator. Calls that share a partition and order do not pay for an extra sort.
55
57
2.**Shuffle by partition.** If `PARTITION BY` is present, the engine shuffles rows so all rows for the same partition land on the same backend. With no `PARTITION BY`, the whole window runs in a single pipeline (the parallelism upper bound).
56
58
3.**Sort within partition.** Each backend sorts its partitions by the `ORDER BY` columns. Ties produce a non-deterministic row order unless the `ORDER BY` is unique, which is why the docs warn that `SUM() OVER (ORDER BY date_col)` can return different results on tied dates.
57
-
4.**Evaluate per row.** Doris walks each partition once, maintaining the window frame as it goes. Ranking functions emit one integer per row; aggregate functions over `ROWS UNBOUNDED PRECEDING` keep a running total; sliding-window aggregates add the new row and drop the row that fell off the back.
58
-
5.**Push down filters and Top-N.**`CreatePartitionTopNFromWindow` turns `WHERE row_number() OVER (PARTITION BY a ORDER BY b) <= K` into a `PartitionTopN(K)` operator below the window, so each partition only carries the top K rows into the window operator. `PushDownFilterThroughWindow` lifts filters on `PARTITION BY` columns past the window, so Doris filters before sorting instead of after.
59
+
4.**Evaluate per row.**Apache Doris walks each partition once, maintaining the window frame as it goes. Ranking functions emit one integer per row; aggregate functions over `ROWS UNBOUNDED PRECEDING` keep a running total; sliding-window aggregates add the new row and drop the row that fell off the back.
60
+
5.**Push down filters and Top-N.**`CreatePartitionTopNFromWindow` turns `WHERE row_number() OVER (PARTITION BY a ORDER BY b) <= K` into a `PartitionTopN(K)` operator below the window, so each partition only carries the top K rows into the window operator. `PushDownFilterThroughWindow` lifts filters on `PARTITION BY` columns past the window, so Apache Doris filters before sorting instead of after.
59
61
60
62
## Quick start {#quick-start}
61
63
@@ -92,11 +94,13 @@ FROM orders;
92
94
93
95
One pass, three windowed columns, no self-joins. The original five rows are preserved. `LAG(..., 1, 0)` returns `0` instead of `NULL` for the first row of each partition.
94
96
95
-
## When to use it {#when}
97
+
## When should you use Apache Doris analytic functions? {#when}
98
+
99
+
Apache Doris analytic functions fit any row-preserving computation that depends on neighboring rows, including Top-N per group, running totals, lag/lead comparisons, percentile bucketing, and share-of-total ratios.
96
100
97
101
**Good fit**
98
102
99
-
- Top-N per group: "top 10 orders per customer," "best-selling product per region." Add a `WHERE rn <= 10` filter and Doris pushes a partition Top-N below the window.
103
+
- Top-N per group: "top 10 orders per customer," "best-selling product per region." Add a `WHERE rn <= 10` filter and Apache Doris pushes a partition Top-N below the window.
100
104
- Running totals, moving averages, and centered moving averages with `ROWS BETWEEN n PRECEDING AND m FOLLOWING`.
101
105
- Year-over-year, day-over-day, and gap-between-events analysis with `LAG` and `LEAD`. One pass over the table replaces a self-join.
102
106
- Bucketing for percentile or quartile reports with `NTILE`.
@@ -105,7 +109,7 @@ One pass, three windowed columns, no self-joins. The original five rows are pres
105
109
**Not a good fit**
106
110
107
111
- A pure aggregation that collapses rows. `SELECT category, SUM(amount) FROM t GROUP BY category` does the same work without sorting and without keeping every row in memory. Reach for `GROUP BY` first, and only switch to `OVER` when you also need the unaggregated columns.
108
-
-`RANGE` frames with numeric offsets, like `RANGE BETWEEN 5 PRECEDING AND CURRENT ROW`. Doris's`RANGE` frame is restricted to `UNBOUNDED` boundaries or `CURRENT ROW`; arbitrary `RANGE n PRECEDING/FOLLOWING` is not supported. Use `ROWS` if you need a numeric offset.
112
+
-`RANGE` frames with numeric offsets, like `RANGE BETWEEN 5 PRECEDING AND CURRENT ROW`. The Apache Doris `RANGE` frame is restricted to `UNBOUNDED` boundaries or `CURRENT ROW`; arbitrary `RANGE n PRECEDING/FOLLOWING` is not supported. Use `ROWS` if you need a numeric offset.
109
113
- A window with no `PARTITION BY`. The whole result set lands on one pipeline, and that pipeline becomes the bottleneck on large inputs. Add a partition key whenever the workload allows it.
110
114
-`ORDER BY` on a non-unique column when you care about deterministic output. `SUM(x) OVER (ORDER BY day)` can return different cumulative totals across runs when several rows share the same day. Add a tie-breaker; see [Window Functions Overview](../sql-manual/sql-functions/window-functions/overview).
111
115
- Recomputing the same window result on every refresh. If the same `ROW_NUMBER()` query runs every minute against a slow-moving table, an [async materialized view](../query-acceleration/materialized-view/async-materialized-view/overview) with a partial refresh is cheaper than re-windowing each time.
@@ -118,3 +122,4 @@ One pass, three windowed columns, no self-joins. The original five rows are pres
118
122
-[Pipeline Execution Engine: how partitioned windows get parallelized across BEs](./pipeline-execution-engine)
119
123
-[Vectorized Execution](./vectorized-execution): the engine that runs each window's batch through SIMD-accelerated operators.
120
124
-[Async Materialized View: precompute window results that re-run every minute](../query-acceleration/materialized-view/async-materialized-view/overview)
125
+
-[MPP Architecture](./mpp): how window/analytic operators are shuffled and partitioned across BEs.
Copy file name to clipboardExpand all lines: docs-next/key-features/batch-load.mdx
+11-7Lines changed: 11 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,22 +20,22 @@ featureCard:
20
20
- bulk-ingestion
21
21
---
22
22
23
-
> **TL;DR** Batch Load is how you move large files from S3, HDFS, or another warehouse into Doris in one shot. You submit a `LOAD LABEL` statement, the FE plans the work and fans it out to BEs, and you check progress with `SHOW LOAD`. The job is asynchronous, so the client can disconnect; the label is what lets you find the result and what dedupes a retry. For [external catalogs](./multi-catalog) and TVF reads, the synchronous `INSERT INTO ... SELECT` form covers the same ground with simpler ETL.
23
+
> **TL;DR**Apache Doris Batch Load moves large files from S3, HDFS, or another warehouse into Doris in one shot. You submit a `LOAD LABEL` statement, the FE plans the work and fans it out to BEs, and you check progress with `SHOW LOAD`. The job is asynchronous, so the client can disconnect; the label is what lets you find the result and what dedupes a retry. For [external catalogs](./multi-catalog) and TVF reads, the synchronous `INSERT INTO ... SELECT` form covers the same ground with simpler ETL.
24
24
25
25

26
-
## Why it matters{#why}
26
+
## Why use batch load in Apache Doris?{#why}
27
27
28
-
Real-time loaders are the right tool for events trickling in, but they fall apart when you need to move a TB of historical data, or rebuild a fact table from a Hive snapshot, or backfill last quarter from Parquet on S3. A [Stream Load](./stream-load) over HTTP times out. A per-row `INSERT` would take days. The job has to run server-side, retry on its own, and survive the client losing its connection.
28
+
Apache Doris Batch Load is the right tool when a single ingestion job runs for hours and the warehouse, not the client, has to own its lifecycle. Real-time loaders are the right tool for events trickling in, but they fall apart when you need to move a TB of historical data, or rebuild a fact table from a Hive snapshot, or backfill last quarter from Parquet on S3. A [Stream Load](./stream-load) over HTTP times out. A per-row `INSERT` would take days. The job has to run server-side, retry on its own, and survive the client losing its connection.
29
29
30
30
- A nightly ETL has to land a few hundred GB of Parquet from S3 before the morning dashboards refresh.
31
31
- A migration from Snowflake or Hive needs to copy whole partitions, with column-level filters and type casts.
32
32
- A backfill of one missing day cannot block the connection that submitted it.
33
33
34
34
Batch Load handles all three by treating large bulk ingestion as a long-running, label-tracked job that the warehouse owns end to end.
35
35
36
-
## What it is{#what}
36
+
## What is Apache Doris batch load?{#what}
37
37
38
-
Batch Load is the family of asynchronous bulk-load methods in Doris. The main entry point is the `LOAD LABEL ... WITH S3|HDFS|BROKER` statement, historically called Broker Load, which today covers S3, HDFS, and any storage system reachable through a Broker process. The synchronous `INSERT INTO ... SELECT` variant covers the same ingestion needs when the source is an external catalog (Hive, Iceberg, JDBC) or a file behind a TVF such as `S3()` or `HDFS()`. Both share the same transaction model as the rest of Doris loading: one label per job, atomic commit, replica-level publish.
38
+
Apache Doris Batch Load is the family of asynchronous bulk-load methods in Doris. The main entry point is the `LOAD LABEL ... WITH S3|HDFS|BROKER` statement, historically called Broker Load, which today covers S3, HDFS, and any storage system reachable through a Broker process. The synchronous `INSERT INTO ... SELECT` variant covers the same ingestion needs when the source is an external catalog (Hive, Iceberg, JDBC) or a file behind a TVF such as `S3()` or `HDFS()`. Both share the same transaction model as the rest of Doris loading: one label per job, atomic commit, replica-level publish.
39
39
40
40
**Key terms**
41
41
@@ -45,7 +45,9 @@ Batch Load is the family of asynchronous bulk-load methods in Doris. The main en
45
45
-**TVF (Table-Valued Function)**: `S3(...)`, `HDFS(...)`, `LOCAL(...)`, `iceberg_meta(...)`, and the like. They expose files or external metadata as a table you can `SELECT` from, then pipe into `INSERT INTO ... SELECT` for synchronous ingestion.
46
46
-**`max_filter_ratio`**: the per-job tolerance for malformed rows. Default is zero, so a single bad row cancels the load.
47
47
48
-
## How it works {#how}
48
+
## How does Apache Doris batch load work? {#how}
49
+
50
+
Apache Doris Batch Load registers a labeled job on the FE, parallelizes file scans across BEs, and atomically commits the result once every replica publishes the new version.
49
51
50
52
1.**Submit and label.** A `LOAD LABEL my_db.daily_orders ...` request lands on the FE. The FE registers the label, opens a transaction, and the load enters `PENDING`. Resubmitting the same label inside the retention window short-circuits to the original job.
51
53
2.**Plan and fan out.** The FE estimates file sizes (one BE handles between `min_bytes_per_broker_scanner` and `max_bytes_per_broker_scanner`, default 64 MB to 500 GB), splits the work, and dispatches scanner tasks to BEs. Job state moves through `LOADING`.
`SHOW LOAD ORDER BY CreateTime DESC LIMIT 1` returns when the job is `FINISHED`. Up to 1% of malformed rows are tolerated; anything more cancels the job and the URL field points at a sample of the offending rows.
89
91
90
-
## When to use it {#when}
92
+
## When should you use Apache Doris batch load? {#when}
93
+
94
+
Use Apache Doris Batch Load when a single ingestion run handles tens of GB or more from object storage, an external warehouse, or a Hive-partitioned directory.
0 commit comments