Skip to content

Add internal repartition metrics#21152

Open
gene-bordegaray wants to merge 2 commits into
apache:mainfrom
gene-bordegaray:gene.bordegaray/2026/03/repartition_metrics
Open

Add internal repartition metrics#21152
gene-bordegaray wants to merge 2 commits into
apache:mainfrom
gene-bordegaray:gene.bordegaray/2026/03/repartition_metrics

Conversation

@gene-bordegaray
Copy link
Copy Markdown
Contributor

@gene-bordegaray gene-bordegaray commented Mar 25, 2026

Which issue does this PR close?

Rationale for this change

RepartitionExec spends time in different phases, but the existing metrics only expose broad timings. More granular timings are useful when investigating repartition bottlenecks without adding default EXPLAIN ANALYZE noise or runtime overhead.

What changes are included in this PR?

  • Adds an internal datafusion.explain.analyze_level for low-level development metrics.
  • Adds conditional metric registration through MetricBuilder::if_enabled.
  • Adds internal RepartitionExec timing metrics as the first consumer of this internal metrics level.
  • Preserves AnalyzeExec metric types through physical plan protobuf serialization and uses them during execution so metrics are registered consistently.
  • Updates user-facing explain and metrics documentation.

Are these changes tested?

Yes. Added coverage for conditional metric registration, RepartitionExec internal metric registration, EXPLAIN ANALYZE output, and physical plan protobuf roundtrip behavior.

Are there any user-facing changes?

Yes. Users can set datafusion.explain.analyze_level = 'internal' to include low-level repartition timing metrics in EXPLAIN ANALYZE. The default dev output is unchanged.

@github-actions github-actions Bot added documentation Improvements or additions to documentation physical-plan Changes to the physical-plan crate labels Mar 25, 2026
@gene-bordegaray gene-bordegaray changed the title Gene.bordegaray/2026/03/repartition metrics feat: add granular repartition metrics Mar 25, 2026
@2010YOUY01
Copy link
Copy Markdown
Contributor

I have some concerns about these low-level (kernel-profiling) metrics, so I’m sharing a few suggestions. (Not trying to block this, given this is useful to solve real problems—just offering additional perspectives and possible improvements.)

Metrics are typically used for query tuning at the application level, while these low-level ones are mainly for internal debugging and are less frequently used. They may also introduce execution overhead that’s hard to observe, and bring maintenance overhead. In general, it might be better to keep metrics that directly help application tuning, are frequently used, or are difficult to capture with external profilers. I suspect some of them can be directly observed with profilers/flamegraphs, maybe they can be simplified?

Additionally, we could consider introducing a new analyze level Internal in datafusion.explain.analyze_level (https://datafusion.apache.org/user-guide/configs.html) to hide these metrics from regular output. It might also be worth exploring ways to conditionally disable certain metrics tracking to reduce runtime overhead, then those low level metrics can get added more easily I think.

@Dandandan
Copy link
Copy Markdown
Contributor

run benchmarks

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Linux bench-c4131756738-554-2ghhk 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux
Comparing gene.bordegaray/2026/03/repartition_metrics (cd5514f) to 4084a18 (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Linux bench-c4131756738-555-wj4xx 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux
Comparing gene.bordegaray/2026/03/repartition_metrics (cd5514f) to 4084a18 (merge-base) diff using: tpcds
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Linux bench-c4131756738-556-9gs2g 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux
Comparing gene.bordegaray/2026/03/repartition_metrics (cd5514f) to 4084a18 (merge-base) diff using: tpch
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Details

Comparing HEAD and gene.bordegaray_2026_03_repartition_metrics
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query     ┃                           HEAD ┃ gene.bordegaray_2026_03_repartition_metrics ┃    Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 1  │ 45.39 / 47.01 ±1.18 / 48.93 ms │              45.42 / 46.33 ±0.84 / 47.78 ms │ no change │
│ QQuery 2  │ 21.36 / 21.75 ±0.61 / 22.96 ms │              21.14 / 22.21 ±1.19 / 24.09 ms │ no change │
│ QQuery 3  │ 31.77 / 32.18 ±0.29 / 32.63 ms │              31.92 / 32.27 ±0.20 / 32.46 ms │ no change │
│ QQuery 4  │ 20.00 / 21.17 ±0.86 / 22.38 ms │              20.44 / 20.98 ±0.43 / 21.70 ms │ no change │
│ QQuery 5  │ 48.09 / 49.51 ±1.00 / 50.86 ms │              47.96 / 50.02 ±1.62 / 51.64 ms │ no change │
│ QQuery 6  │ 17.43 / 18.32 ±0.84 / 19.76 ms │              17.35 / 17.57 ±0.18 / 17.81 ms │ no change │
│ QQuery 7  │ 53.37 / 55.85 ±1.50 / 57.94 ms │              54.88 / 56.01 ±1.21 / 57.86 ms │ no change │
│ QQuery 8  │ 48.13 / 49.06 ±0.70 / 49.97 ms │              48.59 / 49.36 ±0.66 / 50.18 ms │ no change │
│ QQuery 9  │ 54.82 / 55.51 ±0.51 / 56.32 ms │              54.37 / 55.20 ±0.58 / 56.00 ms │ no change │
│ QQuery 10 │ 69.96 / 71.74 ±1.40 / 74.07 ms │              71.47 / 73.27 ±1.69 / 75.32 ms │ no change │
│ QQuery 11 │ 13.92 / 14.21 ±0.36 / 14.91 ms │              14.18 / 14.49 ±0.27 / 14.98 ms │ no change │
│ QQuery 12 │ 28.04 / 28.84 ±1.00 / 30.61 ms │              27.57 / 28.67 ±0.79 / 29.82 ms │ no change │
│ QQuery 13 │ 38.08 / 39.04 ±0.70 / 39.96 ms │              39.21 / 39.70 ±0.48 / 40.33 ms │ no change │
│ QQuery 14 │ 28.70 / 28.93 ±0.23 / 29.37 ms │              28.67 / 28.87 ±0.21 / 29.22 ms │ no change │
│ QQuery 15 │ 33.54 / 34.26 ±0.69 / 35.49 ms │              34.06 / 34.72 ±0.68 / 35.87 ms │ no change │
│ QQuery 16 │ 15.96 / 16.22 ±0.20 / 16.47 ms │              16.09 / 16.55 ±0.46 / 17.43 ms │ no change │
│ QQuery 17 │ 72.67 / 74.49 ±1.94 / 78.26 ms │              72.60 / 74.02 ±1.11 / 75.47 ms │ no change │
│ QQuery 18 │ 77.61 / 78.82 ±0.84 / 79.70 ms │              76.75 / 78.92 ±1.14 / 80.04 ms │ no change │
│ QQuery 19 │ 37.45 / 37.78 ±0.24 / 38.18 ms │              37.42 / 37.80 ±0.32 / 38.27 ms │ no change │
│ QQuery 20 │ 39.98 / 40.53 ±0.44 / 40.91 ms │              41.08 / 41.84 ±0.59 / 42.57 ms │ no change │
│ QQuery 21 │ 63.80 / 65.51 ±1.51 / 67.87 ms │              64.57 / 66.18 ±1.12 / 67.73 ms │ no change │
│ QQuery 22 │ 17.68 / 18.33 ±0.51 / 19.09 ms │              17.78 / 18.42 ±0.39 / 18.96 ms │ no change │
└───────────┴────────────────────────────────┴─────────────────────────────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary                                          ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (HEAD)                                          │ 899.05ms │
│ Total Time (gene.bordegaray_2026_03_repartition_metrics)   │ 903.40ms │
│ Average Time (HEAD)                                        │  40.87ms │
│ Average Time (gene.bordegaray_2026_03_repartition_metrics) │  41.06ms │
│ Queries Faster                                             │        0 │
│ Queries Slower                                             │        0 │
│ Queries with No Change                                     │       22 │
│ Queries with Failure                                       │        0 │
└────────────────────────────────────────────────────────────┴──────────┘

Resource Usage

tpch — base (merge-base)

Metric Value
Wall time 4.8s
Peak memory 4.1 GiB
Avg memory 3.6 GiB
CPU user 33.4s
CPU sys 3.0s
Disk read 0 B
Disk write 136.0 KiB

tpch — branch

Metric Value
Wall time 4.8s
Peak memory 4.1 GiB
Avg memory 3.6 GiB
CPU user 33.6s
CPU sys 2.9s
Disk read 0 B
Disk write 60.0 KiB

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Details

Comparing HEAD and gene.bordegaray_2026_03_repartition_metrics
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                     HEAD ┃ gene.bordegaray_2026_03_repartition_metrics ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │           43.05 / 43.97 ±0.86 / 45.02 ms │              43.14 / 44.07 ±0.76 / 45.33 ms │     no change │
│ QQuery 2  │        145.62 / 146.45 ±0.57 / 147.21 ms │           147.41 / 148.85 ±1.06 / 150.02 ms │     no change │
│ QQuery 3  │        113.93 / 114.81 ±0.90 / 116.11 ms │           115.12 / 115.95 ±0.63 / 116.89 ms │     no change │
│ QQuery 4  │    1279.04 / 1292.40 ±10.27 / 1305.82 ms │       1282.98 / 1315.29 ±20.88 / 1339.57 ms │     no change │
│ QQuery 5  │        172.19 / 174.24 ±1.09 / 175.26 ms │           173.26 / 174.42 ±0.84 / 175.45 ms │     no change │
│ QQuery 6  │     972.06 / 1000.23 ±18.29 / 1023.30 ms │        966.26 / 1026.55 ±35.46 / 1068.55 ms │     no change │
│ QQuery 7  │        349.08 / 353.00 ±2.61 / 356.15 ms │           349.76 / 351.98 ±2.04 / 355.47 ms │     no change │
│ QQuery 8  │        115.86 / 117.02 ±0.87 / 118.31 ms │           115.14 / 116.63 ±1.19 / 118.32 ms │     no change │
│ QQuery 9  │        101.88 / 105.24 ±2.70 / 107.88 ms │           102.69 / 108.46 ±4.66 / 116.97 ms │     no change │
│ QQuery 10 │        107.06 / 107.77 ±0.59 / 108.77 ms │           107.56 / 108.01 ±0.51 / 108.90 ms │     no change │
│ QQuery 11 │       867.69 / 887.05 ±10.17 / 895.38 ms │          878.93 / 894.13 ±13.96 / 913.10 ms │     no change │
│ QQuery 12 │           43.91 / 46.37 ±1.96 / 49.03 ms │              45.00 / 46.30 ±1.21 / 48.20 ms │     no change │
│ QQuery 13 │        400.14 / 402.46 ±1.70 / 404.40 ms │           398.36 / 401.72 ±2.36 / 404.18 ms │     no change │
│ QQuery 14 │     1022.31 / 1026.01 ±5.12 / 1035.43 ms │        1008.33 / 1016.69 ±6.66 / 1023.79 ms │     no change │
│ QQuery 15 │           16.16 / 17.39 ±1.15 / 19.00 ms │              15.22 / 16.38 ±1.00 / 17.89 ms │ +1.06x faster │
│ QQuery 16 │           41.63 / 42.66 ±0.58 / 43.26 ms │              40.88 / 41.21 ±0.49 / 42.18 ms │     no change │
│ QQuery 17 │        246.35 / 247.59 ±1.40 / 250.33 ms │           240.48 / 241.32 ±0.81 / 242.58 ms │     no change │
│ QQuery 18 │        130.36 / 132.85 ±1.76 / 135.14 ms │           127.56 / 129.14 ±1.21 / 131.22 ms │     no change │
│ QQuery 19 │        158.71 / 160.22 ±1.76 / 163.25 ms │           154.61 / 155.82 ±0.65 / 156.57 ms │     no change │
│ QQuery 20 │           14.28 / 14.97 ±0.50 / 15.61 ms │              13.18 / 13.91 ±0.56 / 14.58 ms │ +1.08x faster │
│ QQuery 21 │           20.18 / 20.69 ±0.34 / 21.19 ms │              19.56 / 19.77 ±0.23 / 20.16 ms │     no change │
│ QQuery 22 │        496.19 / 502.33 ±4.78 / 508.10 ms │           482.88 / 485.99 ±3.70 / 493.10 ms │     no change │
│ QQuery 23 │        903.23 / 909.09 ±4.41 / 916.57 ms │           875.55 / 886.15 ±8.72 / 900.93 ms │     no change │
│ QQuery 24 │        412.31 / 416.89 ±2.66 / 420.64 ms │           410.92 / 413.68 ±1.74 / 415.55 ms │     no change │
│ QQuery 25 │        352.67 / 356.40 ±2.09 / 358.72 ms │           351.34 / 353.45 ±1.24 / 354.99 ms │     no change │
│ QQuery 26 │           81.86 / 82.58 ±0.40 / 82.94 ms │              81.17 / 82.47 ±0.88 / 83.52 ms │     no change │
│ QQuery 27 │        350.80 / 352.31 ±1.97 / 356.16 ms │           342.88 / 347.18 ±2.54 / 350.26 ms │     no change │
│ QQuery 28 │        150.57 / 151.84 ±0.97 / 153.38 ms │           148.42 / 149.42 ±1.05 / 151.32 ms │     no change │
│ QQuery 29 │        298.44 / 302.03 ±3.31 / 307.73 ms │           297.52 / 299.38 ±1.67 / 301.39 ms │     no change │
│ QQuery 30 │           43.58 / 45.68 ±1.62 / 48.01 ms │              41.37 / 42.81 ±0.79 / 43.59 ms │ +1.07x faster │
│ QQuery 31 │        170.69 / 172.88 ±1.55 / 175.08 ms │           168.44 / 171.42 ±1.73 / 173.09 ms │     no change │
│ QQuery 32 │         58.71 / 68.03 ±16.98 / 101.97 ms │              56.42 / 57.38 ±0.99 / 59.12 ms │ +1.19x faster │
│ QQuery 33 │        140.62 / 142.13 ±1.01 / 143.74 ms │           140.30 / 142.48 ±1.11 / 143.26 ms │     no change │
│ QQuery 34 │        107.46 / 108.32 ±0.44 / 108.63 ms │           106.24 / 107.85 ±1.09 / 109.23 ms │     no change │
│ QQuery 35 │        108.37 / 109.72 ±0.71 / 110.45 ms │           108.10 / 110.44 ±1.48 / 112.34 ms │     no change │
│ QQuery 36 │        210.36 / 217.62 ±4.20 / 221.99 ms │           213.82 / 219.12 ±3.12 / 221.89 ms │     no change │
│ QQuery 37 │        177.16 / 179.74 ±1.67 / 182.03 ms │           178.14 / 182.74 ±2.78 / 186.60 ms │     no change │
│ QQuery 38 │           85.67 / 87.35 ±2.20 / 91.60 ms │              84.23 / 87.65 ±2.15 / 90.63 ms │     no change │
│ QQuery 39 │        125.64 / 129.22 ±2.32 / 132.11 ms │           124.26 / 126.05 ±1.51 / 128.11 ms │     no change │
│ QQuery 40 │        111.13 / 117.66 ±7.77 / 132.44 ms │           110.77 / 116.00 ±5.10 / 125.39 ms │     no change │
│ QQuery 41 │           14.18 / 14.40 ±0.14 / 14.62 ms │              13.99 / 14.32 ±0.27 / 14.70 ms │     no change │
│ QQuery 42 │        107.08 / 108.44 ±1.05 / 109.86 ms │           108.10 / 108.63 ±0.49 / 109.35 ms │     no change │
│ QQuery 43 │           83.66 / 84.57 ±0.79 / 85.61 ms │              83.06 / 84.96 ±1.59 / 87.54 ms │     no change │
│ QQuery 44 │           11.16 / 12.18 ±1.07 / 14.13 ms │              11.36 / 11.91 ±0.40 / 12.46 ms │     no change │
│ QQuery 45 │           51.11 / 53.50 ±1.94 / 56.73 ms │              52.14 / 53.53 ±1.22 / 55.71 ms │     no change │
│ QQuery 46 │        231.16 / 233.00 ±1.50 / 235.65 ms │           227.82 / 232.32 ±2.59 / 235.90 ms │     no change │
│ QQuery 47 │        691.72 / 703.38 ±5.90 / 707.57 ms │           698.36 / 703.67 ±4.67 / 712.30 ms │     no change │
│ QQuery 48 │        284.42 / 290.92 ±4.86 / 298.68 ms │           289.20 / 291.79 ±2.65 / 296.75 ms │     no change │
│ QQuery 49 │        254.36 / 260.50 ±6.08 / 271.77 ms │           252.92 / 255.45 ±2.09 / 259.15 ms │     no change │
│ QQuery 50 │        230.61 / 236.41 ±4.79 / 244.07 ms │           225.90 / 234.50 ±4.77 / 240.14 ms │     no change │
│ QQuery 51 │        183.53 / 186.48 ±1.71 / 187.96 ms │           181.07 / 184.80 ±3.24 / 189.83 ms │     no change │
│ QQuery 52 │        107.87 / 108.95 ±1.00 / 110.42 ms │           107.30 / 109.14 ±1.63 / 111.60 ms │     no change │
│ QQuery 53 │        103.23 / 103.97 ±0.71 / 105.27 ms │           103.08 / 104.15 ±1.01 / 105.98 ms │     no change │
│ QQuery 54 │        147.26 / 148.77 ±1.30 / 150.78 ms │           146.39 / 148.12 ±1.07 / 149.38 ms │     no change │
│ QQuery 55 │        107.54 / 108.75 ±0.90 / 110.31 ms │           108.01 / 110.08 ±1.35 / 112.03 ms │     no change │
│ QQuery 56 │        141.70 / 145.17 ±2.86 / 149.42 ms │           139.98 / 141.40 ±0.85 / 142.41 ms │     no change │
│ QQuery 57 │        174.23 / 176.98 ±1.79 / 179.26 ms │           171.79 / 173.93 ±1.51 / 175.59 ms │     no change │
│ QQuery 58 │        292.34 / 296.81 ±4.29 / 304.47 ms │           286.26 / 298.67 ±6.65 / 304.77 ms │     no change │
│ QQuery 59 │        199.29 / 201.83 ±1.66 / 204.29 ms │           197.88 / 199.90 ±1.47 / 202.00 ms │     no change │
│ QQuery 60 │        144.95 / 147.15 ±2.09 / 150.66 ms │           142.17 / 144.60 ±1.51 / 146.79 ms │     no change │
│ QQuery 61 │        171.75 / 173.27 ±1.22 / 174.73 ms │           169.43 / 171.84 ±1.51 / 173.62 ms │     no change │
│ QQuery 62 │      887.61 / 937.35 ±46.02 / 1022.06 ms │          876.45 / 909.69 ±19.53 / 934.80 ms │     no change │
│ QQuery 63 │        103.65 / 107.66 ±4.91 / 117.06 ms │           104.70 / 106.97 ±2.23 / 109.66 ms │     no change │
│ QQuery 64 │        693.32 / 701.90 ±5.49 / 710.65 ms │           694.05 / 697.11 ±2.00 / 700.35 ms │     no change │
│ QQuery 65 │        245.90 / 249.63 ±2.61 / 252.33 ms │           245.36 / 250.11 ±3.30 / 254.20 ms │     no change │
│ QQuery 66 │       244.39 / 260.43 ±11.27 / 275.22 ms │          244.37 / 254.61 ±10.06 / 273.79 ms │     no change │
│ QQuery 67 │        305.35 / 313.20 ±7.01 / 324.65 ms │           304.39 / 314.79 ±7.04 / 325.71 ms │     no change │
│ QQuery 68 │        277.88 / 284.43 ±5.34 / 291.68 ms │           279.86 / 283.78 ±3.01 / 287.68 ms │     no change │
│ QQuery 69 │        102.19 / 103.90 ±1.20 / 105.68 ms │           101.86 / 104.70 ±1.43 / 105.74 ms │     no change │
│ QQuery 70 │       319.28 / 349.37 ±19.90 / 374.36 ms │           336.91 / 345.94 ±9.17 / 362.16 ms │     no change │
│ QQuery 71 │        134.13 / 137.47 ±2.97 / 141.60 ms │           134.41 / 138.45 ±3.17 / 143.68 ms │     no change │
│ QQuery 72 │       709.39 / 722.34 ±10.84 / 742.08 ms │           697.82 / 709.69 ±7.65 / 718.66 ms │     no change │
│ QQuery 73 │        103.05 / 105.08 ±1.41 / 107.31 ms │           104.24 / 105.11 ±1.09 / 107.10 ms │     no change │
│ QQuery 74 │       560.15 / 571.52 ±11.40 / 586.50 ms │           534.01 / 541.06 ±4.81 / 547.26 ms │ +1.06x faster │
│ QQuery 75 │        275.67 / 277.88 ±1.96 / 281.09 ms │           274.76 / 278.99 ±2.33 / 281.36 ms │     no change │
│ QQuery 76 │        132.77 / 134.25 ±1.13 / 136.07 ms │           132.60 / 133.75 ±1.14 / 135.79 ms │     no change │
│ QQuery 77 │        187.47 / 189.72 ±2.16 / 193.10 ms │           186.70 / 187.68 ±0.82 / 189.06 ms │     no change │
│ QQuery 78 │        351.92 / 356.47 ±3.45 / 361.37 ms │           342.87 / 347.58 ±4.36 / 354.86 ms │     no change │
│ QQuery 79 │        237.62 / 240.77 ±3.62 / 247.43 ms │           232.52 / 234.32 ±1.06 / 235.75 ms │     no change │
│ QQuery 80 │        328.41 / 331.64 ±3.14 / 337.08 ms │           324.77 / 328.15 ±2.18 / 331.62 ms │     no change │
│ QQuery 81 │           25.99 / 27.48 ±0.90 / 28.75 ms │              25.91 / 26.90 ±0.67 / 28.01 ms │     no change │
│ QQuery 82 │        200.03 / 202.24 ±1.74 / 204.71 ms │           200.65 / 201.37 ±0.70 / 202.37 ms │     no change │
│ QQuery 83 │           39.42 / 40.11 ±0.79 / 41.57 ms │              38.82 / 39.64 ±1.07 / 41.69 ms │     no change │
│ QQuery 84 │           48.44 / 49.51 ±0.71 / 50.41 ms │              48.67 / 49.33 ±0.58 / 50.40 ms │     no change │
│ QQuery 85 │        148.62 / 150.09 ±1.33 / 151.91 ms │           147.10 / 149.11 ±1.06 / 149.94 ms │     no change │
│ QQuery 86 │           39.57 / 40.43 ±0.46 / 40.93 ms │              38.67 / 40.45 ±1.53 / 43.10 ms │     no change │
│ QQuery 87 │           82.14 / 88.82 ±4.58 / 94.39 ms │              85.57 / 90.23 ±3.73 / 94.71 ms │     no change │
│ QQuery 88 │          98.75 / 99.39 ±0.58 / 100.41 ms │             98.41 / 99.49 ±0.79 / 100.27 ms │     no change │
│ QQuery 89 │        117.29 / 118.74 ±1.22 / 120.26 ms │           118.83 / 119.79 ±0.79 / 121.21 ms │     no change │
│ QQuery 90 │           22.84 / 24.39 ±0.91 / 25.40 ms │              23.60 / 24.14 ±0.56 / 25.13 ms │     no change │
│ QQuery 91 │           65.23 / 65.57 ±0.34 / 65.99 ms │              63.07 / 64.44 ±1.07 / 65.70 ms │     no change │
│ QQuery 92 │           57.12 / 58.60 ±1.28 / 60.44 ms │              56.82 / 57.86 ±0.66 / 58.77 ms │     no change │
│ QQuery 93 │        189.81 / 194.21 ±3.26 / 197.70 ms │           190.02 / 191.43 ±1.75 / 194.60 ms │     no change │
│ QQuery 94 │           62.21 / 63.47 ±1.53 / 66.37 ms │              60.92 / 61.54 ±0.45 / 62.31 ms │     no change │
│ QQuery 95 │        134.02 / 135.75 ±1.14 / 137.36 ms │           133.31 / 134.50 ±0.94 / 135.87 ms │     no change │
│ QQuery 96 │           72.92 / 74.73 ±1.21 / 76.24 ms │              72.80 / 74.31 ±0.95 / 75.51 ms │     no change │
│ QQuery 97 │        127.25 / 131.12 ±3.25 / 136.87 ms │           127.74 / 128.87 ±0.88 / 130.21 ms │     no change │
│ QQuery 98 │        154.26 / 155.93 ±1.17 / 157.85 ms │           150.61 / 155.32 ±3.39 / 161.19 ms │     no change │
│ QQuery 99 │ 10725.40 / 10762.18 ±31.10 / 10802.43 ms │    10708.32 / 10776.53 ±58.85 / 10874.14 ms │     no change │
└───────────┴──────────────────────────────────────────┴─────────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                          │ 33558.42ms │
│ Total Time (gene.bordegaray_2026_03_repartition_metrics)   │ 33403.72ms │
│ Average Time (HEAD)                                        │   338.97ms │
│ Average Time (gene.bordegaray_2026_03_repartition_metrics) │   337.41ms │
│ Queries Faster                                             │          5 │
│ Queries Slower                                             │          0 │
│ Queries with No Change                                     │         94 │
│ Queries with Failure                                       │          0 │
└────────────────────────────────────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric Value
Wall time 168.1s
Peak memory 5.3 GiB
Avg memory 4.5 GiB
CPU user 270.0s
CPU sys 18.4s
Disk read 0 B
Disk write 637.7 MiB

tpcds — branch

Metric Value
Wall time 167.3s
Peak memory 5.7 GiB
Avg memory 4.7 GiB
CPU user 267.8s
CPU sys 18.1s
Disk read 0 B
Disk write 152.0 KiB

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Details

Comparing HEAD and gene.bordegaray_2026_03_repartition_metrics
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃ gene.bordegaray_2026_03_repartition_metrics ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │          1.30 / 4.61 ±6.46 / 17.52 ms │                1.31 / 4.59 ±6.42 / 17.43 ms │     no change │
│ QQuery 1  │        14.71 / 14.98 ±0.18 / 15.23 ms │              14.63 / 14.85 ±0.14 / 15.04 ms │     no change │
│ QQuery 2  │        43.95 / 44.23 ±0.22 / 44.53 ms │              45.43 / 45.84 ±0.27 / 46.19 ms │     no change │
│ QQuery 3  │        42.79 / 44.76 ±1.28 / 46.49 ms │              42.93 / 45.08 ±1.74 / 47.29 ms │     no change │
│ QQuery 4  │     297.46 / 304.40 ±5.05 / 312.95 ms │           292.44 / 301.73 ±6.06 / 309.36 ms │     no change │
│ QQuery 5  │     353.82 / 359.92 ±3.51 / 364.58 ms │           343.49 / 355.95 ±6.87 / 364.44 ms │     no change │
│ QQuery 6  │           5.67 / 6.54 ±0.67 / 7.49 ms │                 5.54 / 5.97 ±0.31 / 6.45 ms │ +1.10x faster │
│ QQuery 7  │        17.29 / 17.57 ±0.24 / 17.87 ms │              17.26 / 18.08 ±0.89 / 19.34 ms │     no change │
│ QQuery 8  │     430.34 / 435.43 ±4.65 / 443.59 ms │           438.58 / 443.34 ±5.54 / 454.00 ms │     no change │
│ QQuery 9  │     677.37 / 683.33 ±3.89 / 688.53 ms │           679.77 / 683.29 ±1.84 / 684.93 ms │     no change │
│ QQuery 10 │        92.35 / 94.54 ±2.19 / 98.41 ms │             93.27 / 95.48 ±3.00 / 101.44 ms │     no change │
│ QQuery 11 │     106.13 / 107.23 ±0.81 / 108.08 ms │           106.22 / 107.48 ±1.37 / 109.96 ms │     no change │
│ QQuery 12 │     358.83 / 364.56 ±4.07 / 371.07 ms │           348.51 / 356.75 ±5.40 / 365.47 ms │     no change │
│ QQuery 13 │     471.38 / 484.05 ±8.16 / 496.31 ms │          472.37 / 489.46 ±16.86 / 519.07 ms │     no change │
│ QQuery 14 │     356.74 / 367.05 ±7.50 / 379.79 ms │           363.77 / 368.69 ±2.52 / 370.93 ms │     no change │
│ QQuery 15 │    375.90 / 391.69 ±15.17 / 419.88 ms │           374.32 / 386.29 ±9.31 / 398.08 ms │     no change │
│ QQuery 16 │    736.36 / 767.81 ±21.85 / 802.72 ms │          738.13 / 765.04 ±18.25 / 794.34 ms │     no change │
│ QQuery 17 │     722.96 / 734.04 ±6.34 / 740.61 ms │           724.03 / 729.80 ±4.84 / 737.93 ms │     no change │
│ QQuery 18 │ 1462.60 / 1503.03 ±27.15 / 1542.95 ms │       1445.39 / 1483.94 ±27.63 / 1529.02 ms │     no change │
│ QQuery 19 │       35.44 / 48.66 ±11.19 / 68.04 ms │              35.45 / 39.68 ±4.23 / 47.39 ms │ +1.23x faster │
│ QQuery 20 │    715.93 / 737.80 ±24.26 / 781.13 ms │          718.93 / 731.80 ±15.71 / 761.43 ms │     no change │
│ QQuery 21 │    756.62 / 781.85 ±20.16 / 810.37 ms │           761.69 / 767.64 ±6.14 / 778.11 ms │     no change │
│ QQuery 22 │  1133.31 / 1141.52 ±9.80 / 1159.20 ms │        1123.88 / 1129.34 ±4.35 / 1135.30 ms │     no change │
│ QQuery 23 │ 3078.38 / 3093.13 ±16.43 / 3123.01 ms │       3057.28 / 3073.39 ±13.30 / 3093.74 ms │     no change │
│ QQuery 24 │      98.23 / 100.57 ±1.29 / 101.86 ms │            98.66 / 102.37 ±2.59 / 106.40 ms │     no change │
│ QQuery 25 │     138.99 / 141.07 ±2.19 / 145.20 ms │           138.98 / 140.51 ±0.80 / 141.35 ms │     no change │
│ QQuery 26 │      95.71 / 100.08 ±2.41 / 103.16 ms │            98.08 / 100.09 ±1.37 / 101.43 ms │     no change │
│ QQuery 27 │    850.76 / 868.04 ±11.52 / 883.56 ms │           847.91 / 854.89 ±9.00 / 872.59 ms │     no change │
│ QQuery 28 │ 7767.26 / 7802.92 ±26.74 / 7845.87 ms │       7716.63 / 7763.38 ±24.90 / 7784.15 ms │     no change │
│ QQuery 29 │        50.76 / 54.79 ±4.42 / 61.34 ms │              51.18 / 55.43 ±6.16 / 67.65 ms │     no change │
│ QQuery 30 │     372.29 / 380.04 ±6.66 / 389.24 ms │           363.53 / 370.31 ±5.11 / 377.02 ms │     no change │
│ QQuery 31 │    360.22 / 380.05 ±15.61 / 405.48 ms │           370.86 / 383.50 ±6.68 / 389.47 ms │     no change │
│ QQuery 32 │ 1275.14 / 1309.30 ±25.51 / 1340.04 ms │       1035.99 / 1086.41 ±56.37 / 1183.10 ms │ +1.21x faster │
│ QQuery 33 │ 1446.89 / 1499.98 ±42.47 / 1575.45 ms │       1475.02 / 1490.23 ±11.38 / 1507.59 ms │     no change │
│ QQuery 34 │ 1472.65 / 1486.96 ±10.53 / 1501.35 ms │        1483.28 / 1487.58 ±3.65 / 1492.80 ms │     no change │
│ QQuery 35 │     388.41 / 396.82 ±6.41 / 407.53 ms │           394.69 / 402.46 ±5.98 / 412.31 ms │     no change │
│ QQuery 36 │     115.08 / 122.21 ±5.71 / 128.67 ms │           113.47 / 123.05 ±4.99 / 127.47 ms │     no change │
│ QQuery 37 │        48.78 / 50.46 ±2.41 / 55.19 ms │              47.38 / 49.46 ±1.20 / 51.10 ms │     no change │
│ QQuery 38 │        76.99 / 79.02 ±2.41 / 83.71 ms │              76.39 / 78.62 ±1.25 / 79.82 ms │     no change │
│ QQuery 39 │     212.52 / 216.91 ±3.71 / 221.69 ms │           215.74 / 225.23 ±9.23 / 236.60 ms │     no change │
│ QQuery 40 │        25.04 / 27.66 ±1.77 / 29.82 ms │              24.82 / 26.70 ±1.48 / 28.22 ms │     no change │
│ QQuery 41 │        20.17 / 21.79 ±1.19 / 23.18 ms │              20.00 / 22.05 ±1.19 / 23.57 ms │     no change │
│ QQuery 42 │        19.70 / 20.00 ±0.28 / 20.49 ms │              19.81 / 20.55 ±0.52 / 21.39 ms │     no change │
└───────────┴───────────────────────────────────────┴─────────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                          │ 27591.39ms │
│ Total Time (gene.bordegaray_2026_03_repartition_metrics)   │ 27226.34ms │
│ Average Time (HEAD)                                        │   641.66ms │
│ Average Time (gene.bordegaray_2026_03_repartition_metrics) │   633.17ms │
│ Queries Faster                                             │          3 │
│ Queries Slower                                             │          0 │
│ Queries with No Change                                     │         40 │
│ Queries with Failure                                       │          0 │
└────────────────────────────────────────────────────────────┴────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric Value
Wall time 139.1s
Peak memory 41.2 GiB
Avg memory 28.5 GiB
CPU user 1295.9s
CPU sys 102.8s
Disk read 0 B
Disk write 2.8 GiB

clickbench_partitioned — branch

Metric Value
Wall time 137.3s
Peak memory 42.1 GiB
Avg memory 31.6 GiB
CPU user 1292.9s
CPU sys 89.8s
Disk read 0 B
Disk write 132.0 KiB

File an issue against this benchmark runner

@gene-bordegaray
Copy link
Copy Markdown
Contributor Author

In general, it might be better to keep metrics that directly help application tuning, are frequently used, or are difficult to capture with external profilers. I suspect some of them can be directly observed with profilers/flamegraphs, maybe they can be simplified?

Additionally, we could consider introducing a new analyze level Internal in datafusion.explain.analyze_level (https://datafusion.apache.org/user-guide/configs.html) to hide these metrics from regular output.

Ya I agree and am trying to be wary about this as well. I don't not see any changes on benches but like may be hard to catch microregressions.

I am using these and they are quite useful (and very convenient compared to a profiling tool like xctrace). Maybe these could be kept at the 'dev' level of analyze output?

@2010YOUY01
Copy link
Copy Markdown
Contributor

I am using these and they are quite useful (and very convenient compared to a profiling tool like xctrace). Maybe these could be kept at the 'dev' level of analyze output?

I was thinking dev level is still for application insights and always tracked, but are less frequently used comparing to 'summary'; internal level is for DataFusion kernel hacking, and could be made optional (no runtime overhead) when running at higher levels.

I'll open an issue for this idea.

@gene-bordegaray
Copy link
Copy Markdown
Contributor Author

I am using these and they are quite useful (and very convenient compared to a profiling tool like xctrace). Maybe these could be kept at the 'dev' level of analyze output?

I was thinking dev level is still for application insights and always tracked, but are less frequently used comparing to 'summary'; internal level is for DataFusion kernel hacking, and could be made optional (no runtime overhead) when running at higher levels.

I'll open an issue for this idea.

hey following up on this, I wasn't able to find an issue but I would be willing to for these metrics. I think they are worth the additional level 👍

@2010YOUY01
Copy link
Copy Markdown
Contributor

@gene-bordegaray Thanks, really appreciate it! Please go ahead.

@gene-bordegaray gene-bordegaray force-pushed the gene.bordegaray/2026/03/repartition_metrics branch from cd5514f to 769ad4f Compare May 13, 2026 18:25
@gene-bordegaray gene-bordegaray changed the title feat: add granular repartition metrics Add internal repartition metrics May 13, 2026
@gene-bordegaray gene-bordegaray force-pushed the gene.bordegaray/2026/03/repartition_metrics branch from 769ad4f to fe6c854 Compare May 13, 2026 18:29
@github-actions github-actions Bot added physical-expr Changes to the physical-expr crates core Core DataFusion crate common Related to common crate proto Related to proto crate labels May 13, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

Thank you for opening this pull request!

Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch).

Details
     Cloning apache/main
    Building datafusion v53.1.0 (current)
       Built [  97.914s] (current)
     Parsing datafusion v53.1.0 (current)
      Parsed [   0.033s] (current)
    Building datafusion v53.1.0 (baseline)
       Built [  97.451s] (baseline)
     Parsing datafusion v53.1.0 (baseline)
      Parsed [   0.034s] (baseline)
    Checking datafusion v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.613s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [ 198.113s] datafusion
    Building datafusion-common v53.1.0 (current)
       Built [  32.309s] (current)
     Parsing datafusion-common v53.1.0 (current)
      Parsed [   0.055s] (current)
    Building datafusion-common v53.1.0 (baseline)
       Built [  32.675s] (baseline)
     Parsing datafusion-common v53.1.0 (baseline)
      Parsed [   0.055s] (baseline)
    Checking datafusion-common v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.629s] 222 checks: 221 pass, 1 fail, 0 warn, 30 skip

--- failure enum_variant_added: enum variant added on exhaustive enum ---

Description:
A publicly-visible enum without #[non_exhaustive] has a new variant.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#enum-variant-new
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.47.0/src/lints/enum_variant_added.ron

Failed in:
  variant MetricType:Internal in /home/runner/work/datafusion/datafusion/datafusion/common/src/format.rs:233

     Summary semver requires new major version: 1 major and 0 minor checks failed
    Finished [  67.264s] datafusion-common
    Building datafusion-physical-expr-common v53.1.0 (current)
       Built [  19.204s] (current)
     Parsing datafusion-physical-expr-common v53.1.0 (current)
      Parsed [   0.019s] (current)
    Building datafusion-physical-expr-common v53.1.0 (baseline)
       Built [  19.370s] (baseline)
     Parsing datafusion-physical-expr-common v53.1.0 (baseline)
      Parsed [   0.020s] (baseline)
    Checking datafusion-physical-expr-common v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.203s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [  39.990s] datafusion-physical-expr-common
    Building datafusion-physical-plan v53.1.0 (current)
       Built [  31.635s] (current)
     Parsing datafusion-physical-plan v53.1.0 (current)
      Parsed [   0.119s] (current)
    Building datafusion-physical-plan v53.1.0 (baseline)
       Built [  31.751s] (baseline)
     Parsing datafusion-physical-plan v53.1.0 (baseline)
      Parsed [   0.120s] (baseline)
    Checking datafusion-physical-plan v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.556s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [  65.978s] datafusion-physical-plan
    Building datafusion-proto v53.1.0 (current)
       Built [  52.722s] (current)
     Parsing datafusion-proto v53.1.0 (current)
      Parsed [   0.130s] (current)
    Building datafusion-proto v53.1.0 (baseline)
       Built [  52.779s] (baseline)
     Parsing datafusion-proto v53.1.0 (baseline)
      Parsed [   0.129s] (baseline)
    Checking datafusion-proto v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   1.678s] 222 checks: 221 pass, 1 fail, 0 warn, 30 skip

--- failure constructible_struct_adds_field: externally-constructible struct adds field ---

Description:
A pub struct constructible with a struct literal has a new pub field. Existing struct literals must be updated to include the new field.
        ref: https://doc.rust-lang.org/reference/expressions/struct-expr.html
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.47.0/src/lints/constructible_struct_adds_field.ron

Failed in:
  field AnalyzeExecNode.metric_types in /home/runner/work/datafusion/datafusion/datafusion/proto/src/generated/prost.rs:1837
  field AnalyzeExecNode.metric_types in /home/runner/work/datafusion/datafusion/datafusion/proto/src/generated/prost.rs:1837

     Summary semver requires new major version: 1 major and 0 minor checks failed
    Finished [ 109.778s] datafusion-proto
    Building datafusion-sqllogictest v53.1.0 (current)
       Built [ 168.147s] (current)
     Parsing datafusion-sqllogictest v53.1.0 (current)
      Parsed [   0.020s] (current)
    Building datafusion-sqllogictest v53.1.0 (baseline)
       Built [ 168.211s] (baseline)
     Parsing datafusion-sqllogictest v53.1.0 (baseline)
      Parsed [   0.022s] (baseline)
    Checking datafusion-sqllogictest v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.085s] 222 checks: 222 pass, 30 skip
     Summary no semver update required
    Finished [ 340.060s] datafusion-sqllogictest

@github-actions github-actions Bot added the auto detected api change Auto detected API change label May 13, 2026
@gene-bordegaray gene-bordegaray force-pushed the gene.bordegaray/2026/03/repartition_metrics branch from fe6c854 to 1b1ee57 Compare May 14, 2026 00:42
@github-actions github-actions Bot removed physical-expr Changes to the physical-expr crates core Core DataFusion crate common Related to common crate proto Related to proto crate labels May 14, 2026
@gene-bordegaray gene-bordegaray force-pushed the gene.bordegaray/2026/03/repartition_metrics branch from 1b1ee57 to 5924eaf Compare May 14, 2026 00:43
@github-actions github-actions Bot added physical-expr Changes to the physical-expr crates core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) common Related to common crate proto Related to proto crate labels May 14, 2026
@gene-bordegaray
Copy link
Copy Markdown
Contributor Author

gene-bordegaray commented May 14, 2026

@2010YOUY01 I have added this behind an 'internal' metrics level. I decided to keep this in this PR as the introduction of the "internal" metrics level and repartition being the first use case seemed fitting and not too large of a scope.

These metrics have led me to find an improvement in the oeprator on fanned out small batches here: #22159

Let me know what you think on the way it is introduced, also not torn to do it in two PRs if preferred

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto detected api change Auto detected API change common Related to common crate core Core DataFusion crate documentation Improvements or additions to documentation physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate proto Related to proto crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add internal EXPLAIN ANALYZE metric level Add Granular Metrics to RepartitionExec

4 participants