[WIP][SQL] DSv2 Write Timing Metrics by ZiyaZa · Pull Request #56033 · apache/spark

ZiyaZa · 2026-05-21T10:20:16Z

What changes were proposed in this pull request?

Adds 3 new metrics for write operations in DSv2 (the descriptions below are AI-generated):

executionTimeMs - Total wall-clock time, in milliseconds, of the V2 write exec from entry of writeWithV2 through runJob and metric-collection bookkeeping, snapshotted just before the connector's BatchWrite.commit(...). Includes: query.execute() (RDD lineage construction, plan prepare(), kicking off DPP/runtime-filter subqueries, AQE work that runs during execute), writer-factory creation, the main runJob, post-job driver bookkeeping, and getWriteSummary itself. Excludes: Catalyst analysis / optimization before V2TableWriteExec.run(), the connector's own BatchWrite.commit (the commit is what consumes this summary).
groupFilterTimeMs - Cumulative time, in milliseconds, spent in the runtime group-filter subquery injected by RowLevelOperationRuntimeGroupFiltering - the subquery that identifies which target files contain rows matching the row-level operation condition. Measured as the sum of collectTime over all ExecSubqueryExpressions in the runtime filters of the row-level target's BatchScanExec.
writeJobTimeMs - Wall-clock duration, in milliseconds, of the main Spark write job — i.e. the sparkContext.runJob(...) call inside writeWithV2. This is the fused scan + filter + (join, for MERGE) + project + write tasks that produce the row-level deltas or replacement data. Excludes: Driver-side prep, runtime group-filter subquery execution (which runs in its own job in a separate thread pool), post-job processing.

groupFilterTimeMs is -1 when the runtime group-filter subquery is not injected for the row-level operation. This happens in the following cases:

No effective WHERE clause — The operation has no condition, has WHERE TRUE, or the condition folds to TrueLiteral. The rule requires cond != TrueLiteral.
Runtime group filter disabled — spark.sql.optimizer.runtimeRowLevelOperation.groupFilter.enabled = false.
MERGE with NOT MATCHED BY SOURCE — RewriteMergeIntoTable does not set groupFilterCondition when notMatchedBySourceActions is non-empty, since the whole target must be read.
InsertOnlyMerge optimization — MERGE rewritten to an append-style insert-only plan with no row-level target scan.
Delta-based DELETE — RewriteDeleteFromTable.buildWriteDeltaPlan never sets groupFilterCondition. Delta-based DELETE reads only row IDs, so file pruning by partition columns is not applicable.
Partition column pruned from the read schema — The rule's gate scan.filterAttributes.nonEmpty requires the partition column to be in the scan's read schema. For delta-based UPDATE, when the partition column is being assigned (e.g. SET dep = 'literal') and is not referenced by the condition, column pruning removes it.
AQE eliminates the scan — When the injected runtime-filter subquery returns empty, AQE replaces the BatchScanExec with a null-constant projection. The subquery executed but its parent scan no longer exists in the final plan, so the helper can't reach it.
Condition folds to a constant-false / empty-result branch — Cases like WHERE col = NULL, WHERE false, or empty target tables, where the optimizer short-circuits the scan before runtime filtering is considered.

Why are the changes needed?

For better visibility into the DML queries, showing how long each operation took.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added validation for new metrics to existing tests.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.7

Timing metrics

88e83df

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][SQL] DSv2 Write Timing Metrics#56033

[WIP][SQL] DSv2 Write Timing Metrics#56033
ZiyaZa wants to merge 1 commit into
apache:masterfrom
ZiyaZa:timing-metrics

ZiyaZa commented May 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ZiyaZa commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ZiyaZa commented May 21, 2026 •

edited

Loading