Skip to content

[WIP][SQL] DSv2 Write Timing Metrics#56033

Draft
ZiyaZa wants to merge 1 commit into
apache:masterfrom
ZiyaZa:timing-metrics
Draft

[WIP][SQL] DSv2 Write Timing Metrics#56033
ZiyaZa wants to merge 1 commit into
apache:masterfrom
ZiyaZa:timing-metrics

Conversation

@ZiyaZa
Copy link
Copy Markdown
Contributor

@ZiyaZa ZiyaZa commented May 21, 2026

What changes were proposed in this pull request?

Adds 3 new metrics for write operations in DSv2 (the descriptions below are AI-generated):

  • executionTimeMs - Total wall-clock time, in milliseconds, of the V2 write exec from entry of writeWithV2 through runJob and metric-collection bookkeeping, snapshotted just before the connector's BatchWrite.commit(...). Includes: query.execute() (RDD lineage construction, plan prepare(), kicking off DPP/runtime-filter subqueries, AQE work that runs during execute), writer-factory creation, the main runJob, post-job driver bookkeeping, and getWriteSummary itself. Excludes: Catalyst analysis / optimization before V2TableWriteExec.run(), the connector's own BatchWrite.commit (the commit is what consumes this summary).
  • groupFilterTimeMs - Cumulative time, in milliseconds, spent in the runtime group-filter subquery injected by RowLevelOperationRuntimeGroupFiltering - the subquery that identifies which target files contain rows matching the row-level operation condition. Measured as the sum of collectTime over all ExecSubqueryExpressions in the runtime filters of the row-level target's BatchScanExec.
  • writeJobTimeMs - Wall-clock duration, in milliseconds, of the main Spark write job — i.e. the sparkContext.runJob(...) call inside writeWithV2. This is the fused scan + filter + (join, for MERGE) + project + write tasks that produce the row-level deltas or replacement data. Excludes: Driver-side prep, runtime group-filter subquery execution (which runs in its own job in a separate thread pool), post-job processing.

groupFilterTimeMs is -1 when the runtime group-filter subquery is not injected for the row-level operation. This happens in the following cases:

  1. No effective WHERE clause — The operation has no condition, has WHERE TRUE, or the condition folds to TrueLiteral. The rule requires cond != TrueLiteral.
  2. Runtime group filter disabled — spark.sql.optimizer.runtimeRowLevelOperation.groupFilter.enabled = false.
  3. MERGE with NOT MATCHED BY SOURCE — RewriteMergeIntoTable does not set groupFilterCondition when notMatchedBySourceActions is non-empty, since the whole target must be read.
  4. InsertOnlyMerge optimization — MERGE rewritten to an append-style insert-only plan with no row-level target scan.
  5. Delta-based DELETE — RewriteDeleteFromTable.buildWriteDeltaPlan never sets groupFilterCondition. Delta-based DELETE reads only row IDs, so file pruning by partition columns is not applicable.
  6. Partition column pruned from the read schema — The rule's gate scan.filterAttributes.nonEmpty requires the partition column to be in the scan's read schema. For delta-based UPDATE, when the partition column is being assigned (e.g. SET dep = 'literal') and is not referenced by the condition, column pruning removes it.
  7. AQE eliminates the scan — When the injected runtime-filter subquery returns empty, AQE replaces the BatchScanExec with a null-constant projection. The subquery executed but its parent scan no longer exists in the final plan, so the helper can't reach it.
  8. Condition folds to a constant-false / empty-result branch — Cases like WHERE col = NULL, WHERE false, or empty target tables, where the optimizer short-circuits the scan before runtime filtering is considered.

Why are the changes needed?

For better visibility into the DML queries, showing how long each operation took.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added validation for new metrics to existing tests.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant