You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(merge): partition pruning, EXPLAIN routing, and data-loss fix on the MERGE path (rebased) (#134)
* fix(custom_type_coercion): fall back to plan.schema() for leaf nodes
`CustomTypeCoercionRewriter::analyze_internal` built its lookup schema
from `merge_schema(&plan.inputs())`. For leaf nodes like `LogicalPlan::TableScan`,
`plan.inputs()` is empty and the merged schema has no fields, so any
binary-op expression attached directly to the leaf — e.g. via
`LogicalPlanBuilder::scan_with_filters` — would fail coercion with
"Schema error: No field named <col>" during the analyzer pass.
This broke the target-side partition pruning hint path that
`UserQuery::merge_query` wires up when a MERGE source is a partitioned
`DataFusionTable`: `target_filter_expression()` builds a per-partition
`col(source) >= min AND col(source) <= max` predicate and pushes it into
the target `TableScan`'s filters via `scan_with_filters`, expecting
Iceberg's file pruner to use it at manifest level. The filter never made
it past the analyzer.
Fix: when `plan.inputs().is_empty()`, use `plan.schema()` directly for
type resolution, mirroring the pattern DataFusion's built-in
`TypeCoercion` analyzer uses. All existing `custom_type_coercion`
snapshot tests still pass, and the full `merge_into` suite (22 tests)
stays green.
Verified end-to-end against a deployed Embucket Lambda:
`MERGE INTO demo.atomic.events_hooli_tiny USING demo.atomic.events_hooli_ident`
where the source is partitioned by `identity(event_name)` — previously
failed with `custom_type_coercion / Schema error: No field named event_name`,
now returns 100 matched rows and the update lands on disk.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(merge): route EXPLAIN / EXPLAIN ANALYZE into the MERGE planner
Embucket has its own MERGE planner (`UserQuery::merge_query`) because
DataFusion's SQL path doesn't produce a usable plan for `MERGE INTO`.
The side effect was that `EXPLAIN MERGE INTO …` and
`EXPLAIN ANALYZE MERGE INTO …` both fell through `execute()` to
`execute_sql`, which hands the statement to DataFusion's planner and
bounces back with:
SQL compilation error: unsupported feature: Unsupported SQL statement:
MERGE INTO …
No observability for MERGE plans or for per-scan metrics — which made
it impossible to verify partition-pruning behaviour on partitioned
Iceberg targets (files scanned, bytes scanned, manifest-level pruning
counters).
Changes:
1. Split `merge_query` into a pure plan-builder `merge_to_logical_plan`
and a thin wrapper that calls `execute_logical_plan`.
2. In `execute()`, when the parsed statement is
`DFStatement::Explain(..)` whose inner statement is
`Statement::Merge { .. }`, build the MERGE logical plan via
`merge_to_logical_plan`, then wrap it in the same
`LogicalPlan::Explain` / `LogicalPlan::Analyze` shape DataFusion's
own `explain_to_plan` constructs. Everything downstream (physical
planning, execution, output formatting) is unchanged.
3. Add a snapshot test `merge_into_explain` over a minimal
unpartitioned target + source — asserts the logical and physical
plans render. `EXPLAIN ANALYZE` is exercised end-to-end through the
deployed Lambda rather than via snapshot because the formatted-table
column widths depend on the pre-redaction metric value widths and
aren't stable across runs.
After this change:
- `EXPLAIN MERGE INTO t USING s ON ... WHEN MATCHED THEN UPDATE ...`
returns the logical plan + physical plan (including
`MergeIntoSinkExec`, `HashJoinExec`, `DataSourceExec { file_groups,
projection, file_type }` for each side).
- `EXPLAIN ANALYZE` of the same statement executes the MERGE and
additionally reports per-node runtime metrics. The
`DataSourceExec` rows now surface the DataFusion/Parquet scan
counters that were previously invisible: `bytes_scanned`,
`files_ranges_pruned_statistics`, `row_groups_pruned_statistics`,
`pushdown_rows_pruned`, `page_index_rows_pruned`. That's the signal
you need to verify source-side partition-hint pruning actually prunes.
All 23 `merge_into` tests pass (22 existing + 1 new). Full
`cargo test -p executor --lib` is 359/0.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(merge): expose updated/inserted/deleted row counts as MetricsSet
`MergeIntoCOWSinkExec` tracked per-clause row counts in `AtomicI64`
purely to populate the final result batch. After the EXPLAIN / EXPLAIN
ANALYZE routing fix on this branch, `EXPLAIN ANALYZE MERGE INTO …`
reports rich per-scan metrics on every `DataSourceExec` in the plan,
but the sink line was still rendering as `MergeIntoSinkExec, metrics=[]`
because this node didn't own an `ExecutionPlanMetricsSet`.
Wire one up: register `Count` metrics `updated_rows`, `inserted_rows`,
and `deleted_rows` via `MetricBuilder::new(&self.metrics).counter(..)`
at the start of `execute()`, clone them into the async write closure,
and `add()` the final `AtomicI64` values after the transaction commits.
Implement `ExecutionPlan::metrics()` to return
`Some(self.metrics.clone_inner())` so DataFusion's plan formatter picks
them up. Row counts that exceed `usize::MAX` saturate via `try_from`
rather than panicking.
After this change, `EXPLAIN ANALYZE MERGE` shows the sink counters
alongside the child scan counters, so an operator can read updated /
inserted / deleted counts directly off the plan output instead of only
from the result row.
All 23 `merge_into` tests stay green.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(merge): preserve target rows when MERGE batch contains only target
The MergeCOWFilterStream "no matches in this batch" fast path
short-circuited on `matching_data_and_manifest_files.is_empty()`
without checking the cumulative `all_matching_data_files` set. If a
target file had been seen as matching in an earlier batch and a later
batch contained only target rows for that file, the rows in the later
batch were silently dropped. The downstream COW commit then overwrote
the original file with the partial result, permanently losing the
unmatched target rows whose batch hit the dead path.
The fix tightens the guard to also require `all_matching_data_files`
to be empty before taking the fast path. When a batch belongs to a
file already in the overwrite set, it falls through to the main
filter path which correctly emits target rows via
`file_predicate OR source_exists`.
Adds three unit tests against MergeCOWFilterStream covering the
matching-then-target patterns, plus a SQL snapshot test that exercises
the same shape end-to-end.
Fixes#128
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style: apply rustfmt after rebase on main
Rustfmt on Rust 1.94 formats the long #[instrument(...)] attribute and
the following fn signature differently than the PR was originally
authored against. No semantic change.
* style(merge): collapse nested if into let-chain for EXPLAIN MERGE route
Rust 1.94 clippy (clippy::collapsible_if, denied via clippy::all) flags
the nested `if let ... { if matches!(...) { ... } }` guard in execute().
Merge both conditions into a single let-chain so clippy is happy without
changing the observable behaviour of the MERGE EXPLAIN routing.
* test(merge): normalize RoundRobinBatch fan-out in EXPLAIN snapshots
The DataFusion planner uses the host CPU count as the RoundRobinBatch
partition target, so the EXPLAIN snapshot literal differed between the
PR author's dev box (10 cores) and the 4-core ubuntu-latest GitHub
runner. Add an insta filter to the shared test_query! macro that rewrites
`RoundRobinBatch(N)` to `RoundRobinBatch([N])`, and regenerate the
`query_merge_into_explain` snapshot to use the normalized token so the
test is stable across core counts.
* ci: re-trigger clippy after transient actions/checkout HTTP 500
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
// MERGE INTO" because `execute()` never unwrapped
7
+
// `DFStatement::Explain(..MERGE..)` and fell through to DataFusion's default
8
+
// SQL path, which doesn't know about Embucket's MERGE planner.
9
+
//
10
+
// This test covers the plan shape only. `EXPLAIN ANALYZE MERGE` is
11
+
// exercised end-to-end against the deployed Lambda — its output contains
12
+
// per-run metric values whose width varies the formatted-table column
13
+
// padding, which is too unstable for an insta snapshot.
14
+
test_query!(
15
+
merge_into_explain,
16
+
"EXPLAIN MERGE INTO merge_target USING merge_source ON merge_target.id = merge_source.id WHEN MATCHED THEN UPDATE SET merge_target.description = merge_source.description",
17
+
setup_queries = [
18
+
"CREATE TABLE embucket.public.merge_target (ID INTEGER, description VARCHAR)",
19
+
"CREATE TABLE embucket.public.merge_source (ID INTEGER, description VARCHAR)",
20
+
"INSERT INTO embucket.public.merge_target VALUES (1, 'existing row')",
21
+
"INSERT INTO embucket.public.merge_source VALUES (1, 'updated row')",
22
+
],
23
+
snapshot_path = "merge_into"
24
+
);
25
+
3
26
test_query!(
4
27
merge_into_only_update,
5
28
"SELECT count(CASE WHEN description = 'updated row' THEN 1 ELSE NULL END) updated, count(CASE WHEN description = 'existing row' THEN 1 ELSE NULL END) existing FROM embucket.public.merge_target",
@@ -299,3 +322,25 @@ test_query!(
299
322
],
300
323
snapshot_path = "merge_into"
301
324
);
325
+
326
+
// Regression test for https://github.com/Embucket/embucket/issues/128.
327
+
//
328
+
// Target is one data file with many rows; source is a mix of updates (matches) and
329
+
// inserts (no match), and the target rows of the join land in the filter stream in
330
+
// batches where some contain source_exists=true rows and some only contain target
331
+
// rows. Previously the "no matches, no source" fast path would silently drop the
332
+
// target-only batches for a file that had already been marked as matching in an
333
+
// earlier batch, causing the final row count to be less than the expected
334
+
// (target_rows + new_source_rows). This test asserts that no target row is lost.
335
+
test_query!(
336
+
merge_into_mixed_unsorted_multi_row_no_data_loss,
337
+
"SELECT COUNT(*) as total_rows, COUNT(CASE WHEN description = 'updated row' THEN 1 END) as updated_rows, COUNT(CASE WHEN description = 'original row' THEN 1 END) as preserved_rows, COUNT(CASE WHEN description = 'new row' THEN 1 END) as inserted_rows FROM embucket.public.merge_target",
338
+
setup_queries = [
339
+
"CREATE TABLE embucket.public.merge_target (id INTEGER, description VARCHAR)",
340
+
"CREATE TABLE embucket.public.merge_source (id INTEGER, description VARCHAR)",
"MERGE INTO merge_target t USING merge_source s ON t.id = s.id WHEN MATCHED THEN UPDATE SET t.description = s.description WHEN NOT MATCHED THEN INSERT (id, description) VALUES (s.id, s.description)",
0 commit comments