Skip to content

Commit 3e1f856

Browse files
adriangbclaude
andcommitted
docs: clarify error semantics of in-place plan rewriting
Address review feedback on the in-place optimizer rewrite: - Document the error contract of `rewrite_plan_in_place`: on `Err` the plan is left in an unspecified state because `rule.rewrite()` consumes it by value, and explain why it cannot be recovered without the clone the function exists to avoid. Note how `Optimizer::optimize` handles it. - Move the `mem::take` bridge explanation from the doc comment into an inline comment next to the code it describes. - Drop the inaccurate "Arc refcount is 1 is always true" claim. - Document that `LogicalPlan::map_children_mut` does not roll back partial mutations when `f` fails. Comment-only changes; no behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 7d5359f commit 3e1f856

2 files changed

Lines changed: 33 additions & 6 deletions

File tree

datafusion/expr/src/logical_plan/tree_node.rs

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -402,6 +402,14 @@ impl LogicalPlan {
402402
/// When the refcount is >1, it clones the inner value first.
403403
///
404404
/// Returns `Ok(true)` if any child was modified by `f`.
405+
///
406+
/// # Error semantics
407+
///
408+
/// If `f` returns `Err` for a child, this method returns that error
409+
/// immediately. Children visited earlier in the same call keep whatever
410+
/// modifications `f` already applied to them — they are **not** rolled
411+
/// back. Callers that need the pre-call tree on error must save a copy
412+
/// beforehand.
405413
pub fn map_children_mut<F: FnMut(&mut Self) -> Result<bool>>(
406414
&mut self,
407415
mut f: F,

datafusion/optimizer/src/optimizer.rs

Lines changed: 25 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -365,13 +365,22 @@ impl TreeNodeRewriter for Rewriter<'_> {
365365
///
366366
/// This avoids the `Arc::unwrap_or_clone` + `Arc::new` cycle that the
367367
/// ownership-based `TreeNode::rewrite` performs at every child node.
368-
/// When the `Arc` refcount is 1 (always true in the optimizer),
369-
/// `Arc::make_mut` is essentially free.
370368
///
371-
/// The `rule.rewrite()` API takes ownership, so we bridge the `&mut` to an
372-
/// owned value with [`std::mem::take`]. `LogicalPlan::default()` is a cheap
373-
/// empty placeholder (shared empty schema, no allocation) and is overwritten
374-
/// with the rule's output on the very next line.
369+
/// # Error semantics
370+
///
371+
/// On `Err`, `*plan` is left in an **unspecified** state and must not be used.
372+
/// Because `rule.rewrite()` consumes the plan by value, a failing rule drops
373+
/// the node it was handed and the [`std::mem::take`] placeholder
374+
/// (`LogicalPlan::default()`, an `EmptyRelation`) is left in its place — at the
375+
/// root for a top-level failure, or somewhere in a subtree for a failure deeper
376+
/// in the recursion. The pre-rule plan is **not** recoverable here: restoring it
377+
/// would require cloning it before every rule invocation, which is exactly the
378+
/// allocation this function exists to avoid.
379+
///
380+
/// Callers must therefore discard `*plan` on `Err`, or restore it from a copy
381+
/// saved beforehand. [`Optimizer::optimize`] does this: with `skip_failed_rules`
382+
/// it restores `new_plan` from the `prev_plan` clone it already holds, and
383+
/// without it the error aborts the pass and the plan is dropped unobserved.
375384
#[cfg_attr(feature = "recursive_protection", recursive::recursive)]
376385
fn rewrite_plan_in_place(
377386
plan: &mut LogicalPlan,
@@ -382,6 +391,10 @@ fn rewrite_plan_in_place(
382391
// f_down phase
383392
let mut changed = false;
384393
if apply_order == ApplyOrder::TopDown {
394+
// `rule.rewrite()` takes the plan by value, so bridge the `&mut` to an
395+
// owned value with `std::mem::take`. `LogicalPlan::default()` is a cheap
396+
// empty placeholder (shared empty schema, no allocation) and is
397+
// overwritten with the rule's output on the next line.
385398
let owned = std::mem::take(plan);
386399
let result = rule.rewrite(owned, config)?;
387400
*plan = result.data;
@@ -514,6 +527,12 @@ impl Optimizer {
514527
// No subqueries: use in-place rewriting
515528
// with Arc::make_mut for zero-cost CoW on
516529
// children, avoiding Arc unwrap/rewrap.
530+
//
531+
// On error `new_plan` is left in an unspecified
532+
// state (see `rewrite_plan_in_place`); the result
533+
// handling below discards it, restoring `prev_plan`
534+
// when `skip_failed_rules` is set or propagating
535+
// the error otherwise.
517536
rewrite_plan_in_place(
518537
&mut new_plan,
519538
apply_order,

0 commit comments

Comments
 (0)