Skip to content

Fix MERGE16_FIXME: cleanup AO vacuum, PartitionSelector, and qual pushdown#1820

Open
lss602726449 wants to merge 9 commits into
apache:mainfrom
lss602726449:fix-merge16-fixmes
Open

Fix MERGE16_FIXME: cleanup AO vacuum, PartitionSelector, and qual pushdown#1820
lss602726449 wants to merge 9 commits into
apache:mainfrom
lss602726449:fix-merge16-fixmes

Conversation

@lss602726449

Copy link
Copy Markdown
Contributor

Summary

  • Remove unnecessary vacuum_set_xid_limits calls for AO/AOCO tables: AO tables don't have per-tuple transaction IDs, so freeze/cutoff computation is meaningless. Removed the calls from vacuum_ao.c, appendonlyam_handler.c (CLUSTER), and aocsam_handler.c (CLUSTER). Fixed cluster.c to properly reset relminmxid for AO tables. Deleted the now-unused vacuum_set_xid_limits function entirely.

  • Remove incorrect FIXME in PartitionSelector: The CreatePartitionPruneState call is correct for Cloudberry's join-based partition pruning (distinct from ExecInitPartitionPruning which adds initial pruning + subplan map renumbering). Also removed two dead declarations (ExecCreatePartitionPruneState, ExecFindInitialMatchingSubPlans) left over from PG14.

  • Add UNSAFE_HAS_SUBPLAN flag for qual pushdown safety: Replaced the incorrect reuse of UNSAFE_NOTIN_PARTITIONBY_CLAUSE (a "soft" unsafe flag that still allows window run conditions) with a proper UNSAFE_HAS_SUBPLAN flag (hard block) for output columns containing subplans. This prevents unsafe pushdown of subplan-containing quals into subqueries in Cloudberry's distributed execution model.

Test plan

  • Compilation passes with --enable-cassert
  • make installcheck regression tests pass
  • isolation2 tests pass
  • Verify AO/AOCO VACUUM and CLUSTER operations work correctly
  • Verify partition pruning with PartitionSelector still works

AO/AOCO tables have no per-tuple xmin/xmax -- visibility is managed
via visibility map at segment level, not per-tuple transaction IDs.
The freeze limits computed by vacuum_set_xid_limits are meaningless
for AO tables. Worse, passing MultiXactCutoff to vac_update_relstats
(vacuum) or swap_relation_files (CLUSTER) incorrectly sets relminmxid
on AO tables (whose relminmxid should remain InvalidMultiXactId),
causing them to unnecessarily participate in database-wide datminmxid
calculation.

Fix by:
- vacuum_ao.c: remove vacuum_set_xid_limits call, pass Invalid values
  directly to vac_update_relstats
- appendonlyam_handler.c / aocsam_handler.c: remove vacuum_set_xid_limits
  call in copy_for_cluster, return Invalid values to caller
- cluster.c: relax MultiXactId assert to allow InvalidMultiXactId,
  and reset relminmxid to InvalidMultiXactId for AO tables (matching
  the existing relfrozenxid override)

vacuum_set_xid_limits was a pre-PG16 API kept only for AO callers.
With all callers removed, delete the function and its declaration.
The CreatePartitionPruneState() call in nodePartitionSelector.c is
correct -- PartitionSelector only needs the pruning data structure,
not the initial pruning and subplan map renumbering that
ExecInitPartitionPruning() adds on top. Remove the incorrect FIXME.

Also remove two dead declarations in execPartition.h:
- ExecCreatePartitionPruneState: renamed to CreatePartitionPruneState
  in PG15 (commit 297daa9), declaration was never cleaned up
- ExecFindInitialMatchingSubPlans: folded into ExecFindMatchingSubPlans
  in the same refactor, declaration was never cleaned up
The subplan check in check_output_expressions was incorrectly using
UNSAFE_NOTIN_PARTITIONBY_CLAUSE, which only prevents normal pushdown
but still allows the qual to be pushed as a window run condition.
Subplans in output expressions should completely block pushdown in
Cloudberry's distributed execution model.

Add a dedicated UNSAFE_HAS_SUBPLAN flag and include it in the fully
unsafe set in qual_is_pushdown_safe, so quals referencing output
columns containing subplans are never pushed down.
@tuhaihe tuhaihe requested review from chenjinbao1989 and reshke June 16, 2026 03:43
The pgstat infrastructure is fully functional now. Remove the early
return in ReportTemporaryFileUsage that disabled both temp file stats
reporting and log_temp_files logging. Also restore the assert in
pgstat_assert_is_up.
…c_mpp_query

The commented-out code stripped write permissions from RTEs on non-root
slices. This is unnecessary because InitPlan already skips permission
checks on non-writer segments (execMain.c:1821). Additionally, PG16
moved requiredPerms from RTE to RTEPermissionInfo, making the original
code incompatible.
The forceoverwrite check is unnecessary here because
verify_dir_is_empty_or_create already handles non-empty directories
before tar extraction begins.
Re-enable segwalrep/select_throttle in isolation2_schedule (verified
passing) and workfile_mgr_test in singlenode_isolation2_schedule.
The pg_waldump command works correctly with appendonly WAL records.
Remove start_ignore/end_ignore wrappers and update expected output
to use matchsubs-masked format consistent with the isolation2 version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant