[UT][VL] Refresh TPC-H q19 plan stability golden file#12374
Conversation
The ExprId normalizer in GlutenPlanStabilitySuite uses regex `#\d+` which inadvertently matches TPC-H string literals such as Brand#11, Brand#12, Brand#13 (p_brand values in q19's filter). Over the 264 commits since the golden file was added in apache#11805, new optimizer rules shifted the ExprId counter so Brand#12 now normalizes to Brand#6 and _pre_1#14 to _pre_1#13, causing a spurious plan mismatch. Regenerated by running GlutenTPCHPlanStabilitySuite with SPARK_GENERATE_GOLDEN_FILES=1. Only q19/explain.txt changes; simplified.txt and all other queries are unaffected. Verified: q19 fails on main without this fix (21/22); passes with it (22/22). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Closing — the q19 golden does not actually need refreshing. After further investigation, the q19 "failure" only reproduces when Confirmed by #12151 (which touches Velox backend Scala and therefore runs The underlying normalizer limitation is tracked in #12375; a static golden refresh can't fix it reliably anyway (the normalization is run-context dependent). Closing as not needed. |
Fixes #12375.
Problem
GlutenTPCHPlanStabilitySuite→tpch/q19has been failing inspark-test-spark40CI runs for PRs that touch Velox backend Scala files.Root cause
GlutenPlanStabilitySuite.glutenNormalizeIds()uses the regex(?<prefix>(?<!id=)#)\\d+L?which matches any#<number>in the explain text — including TPC-H string literals. Thep_brandfilter in q19 uses valuesBrand#11,Brand#12,Brand#13(actual TPC-H spec data values). These appear unquoted in the explain output:The normalizer incorrectly treats
#12as an ExprId and remaps it sequentially. The suite code itself warns about this at line 67–68:What changed
The golden file was committed in #11805 (
c37fee4e5, 2026-03-24). Since then 264 commits landed onmain, shifting the ExprId counter.Brand#12now normalizes toBrand#6and_pre_1#14shifts to_pre_1#13.Exact diff (original vs current):
Evidence that this is pre-existing
Ran
GlutenTPCHPlanStabilitySuiteonmainat commit6097b59a6(2026-06-25, [MINOR][VL] Build Arrow 18 with patch for Power #12344) — without any pending PR applied:Then regenerated with
SPARK_GENERATE_GOLDEN_FILES=1and re-ran:Only
q19/explain.txtchanged.simplified.txtand all other queries (q1–q18, q20–q22) are unaffected.Why it only surfaces on PRs touching Velox backend Scala files
spark-test-spark40is only triggered when Velox backend Scala files are modified. Most PRs touch native C++ code, docs, or non-Velox modules and never trigger this check.Fix
Regenerated
q19/explain.txtby runningGlutenTPCHPlanStabilitySuitewithSPARK_GENERATE_GOLDEN_FILES=1 SPARK_ANSI_SQL_MODE=false.A proper long-term fix (tracked in #12375) would be to make
glutenNormalizeIdsskip#Noccurrences inside string literal contexts.Impact
gluten-ut/spark40/src/test/resources/backends-velox/gluten-tpch-plan-stability/q19/explain.txtchanges