You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Risk profile**: Low — single default value flip, all existing files remain readable, ROOT-native algorithm choice. The only behavior change is faster I/O on freshly-written files.
403
403
404
+
### Phase 13.35.ADF: Vector Kwargs Alias Pre-Materialization + `vector_compose` Auto-Force
405
+
**Dates**: 2026-05-18
406
+
**Status**: ✅ Merged
407
+
**Commit**: `879a0835`
408
+
**Base**: `a6a5b6e8` (BUG_AliasDataFrame_20260518 Phase A close)
409
+
**Tag**: `PHASE_13_35_ADF_END`
410
+
**Coder**: Claude36 (Opus 4.7)
411
+
**Sister phase**: dfdraw Phase 13.27.DF Commit 2
412
+
413
+
Two coupled fixes at the ADF→dfdraw boundary, both producing failures-of-the-week in production calibration workflows:
414
+
415
+
1.**Vector-kwarg alias pre-materialization** — `selection_vector` / `weights_vector` / `facet_by` expressions referencing ADF aliases (e.g. `selection_vector=["(abs(sector-13)<2)", "(abs(sector-13)>=2)&(sector<36)"]` where `sector` is an alias) raised `UndefinedVariableError` because `draw()` / `draw_batch()` / `draw_figures()` forwarded kwargs to dfdraw without materializing the referenced aliases first. `pandas.eval` inside dfdraw then failed on the un-resolved alias name.
416
+
417
+
2.**`vector_compose="outer"` auto-force** — even after (1) is fixed, dfdraw's inner-compose 3-axis check (AD-67, `drawer.py:757`) raises `ValueError: 3-axis inner requires equal lengths` whenever expr is single-Y (e.g. `"nClITS:time_s"`) and `selection_vector` (or `weights_vector`) has >1 element. The architect's production §1.4 call works only because `normalize="delta"` silently sets `vector_compose="outer"` inside dfdraw — a fragile coupling that users without `normalize=` keyword hit hard. Spec v1.2 §6 row #3 deferred this auto-force to "Phase 13.33.DF or FIX2"; landed here instead at architect direction after production validation showed the §1.4 reproducer needs both fixes together.
418
+
419
+
**Implementation**: two sibling helpers in `AliasDataFrame.py`:
-`_normalize_vector_compose_kwargs(kwargs, expr)` — auto-forces `vector_compose="outer"` when expr is single-Y AND (`selection_vector` OR `weights_vector` has >1 element). Respects user opt-out (no overwrite if user passed `vector_compose` explicitly). No-op for multi-Y expressions.
422
+
423
+
Both helpers wired into all 3 draw entry points (`draw()` at method entry; `draw_batch()` / `draw_figures()` per-spec/per-plot). For batch methods, helpers mutate the ORIGINAL spec dict (not `_merged_spec`) — follows the existing in-place mutation pattern at `AliasDataFrame.py:12121` (subframe replacement loop).
- V1.5: draw_batch per-spec materialization (uses `clear_after=False` to make materialization observable post-call — `draw_batch` defaults drop materialized aliases after the batch completes)
- V1.7: facet_by channel-enum negative branch — must NOT materialize (guard test)
433
+
- V1.8: auto-force helper direct unit test — positive (sel + weights), negative (multi-Y), user-explicit-inner respect, 1-element no-op
434
+
435
+
**Production validation**: real ALICE TPC data, ~9.86M tracks, on alma2 (2026-05-18 13:34). All draw call patterns in `drawTest(adfVertex, adf)` rendered without `UndefinedVariableError` or 3-axis `ValueError`. Two downstream dfdraw-layer bugs surfaced but are not Phase 13.35.ADF scope:
436
+
-`auto_title=True` not honored — figure shows matplotlib default title across all renders
437
+
-`normalize="ratio"` returns 1.0 instead of the computed early/late ratio
438
+
Both handed to dfdraw team for separate bug filing.
439
+
440
+
**Test count**: 1625 passed, 10F+1E, 8 skipped at commit-time. The +3 failures vs Phase A's 7F+1E baseline (`test_save_and_load_integrity`, `test_backward_compatibility_no_compression_info`, `test_roundtrip_save_load`) are the documented parallel-execution flake cluster — same pattern noted at Phase 13.27.ADF (`bbedd90b`) commit message. Diagnostic 2026-05-18: `pytest <3 tests> -p no:xdist` → **15/15 pass in isolation**, confirming parallel-worker artifact, not Phase 13.35.ADF regression.
441
+
442
+
**Reviewer cycle notes**:
443
+
- v1.0 proposal (Sonnet1) → Claude36 reviewed with `[!]` (P1 scope gap: only `draw()` patched, missing `draw_batch` / `draw_figures`)
444
+
- v1.1 (Claude36 drafted, 3-call-site scope) → 5-reviewer panel found 2 P1s (V4 channel-enum negative test missing; §9 audit incomplete)
- Commit-time review: 4 reviewers issued `[X]` flagging 3 regressions; diagnostic disproved the in-place-mutation root cause hypothesis; architect closed on authority
448
+
449
+
**Methodology lessons** (for next Coder/Reviewer QRC revision):
450
+
-**Verbal scope expansion during implementation needs a spec amendment.** Spec v1.2 §6 row #3 explicitly deferred auto-force; landing it here via verbal direction-of-the-day was correct architecturally but bypassed the spec-amendment loop. CRR §5 documented honestly, but for future: produce spec v1.3 amendment BEFORE coding, not as post-hoc CRR note.
451
+
-**Coder post-hoc baseline revision is a Failure Mode.** When CRR §3 predicted `1627 pass / 7F+1E` and got `1625 / 10F+1E`, Claude36 changed the baseline number in §3 (7→10) instead of investigating the -2 delta. Reviewers caught it and demanded diagnostic. Diagnostic vindicated the result but the process was wrong: investigate first, narrate second. Candidate Failure Mode for Coder QRC: *"Numbers-revised-to-fit."*
452
+
-**Reviewer panel discipline was correct.** Sonnet1/2/3/4 demanded diagnostic before approval — exactly the right call. Their P0-1 hypothesis (in-place mutation) was disproven, but the gate (don't approve before root-cause) is the value, not the hypothesis. Worth a Reviewer QRC note: *"verdict on diagnostic, not on hypothesis"*.
453
+
-**Documented parallel-execution flake pattern recurring.** Third independent recurrence of the `test_alias_dataframe.py` save/load + compression intermittent failures under 12-worker xdist (Phase 13.27.ADF + Phase 13.35.ADF + the 2026-04 incident referenced in Phase 13.27 commit). Pattern is consistent: pass deterministically in isolation, fail intermittently under parallelism. Worth formal bug-filing on next recurrence; consider `@pytest.mark.serial` or worker-count cap.
454
+
455
+
**Closed deferred items** (from spec v1.2 §6):
456
+
- Row #3 — `vector_compose='outer'` auto-forcing for single-Y + N-element selection_vector → CLOSED by this phase.
457
+
458
+
**Phase B marker**: Both helpers carry inline `Phase B marker` comments — regex tokenizer in `_ensure_vector_kwargs_aliases` and Y-count parser in `_normalize_vector_compose_kwargs` should be folded into AST resolver consolidation when that phase lands.
**Severity**: P0 — cold draw of `Subframe.aliased_column` raised `UndefinedVariableError` whenever the subframe column was an ADF alias not yet materialized into the subframe's DataFrame. Production reproducer: `adfVertex.draw("vertex_x_intercept:vC.vertex_x_intercept_decomp")` — fails cold because `vC.vertex_x_intercept_decomp` is an alias on the vC subframe, not a raw column.
489
+
490
+
**Problem**: Four draw-time resolver sites assumed that any `Subframe.col` reference resolves to a raw column on the subframe's DataFrame. When `col` is an ADF alias on that subframe (the common pattern for compressed/decompressed columns in calibration QA), the lookup miss propagated as `UndefinedVariableError`. The error pointed at the rewritten flat reference (e.g. `vertex_x_intercept_decomp__vC`), never at the actual cause (the column needed lazy materialization on the subframe).
491
+
492
+
**Sites patched** (all in `AliasDataFrame.py`):
493
+
| Method | Source line (approx) | Level |
494
+
|---|---|---|
495
+
|`draw()`| 11036 | Single-level |
496
+
|`draw_batch()`| 12060 | Single-level |
497
+
|`draw_figures()`| 12360 | Single-level |
498
+
|`_scatter_subframe_column`| 3108 | Multi-level |
499
+
500
+
Each site now calls `sf_adf.materialize_aliases([col_name])` on the subframe before the join.
501
+
502
+
**Silent-swallow cleanup** (completes the remediation begun at `c1f77b06`): 4× `except Exception: pass` blocks in the draw resolver paths were replaced with `warnings.warn(...)` to surface previously-masked errors. Aligns with the diagnostic-improvement direction of BUG_20260517 — drawer paths no longer hide their failures.
503
+
504
+
**Tests**: S10–S19 (10 invariance tests in `tests/test_S10_draw_subframe_alias.py`):
505
+
- S10–S12: single-level / multi-level / compound-expression alias resolution
506
+
- S13–S14: alias in arithmetic / alias in selection
507
+
- S15: raw column still works (regression guard)
508
+
- S16–S17: `draw_batch` / `draw_figures` paths
509
+
- S18: multilevel alias on inner subframe
510
+
- S19: cold draw, no workaround (production reproducer)
511
+
512
+
10/10 pass in 6.45s parallel.
513
+
514
+
**Production validation**: cold draw on alma2 with real ALICE TPC data (~9.86M tracks, 986 quantile bins) — `adfVertex.draw("vertex_x_intercept:vC.vertex_x_intercept_decomp")` produces correlation 0.9999 between signal and decompressed reference, no workaround (no pre-call `materialize_aliases([...])` needed).
-[ ]**dfdraw `auto_title` not honored** — Phase 13.35.ADF production validation surfaced this; figure shows matplotlib default title across all renders despite `auto_title=True` kwarg passed (handed to dfdraw team for separate bug filing as `BUG_dfdraw_20260518_auto_title_not_honored.md`)
1160
+
-[ ]**dfdraw `normalize="ratio"` returns 1.0** — same Phase 13.35.ADF production session; bottom panel shows exactly 1.0 instead of computed early/late ratio (handed to dfdraw team as `BUG_dfdraw_20260518_normalize_ratio_returns_one.md`)
1161
+
-[ ]**Parallel-execution flake cluster** — `test_save_and_load_integrity`, `test_backward_compatibility_no_compression_info`, `test_roundtrip_save_load` intermittently fail under 12-worker xdist (third documented recurrence; pass deterministically in isolation). Consider `BUG_AliasDataFrame_20260518_compression_save_load_parallel_flake.md` and/or `@pytest.mark.serial` or worker-count cap.
-[ ]**Coder QRC Failure Mode candidate (from Phase 13.35.ADF)**: *"Numbers-revised-to-fit"* — when CRR-predicted test count misses actual, do not rewrite the prediction to match the result; investigate the delta first, narrate second. Phase 13.35.ADF: Claude36 changed baseline 7F+1E → 10F+1E in CRR §3 to match the observed result instead of investigating why the prediction missed. Reviewer panel correctly caught it.
1194
+
-[ ]**Coder QRC reminder (from Phase 13.35.ADF)**: verbal architect direction mid-implementation that expands scope beyond the approved spec should produce a spec amendment (v1.x → v1.x+1) BEFORE coding, not as a post-hoc CRR §5 note. Phase 13.35.ADF auto-force was architecturally correct but the process bypassed the spec-amendment loop.
1195
+
-[ ]**Reviewer QRC reminder (from Phase 13.35.ADF)**: "verdict on diagnostic, not on hypothesis" — when reviewers hypothesize a root cause for an anomaly, the gate is the diagnostic that confirms/refutes it, not the hypothesis itself. Phase 13.35.ADF: Sonnet4 hypothesized in-place mutation; diagnostic disproved it; reviewer discipline (demanding diagnostic before approval) was the value, not the specific hypothesis.
0 commit comments