Commit 9536ee3
miranov25
feat(AliasDataFrame): Add fill handling for subframe joins + restore batch optimization
BUG-2025-11-27-003: Fill Handling for Missing Keys and Invalid Values
Add configurable fill behavior for subframe joins:
- set_global_fill() / set_subframe_fill(): Configure fill values per subframe
- fill_missing: Fill value for missing keys (default NaN)
- fill_nan / fill_inf / fill_invalid: Fill values for invalid data
- fill_mode: 'safe' (default) or 'direct' (fast path)
- Aggregated warnings: Single summary instead of per-column spam
- Uses pd.merge(indicator=True) to distinguish missing keys from data NaN
Restore batch materialization optimization (regression fix):
- materialize_aliases() uses context_override for chained dependencies
- Single pd.concat() instead of per-alias insertion (avoids O(n²) fragmentation)
- Single drop() for cleanup
- Removes _batch_mode parameter (was source of regression)
Performance impact:
- Subframe-heavy workflows: ~3x faster (72s vs 216s on 13.5M rows)
- No DataFrame fragmentation warnings
- Warning spam eliminated (24 warnings → 1 summary)
Phase 1 implementation (Phase 2 will add regex patterns and 'fast' mode).
Tests: 544 passed, 2 xfailed (expected)1 parent 4312270 commit 9536ee3
3 files changed
Lines changed: 579 additions & 201 deletions
File tree
- UTILS/dfextensions/AliasDataFrame
- tests
0 commit comments