Commit 6d4f5ea
miranov25
Phase 13.58.ADF: single-tree lazy time-series loading & lazy drawing (D1-D4)
Implements use case 1 of the lazy time-series work: a single TTree read lazily,
drawn through the full gallery surface, loading only the branches each draw needs.
All changes are additive; the eager path is byte-identical by default.
Deliverables (ratified proposal v1.6 + joint test plan v2.0)
------------------------------------------------------------
D1 Dependency-resolver consolidation. get_required_branches() now routes the
inline `expr` through the AST analyzer _analyze_expression() instead of the
old top-level string split, via a new bracket-aware _split_top_level_colon().
_analyze_expression() queries the runtime _registered_functions registry at
parse time, so a call like corr(xM, driftM) treats `corr` as a function, not
a column. Behaviour-preserving for every existing expression; the inline
registered-function and bracket-vector forms (e.g. "[corr(x,y), x]:z" -> {x,y,z})
now resolve correctly. The alias path was already AST-based and is unchanged.
D2 Draw-surface branch-scan gap closed. get_required_branches() gains
facet_by / weights / weights_vector / selection_vector and they are threaded
at all three call sites: draw(), draw_batch(), draw_figures(). A branch
referenced ONLY via one of these kwargs now pre-loads in lazy mode (the
silent-empty-figure class). Integer count kwargs (*_bins / *_quantiles) are
excluded structurally.
D3 LazyTreeReader.estimate_memory() implemented using the real per-branch dtype
item size from the TTree interpretation (not a float32 constant), so it matches
the eager sum(df[col].nbytes) exactly. Removes the single-tree-lazy
AttributeError (ADF.estimate_memory delegated to a method that only existed on
the chain reader). LazyChainReader.estimate_memory docstring corrected (it
documented 3 keys but returns 5).
D4 Gallery integration. build_adf() gains an additive lazy= / tree_name= parameter
that swaps only the constructor (eager root_to_adf vs read_tree_lazy) and shares
every post-read step; default lazy=False is byte-identical. sampled-lazy is
rejected with a clear error (out of scope). validate_lazy_vs_eager() harness
added for the alma2 gallery double-run.
Tests (synthetic, ROOT/PyROOT-free; uproot mktree TTrees, seed 42)
------------------------------------------------------------------
tests/test_phase1358_lazy_timeseries.py (8): loader mechanism + resolver/estimator
gates - only-needed-load, registered-function alias, facet/weights preload,
estimate_memory tolerance 0, resolver regression + colon-grammar, lazy==eager
data, and the bracket-vector + composition resolver lock.
tests/test_phase1358_lazy_draw_invariance.py (21): real adf.draw() / draw_batch() /
draw_figures() lazy vs eager across hist/profile/scatter x selection/group_by/
facet_by/weights/color, registered-function alias (expr- and selection-side),
weights_vector and selection_vector lists, nested alias, multi-draw no-reload,
non-existent-branch negative control, and the D2 loud-raise negative control
(D2 disabled -> ValueError, raw branch). Every load assertion is exact
(loaded == need) with >=3 decoy branches in the fixture.
tests/test_phase1358_gallery_lazy.py (1, env-gated): TIME-SERIES gallery double-run
(validate_lazy_vs_eager); skips without the time-series file / dfdraw. Not calibITS.
tests/test_phase1358_lazy_calibITS.py (4): real-data lazy invariance on the committed
calibITS.root fixture (main tree 'AlignITS5', 28,956 rows, two ADF subframes). A real
profile / group_by / hist draw each loads EXACTLY its branches; lazy stats == eager.
Uses the file's actual schema (not the time-series build_adf). Skips without the fixture.
tests/data/calibITS.root: ~4 MB distributable fixture for the test above.
Note: ADF selection is numexpr-style (& / |), not C-style (&& / ||); the logical-
selection test uses & (&& raises in numexpr).
Governance / taxonomy
---------------------
tests/feature_taxonomy.py: + LAZY.timeseries_draw (mapping the three test files);
total 50 -> 51.
tests/test_phase_13_56_adf_post_audit.py: TG4 count-lock bumped 50 -> 51 in the
same commit (per the plan).
Test baseline (full run_tests.sh, alma2, 12 workers)
----------------------------------------------------
1731 passed / 8 failed / 1 error / 9 skipped after the TG4 fix. The 8 failures and
1 error are all PRE-EXISTING (identical to the run before this work) and outside the
D1-D4 surface:
K1_3_draw_batch_forwards_batch_kwargs, K2_3_production_reproducer_mirror
(the separate vector-kwarg change on this branch, not D1-D4),
AliasDataFrameRDF: missing_keys_in_friend, collision_from_friend_tree,
composite_index_friend,
invariance_backend I2_6 (numba vs numpy chained subframe),
invariance_compression I4_2, I4_3,
ERROR test_schema_serialization.py.
This change adds +21 passing tests and zero net failures.
Not included (closure-pending)
------------------------------
- Real-data lazy gate: SATISFIED by tests/test_phase1358_lazy_calibITS.py on the committed
calibITS.root fixture (4 passing). The time-series gallery double-run
(test_phase1358_gallery_lazy.py) remains env-gated on the large time-series file and is
optional, not the closure gate.
- T11 (subframe-column lazy draw, skipif-gated): calibITS HAS subframes (R, AlignDzITS5),
so a subframe-column draw test can be added against the committed fixture (follow-up).
- docs/CAPABILITY_MATRIX.md regeneration (deferred to the closure commit, not this one).
Refs: PHASE_13_58_ADF proposal v1.6; joint test plan v2.0; panel summaries v1.0/v2.0.1 parent f37208c commit 6d4f5ea
10 files changed
Lines changed: 907 additions & 16 deletions
File tree
- UTILS/dfextensions/AliasDataFrame
- examples/time_series
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3803 | 3803 | | |
3804 | 3804 | | |
3805 | 3805 | | |
| 3806 | + | |
| 3807 | + | |
| 3808 | + | |
| 3809 | + | |
| 3810 | + | |
| 3811 | + | |
| 3812 | + | |
3806 | 3813 | | |
3807 | 3814 | | |
3808 | 3815 | | |
| |||
6834 | 6841 | | |
6835 | 6842 | | |
6836 | 6843 | | |
| 6844 | + | |
| 6845 | + | |
| 6846 | + | |
| 6847 | + | |
| 6848 | + | |
| 6849 | + | |
| 6850 | + | |
| 6851 | + | |
| 6852 | + | |
| 6853 | + | |
| 6854 | + | |
| 6855 | + | |
| 6856 | + | |
| 6857 | + | |
| 6858 | + | |
| 6859 | + | |
| 6860 | + | |
| 6861 | + | |
| 6862 | + | |
| 6863 | + | |
| 6864 | + | |
| 6865 | + | |
| 6866 | + | |
| 6867 | + | |
| 6868 | + | |
| 6869 | + | |
| 6870 | + | |
| 6871 | + | |
6837 | 6872 | | |
6838 | 6873 | | |
6839 | 6874 | | |
6840 | 6875 | | |
6841 | 6876 | | |
| 6877 | + | |
| 6878 | + | |
| 6879 | + | |
| 6880 | + | |
6842 | 6881 | | |
6843 | 6882 | | |
6844 | 6883 | | |
| |||
6883 | 6922 | | |
6884 | 6923 | | |
6885 | 6924 | | |
6886 | | - | |
6887 | | - | |
| 6925 | + | |
| 6926 | + | |
| 6927 | + | |
| 6928 | + | |
| 6929 | + | |
| 6930 | + | |
| 6931 | + | |
| 6932 | + | |
6888 | 6933 | | |
6889 | | - | |
6890 | | - | |
| 6934 | + | |
| 6935 | + | |
| 6936 | + | |
| 6937 | + | |
| 6938 | + | |
| 6939 | + | |
| 6940 | + | |
| 6941 | + | |
| 6942 | + | |
| 6943 | + | |
| 6944 | + | |
| 6945 | + | |
| 6946 | + | |
| 6947 | + | |
6891 | 6948 | | |
6892 | 6949 | | |
6893 | 6950 | | |
| |||
6899 | 6956 | | |
6900 | 6957 | | |
6901 | 6958 | | |
6902 | | - | |
| 6959 | + | |
| 6960 | + | |
| 6961 | + | |
| 6962 | + | |
| 6963 | + | |
| 6964 | + | |
| 6965 | + | |
| 6966 | + | |
| 6967 | + | |
| 6968 | + | |
| 6969 | + | |
| 6970 | + | |
| 6971 | + | |
| 6972 | + | |
| 6973 | + | |
| 6974 | + | |
| 6975 | + | |
| 6976 | + | |
| 6977 | + | |
| 6978 | + | |
| 6979 | + | |
| 6980 | + | |
| 6981 | + | |
| 6982 | + | |
| 6983 | + | |
| 6984 | + | |
| 6985 | + | |
| 6986 | + | |
| 6987 | + | |
6903 | 6988 | | |
6904 | 6989 | | |
6905 | 6990 | | |
| |||
11339 | 11424 | | |
11340 | 11425 | | |
11341 | 11426 | | |
11342 | | - | |
| 11427 | + | |
| 11428 | + | |
| 11429 | + | |
| 11430 | + | |
| 11431 | + | |
11343 | 11432 | | |
11344 | 11433 | | |
11345 | 11434 | | |
| |||
12419 | 12508 | | |
12420 | 12509 | | |
12421 | 12510 | | |
12422 | | - | |
| 12511 | + | |
| 12512 | + | |
| 12513 | + | |
| 12514 | + | |
| 12515 | + | |
12423 | 12516 | | |
12424 | 12517 | | |
12425 | 12518 | | |
| |||
12754 | 12847 | | |
12755 | 12848 | | |
12756 | 12849 | | |
12757 | | - | |
| 12850 | + | |
| 12851 | + | |
| 12852 | + | |
| 12853 | + | |
| 12854 | + | |
12758 | 12855 | | |
12759 | 12856 | | |
12760 | 12857 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
316 | 316 | | |
317 | 317 | | |
318 | 318 | | |
319 | | - | |
| 319 | + | |
| 320 | + | |
320 | 321 | | |
321 | 322 | | |
322 | 323 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
| |||
201 | 202 | | |
202 | 203 | | |
203 | 204 | | |
204 | | - | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
205 | 272 | | |
206 | 273 | | |
207 | 274 | | |
| |||
Lines changed: 86 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
46 | | - | |
47 | | - | |
48 | | - | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
49 | 69 | | |
50 | 70 | | |
51 | 71 | | |
52 | 72 | | |
53 | 73 | | |
54 | 74 | | |
55 | 75 | | |
56 | | - | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
57 | 79 | | |
58 | 80 | | |
59 | 81 | | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
60 | 142 | | |
61 | 143 | | |
62 | 144 | | |
| |||
0 commit comments