Commit edf2f5f
miranov25
fix(benchmarks): Fix L2 Numba reference and add 3-level roofline model
ROOFLINE FIX:
The L2 (Numba @njit) reference was measuring cache-bound performance
because the source array (1288×8 = 41KB) fit entirely in L1 cache.
This gave impossibly fast times (0.0008s = 80 GB/s > RAM bandwidth).
Fix: Use memory-bound source array matching output size (2M×8 = 64MB).
Before: L2 = 0.0008s (cache-bound, incorrect)
After: L2 = 0.0039s (memory-bound, correct)
3-LEVEL REFERENCE MODEL:
L1 Hardware (memcpy): 0.003s (33 GB/s) - theoretical ceiling
L2 Numba (@njit parallel): 0.004s (16 GB/s) - compiled Python ceiling
L3 NumPy (C backend): 0.015s (4 GB/s) - Python ceiling (official)
EFFICIENCY (direct mode, 0.217s):
vs L3: 6.8% (official metric)
vs L2: 1.8%
vs L1: 1.4%
PROFILER ANALYSIS (where the 97% overhead goes):
NumPy array operations: 54%
Pandas DataFrame ops: 28%
Python interpreter: 15%
Numba kernels: 3% ← algorithm is near-optimal
Also:
- Changed default subframe_size from 1000 to 1288 (ALICE TPC)
- Added test_l2_timing.py for standalone verification
- Updated README with roofline documentation
Addresses WP4 feedback on reference definition.
Reviewed-by: GPT, Gemini, Claude1 parent 46d2320 commit edf2f5f
2 files changed
Lines changed: 305 additions & 28 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
108 | 108 | | |
109 | 109 | | |
110 | 110 | | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
111 | 197 | | |
112 | 198 | | |
113 | 199 | | |
| |||
0 commit comments