Skip to content

Commit dade382

Browse files
author
miranov25
committed
feat(benchmarks): Complete benchmark infrastructure with history and profiling
Add comprehensive benchmark infrastructure for performance tracking: - Row count configuration: quick=500K, default=1M, full=2M rows - Profile naming: bench_<component>_<scenario>_<timestamp>_<commit>.prof - History archiving: Every run archived with git commit info - Diff command: Compare arbitrary history files with threshold detection - History analysis: DataFrame utilities (long/wide format) for custom queries New files: - history_analysis.py: Load history into pandas DataFrames Modified files: - benchmark_materialize_aliases.py: --full flag, profile naming, row counts - baseline_utils.py: diff command, get_git_info() - run_benchmark.sh: --full flag passthrough - README.md: Documentation for new features Usage: ./run_benchmark.sh --full # Full analysis with profiling python baseline_utils.py diff A.json B.json # Compare runs python history_analysis.py list results/history/ # List metrics Part of benchmark infrastructure for Phase 3 join optimization.
1 parent 18caba7 commit dade382

6 files changed

Lines changed: 1245 additions & 21 deletions

File tree

UTILS/dfextensions/AliasDataFrame/benchmarks/README.md

Lines changed: 102 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,99 @@ Performance benchmarks for AliasDataFrame operations.
1515
./run_benchmark.sh --synthetic-only
1616
```
1717

18+
## New Features
19+
20+
### Full Analysis Mode
21+
22+
Run complete benchmark with profiling, baseline comparison, and history archiving:
23+
24+
```bash
25+
./run_benchmark.sh --full
26+
```
27+
28+
This enables:
29+
- Profiler output (`.prof` and `.txt` files)
30+
- Baseline comparison
31+
- Automatic history archiving with git info
32+
33+
### Profiler Output
34+
35+
Generate detailed profiler output for performance analysis:
36+
37+
```bash
38+
./run_benchmark.sh --profile
39+
40+
# Or combined with full analysis
41+
./run_benchmark.sh --full
42+
```
43+
44+
Profile files are saved to `results/profiles/` with naming:
45+
```
46+
bench_<component>_<scenario>_<timestamp>_<commit>.prof
47+
```
48+
49+
Analyze profiles with standard Python tools:
50+
```python
51+
import pstats
52+
p = pstats.Stats('results/profiles/bench_materialize_safe_20251130_164906_18caba76.prof')
53+
p.sort_stats('cumulative').print_stats(20)
54+
55+
# Or use snakeviz for visualization
56+
# pip install snakeviz
57+
# snakeviz results/profiles/bench_materialize_safe_20251130_164906_18caba76.prof
58+
```
59+
60+
### History and Comparison
61+
62+
Every benchmark run is archived to `results/history/` with git information.
63+
64+
**Compare two runs:**
65+
```bash
66+
# Compare specific files
67+
python baseline_utils.py diff results/history/benchmark_*_f9df9cf.json results/history/benchmark_*_18caba7.json
68+
69+
# Supports glob patterns
70+
python baseline_utils.py diff 'results/history/*f9df9cf*' 'results/history/*18caba7*'
71+
72+
# With strict mode (exit code 1 on regression)
73+
python baseline_utils.py diff file_a.json file_b.json --strict
74+
```
75+
76+
### History Analysis
77+
78+
Load history into pandas DataFrames for custom analysis:
79+
80+
```python
81+
from history_analysis import load_history_long, load_history_wide
82+
83+
# Long format (one row per metric) - good for filtering
84+
df_long = load_history_long('results/history/')
85+
df_long[df_long['metric'] == 'direct_vs_safe_speedup']
86+
87+
# Wide format (one row per run) - good for correlation
88+
df_wide = load_history_wide('results/history/')
89+
df_wide[['commit', 'materialize_aliases_time_s', 'materialize_aliases_direct_vs_safe_speedup']]
90+
91+
# Time series of specific metric
92+
from history_analysis import get_metric_history
93+
ts = get_metric_history(df_long, 'materialize_aliases', 'direct_vs_safe_speedup')
94+
```
95+
96+
**CLI commands:**
97+
```bash
98+
# List available metrics
99+
python history_analysis.py list results/history/
100+
101+
# Show recent runs
102+
python history_analysis.py show results/history/ --last 10
103+
104+
# Show specific metric
105+
python history_analysis.py show results/history/ --metric direct_vs_safe_speedup
106+
107+
# Export for external tools
108+
python history_analysis.py export results/history/ --format wide -o history.csv
109+
```
110+
18111
## Overview
19112

20113
| Script | Purpose | Data Required |
@@ -631,15 +724,23 @@ benchmarks/
631724
├── generate_synthetic_data.py # Creates test ROOT file (~5MB)
632725
├── diagnose_read_performance.py # Diagnostic tool for slowdowns
633726
├── benchmark_performance.py # Core operations timing
634-
├── benchmark_materialize_aliases.py # Alias DAG + subframe benchmark (NEW)
727+
├── benchmark_materialize_aliases.py # Alias DAG + subframe benchmark
635728
├── benchmark_parallel.py # Parallel scaling tests
636729
├── benchmark_read_tree.py # ROOT file read comparison
637730
├── benchmark_subframe.py # Subframe validation
638731
├── baseline_utils.py # Baseline management utilities
732+
├── history_analysis.py # DataFrame utilities for history analysis (NEW)
639733
├── baselines.json # Saved baselines (auto-generated)
640734
├── baseline.json # Unified baseline for regression detection
641735
├── synthetic_data.root # Test data (auto-generated, gitignored)
642736
└── results/ # Output directory (gitignored)
737+
├── history/ # Archived runs with git info (NEW)
738+
│ ├── benchmark_20251128_150047_f9df9cf.json
739+
│ └── benchmark_20251130_164906_18caba76.json
740+
├── profiles/ # Profiler output (NEW)
741+
│ ├── bench_materialize_safe_20251130_164906_18caba76.prof
742+
│ ├── bench_materialize_safe_20251130_164906_18caba76.txt
743+
│ └── ...
643744
├── benchmark_*.json # Detailed results
644745
├── benchmark_merged_*.json # Merged results for comparison
645746
├── comparison_*.json # Regression comparison results

0 commit comments

Comments
 (0)