Summary
Add a generate_k_fold_backtest_windows() function that splits a date range into k chronological folds, each with a train and test period. The API mirrors generate_rolling_backtest_windows() — just start_date, end_date, and n_splits.
Motivation
generate_rolling_backtest_windows() requires the user to manually tune train_days, test_days, and step_days to control how many windows are produced and how much of the date range is covered. K-fold solves this differently: the user says "give me exactly k splits" and every day in the range appears in exactly one test fold, ensuring full coverage without dead zones.
This is especially useful for parameter selection / strategy ranking -- you pick the parameter sets that are consistently good across all k folds, not just one split.
Proposed API
from investing_algorithm_framework import generate_k_fold_backtest_windows
windows = generate_k_fold_backtest_windows(
start_date=datetime(2021, 1, 1, tzinfo=timezone.utc),
end_date=datetime(2024, 12, 31, tzinfo=timezone.utc),
n_splits=5,
gap_days=0, # gap between train end and test start (same as rolling)
min_train_days=0, # skip folds where train history is shorter than this
)
for window in windows:
train_range = window["train_range"] # BacktestDateRange
test_range = window["test_range"] # BacktestDateRange
fold_index = window["fold_index"] # int, 0-based
Behaviour
- The total date range is divided into
n_splits equal-sized test folds (strictly chronological, no shuffling).
- For fold
i, training covers [start_date, test_fold_i.start - gap_days) -- an expanding train window.
gap_days works the same way as in generate_rolling_backtest_windows (useful to avoid indicator lag leakage).
min_train_days silently skips early folds where the training history is shorter than a strategy's warmup requirement (e.g. a 200-day EMA needs at least 200 days of training data before results are valid).
- Return type is
List[Dict] with keys "train_range", "test_range", "fold_index" -- drop-in compatible with existing window consumers (run_vector_backtests, run_backtests).
Implementation sketch
def generate_k_fold_backtest_windows(
start_date: datetime,
end_date: datetime,
n_splits: int = 5,
gap_days: int = 0,
min_train_days: int = 0,
) -> List[Dict]:
total_days = (end_date - start_date).days
fold_size = total_days // n_splits
windows = []
for i in range(n_splits):
test_start = start_date + pd.Timedelta(days=i * fold_size)
test_end = test_start + pd.Timedelta(days=fold_size)
train_end = test_start - pd.Timedelta(days=gap_days)
train_days = (train_end - start_date).days
if train_days < min_train_days:
continue
windows.append({
"train_range": BacktestDateRange(
name=f"train_fold_{i}",
start_date=start_date,
end_date=train_end,
),
"test_range": BacktestDateRange(
name=f"test_fold_{i}",
start_date=test_start,
end_date=test_end,
),
"fold_index": i,
})
return windows
Acceptance criteria
Summary
Add a
generate_k_fold_backtest_windows()function that splits a date range intokchronological folds, each with a train and test period. The API mirrorsgenerate_rolling_backtest_windows()— juststart_date,end_date, andn_splits.Motivation
generate_rolling_backtest_windows()requires the user to manually tunetrain_days,test_days, andstep_daysto control how many windows are produced and how much of the date range is covered. K-fold solves this differently: the user says "give me exactly k splits" and every day in the range appears in exactly one test fold, ensuring full coverage without dead zones.This is especially useful for parameter selection / strategy ranking -- you pick the parameter sets that are consistently good across all k folds, not just one split.
Proposed API
Behaviour
n_splitsequal-sized test folds (strictly chronological, no shuffling).i, training covers[start_date, test_fold_i.start - gap_days)-- an expanding train window.gap_daysworks the same way as ingenerate_rolling_backtest_windows(useful to avoid indicator lag leakage).min_train_dayssilently skips early folds where the training history is shorter than a strategy's warmup requirement (e.g. a 200-day EMA needs at least 200 days of training data before results are valid).List[Dict]with keys"train_range","test_range","fold_index"-- drop-in compatible with existing window consumers (run_vector_backtests,run_backtests).Implementation sketch
Acceptance criteria
investing_algorithm_framework/analysis/backtest_data_ranges.pyalongsidegenerate_rolling_backtest_windowsinvesting_algorithm_framework/__init__.pytrain_range,test_rangeasBacktestDateRange, plusfold_index)gap_daysapplied between train end and test startmin_train_daysskips folds with insufficient historygap_days > 0,min_train_daysfiltering, non-divisible date ranges