Skip to content

Commit 0f05c47

Browse files
committed
Merge remote-tracking branch 'origin/dev' into feat/503-pipeline-api-phase3-live
2 parents 08a3b7f + 9ece0fb commit 0f05c47

13 files changed

Lines changed: 601 additions & 84 deletions

File tree

docs/design/pipeline-api.md

Lines changed: 188 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,188 @@
1+
# Pipeline API — Design Doc
2+
3+
> Status: **DRAFT for review**.
4+
> Tracking issue: [#438](https://github.com/coding-kitties/investing-algorithm-framework/issues/438).
5+
> Phase issues: [#501](https://github.com/coding-kitties/investing-algorithm-framework/issues/501) (event), [#502](https://github.com/coding-kitties/investing-algorithm-framework/issues/502) (vector), [#503](https://github.com/coding-kitties/investing-algorithm-framework/issues/503) (live).
6+
7+
## 1. Goals & non-goals
8+
9+
### Goals
10+
11+
1. Declarative cross-sectional factor / filter / classifier computation across an asset universe at each bar.
12+
2. Look-ahead-safe by construction: a factor evaluated at bar `t` sees only data with timestamp `≤ t`.
13+
3. Strict opt-in: strategies without `pipelines = [...]` see **zero** behavioural or performance change.
14+
4. Three execution backends (event backtest, vector backtest, live) sharing the same `Pipeline` definition.
15+
16+
### Non-goals (v1)
17+
18+
- Full Zipline-Pipeline parity (no classifier hierarchies, no winsorization, no OLS factors).
19+
- Live pipelines on sub-daily timeframes.
20+
- Universes outside the supported envelope (see §6).
21+
- Cross-market order routing (separate issue).
22+
23+
## 2. Public API
24+
25+
```python
26+
from investing_algorithm_framework import (
27+
TradingStrategy, Pipeline, Returns, AverageDollarVolume, TimeUnit,
28+
)
29+
30+
class MomentumScreener(Pipeline):
31+
dollar_volume = AverageDollarVolume(window=30)
32+
momentum = Returns(window=60)
33+
34+
universe = dollar_volume.top(100)
35+
alpha = momentum.rank(mask=universe)
36+
37+
38+
class MyStrategy(TradingStrategy):
39+
time_unit = TimeUnit.DAY
40+
interval = 1
41+
pipelines = [MomentumScreener]
42+
universe = ["BTC/EUR", "ETH/EUR", ...] # candidate symbols
43+
44+
def run_strategy(self, context, data):
45+
out = data["MomentumScreener"] # pl.DataFrame, one row per surviving symbol
46+
...
47+
```
48+
49+
### Class attributes added to `TradingStrategy`
50+
51+
| Attribute | Type | Default | Meaning |
52+
|---|---|---|---|
53+
| `pipelines` | `list[type[Pipeline]]` | `[]` | Pipelines to run before each `run_strategy` call. |
54+
| `universe` | `list[str] \| list[DataSource]` | `[]` | Candidate symbols. Folded into `data_sources` at app startup; pipelines filter down. |
55+
56+
### `Pipeline` class
57+
58+
- Class attributes that are `Factor` / `Filter` instances are introspected via `__init_subclass__`.
59+
- A class attribute named `universe` is treated as the **root mask**: if present, every other column is computed on the masked subset.
60+
- All other attributes become columns of the output frame.
61+
62+
## 3. Panel shape
63+
64+
The engine's internal representation is a **long-form Polars DataFrame**:
65+
66+
```
67+
schema = {
68+
"datetime": pl.Datetime,
69+
"symbol": pl.Utf8,
70+
"open": pl.Float32,
71+
"high": pl.Float32,
72+
"low": pl.Float32,
73+
"close": pl.Float32,
74+
"volume": pl.Float32,
75+
}
76+
```
77+
78+
Long-form is chosen because:
79+
80+
- Polars rolling/group-by is faster on long form than on wide.
81+
- Sparse symbols (delisted, late-listed) are natural — no NaN columns.
82+
- Cache files are smaller (no per-symbol column duplication).
83+
84+
Per-bar pipeline output handed to the strategy is a **wide** frame keyed by symbol:
85+
86+
```
87+
out = pl.DataFrame({
88+
"symbol": pl.Utf8,
89+
"<factor name>": pl.Float64, # one column per Factor/Filter on the Pipeline
90+
...
91+
})
92+
```
93+
94+
## 4. Engine API (internal)
95+
96+
```python
97+
class PipelineEngine(Protocol):
98+
def evaluate_at(
99+
self,
100+
pipeline: type[Pipeline],
101+
as_of: datetime,
102+
) -> pl.DataFrame: ...
103+
"""Event mode: return wide per-symbol frame for the given timestamp."""
104+
105+
def evaluate_range(
106+
self,
107+
pipeline: type[Pipeline],
108+
start: datetime,
109+
end: datetime,
110+
) -> pl.DataFrame: ...
111+
"""Vector mode: return long (date, symbol)-indexed frame for the range."""
112+
```
113+
114+
Two implementations:
115+
116+
- `LazyPolarsPipelineEngine` (event + vector). Compiles Factor expressions into a single `pl.LazyFrame` plan; `collect()` only at the boundary.
117+
- `LiveBatchedPipelineEngine` (Phase 3). Adds async batched fetch + universe-refresh.
118+
119+
## 5. Cache key (Phase 2)
120+
121+
Cache lives under `<resource_dir>/pipeline_cache/`. Key:
122+
123+
```
124+
hash(
125+
universe_hash: sha1(sorted(symbol_list)),
126+
daterange: (start.isoformat(), end.isoformat()),
127+
timeframe: e.g. "1d",
128+
expr_hash: sha1(canonical_repr(factor_expression_tree)),
129+
schema_version: int, # bump on any cache-incompatible change
130+
)
131+
```
132+
133+
Hits return the cached panel/factor frame without recomputation.
134+
Parameter sweeps over **non-pipeline** attributes (signal thresholds, position sizing) reuse the cache for free.
135+
136+
## 6. Performance contract
137+
138+
| Mode | Timeframe | Max universe | Tested in CI |
139+
|---|---|---|---|
140+
| Event BT | daily | 5,000 ||
141+
| Event BT | 4h / 1h | 1,000 / 500 ||
142+
| Event BT | < 1h || ❌ raises |
143+
| Vector BT | daily | 5,000 ||
144+
| Vector BT | 4h / 1h | 1,000 / 500 ||
145+
| Vector BT | < 1h || ❌ raises |
146+
| Live | daily | 50 | smoke only |
147+
| Live | < daily || ❌ raises |
148+
149+
**Opt-in guarantee (CI-asserted):** vector backtest of the existing single-symbol example must run within ±10% of the pre-pipeline baseline wall-clock.
150+
151+
## 7. Built-in factors (v1)
152+
153+
| Factor | Formula |
154+
|---|---|
155+
| `Returns(window=N)` | `close.pct_change(N)` |
156+
| `AverageDollarVolume(window=N)` | `(close * volume).rolling_mean(N)` |
157+
| `SMA(window=N)` | `close.rolling_mean(N)` |
158+
| `RSI(window=N)` | standard Wilder RSI |
159+
| `Volatility(window=N)` | `log_returns.rolling_std(N) * sqrt(periods_per_year)` |
160+
161+
All other factors mentioned in #438's original draft (`MACD`, `BollingerBands`, `EWMA`, `VWAP`, `MaxDrawdown`) are deferred. Users can subclass `CustomFactor`:
162+
163+
```python
164+
class MACD(CustomFactor):
165+
inputs = ["close"]
166+
window = 26
167+
168+
def compute(self, close: pl.Series) -> pl.Series:
169+
...
170+
```
171+
172+
## 8. Look-ahead safety
173+
174+
Factors operate on a Polars `LazyFrame` filtered to `datetime <= as_of` *before* any rolling op. Rolling windows are right-aligned (closed on the right). Tests must assert that injecting a future bar does not change a past factor value.
175+
176+
## 9. Open questions
177+
178+
1. **Universe declaration ergonomics.** Do we accept a callable `universe = lambda ctx: top_500_by_market_cap()` or only a static list in v1? (Proposed: static list in v1, callable in v2.)
179+
2. **Pipeline scheduling.** Always run every bar, or honour a per-pipeline `time_unit`? (Proposed: same `time_unit` as the strategy in v1; per-pipeline scheduling in v2.)
180+
3. **Multiple pipelines on one strategy.** Independent (each gets its own cache key) or composable (one pipeline can reference another's column)? (Proposed: independent in v1.)
181+
4. **Float32 vs float64.** Default to float32 for memory; users opt into float64 per factor? (Proposed: yes, factor-level `dtype=` override.)
182+
183+
## 10. Out of code, in the order of work
184+
185+
1. ✅ This doc reviewed and merged.
186+
2. Phase 1 ([#501](https://github.com/coding-kitties/investing-algorithm-framework/issues/501)) — event backtest + 5 factors.
187+
3. Phase 2 ([#502](https://github.com/coding-kitties/investing-algorithm-framework/issues/502)) — vector + cache + benchmark.
188+
4. Phase 3 ([#503](https://github.com/coding-kitties/investing-algorithm-framework/issues/503)) — live, gated on async CCXT fetch.

investing_algorithm_framework/domain/backtesting/backtest.py

Lines changed: 30 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -418,6 +418,23 @@ def save(
418418
os.makedirs(destination_run_path, exist_ok=True)
419419
br.save(destination_run_path)
420420

421+
# Always rebuild the summary from the current set of backtest runs
422+
# before writing it to disk. This guarantees that summary.json is
423+
# self-consistent with the per-run metrics.json files (e.g.
424+
# number_of_windows == number of runs, total_net_gain == sum of
425+
# per-run total_net_gain, etc.) regardless of how the in-memory
426+
# Backtest was constructed (single backtest, walk-forward,
427+
# merge(), …). See issue #511.
428+
if self.backtest_runs:
429+
per_run_metrics = [
430+
br.backtest_metrics for br in self.backtest_runs
431+
if br.backtest_metrics is not None
432+
]
433+
if per_run_metrics:
434+
self.backtest_summary = generate_backtest_summary_metrics(
435+
per_run_metrics
436+
)
437+
421438
# Save combined backtest metrics if available
422439
if self.backtest_summary:
423440
summary_file = os.path.join(
@@ -517,15 +534,22 @@ def merge(self, other: 'Backtest') -> 'Backtest':
517534
Backtest: The merged Backtest instance.
518535
"""
519536

520-
merged = Backtest()
537+
merged = Backtest(algorithm_id=self.algorithm_id)
521538
merged.backtest_runs = self.backtest_runs + other.backtest_runs
522539

523-
summary = BacktestSummaryMetrics()
524-
525-
for bt_run in merged.get_all_backtest_metrics():
526-
summary.add(bt_run)
540+
# Rebuild the summary from the full set of merged backtest runs.
541+
# `BacktestSummaryMetrics` is a plain dataclass and does not expose
542+
# an `add()` method, so the previous incremental approach raised
543+
# AttributeError and silently produced a stale / single-window
544+
# summary. See issue #511.
545+
merged_metrics = merged.get_all_backtest_metrics()
546+
if merged_metrics:
547+
merged.backtest_summary = generate_backtest_summary_metrics(
548+
merged_metrics
549+
)
550+
else:
551+
merged.backtest_summary = BacktestSummaryMetrics()
527552

528-
merged.backtest_summary = summary
529553
merged.backtest_permutation_tests = \
530554
self.backtest_permutation_tests + other.backtest_permutation_tests
531555

investing_algorithm_framework/domain/backtesting/backtest_metrics.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,21 @@ class BacktestMetrics:
2020
total return, annualized return, volatility, Sharpe ratio,
2121
and maximum drawdown.
2222
23+
.. note:: Field semantics & known duplicates (issue #511)
24+
25+
- ``total_loss`` is the **gross loss magnitude** (a non-negative
26+
number equal to ``sum(abs(net_gain))`` over losing trades).
27+
``total_loss_percentage`` is ``total_loss /
28+
initial_unallocated`` (decimal, non-negative). These fields
29+
no longer mirror ``total_net_gain`` / ``total_net_gain_pct``;
30+
see B1 in issue #511.
31+
32+
- ``total_growth`` / ``total_growth_percentage`` are
33+
numerically equivalent to ``total_net_gain`` /
34+
``total_net_gain_percentage`` for the standard
35+
mark-to-market portfolio used in backtests. They are kept as
36+
legacy aliases. See B3 in issue #511.
37+
2338
Attributes:
2439
backtest_date_range_name (str): The name of the date range
2540
used for the backtest.

investing_algorithm_framework/domain/backtesting/backtest_summary_metrics.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,30 @@ class BacktestSummaryMetrics:
1313
Represents the summarized results of a backtest,
1414
focusing on key headline performance and risk metrics.
1515
16+
.. note:: Field semantics & known duplicates (issue #511)
17+
18+
- ``total_loss`` / ``total_loss_percentage`` are gross-loss
19+
based: ``total_loss`` is ``sum(per-run gross_loss)`` (a
20+
non-negative magnitude in account currency) and
21+
``total_loss_percentage`` is ``total_loss /
22+
sum(initial_unallocated)`` (decimal). They no longer mix
23+
with net-return semantics. See B1/B2 in issue #511.
24+
25+
- ``total_growth`` / ``total_growth_percentage`` are
26+
numerically equivalent to ``total_net_gain`` /
27+
``total_net_gain_percentage`` for closed-position backtests
28+
because both are derived from the same start/end portfolio
29+
values. They are kept for backwards compatibility but should
30+
be considered legacy aliases. See B3 in issue #511.
31+
32+
- ``average_net_gain``, ``average_loss``, ``average_growth``
33+
(and their ``*_percentage`` counterparts) are time-weighted
34+
means **across windows**. For a single-window backtest they
35+
collapse to the corresponding ``total_*`` value by
36+
definition (weighted mean of one element equals that
37+
element). See B4 in issue #511. For per-trade averages use
38+
``average_trade_*`` instead.
39+
1640
Attributes:
1741
total_net_gain (float): Total net gain from the backtest.
1842
total_net_gain_percentage (float): Total net gain percentage

investing_algorithm_framework/domain/backtesting/combine_backtests.py

Lines changed: 35 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -39,28 +39,32 @@ def _compound_percentage_returns(percentages):
3939
Compound percentage returns across multiple periods.
4040
4141
For example, if period 1 has 10% return and period 2 has 5% return,
42-
the compounded return is: (1 + 0.10) * (1 + 0.05) - 1 = 15.5%
43-
NOT simply 10% + 5% = 15%
42+
the compounded return is: (1 + 0.10) * (1 + 0.05) - 1 = 0.155 = 15.5%
43+
NOT simply 0.10 + 0.05 = 0.15.
44+
45+
The framework consistently represents percentages as **decimals**
46+
(e.g. ``0.10`` for 10%), so this helper expects decimal inputs and
47+
returns a decimal. See issue #511 (B5) — earlier versions assumed
48+
whole-number percentages, which silently produced results off by a
49+
factor of ~100 once multi-window aggregation was exercised.
4450
4551
Args:
46-
percentages (List[float | None]): List of percentage returns
47-
(as whole numbers, e.g., 10 for 10%).
52+
percentages (List[float | None]): List of period returns expressed
53+
as decimals (e.g. ``0.10`` for 10%).
4854
4955
Returns:
50-
float | None: The compounded percentage return, or None if no
51-
valid percentages.
56+
float | None: The compounded return as a decimal, or ``None`` if
57+
no valid percentages.
5258
"""
5359
valid_percentages = [p for p in percentages if p is not None]
5460
if not valid_percentages:
5561
return None
5662

57-
# Convert percentages to decimals, compound, then convert back
5863
compounded = 1.0
5964
for pct in valid_percentages:
60-
compounded *= (1 + pct / 100)
65+
compounded *= (1 + pct)
6166

62-
# Convert back to percentage
63-
return (compounded - 1) * 100
67+
return compounded - 1
6468

6569

6670
def combine_backtests(backtests):
@@ -173,9 +177,13 @@ def generate_backtest_summary_metrics(
173177
b.total_net_gain for b in valid_metrics
174178
if b.total_net_gain is not None
175179
)
180+
# B1/B2 fix (issue #511): per-run ``total_loss`` is now the gross
181+
# loss magnitude, so the aggregate is simply the sum of per-run
182+
# ``total_loss`` (equivalent to ``sum(gross_loss)``). Both per-run
183+
# and aggregate use the same unit (positive currency).
176184
total_loss = sum(
177-
b.gross_loss for b in valid_metrics
178-
if b.gross_loss is not None
185+
b.total_loss for b in valid_metrics
186+
if b.total_loss is not None
179187
)
180188
total_growth = sum(
181189
b.total_growth for b in valid_metrics
@@ -184,13 +192,24 @@ def generate_backtest_summary_metrics(
184192

185193
# === PERCENTAGE RETURNS (compounded, not summed) ===
186194
# Compound returns: (1 + r1) * (1 + r2) * ... - 1
187-
# For percentages stored as whole numbers (e.g., 10 for 10%)
195+
# All percentages are stored as decimals (e.g. 0.10 for 10%).
188196
total_net_gain_percentage = _compound_percentage_returns(
189197
[b.total_net_gain_percentage for b in valid_metrics]
190198
)
191-
total_loss_percentage = _compound_percentage_returns(
192-
[b.total_loss_percentage for b in valid_metrics]
193-
)
199+
# ``total_loss`` is a non-multiplicative magnitude (it does not
200+
# compound across windows). Express the aggregate as the sum of
201+
# gross losses divided by the sum of initial capital across
202+
# windows, which keeps the unit (decimal fraction) consistent
203+
# with the per-run definition. See issue #511 (B2).
204+
total_initial_value = 0.0
205+
for b in valid_metrics:
206+
iv = getattr(b, "initial_unallocated", None)
207+
if isinstance(iv, (int, float)) and iv > 0:
208+
total_initial_value += iv
209+
if total_initial_value > 0 and total_loss is not None:
210+
total_loss_percentage = total_loss / total_initial_value
211+
else:
212+
total_loss_percentage = None
194213
total_growth_percentage = _compound_percentage_returns(
195214
[b.total_growth_percentage for b in valid_metrics]
196215
)

0 commit comments

Comments
 (0)