Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
8a3d987
feat(pipeline): add Phase 1 cross-sectional Pipeline API (#501)
MDUYN May 1, 2026
550f1b1
feat(pipeline): event-backtest examples + AverageTradedValue alias
MDUYN May 2, 2026
5b8aad4
Merge remote-tracking branch 'origin/dev' into feat/501-pipeline-api-…
MDUYN May 2, 2026
47776c1
feat(pipeline): add VectorPipelineEngine (#502 phase 2a)
MDUYN May 2, 2026
48bf3e6
feat(pipeline): wire VectorPipelineEngine into VectorBacktestService …
MDUYN May 2, 2026
aabf9ae
feat(pipeline): factor arithmetic, cross-sectional transforms, lazy e…
MDUYN May 2, 2026
4af8b2d
docs(pipeline): document factor algebra, cross-sectional transforms, …
MDUYN May 2, 2026
2850d5d
feat(pipeline): validate pipeline warmup_window at strategy construct…
MDUYN May 2, 2026
6a46b3a
feat(pipeline): live envelope, universe-refresh cache, per-pipeline r…
MDUYN May 2, 2026
042c8ab
feat(pipeline): risk-neutrality primitives (#504)
MDUYN May 2, 2026
02797d1
docs(pipeline): list risk-neutrality primitives in built-in factor ta…
MDUYN May 2, 2026
43c7631
test(pipeline): convert pipeline tests from pytest to unittest
MDUYN May 4, 2026
93f3a19
test(pipeline): convert pipeline tests from pytest to unittest
MDUYN May 4, 2026
127115d
test(pipeline): convert phase 2/3 pipeline tests from pytest to unittest
MDUYN May 4, 2026
babadae
test(pipeline): convert pipeline tests from pytest to unittest
MDUYN May 4, 2026
c714f52
test(pipeline): convert phase 2/3 pipeline tests from pytest to unittest
MDUYN May 4, 2026
c8e7cf4
test(pipeline): convert risk-neutrality tests from pytest to unittest
MDUYN May 4, 2026
9ece0fb
Merge pull request #506 from coding-kitties/feat/501-pipeline-api-phase1
MDUYN May 4, 2026
08a3b7f
Merge pull request #509 from coding-kitties/feat/504-pipeline-risk-ne…
MDUYN May 4, 2026
0f05c47
Merge remote-tracking branch 'origin/dev' into feat/503-pipeline-api-…
MDUYN May 4, 2026
0d0e99c
Merge pull request #515 from coding-kitties/feat/503-pipeline-api-pha…
MDUYN May 4, 2026
0c56940
feat(backtesting): single-bundle binary persistence format (#487)
MDUYN May 5, 2026
b84cce7
perf(backtesting): 3x faster Backtest.open + dashboard bundle support
MDUYN May 5, 2026
01c8154
feat(backtesting): streaming migrate_backtests with resume support
MDUYN May 5, 2026
d723f3e
feat(backtesting): add delete_source option to migrate_backtests
MDUYN May 5, 2026
d142374
perf(metrics): O(N^2) -> O(N) snapshot lookup in rolling Sharpe
MDUYN May 5, 2026
1e0c64a
docs(readme): clarify .iafbt format features (versioned, language-por…
MDUYN May 5, 2026
a271960
Merge pull request #516 from coding-kitties/fix/497-report-html-scaling
MDUYN May 5, 2026
2780eee
style: fix flake8 errors (E501, E131, F401)
MDUYN May 5, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -164,4 +164,5 @@ venv
examples/tutorial/data
examples/tutorial/backtest_results
examples/tutorial/resources
examples/**/resources/
.data_cache/
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@ This framework is built around the full loop: **create strategies → vector bac
- 🎯 **Return Scenario Projections** — Good, average, bad & very bad year projections from backtest data
- 📉 **Benchmark Comparison** — Beat-rate analysis vs Buy & Hold, DCA, risk-free & custom benchmarks
- 📄 **One-Click HTML Report** — Self-contained file, no server, dark & light theme, shareable
- 📦 **Custom `.iafbt` Backtest Bundle Format** — An explicit, versioned, compressed, language-portable container (zstd + msgpack with magic-byte header) plus a separate parquet index for fast filtering without loading. ~21× smaller and ~27× fewer files than standard filebased directory layouts, with parallel I/O for fast load/save of large amounts of backtests.
- 🌐 **Load External Data** — Fetch CSV, JSON, or Parquet from any URL with caching and auto-refresh
- � **[Record Custom Variables](https://coding-kitties.github.io/investing-algorithm-framework/Advanced%20Concepts/recording-variables)** — Track any indicator or metric during backtests with `context.record()`
- �🚀 **Build → Backtest → Deploy** — Local dev, cloud deploy (AWS / Azure), or monetize on Finterion
Expand Down
239 changes: 239 additions & 0 deletions docusaurus/docs/Advanced Concepts/pipelines-event-backtest.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
---
sidebar_position: 10
title: Pipelines — Event-driven backtest
description: Use cross-sectional pipelines in the event-driven backtest engine (Phase 1, available today).
---

# Pipelines: Event-driven backtest

This is the full Phase 1 reference. Read [Pipelines](pipelines.md) first
for the high-level concept.

## Quick start

```python
from typing import Any, Dict

from investing_algorithm_framework import (
AverageDollarVolume,
BacktestDateRange,
Context,
DataSource,
Pipeline,
Returns,
TimeUnit,
TradingStrategy,
create_app,
)


class MomentumScreener(Pipeline):
dollar_volume = AverageDollarVolume(window=30)
momentum = Returns(window=30)

universe = dollar_volume.top(3)
alpha = momentum.rank(mask=universe)


class CrossSectionalMomentum(TradingStrategy):
algorithm_id = "cross-sectional-momentum"
time_unit = TimeUnit.DAY
interval = 1
data_sources = [
DataSource(
data_type="OHLCV",
market="binance",
symbol=symbol,
warmup_window=60,
time_frame="1d",
identifier=f"{symbol}-ohlcv",
)
for symbol in ["BTC/EUR", "ETH/EUR", "SOL/EUR", "ADA/EUR", "XRP/EUR"]
]
pipelines = [MomentumScreener]

def run_strategy(self, context: Context, data: Dict[str, Any]):
screen = data["MomentumScreener"]
top = screen.sort("alpha", descending=True).head(2)
for row in top.iter_rows(named=True):
print(row["symbol"], row["momentum"], row["alpha"])


app = create_app()
app.add_strategy(CrossSectionalMomentum)
app.add_market(market="binance", trading_symbol="EUR", initial_balance=1000)

if __name__ == "__main__":
app.run_backtest(
backtest_date_range=BacktestDateRange(
start_date="2024-01-01", end_date="2024-06-01"
),
)
```

A complete runnable example lives in
[`examples/pipeline_momentum_screener.py`](https://github.com/coding-kitties/investing-algorithm-framework/blob/dev/examples/pipeline_momentum_screener.py).

## How it works

On every iteration of the event loop:

1. The framework discovers your strategy's OHLCV data sources from
`strategy.data_sources` (filtered by `DataType.OHLCV`).
2. For each `Pipeline` listed in `strategy.pipelines`:
- Each per-symbol OHLCV frame is converted to Polars (Pandas inputs
are auto-converted) and lower-cased.
- The frames are stacked into a long-form panel and **truncated at
the current bar** (`datetime <= as_of`) — guaranteed no
look-ahead.
- Each declared `Factor` is computed in vectorised Polars over the
full panel.
- The pipeline's `universe` filter (if any) is applied as a top-level
mask; symbols failing the mask are dropped.
- The frame is sliced to the current bar, and the result is stored
under `data["YourPipelineClassName"]`.

The output is a `polars.DataFrame` with columns:

```
symbol | <factor_1> | <factor_2> | ... | <factor_n>
```

Symbols with no data at the current bar (e.g. listed late) are dropped.

## Declaring a pipeline

```python
from investing_algorithm_framework import (
AverageDollarVolume, Pipeline, Returns, RSI, SMA, Volatility,
)

class MyScreen(Pipeline):
# Any factor declared as a class attribute becomes an output column
# named after the attribute.
momentum = Returns(window=30)
rsi = RSI(window=14)
vol = Volatility(window=30)

# Optional: a Filter assigned to `universe` becomes the master mask.
# Every other column is restricted to symbols where universe is True.
# The universe column itself is NOT exposed in the output.
universe = AverageDollarVolume(window=30).top(100)

# `rank` works inside the universe.
alpha = momentum.rank(mask=universe)
```

Rules enforced at class definition time:

- A pipeline must declare at least one factor column (otherwise
`TypeError`).
- The `universe` attribute, if present, **must** be a `Filter` (e.g.
`factor.top(n)` / `factor.bottom(n)`); using a plain `Factor` raises
`TypeError`.
- Factors and filters are inherited via the MRO; subclass declarations
override parent ones with the same name.

## Built-in factors

| Class | Inputs | Notes |
| --- | --- | --- |
| `Returns(window)` | close | `close[t] / close[t - window] - 1` |
| `AverageDollarVolume(window)` | close, volume | rolling mean of `close * volume` |
| `SMA(window)` | close | simple moving average |
| `RSI(window)` | close | Wilder's RSI; clamps to 100 when there are no losses |
| `Volatility(window, periods_per_year=252)` | close | rolling stdev of log returns × √periods_per_year |

All built-ins compute per-symbol via `over("symbol")` so symbols are
independent.

## Custom factors

Subclass `CustomFactor` and implement `compute_panel`:

```python
import polars as pl
from investing_algorithm_framework import CustomFactor

class HighLowRange(CustomFactor):
inputs = ["high", "low"]
window = 1

def compute_panel(self, panel: pl.DataFrame) -> pl.Series:
return (panel["high"] - panel["low"]).rename("range")
```

`compute_panel` receives the full long-form panel and must return a
`pl.Series` aligned with the panel rows. Set:

- `inputs` — the OHLCV columns you read from the panel.
- `window` — the lookback in bars (used for warmup sizing checks in
future phases; also exposed via `pipeline.required_window()`).

## Cross-sectional ops

```python
factor.rank(mask=optional_filter) # ascending ordinal rank per bar
factor.top(n) # mask: top-n per bar by descending value
factor.bottom(n) # mask: bottom-n per bar by ascending value
```

`rank` returns ordinal ranks (1, 2, 3, …) within each `datetime`. With
a `mask`, symbols outside the mask receive `null`.

## Reading the result

```python
def run_strategy(self, context, data):
screen: pl.DataFrame = data["MomentumScreener"]
if screen.is_empty():
return # universe drained or warmup not yet satisfied

top = screen.sort("alpha", descending=True).head(5)
symbols = top["symbol"].to_list()
```

Common patterns:

```python
# Symbol → row dict
rows = {row["symbol"]: row for row in screen.iter_rows(named=True)}

# To pandas if you prefer:
pdf = screen.to_pandas()
```

## Performance notes

Phase 1 is **eager**: the panel is rebuilt on every iteration. That is
fine for daily/hourly backtests with up to a few hundred symbols. If
you need to push further, Phase 2 ([#502](https://github.com/coding-kitties/investing-algorithm-framework/issues/502))
introduces a vector-mode pipeline executor that materialises factors
once over the full backtest window.

## Limitations (Phase 1)

- No factor arithmetic (`a + b`, `a / b`, `(a - b).zscore()`); use a
`CustomFactor` for now.
- No cached results between bars (rebuilt each iteration).
- Only OHLCV inputs. External data joining is on the roadmap.

These are intentional — the goal of Phase 1 is to nail down the public
declarative surface (`Pipeline`, `Factor`, `Filter`, `top` / `bottom` /
`rank`) before scaling the executor.

## Troubleshooting

- **`TypeError: <Pipeline>.universe must be a Filter`** — assign a
`Filter` (e.g. `factor.top(100)`), not a raw `Factor`.
- **`KeyError: ... missing required column 'volume'`** — your data
source is not OHLCV; pipelines need full OHLCV frames.
- **Empty result frame** — either your warmup hasn't yet satisfied the
largest `window`, or your universe filtered every symbol out.

## See also

- [Pipelines](pipelines.md) — concept page.
- [Pipelines: Vector backtest](pipelines-vector-backtest.md) — Phase 2 roadmap.
- [Pipelines: Live trading](pipelines-live.md) — Phase 3 roadmap.
- Design doc: [`docs/design/pipeline-api.md`](https://github.com/coding-kitties/investing-algorithm-framework/blob/dev/docs/design/pipeline-api.md).
Loading
Loading