|
| 1 | +# Bundle Format v2 — Public Specification |
| 2 | + |
| 3 | +**Status:** Stable. Default writer since `v8.9.0` (May 2026). |
| 4 | +**File extension:** `.iafbt` |
| 5 | +**Backwards compatibility:** v1 bundles remain readable indefinitely. |
| 6 | + |
| 7 | +This document describes the on-disk binary format produced by |
| 8 | +`save_bundle()` and consumed by `open_bundle()` / |
| 9 | +`Backtest.open()`. Third-party tools (e.g. the Finterion upload CLI |
| 10 | +and ingestion pipeline) can rely on this contract. |
| 11 | + |
| 12 | +--- |
| 13 | + |
| 14 | +## Why v2 |
| 15 | + |
| 16 | +v1 stored the entire `Backtest.to_dict()` graph as a single |
| 17 | +zstd-compressed MessagePack document. That was already efficient for |
| 18 | +small backtests, but two structural problems became visible at scale |
| 19 | +(thousands of bundles per user): |
| 20 | + |
| 21 | +1. **Heavy time series stored as JSON-ish lists of `(float, |
| 22 | + ISO-string)` tuples** — the strings dominate the on-disk size for |
| 23 | + long-running backtests (e.g. 10y daily ≈ 2,500 entries × 8 series). |
| 24 | + ISO-8601 strings are ~25 bytes each; an `int64` epoch-ms is 8 bytes |
| 25 | + and Parquet's columnar dictionary compression collapses repeated |
| 26 | + timestamps further. |
| 27 | + |
| 28 | +2. **No way to distinguish vector from event backtests** in the |
| 29 | + on-disk envelope, even though they're produced by separate engines |
| 30 | + with subtly different semantics. Reports and analyses had to |
| 31 | + guess from filename or metadata. |
| 32 | + |
| 33 | +v2 fixes both without breaking v1. |
| 34 | + |
| 35 | +--- |
| 36 | + |
| 37 | +## Outer envelope (unchanged from v1) |
| 38 | + |
| 39 | +``` |
| 40 | ++-----------+-----------+--------------------------------+ |
| 41 | +| 4 bytes | 4 bytes | N bytes | |
| 42 | +| "IAFB" | uint32 LE | zstd(level=7, msgpack(doc)) | |
| 43 | ++-----------+-----------+--------------------------------+ |
| 44 | + magic version compressed body |
| 45 | +``` |
| 46 | + |
| 47 | +The 4-byte little-endian uint32 holds the format version (1 or 2). |
| 48 | +The body is always zstd-compressed MessagePack with `use_bin_type=True`. |
| 49 | + |
| 50 | +Readers MUST reject any version > the highest they support, and SHOULD |
| 51 | +inspect the magic before attempting to decompress. |
| 52 | + |
| 53 | +--- |
| 54 | + |
| 55 | +## v2 document structure |
| 56 | + |
| 57 | +```python |
| 58 | +{ |
| 59 | + "format_version": 2, |
| 60 | + "engine_type": "vector" | "event" | None, |
| 61 | + |
| 62 | + # Engine-agnostic top-level fields (carry across both engines) |
| 63 | + "algorithm_id": str, |
| 64 | + "metadata": dict, |
| 65 | + "risk_free_rate": float | None, |
| 66 | + "strategy_ids": list, |
| 67 | + "parameters": dict, |
| 68 | + "tag": str | None, |
| 69 | + "backtest_permutation_tests": list | None, |
| 70 | + |
| 71 | + # Exactly ONE of these pairs is populated based on engine_type: |
| 72 | + "vector_runs": [run_dict, ...], # if engine_type == "vector" |
| 73 | + "vector_metrics": summary_dict, # if engine_type == "vector" |
| 74 | + |
| 75 | + "event_runs": [run_dict, ...], # if engine_type == "event" |
| 76 | + "event_metrics": summary_dict, # if engine_type == "event" |
| 77 | + |
| 78 | + # Fallback for legacy / unknown-engine bundles: |
| 79 | + "backtest_runs": [run_dict, ...], # if engine_type is None |
| 80 | + "backtest_summary": summary_dict, # if engine_type is None |
| 81 | + |
| 82 | + # Optional: embedded heavy-series Parquet blobs |
| 83 | + "blobs": { |
| 84 | + "runs/<idx>/metrics/<field>.parquet": bytes, |
| 85 | + ... |
| 86 | + }, |
| 87 | + |
| 88 | + # Optional: OHLCV manifest (unchanged from v1) |
| 89 | + "ohlcv": { |
| 90 | + "store_dir": str, # relative to bundle file |
| 91 | + "manifest": {key: relative_path}, |
| 92 | + }, |
| 93 | +} |
| 94 | +``` |
| 95 | + |
| 96 | +### Engine routing |
| 97 | + |
| 98 | +| `engine_type` | Runs key | Summary key | |
| 99 | +| ------------- | ------------- | ----------------- | |
| 100 | +| `"vector"` | `vector_runs` | `vector_metrics` | |
| 101 | +| `"event"` | `event_runs` | `event_metrics` | |
| 102 | +| `None` | `backtest_runs` | `backtest_summary` | |
| 103 | + |
| 104 | +A bundle holds exactly **one** engine's results. Mixing engines in a |
| 105 | +single bundle is not supported in v2 — produce two bundles and store |
| 106 | +them in the same directory. |
| 107 | + |
| 108 | +### Metric blob extraction |
| 109 | + |
| 110 | +Eight `BacktestMetrics` fields are extracted from each run's |
| 111 | +`backtest_metrics` dict and replaced with a `{"@blob": "<key>"}` |
| 112 | +reference; the actual Parquet bytes go into the top-level `blobs` map. |
| 113 | + |
| 114 | +The eight fields are all `List[Tuple[float, datetime|date]]`: |
| 115 | + |
| 116 | +- `equity_curve` |
| 117 | +- `drawdown_series` |
| 118 | +- `cumulative_return_series` |
| 119 | +- `rolling_sharpe_ratio` |
| 120 | +- `monthly_returns` |
| 121 | +- `yearly_returns` |
| 122 | +- `twr_equity_curve` |
| 123 | +- `twr_drawdown_series` |
| 124 | + |
| 125 | +Each blob is a 2-column Parquet file (zstd compression level 5): |
| 126 | + |
| 127 | +| Column | Type | Semantics | |
| 128 | +| ------ | ------ | ------------------------------------------ | |
| 129 | +| `ts` | int64 | UTC epoch milliseconds | |
| 130 | +| `value`| float64| The metric value | |
| 131 | + |
| 132 | +The blob key follows the convention |
| 133 | +`runs/<index>/metrics/<field_name>.parquet` where `<index>` is the |
| 134 | +zero-based offset of the run within `vector_runs` / `event_runs` / |
| 135 | +`backtest_runs` and `<field_name>` is one of the eight names above. |
| 136 | + |
| 137 | +If a series has fewer than 2 entries, the writer leaves it inline |
| 138 | +(no blob extraction). Readers MUST handle both cases for any field. |
| 139 | + |
| 140 | +### Other fields |
| 141 | + |
| 142 | +Fields that are NOT extracted into Parquet blobs in v2: |
| 143 | + |
| 144 | +- `portfolio_snapshots`, `trades`, `orders`, `positions` — stay as |
| 145 | + msgpack lists of dicts. Their schemas are unstable across model |
| 146 | + changes, and msgpack is sufficient for the typical row counts. |
| 147 | +- All scalar metrics (`sharpe_ratio`, `max_drawdown`, etc.) — stay |
| 148 | + inline. The whole point is keeping these fast to read. |
| 149 | +- `signals`, `signal_events`, `recorded_values`, `data_sources`, |
| 150 | + `metadata` on each run — stay inline. |
| 151 | + |
| 152 | +A future v2.x revision MAY extract additional fields. Readers MUST |
| 153 | +treat the `blobs` map as authoritative: any key found there |
| 154 | +overrides the inline value (the writer is required to leave the |
| 155 | +inline placeholder as `{"@blob": "<key>"}` to make this unambiguous). |
| 156 | + |
| 157 | +--- |
| 158 | + |
| 159 | +## Reader contract |
| 160 | + |
| 161 | +`open_bundle(path)` MUST: |
| 162 | + |
| 163 | +1. Read 8 bytes; verify magic, parse version. |
| 164 | +2. Decompress (zstd) and unpack (msgpack) the body. |
| 165 | +3. If `version == 1`: dispatch through the v1 reader (legacy |
| 166 | + `{"backtest": <to_dict>}` envelope). |
| 167 | +4. If `version == 2`: route runs/summary based on `engine_type`, |
| 168 | + resolve blob references against the `blobs` map (replacing each |
| 169 | + `{"@blob": "<key>"}` with the decoded `[(value, iso_string), ...]` |
| 170 | + list), and reconstruct a `Backtest` via `Backtest.from_dict`. |
| 171 | +5. Reject any `version > BUNDLE_FORMAT_VERSION`. |
| 172 | + |
| 173 | +### Summary-only mode |
| 174 | + |
| 175 | +`open_bundle(path, summary_only=True)` skips the Parquet decode step. |
| 176 | +Each blob reference is replaced with an empty list (so |
| 177 | +`BacktestMetrics.from_dict` doesn't choke). All scalar summary |
| 178 | +metrics (Sharpe, Sortino, max DD, CAGR, win-rate, …) remain fully |
| 179 | +populated. Use this for bulk listing / ranking pipelines that don't |
| 180 | +draw charts. |
| 181 | + |
| 182 | +--- |
| 183 | + |
| 184 | +## Writer contract |
| 185 | + |
| 186 | +`save_bundle(backtest, path)` MUST: |
| 187 | + |
| 188 | +1. Default to `format_version = BUNDLE_FORMAT_VERSION` (currently 2). |
| 189 | +2. Accept `format_version=1` for explicit downgrade. |
| 190 | +3. Write atomically (write to `<path>.tmp`, then `os.replace`). |
| 191 | +4. Set `engine_type` from `backtest.engine_type`. |
| 192 | +5. For v2: extract the eight metric series into Parquet blobs only |
| 193 | + when the source list has at least one usable `(value, datetime)` |
| 194 | + pair; leave malformed or empty series inline. |
| 195 | + |
| 196 | +### OHLCV float32 quantization |
| 197 | + |
| 198 | +`save_bundle(..., float32_ohlcv=True)` downcasts float64 OHLCV |
| 199 | +columns to float32 before Parquet encoding. Typical reduction is ~2x |
| 200 | +on the OHLCV side store; backtest metrics are unaffected for |
| 201 | +crypto / equity time series. Off by default to preserve the v1 |
| 202 | +exact-round-trip contract — opt in for upload / archive workflows. |
| 203 | + |
| 204 | +--- |
| 205 | + |
| 206 | +## Size expectations |
| 207 | + |
| 208 | +For a 10-year daily backtest with one run, three trades per week, |
| 209 | +typical metric-series savings: |
| 210 | + |
| 211 | +| Item | v1 inline (ISO strings)| v2 Parquet blob | |
| 212 | +| ------------------------------ | ----------------------:| ---------------:| |
| 213 | +| `equity_curve` (2,500 entries) | ~120 KB | ~25 KB | |
| 214 | +| `drawdown_series` (2,500) | ~120 KB | ~22 KB | |
| 215 | +| `monthly_returns` (120) | ~6 KB | ~2 KB | |
| 216 | +| 8 series total | ~500 KB | ~80 KB | |
| 217 | + |
| 218 | +Typical full-bundle size reduction for "metric-heavy" backtests |
| 219 | +(many runs, long horizons): **30-80%**. For "snapshot-heavy" |
| 220 | +backtests where `portfolio_snapshots` dominates, savings are smaller |
| 221 | +(snapshots aren't extracted in v2.0); a future v2.x revision will |
| 222 | +address this. |
| 223 | + |
| 224 | +For tiny / smoke-test backtests with <50 entries per series, v2 can |
| 225 | +be **slightly larger** than v1 because Parquet's per-file overhead |
| 226 | +(~100 bytes) exceeds the savings. This is expected and harmless. |
| 227 | + |
| 228 | +--- |
| 229 | + |
| 230 | +## Versioning policy |
| 231 | + |
| 232 | +- Bumping the bundle `format_version` integer is a **breaking change |
| 233 | + for readers** of older framework versions. |
| 234 | +- The framework will continue to read all historical versions |
| 235 | + indefinitely. There is no plan to drop v1 read support. |
| 236 | +- Writers default to the highest version the framework knows about. |
| 237 | +- Additive changes within v2 (e.g. extracting more fields into |
| 238 | + blobs) MUST be safe for v2 readers that don't know about the new |
| 239 | + blobs — they should receive the inline value as a fallback. |
| 240 | +- A bundle with `format_version=2` MAY contain blob keys the reader |
| 241 | + doesn't recognise. Readers MUST ignore unknown blob keys. |
0 commit comments