Pipeline API — Phase 3 follow-ups (live hardening, residual work)
Tracks the remaining building blocks for production-grade live
pipeline trading after #503 phase 3a–3d shipped (PRs #508 + #509).
Phase 3a (warmup validation), 3b (live envelope), 3c
(refresh_universe_every cache), and 3d (per-pipeline resilience)
are all merged in feat/503-pipeline-api-phase3-live.
Remaining building blocks
1. Batched / async OHLCV fetch in CCXTOHLCVDataProvider
CCXT's fetch_ohlcv is per-symbol. For pipelines spanning many
symbols this serialises the cold path of every iteration. We need a
rate-limit-aware concurrent fan-out (asyncio + a bounded semaphore
keyed off exchange.rateLimit). This was deliberately split off
from #503 because the failure modes (per-exchange rate-limit drift,
partial fills under burst) really need live integration testing
against a real exchange before we're confident.
Acceptance
- Concurrent fetch with rate-limit awareness keyed on the CCXT
exchange's reported limit.
- A single failing per-symbol fetch is logged and the surviving
symbols still produce a panel (already enforced at the engine
level by phase 3d, but we want it at the provider level too so
the pipeline gets null for the failed symbol rather than a
silent retry storm).
- 24h paper-trading run on Bitvavo with a 20-symbol daily universe.
2. Cross-market order routing for pipeline universes
A pipeline universe spanning, e.g., BITVAVO + BINANCE should
place orders on the right portfolio. Today the order layer assumes
one strategy → one market. Likely needs a market column on the
pipeline output the order helper can pivot on, plus a portfolio
lookup keyed by (strategy_id, market). Flagged in #503 as
"likely its own sub-issue" — confirmed.
Acceptance
- Pipeline output includes (or makes resolvable) the source market
per row.
context.create_order(symbol, market=...) dispatches to the
matching portfolio.
- Test covering a 2-market universe placing orders on each.
3. Streaming / append-only panel updates
Today every iteration rebuilds the long-form panel from scratch.
For high-frequency live use this is wasteful — we already have all
historical bars in memory, only the latest bar is new. We should
support a streaming mode that appends the new bar(s) and re-
evaluates incrementally.
Acceptance
- A pipeline engine variant that keeps the panel resident across
iterations and grows it by appending new bars.
- Cache invalidation when
as_of jumps backwards (e.g. backtest
re-runs).
- Memory benchmarks vs. full rebuild on a 50-symbol / 1y / 1d
panel.
4. Partial-bar / unclosed-candle handling
A live fetch_ohlcv typically returns the in-progress (unclosed)
bar as the last row. Pipelines must not trade on that bar by
default — it changes second-to-second and produces unstable
factor values. We need an explicit policy:
- Default: drop the in-progress bar from the panel before
evaluation.
- Opt-in:
Pipeline.allow_partial_bar = True for strategies that
explicitly want the live bar (e.g. intra-day momentum probes).
Acceptance
- Detection of partial bars (compare last datetime to the
timeframe's expected close).
- Panel filtering at engine level keyed off the pipeline class
attribute.
- Documented opt-in escape hatch.
5. Live observability hooks for pipeline output
Today pipeline output goes into data["YourPipelineClassName"]
and that's it. For live runs we want:
- An
on_pipeline_output strategy hook (parallel to
on_strategy_run) that fires per-pipeline per-iteration with the
resolved frame.
- A built-in logger snapshot (latest pipeline output to disk on
each iteration, rolled by date) for after-the-fact debugging of
live runs.
- Optional integration with the existing portfolio snapshot
cadence so pipeline state is captured alongside portfolio state.
Acceptance
- New hook documented in
docusaurus/docs/Advanced Concepts/pipelines-live.md.
- Log/snapshot writer with bounded retention.
- Test covering hook invocation order and snapshot content.
Out of scope (still)
- Sub-daily live pipelines.
- Universes > 50 symbols.
- Sector / industry classifiers (no live data source).
Related
Pipeline API — Phase 3 follow-ups (live hardening, residual work)
Tracks the remaining building blocks for production-grade live
pipeline trading after #503 phase 3a–3d shipped (PRs #508 + #509).
Phase 3a (warmup validation), 3b (live envelope), 3c
(
refresh_universe_everycache), and 3d (per-pipeline resilience)are all merged in
feat/503-pipeline-api-phase3-live.Remaining building blocks
1. Batched / async OHLCV fetch in
CCXTOHLCVDataProviderCCXT's
fetch_ohlcvis per-symbol. For pipelines spanning manysymbols this serialises the cold path of every iteration. We need a
rate-limit-aware concurrent fan-out (asyncio + a bounded semaphore
keyed off
exchange.rateLimit). This was deliberately split offfrom #503 because the failure modes (per-exchange rate-limit drift,
partial fills under burst) really need live integration testing
against a real exchange before we're confident.
Acceptance
exchange's reported limit.
symbols still produce a panel (already enforced at the engine
level by phase 3d, but we want it at the provider level too so
the pipeline gets
nullfor the failed symbol rather than asilent retry storm).
2. Cross-market order routing for pipeline universes
A pipeline universe spanning, e.g.,
BITVAVO+BINANCEshouldplace orders on the right portfolio. Today the order layer assumes
one strategy → one market. Likely needs a
marketcolumn on thepipeline output the order helper can pivot on, plus a portfolio
lookup keyed by
(strategy_id, market). Flagged in #503 as"likely its own sub-issue" — confirmed.
Acceptance
per row.
context.create_order(symbol, market=...)dispatches to thematching portfolio.
3. Streaming / append-only panel updates
Today every iteration rebuilds the long-form panel from scratch.
For high-frequency live use this is wasteful — we already have all
historical bars in memory, only the latest bar is new. We should
support a streaming mode that appends the new bar(s) and re-
evaluates incrementally.
Acceptance
iterations and grows it by appending new bars.
as_ofjumps backwards (e.g. backtestre-runs).
panel.
4. Partial-bar / unclosed-candle handling
A live
fetch_ohlcvtypically returns the in-progress (unclosed)bar as the last row. Pipelines must not trade on that bar by
default — it changes second-to-second and produces unstable
factor values. We need an explicit policy:
evaluation.
Pipeline.allow_partial_bar = Truefor strategies thatexplicitly want the live bar (e.g. intra-day momentum probes).
Acceptance
timeframe's expected close).
attribute.
5. Live observability hooks for pipeline output
Today pipeline output goes into
data["YourPipelineClassName"]and that's it. For live runs we want:
on_pipeline_outputstrategy hook (parallel toon_strategy_run) that fires per-pipeline per-iteration with theresolved frame.
each iteration, rolled by date) for after-the-fact debugging of
live runs.
cadence so pipeline state is captured alongside portfolio state.
Acceptance
docusaurus/docs/Advanced Concepts/pipelines-live.md.Out of scope (still)
Related
PRs feat(pipeline): #503 Phase 3a — warmup_window validation #508 and feat(pipeline): #504 risk-neutrality primitives (sector neutrality, beta neutralisation, OLS risk models) #509).