Skip to content

Pipeline API — Phase 3 follow-ups (async CCXT fetch, cross-market routing, streaming panel, partial-bar handling, live observability) #510

Description

@MDUYN

Pipeline API — Phase 3 follow-ups (live hardening, residual work)

Tracks the remaining building blocks for production-grade live
pipeline trading after #503 phase 3a–3d shipped (PRs #508 + #509).
Phase 3a (warmup validation), 3b (live envelope), 3c
(refresh_universe_every cache), and 3d (per-pipeline resilience)
are all merged in feat/503-pipeline-api-phase3-live.

Remaining building blocks

1. Batched / async OHLCV fetch in CCXTOHLCVDataProvider

CCXT's fetch_ohlcv is per-symbol. For pipelines spanning many
symbols this serialises the cold path of every iteration. We need a
rate-limit-aware concurrent fan-out (asyncio + a bounded semaphore
keyed off exchange.rateLimit). This was deliberately split off
from #503 because the failure modes (per-exchange rate-limit drift,
partial fills under burst) really need live integration testing
against a real exchange before we're confident.

Acceptance

  • Concurrent fetch with rate-limit awareness keyed on the CCXT
    exchange's reported limit.
  • A single failing per-symbol fetch is logged and the surviving
    symbols still produce a panel (already enforced at the engine
    level by phase 3d, but we want it at the provider level too so
    the pipeline gets null for the failed symbol rather than a
    silent retry storm).
  • 24h paper-trading run on Bitvavo with a 20-symbol daily universe.

2. Cross-market order routing for pipeline universes

A pipeline universe spanning, e.g., BITVAVO + BINANCE should
place orders on the right portfolio. Today the order layer assumes
one strategy → one market. Likely needs a market column on the
pipeline output the order helper can pivot on, plus a portfolio
lookup keyed by (strategy_id, market). Flagged in #503 as
"likely its own sub-issue" — confirmed.

Acceptance

  • Pipeline output includes (or makes resolvable) the source market
    per row.
  • context.create_order(symbol, market=...) dispatches to the
    matching portfolio.
  • Test covering a 2-market universe placing orders on each.

3. Streaming / append-only panel updates

Today every iteration rebuilds the long-form panel from scratch.
For high-frequency live use this is wasteful — we already have all
historical bars in memory, only the latest bar is new. We should
support a streaming mode that appends the new bar(s) and re-
evaluates incrementally.

Acceptance

  • A pipeline engine variant that keeps the panel resident across
    iterations and grows it by appending new bars.
  • Cache invalidation when as_of jumps backwards (e.g. backtest
    re-runs).
  • Memory benchmarks vs. full rebuild on a 50-symbol / 1y / 1d
    panel.

4. Partial-bar / unclosed-candle handling

A live fetch_ohlcv typically returns the in-progress (unclosed)
bar as the last row. Pipelines must not trade on that bar by
default — it changes second-to-second and produces unstable
factor values. We need an explicit policy:

  • Default: drop the in-progress bar from the panel before
    evaluation.
  • Opt-in: Pipeline.allow_partial_bar = True for strategies that
    explicitly want the live bar (e.g. intra-day momentum probes).

Acceptance

  • Detection of partial bars (compare last datetime to the
    timeframe's expected close).
  • Panel filtering at engine level keyed off the pipeline class
    attribute.
  • Documented opt-in escape hatch.

5. Live observability hooks for pipeline output

Today pipeline output goes into data["YourPipelineClassName"]
and that's it. For live runs we want:

  • An on_pipeline_output strategy hook (parallel to
    on_strategy_run) that fires per-pipeline per-iteration with the
    resolved frame.
  • A built-in logger snapshot (latest pipeline output to disk on
    each iteration, rolled by date) for after-the-fact debugging of
    live runs.
  • Optional integration with the existing portfolio snapshot
    cadence so pipeline state is captured alongside portfolio state.

Acceptance

  • New hook documented in docusaurus/docs/Advanced Concepts/pipelines-live.md.
  • Log/snapshot writer with bounded retention.
  • Test covering hook invocation order and snapshot content.

Out of scope (still)

  • Sub-daily live pipelines.
  • Universes > 50 symbols.
  • Sector / industry classifiers (no live data source).

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions