Skip to content

Latest commit

 

History

History
60 lines (44 loc) · 3.42 KB

File metadata and controls

60 lines (44 loc) · 3.42 KB

Changelog

All notable changes to the Sub-Microsecond Execution Engine will be documented in this file.

[v2.4.0] - 2025-12-30

Added

  • Jitter Profiler & Stall Detector: New JitterProfiler class to detect micro-stalls caused by OS interrupts (SMI, Context Switches) in the busy-wait loop.
    • Why it helps: "Mean latency" metrics hide tail events. HFT requires consistent variance (jitter). This tool flags if the isolated core is truly isolated or being interrupted.
  • Explicit L1 Cache Prefetching: Added prefetch_L1 and prefetch_next_line utilities.
    • Why it helps: Allows hot-path loops to pull the next 64-byte cache line into L1 cache before it's needed, hiding the ~80ns DRAM latency.
  • Cache Warming & Hot-Path Optimization: Implemented a 50,000-iteration "Warm-Up Phase" before market data ingestion starts.
    • Why it helps: Pre-faults instruction pages into i-cache, populates the TLB, and trains CPU branch predictors. This ensures the very first market tick is processed with fully optimized CPU state, eliminating "cold start" latency spikes.

[v2.3.0] - 2025-12-30

Added

  • SIMD-Accelerated Alpha Extraction: Replaced scalar feature extraction loops with hand-optimized AVX2 (x86) and NEON (ARM) intrinsics within simd_features.hpp.
    • Why it helps: Shifts the "Order Flow Imbalance" (OFI) and volume imbalance calculations from sequential $O(Depth)$ to effectively constant hardware-parallel time. This mimics FPGA-like pipelining in software, directly reducing the critical path latency for signal generation.

[v2.2.0] - 2025-12-28

Added

  • Vectorized Multi-Kernel Hawkes Engine: Implemented a SIMD-accelerated engine supporting 4 simultaneous kernels.
    • Why it helps: Markets react at different speeds; this captures both microsecond liquidity bursts and millisecond price trends simultaneously, significantly improving signal accuracy without increasing latency.
  • Engagement Framework: Added docs/ENGAGEMENTS.md and GitHub issue templates.
    • Why it helps: Provides a structured pathway for institutional partners and research labs to request custom hardware (FPGA/NIC) integrations or collaborative research without public disclosure.

Improved

  • Build System: Updated scripts/build_all.sh with automatic macOS Homebrew path detection.
    • Why it helps: Streamlines developer onboarding and ensures seamless local testing on modern Apple Silicon (ARM64) research environments.
  • Documentation: Simplified README and organized commercial support sections.

[v2.1.0] - 2025-12-16

Enhancement

  • 890ns End-to-End Latency: Optimized decision pipeline down to 890ns median.
  • Institutional Logging: Implemented 7-layer professional logging (NIC, TSC, PTP, etc.).
  • Deterministic Replay: Bit-identical verification with SHA-256 manifests.

[v2.0.0] - 2025-12-15

Added

  • Custom NIC Drivers: Support for Intel X710 and Mellanox ConnectX hardware mapping.
  • Solarflare ef_vi: Integration for X2522/X2542 series.
  • Kernel Bypass: Lock-free DPDK-style interface.

[v1.5.0] - 2025-12-10

Added

  • AVX-512 SIMD: Vectorized OFI computation (40ns).
  • Vectorized ML: FPGA-simulated inference pipeline (400ns fixed latency).

[v1.0.0] - 2025-12-01

Added

  • Initial High-Frequency Trading skeleton.
  • Hawkes Process Engine (Single-Kernel).
  • Avellaneda-Stoikov Market Making strategy.
  • Lock-Free SPSC Queues and Atomic Risk Controls.