Skip to content

Commit e99de35

Browse files
committed
feat(tier1): Initialize Global Sentinel Tier 1 HA Architecture
- Formalized the QuanuX Physical Laws of High Availability in `.agent/skills/tier1_ha_skill.md`. - Documented the Active-Passive NATS Supercluster architecture and global routing realities in `docs/architecture/high_availability.md`, accounting for BGP Anycast propagation ("The Long-Dark"). - Established the STONITH Apoptosis security protocols in `docs/security/failover_protocols.md`, mandating Out-Of-Band (OOB) hardware fencing over primary network channels. - Drafted the HA implementation roadmap in `docs/planning/HA_IMPLEMENTATION_PLAN.md`, explicitly anchoring the GlobalSentinelLoop into the FastAPI lifespan and enforcing Typer CLI standards for `quanuxctl`. - Enforced strict state segregation between NATS (Control State) and Hybrid Analytical Memory (DuckDB/HDF5/NAS choice).
1 parent d16941c commit e99de35

4 files changed

Lines changed: 174 additions & 0 deletions

File tree

.agent/skills/tier1_ha_skill.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
---
2+
description: The Physical Laws of Global High Availability (HA) for QuanuX Tier 1 Leader Nodes and the Active-Passive NATS Supercluster.
3+
---
4+
5+
# QuanuX Tier 1 High Availability (HA) Skill File
6+
7+
> [!IMPORTANT]
8+
> **IMMORTAL GUIDANCE:** This document establishes the absolute physical and cryptographic laws of the QuanuX Tier 1 Global Supercluster. Any code, architecture, or CLI tools you write concerning Tier 1 routing, orchestration, or node control MUST strictly adhere to this protocol.
9+
10+
## 1. The NATS JetStream KV Lock (Leader Election)
11+
The Tier 1 Leader is determined *strictly* by who holds the `quanux.tier1.leader` key via Raft consensus.
12+
- There is no ambiguous state. A node is either the Leader or a Follower.
13+
- Holding the KV lock grants the Tier 1 server total god-mode over the Global Sentinel network.
14+
- When the Leader's TTL (Time-To-Live) on the lock expires without a heartbeat, the Raft consensus automatically allows a hot-standby Follower to acquire the lock and promote itself.
15+
16+
## 2. Active-Passive Global Nodes
17+
Only **ONE** Tier 1 Server acts as the active orchestrator.
18+
- **The Leader:** Actively commands the Nests, deploys risk updates, and emits orchestration directives.
19+
- **The Followers:** Maintain absolute silence. They are hot-standby replicas observing the NATS JetStream state, passively syncing to the exact nanosecond of the Leader's event log without taking any authoritative action.
20+
21+
## 3. The STONITH Fencing Law (Preventing Split-Brain)
22+
This is the most critical axiom of QuanuX high availability. If a Follower promotes itself to Leader, its *very first cryptographic act* MUST be to execute a precise Fencing operation.
23+
- **The OOB Network Mandate:** Fencing must be executed via IPMI/PDU over a strictly separate Out-Of-Band (OOB) Ethernet network. Network partitions mean cryptographic tokens or SSH will fail; hard-power cycling is the only absolute proof of death.
24+
- **No Infinite Blocking:** The STONITH sequence MUST have a severe hard-timeout (e.g., 2000ms). If it fails to reach the BMC/PDU, it must abort and transition to `CRITICAL_PENDING` with alarms sent to the Architect. It cannot block the event loop infinitely.
25+
- We cannot permit a Split-Brain reality where two Tier 1 nodes believe they are the Leader and issue conflicting logic to the Tier 4 Fiber Nests.
26+
27+
## 4. BGP Convergence & The "Long-Dark"
28+
- Do not assume immediate route convergence. Global BGP shifts take 3 seconds to 3 minutes.
29+
- Tier 4 Nests must be programmed to survive the "Long-Dark," halting *new* entries and executing *existing* exit logic blindly until routes converge.
30+
31+
## 5. State Segregation (NATS vs. Analytical Storage)
32+
- **NATS JetStream** holds ONLY the Control State (active deployments, risk updates).
33+
- **Analytical Storage (Hybrid Choice)** holds historic/analytic Memory (tick data, deep backtest results). The end-user configures whether this is DuckDB, HDF5, or another NAS. HA failover only guarantees NATS orchestration.
34+
35+
## 6. The CLI Authority (`quanuxctl`)
36+
Automated Raft consensus governs normal operation, but the Architect commands the system via `quanuxctl`.
37+
- `quanuxctl` is the ONLY manual interface authorized to override the Raft election.
38+
- Use `quanuxctl` to force failovers, demote a struggling Leader, or permanently fence a rogue node from the cluster.
39+
- When generating strategy deployment or node orchestration logic, remember that `quanuxctl` can seamlessly interrupt or redirect the system flow through these administrative paths.
40+
41+
**When analyzing or extending QuanuX HA features, you must always verify that your solution complies with OOB STONITH limits, BGP delay realities, Analytical State Segregation, and the singular authority of the CLI.**
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# QuanuX Global High Availability Architecture
2+
3+
This document dictates the design and structural reality of the QuanuX Active-Passive NATS Supercluster—a globally distributed Control Plane orchestrating decentralized Tier 4 Nests in multiple financial hubs (e.g., Aurora, Carteret, Frankfurt).
4+
5+
## 1. The Global Sentinel Vision
6+
QuanuX operates under the paradigm that the Data Plane is ruthless and hyper-localized (C++ execution loops over bare-metal Linux sockets), while the Control Plane is globally distributed, highly available, and impervious to catastrophic localized failure.
7+
8+
To achieve this, QuanuX leverages an **Active-Passive Global Tier 1** mechanism backed by a NATS JetStream Supercluster.
9+
10+
### The Split
11+
* **Data Plane (Tick-to-Trade):** 59ns lock-free Dual-Thread execution in Tier 4 Fiber Nests. This layer operates out-of-band of the central node.
12+
* **Control Plane (Orchestration & Risk):** Synchronous, deterministic, Raft-driven governance managed by the Tier 1 Global Leader Server.
13+
14+
## 2. Infrastructure Anatomy
15+
The Tier 1 Control Plane is structured via **Leader Election**:
16+
* **Tier 1 Leader:** The sole commander holding the JetStream KV lock (`quanux.tier1.leader`). Responsible for emitting immutable orchestration logs, adjusting risk metrics, and managing the Biological Lore (e.g., triggering Apoptosis).
17+
* **Regional Followers:** Live in datacenters worldwide. They are silent hot-standbys that persist the JetStream event log.
18+
19+
---
20+
21+
## 3. Failover Sequence: The Millisecond Anatomy of a Crash
22+
23+
When a Tier 1 Leader experiences physical destruction, network segmentation, or fatal OS panic, the QuanuX cluster executes a mathematically deterministic failover protocol.
24+
25+
### Step 1: Leader TTL Expiration
26+
The Tier 1 Leader sends a high-frequency heartbeat holding the JetStream lock. If the cluster goes `N` milliseconds without a heartbeat, the lock's TTL expires.
27+
28+
### Step 2: Follower Promotion
29+
Raft consensus awakens the Followers. The fastest Follower (usually the geographically closest with the best ping to the quorum) instantly seizes the `quanux.tier1.leader` KV lock. At this point, it is logically the Leader.
30+
31+
### Step 3: STONITH Apoptosis (Fencing)
32+
Before issuing a single command to the edge nodes, the new Leader issues a mathematically guaranteed kill-pill—a `STATE_HALT` command (Apoptosis)—directed at the physical ID of the fallen Leader.
33+
* *Why?* It prevents a "Split-Brain." If the old Leader was severed by a **Network Partition**, it cannot be reached via normal SSH or software protocols. Fencing MUST occur over a strictly separate **Out-Of-Band (OOB) Hardware Management Network** (IPMI/iLO/PDU) to physically cut its power. Otherwise, the old Leader will resurrect locally when the partition heals, creating chaotic dual-command horizons.
34+
35+
### Step 4: Event Sourcing & Deterministic Replay
36+
The new Leader replays the last uncommitted NATS JetStream log. By traversing the deterministic event sequence of the entire cluster, the new Leader rebuilds the exact working memory and risk state that the old Leader possessed micro-seconds before crashing. No configurations or deployments are dropped.
37+
38+
### Step 5: Global Anycast IP & The "Long-Dark" Reconnection
39+
* **Control Plane Routing:** Tier 1 IPs are configured as Virtual IPs (VIPs) using BGP Anycast. When the failover occurs, the new Leader triggers a BGP route update to shift traffic globally.
40+
* **The "Long-Dark" Survival Mode:** The execution edge Nests detect a ping timeout. BGP route convergence across the global internet requires anywhere from 3 seconds to 3 minutes. The Nests do **NOT** panic, but they understand the reality of the propagation delay.
41+
* **Ritchie FSM (Finite State Machine):** During the blackout, Nests throttle or completely halt *new* strategy entries locally. They rely on their FSM to blindly execute *existing* exit logic via raw sockets. When the BGP routes finally converge, the physical internet seamlessly routes them to the new Leader node. Connection restored. State synchronized.
42+
43+
## 4. Summary
44+
The QuanuX High Availability Architecture bridges biological resilience with institutional-grade networking. By coupling Raft election to STONITH fencing and separating the Control Plane from the localized Execution Plane, the cluster can dynamically survive the loss of master operational nodes worldwide.
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# HA Implementation Roadmap & CLI Expansion Plan
2+
3+
This document serves as the exact engineering blueprint for migrating the QuanuX Tier 1 Server from a single-node Python orchestrator into a fully distributed Global Supercluster.
4+
5+
## Phase 1: NATS JetStream vs Analytical Boundary
6+
**Objective:** Replace local state management with a globally distributed KV lock while enforcing the boundary between Execution State and Analytical Memory.
7+
8+
1. **Initialize Global Bucket:** Create a NATS JetStream Key-Value bucket named `quanux_cluster_state` replicated across $N$ geographic regions.
9+
2. **Lock Definition:** Define the primary lock `quanux.tier1.leader`.
10+
3. **State Reflection (The Dichotomy):**
11+
- **Control State (NATS JetStream):** Risk profiles, Supervisor limits, active deployments, and real-time orchestration events are strictly appended to the JetStream Event Log for immediate HA replay.
12+
- **Analytical State (Hybrid):** Historic tick data, heavy backtesting memory, and massive order-flow archives must NEVER be stored in NATS. They are safely offloaded to a user-configured storage engine (DuckDB, HDF5, NAS). HA Failover strictly guarantees the Control State, not the analytics volume.
13+
14+
## Phase 2: Python Background Leader Election Loop
15+
**Objective:** Implement the Active-Passive heartbeat and Raft observer within the FastAPI backend with defensive STONITH timeouts.
16+
17+
1. **The Observer Task (FastAPI Lifespan):** The Tier 1 Server is driven by FastAPI. The background `asyncio` task (`GlobalSentinelLoop`) does not exist in a vacuum; it is spun up and managed by the FastAPI lifespan context manager (or startup/shutdown events). Furthermore, the FastAPI routing logic must be Raft-aware: if the node is a Follower, specific orchestration endpoints must automatically reject or redirect traffic until the node is promoted to Leader.
18+
2. **Heartbeat Maintenance:** If the server is Leader, it writes the current timestamp to `quanux.tier1.leader` every `50ms`.
19+
3. **The Watcher:** If the server is a Follower, it establishes a JetStream Watcher on the lock. If the lock's TTL is exceeded (e.g., no update for `250ms`), the server attempts a targeted `Update` with its own `Node_ID` to seize the lock.
20+
4. **Apoptosis Hook (Defended):** Upon acquiring the lock, the backend triggers `execute_stonith(old_leader_id)`. **CRITICAL:** This call must have a strict hard-timeout (e.g., `2000ms`). If the IPMI interface of the dead datacenter is offline, it cannot block infinitely. If the script hits the timeout, it abandons the lock, enters a `CRITICAL_PENDING` state, and fires a severe alarm via `quanuxctl`/SMS/PagerDuty to the Architect.
21+
5. **State Rehydration:** Once Fencing is verified, the server replays the NATS Event Log to rehydrate the application state and begins accepting `quanuxctl` and Nest connections.
22+
23+
## Phase 3: The `quanuxctl` CLI Expansion
24+
**Objective:** Grant the Architect "God-Mode" over the Raft cluster and manual failover hierarchy. Any automated clustering protocol must have deterministic manual overrides.
25+
26+
The `quanuxctl` CLI will be expanded to include the `cluster` command group.
27+
28+
### `quanuxctl cluster status`
29+
* **Action:** Queries NATS for the telemetry of the global supercluster.
30+
* **Output:**
31+
* Identifies the current **Leader** (Node ID, Region, Uptime).
32+
* Lists all **Followers** (Node IDs, Ping to Leader, Replay Lag).
33+
* Displays the health and state of the `quanux.tier1.leader` lock.
34+
35+
### `quanuxctl cluster promote <node_id>`
36+
* **Action:** Forces a manual Raft election override.
37+
* **Execution:** Administratively commands the current Leader to drop the lock and artificially boosts the priority/election-timer of the specified `<node_id>` so it is guaranteed to become the new Leader.
38+
* **Use Case:** Pre-emptive maintenance of a datacenter or shifting latency footprints before major economic releases.
39+
40+
### `quanuxctl cluster demote`
41+
* **Action:** Forces the current Leader to step down gracefully without explicitly assigning a successor.
42+
* **Execution:** The Leader deletes its lock on `quanux.tier1.leader` and enters a 5-second backoff period where it refuses to vote OR run for election, allowing the remaining Followers to elect a new Leader.
43+
44+
### `quanuxctl cluster fence <node_id>`
45+
* **Action:** Manually triggers STONITH (Apoptosis) against a rogue or "zombie" node.
46+
* **Execution:** Bypasses Raft consensus entirely and immediately fires the deepest available Fencing mechanism (Cryptographic -> OS -> Hardware) against a specific Node ID.
47+
* **Use Case:** Resolving complex network splits or permanently blinding a node that has been compromised or is behaving erratically outside of normal cluster logic.
48+
49+
---
50+
**Execution Mandate:** Development must proceed linearly from Phase 1 to Phase 3. The foundational AI context (tier1_ha_skill.md) provides the parameters. Code generation algorithms are to strictly reference this plan when structuring the `quanuxctl` Typer framework extensions.
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# QuanuX Security & Failover Protocols
2+
3+
## 1. Overview
4+
The security of the QuanuX Tier 1 Global Supercluster relies equally on cryptography and brutal pragmatism. When a failover event is triggered, the cluster cannot assume the old Leader is "dead" simply because it vanished from the network quorum; it must be **proven** dead. This document formalizes the STONITH (Shoot The Other Node In The Head) protocol, also known as the Apoptosis Directive.
5+
6+
## 2. The STONITH Protocol (Apoptosis)
7+
A "Split-Brain" scenario—where two Tier 1 Orchestrators believe they are the Leader and stream conflicting commands to Execution Nests—is an unrecoverable, catastrophic event.
8+
9+
### 2.1 The Prime Directive of Promotion
10+
When a Follower successfully acquires the `quanux.tier1.leader` NATS JetStream KV lock via Raft consensus, **it is strictly prohibited from emitting orchestration commands** until it has fully terminated the previous Leader.
11+
12+
### 2.2 Fencing Mechanisms
13+
The new Leader will execute a `STATE_HALT` kill-pill against the fallen Leader's `Node_ID`. The stark reality of distributed systems is that **Network Partitions** render software fencing useless. If the old Leader is unreachable because its primary switch failed, it is still alive locally. Sending a cryptographic token or trying to SSH into it will fail. Thus, QuanuX prioritizes physical infrastructure segregation:
14+
15+
1. **Hardware Fencing (IPMI / iLO / PDU) via Out-Of-Band (OOB) Network:**
16+
**[PRIMARY MANDATE]** Hardware physical power fencing is the ONLY valid STONITH in a network split. The new Leader sends a kill-pill over a physically separate OOB ethernet network wired to a distinct management switch. It connects directly to the Baseboard Management Controller (BMC) or the Power Distribution Rack to physically kill power to the old Leader.
17+
2. **OS-Level Fencing (SSH / API Kill):**
18+
*(Secondary)* If the OOB network is unavailable but the OS is reachable, the new Leader issues a kernel-level `sysrq-trigger` or tears down the supervisor.
19+
3. **Cryptographic Fencing (The Ritchie Protocol Extension):**
20+
*(Tertiary/Fallback)* The new Leader broadcasts a signed global banishment token across JetStream. NATS will brutally reject any further connections originating from the old Leader's TLS identity, acting as a final logical fence if power cannot be cut.
21+
22+
*If all fencing attempts fail or block, the new Leader must ABORT its lock acquisition via a hard-timeout to prevent the "Infinite Blocking Trap" and log a `CRITICAL_PENDING` alarm to the Architect.*
23+
24+
## 3. The Millisecond Timeline of Failover Execution
25+
The life-cycle of a high-availability failover across continents happens far faster than a human operator can react—up until the grim reality of global BGP propagation.
26+
27+
* `T+0ms`: The active Tier 1 Leader (e.g., Aurora) physically crashes or drops its network connection to the Quorum.
28+
* `T+Nms`: The `quanux.tier1.leader` JetStream lock's TTL (Time-To-Live) heartbeat expires.
29+
* `T+N+10ms`: Raft election begins. Followers in Carteret and Frankfurt detect the missing leader.
30+
* `T+N+25ms`: The Frankfurt node, having a faster consensus ping with the remaining Quorum, acquires the `quanux.tier1.leader` lock.
31+
* `T+N+30ms`: **Apoptosis Initiated.** Frankfurt fires a hardware kill-pill (STONITH) to Aurora's physical IPMI over the OOB network.
32+
* `T+N+45ms`: **Event Sourcing Replay.** Frankfurt replays the JetStream log in deterministic order to rebuild its risk and orchestration state, adopting the precise memory footprint of Aurora right before the crash.
33+
* `T+N+50ms`: **BGP Route Advertisement.** The VIP routing shifts. Frankfurt broadcasts it is now the origin of the Control Plane IP via BGP Anycast (or GARP for intra-DC local failovers).
34+
* `T+N+55ms`: **The Long-Dark Blackout Begins.** While GARP works instantly locally, BGP convergence across the global internet takes anywhere from 3 to 180 seconds. The Edge Nests enter the "Long-Dark".
35+
* `T+2000ms - T+120s`: **Edge Nest Survival (The Ritchie FSM).** The Tier 4 Fiber Nests know the Control Plane is dark. They throttle or halt new strategy entries locally, blindly executing their existing exit logic and risk liquidation checks via raw sockets until global routing converges.
36+
* `T+Route_Converged`: **Reconnection.** The global physical network finally routes the VIP to Frankfurt. Nests seamlessly reconnect. Standard Operation Resumes.
37+
38+
## 4. Unattended Operations
39+
Tier 3 and Tier 4 Nests are biologically resilient. If the entire Control Plane is briefly severed, Nests will freeze non-essential risk updates but **will continue deterministic order execution and local liquidation checks** via raw sockets. They are designed never to panic in the dark.

0 commit comments

Comments
 (0)