|
| 1 | +# AllSpark Edge Control Plane - Architecture & Design Decisions |
| 2 | + |
| 3 | +This document captures the core architectural choices and design patterns implemented for the AllSpark Edge Control Plane. |
| 4 | + |
| 5 | +## 1. Framework Selection: NiceGUI |
| 6 | +**Decision:** We chose [NiceGUI](https://nicegui.io/) over traditional decoupled SPA frameworks (like React, Vue, or Angular) or heavy template rendering (like Django/Jinja). |
| 7 | +**Rationale:** |
| 8 | +- **Python-Native:** NiceGUI allows the entire frontend and backend logic to reside in pure Python. This ensures that the AI/backend engineers maintaining the AllSpark ecosystem can easily extend the UI without needing specialized frontend context switching. |
| 9 | +- **Reactive Data Binding:** It provides Vue-like reactivity natively in Python, making it trivial to auto-update lists, tables, and graphs without writing websocket/polling boilerplate manually. |
| 10 | + |
| 11 | +## 2. The Sidecar Architecture Pattern |
| 12 | +**Decision:** The control plane executes as a completely detached **Sidecar Process** (`python control_plane/main.py`) rather than being tightly integrated into the main `aiohttp` edge server (`server.py`). |
| 13 | +**Rationale:** |
| 14 | +- **Isolation of Concerns:** The primary Edge Server manages high-frequency WebSocket streams, large QUIC video blob uploads, and Bonjour service discovery. Keeping the UI rendering and polling decoupled ensures that an expensive UI redraw or long-running query doesn't block the asyncio event loop handling critical edge ingestion. |
| 15 | +- **Port Offset:** The sidecar automatically reads the Edge Server's configured port (e.g., `8080`) from `config.json` and binds itself to `port + 1` (e.g., `8081`), ensuring no port collisions while remaining predictable. |
| 16 | + |
| 17 | +## 3. Single Source of Truth Configuration |
| 18 | +**Decision:** Both the Edge Server and the Sidecar UI read from a unified `config.json` in the root `edge_server/` directory. |
| 19 | +**Rationale:** |
| 20 | +- **Bootstrapping:** Whichever service boots first (typically `server.py` or the `node` equivalent) checks for `config.json`. If it's missing, it dynamically generates it with internal defaults. |
| 21 | +- **Synchronization:** The control plane reliably knows exactly where the `uploadPath` is located, what IP constraints exist, and what the base port is without duplicated environment variables. |
| 22 | + |
| 23 | +## 4. Integration Strategies for Edge Data |
| 24 | + |
| 25 | +We utilized three distinct strategies for populating the reactive UI, optimizing for speed and footprint: |
| 26 | + |
| 27 | +### A. Polling for Client State (REST API) |
| 28 | +- **Method:** `aiohttp` loop polling `/api/status` via `ui.timer(5.0)`. |
| 29 | +- **Reasoning:** Rather than constructing an inter-process WebSocket bridge between the Edge Server and the Sidecar to sync mobile rig connections, the sidecar leverages the Edge Server's existing lightweight REST API. |
| 30 | + |
| 31 | +### B. Shared File Mounts (Capture Browser) |
| 32 | +- **Method:** `os.path` and `glob` traversal to `app.add_media_files()`. |
| 33 | +- **Reasoning:** Instead of creating file transfer endpoints, the Sidecar reads the absolute system path defined in `config.json` (`uploads/orgs/default/...`). It dynamically builds UI cards based on what exists on disk in real-time, allowing native HTML5 `video` playback over HTTP immediately. |
| 34 | + |
| 35 | +### C. Direct Broker Attachment (MQTT Anomalies) |
| 36 | +- **Method:** Background `paho-mqtt` thread processing wildcard topics (`#`). |
| 37 | +- **Reasoning:** System anomalies generated by AllSpark agents bypass the core Edge Server entirely and hit the local MQTT broker (`1883`). The control plane runs a dedicated Paho subscriber thread that captures, structures, and limits anomaly events in memory, pushing updates to the UI natively. |
| 38 | + |
| 39 | +## 5. Rerun.io Data Plane Mocking |
| 40 | +**Decision:** Integrated an `iframe` pointing to a local `rerun-sdk` web viewer (`http://localhost:9090`). |
| 41 | +**Rationale:** Allows the control plane to seamlessly wrap complex, high-performance rust-rendered robotics 3D visualizations inside the standard Python UI workflow until a unified authentication and embedding pipeline is established. |
| 42 | + |
| 43 | +--- |
| 44 | + |
| 45 | +## Next Steps & Future Work |
| 46 | + |
| 47 | +While Phase 1 and 2 established the decoupled control plane, the following steps are planned for the evolution of the AllSpark Edge architecture: |
| 48 | + |
| 49 | +1. **Native Integration (Merging the Sidecar):** |
| 50 | + - Although the detached Sidecar pattern (running on `port + 1`) is currently used to guarantee the Edge Server's critical event loop remains unblocked, the ultimate goal is to **natively integrate** the control plane into the main Edge Server application. This will unify the deployment footprint and eliminate the need to run two separate Python processes. |
| 51 | + |
| 52 | +2. **Real Rerun.io Server Integration:** |
| 53 | + - Replace the `dummy_rerun_server.py` mock with the actual `rerun-sdk` data integration pipeline. This involves piping the live telemetry and `.quic` video streams directly into the native Rerun data plane for real-time 3D and spatial debugging. |
| 54 | + |
| 55 | +3. **Expanded Configuration & Data Discovery:** |
| 56 | + - Extend the `config.json` schema to encompass a broader range of telemetry locations and application logs. |
| 57 | + - Update `pages/capture.py` to ingest and parse these diverse logs globally, providing a centralized diagnostic view beyond just the MQTT streams and media uploads. |
| 58 | + |
| 59 | +4. **Agentic Framework Hookup:** |
| 60 | + - Replace the UI stubs in `pages/agent.py` with actual programmatic calls to the underlying AllSpark Agentic nodes. This will allow the control plane to seamlessly dispatch identified MQTT anomalies directly to the Vertex/LLM-Farm agents and display their diagnostic reports dynamically within the dashboard. |
| 61 | + |
| 62 | +5. **Unified Security & Authentication:** |
| 63 | + - Sync the SSL context and authentication layers from the main Edge app into the control plane to ensure that remote debugging over the network is properly secured with JWT or token-based guards. |
0 commit comments