|
| 1 | +# Future Runtime: event bus, state, and shutdown |
| 2 | + |
| 3 | +## Scope |
| 4 | + |
| 5 | +This file defines the future `src/runtime/` boundary for the Polymarket CLI. |
| 6 | +The runtime layer owns process-wide coordination that does not belong inside a |
| 7 | +single command, adapter, or renderer: |
| 8 | + |
| 9 | +- a typed event bus for internal status changes, |
| 10 | +- a small state machine for lifecycle reporting, |
| 11 | +- a shutdown path that drains in-flight work before the process exits, |
| 12 | +- a stable API that command modules can use without depending on a specific |
| 13 | + async executor. |
| 14 | + |
| 15 | +The near-term target module is `src/runtime/mod.rs`. Follow-up modules can split |
| 16 | +this into `event_bus.rs`, `state.rs`, and `shutdown.rs` once the first runtime |
| 17 | +stub lands. |
| 18 | + |
| 19 | +## Mission |
| 20 | + |
| 21 | +Commands should be able to start workers, publish progress, react to shutdown, |
| 22 | +and expose state to the TUI or JSON output without each command inventing its |
| 23 | +own channels and signal handling. The runtime should stay boring: predictable |
| 24 | +types, bounded queues, explicit shutdown reasons, and no hidden background work. |
| 25 | + |
| 26 | +## Current state |
| 27 | + |
| 28 | +Runtime behavior is mostly implicit. Command handlers own their own loops and |
| 29 | +shutdown decisions. State is usually printed directly, which makes it hard to |
| 30 | +offer the same information to logs, the TUI, and machine-readable output. The |
| 31 | +existing protocol file, `docs/future/PROTOCOL.md`, contains the broad future |
| 32 | +catalog, but this split file is the runtime-specific contract for Phase B. |
| 33 | + |
| 34 | +## Pain points |
| 35 | + |
| 36 | +- [Runtime-1] Event shape is not centralized. A command can emit a message that |
| 37 | + another command cannot parse, so shared output and monitoring stay brittle. |
| 38 | +- [Runtime-2] State transitions are inferred from logs instead of represented |
| 39 | + as data. That makes resume, status, and graceful shutdown difficult to test. |
| 40 | +- [Runtime-3] Shutdown semantics are inconsistent. Some paths exit immediately, |
| 41 | + while others wait for work to finish, and callers cannot tell which occurred. |
| 42 | +- [Runtime-4] Backpressure is undefined. A hot feed can overwhelm a slow UI or |
| 43 | + logger because the queue policy is not named. |
| 44 | +- [Runtime-5] Async runtime choice is premature. The CLI needs an interface that |
| 45 | + can be implemented with synchronous tests first and swapped to Tokio later. |
| 46 | + |
| 47 | +## Proposals |
| 48 | + |
| 49 | +### [Runtime-1] Typed runtime events |
| 50 | + |
| 51 | +Define `RuntimeEvent` as the only event shape that crosses runtime boundaries. |
| 52 | +Initial variants should cover lifecycle transitions, command progress, |
| 53 | +warnings, and shutdown requests. Payloads should be small and cloneable. |
| 54 | + |
| 55 | +Acceptance: |
| 56 | + |
| 57 | +- event producers call `EventBus::publish(RuntimeEvent)`, |
| 58 | +- command code never sends raw strings through runtime channels, |
| 59 | +- every event variant has a documented consumer expectation. |
| 60 | + |
| 61 | +### [Runtime-2] Explicit lifecycle state |
| 62 | + |
| 63 | +Represent runtime status as `RuntimeState`, with `Starting`, `Running`, |
| 64 | +`Draining`, `Stopped`, and `Failed` states. State changes should be emitted as |
| 65 | +events and stored as the last-known state for status commands. |
| 66 | + |
| 67 | +Acceptance: |
| 68 | + |
| 69 | +- a new command can report lifecycle through state transitions only, |
| 70 | +- tests can assert state without parsing logs, |
| 71 | +- failure carries a short reason string or typed error code. |
| 72 | + |
| 73 | +### [Runtime-3] Graceful shutdown contract |
| 74 | + |
| 75 | +Introduce `ShutdownSignal` with a reason and drain policy. The default policy is |
| 76 | +soft drain: stop accepting new work, emit a shutdown event, flush queued events, |
| 77 | +then exit. Hard stop is reserved for corrupted state, repeated signal delivery, |
| 78 | +or operator-requested abort. |
| 79 | + |
| 80 | +Acceptance: |
| 81 | + |
| 82 | +- one shutdown request is idempotent, |
| 83 | +- a second stronger request can escalate to hard stop, |
| 84 | +- shutdown state is visible to TUI and JSON callers before exit. |
| 85 | + |
| 86 | +### [Runtime-4] Bounded event queue |
| 87 | + |
| 88 | +Use a bounded in-memory queue for the first implementation. When full, preserve |
| 89 | +shutdown and failure events, then drop low-priority progress events with a |
| 90 | +counter. This keeps the process responsive under noisy market feeds. |
| 91 | + |
| 92 | +Acceptance: |
| 93 | + |
| 94 | +- queue capacity is configurable in `RuntimeConfig`, |
| 95 | +- dropped progress count is exposed as runtime state, |
| 96 | +- shutdown and failure events are never silently dropped. |
| 97 | + |
| 98 | +### [Runtime-5] Executor-neutral API |
| 99 | + |
| 100 | +Keep the first `src/runtime/` API synchronous and dependency-light. A future |
| 101 | +Tokio-backed implementation can sit behind the same bus and shutdown types |
| 102 | +without forcing every command module to become async. |
| 103 | + |
| 104 | +Acceptance: |
| 105 | + |
| 106 | +- the runtime stub compiles without external crates, |
| 107 | +- command modules can own their own blocking work while publishing events, |
| 108 | +- async integration is deferred until a command demonstrates a real need. |
| 109 | + |
| 110 | +Deferred: |
| 111 | + |
| 112 | +- OS signal registration is deferred to the command runner because it depends on |
| 113 | + the final CLI entrypoint shape. |
| 114 | +- Cross-process event persistence is deferred until storage design is ready. |
| 115 | + |
| 116 | +## Runtime state model |
| 117 | + |
| 118 | +The state model should be small enough to render in one status row: |
| 119 | + |
| 120 | +| State | Meaning | Exit behavior | |
| 121 | +| --- | --- | --- | |
| 122 | +| `Starting` | Runtime is building command resources. | no exit | |
| 123 | +| `Running` | Runtime accepts events and work. | no exit | |
| 124 | +| `Draining` | Runtime rejects new work and flushes current work. | exits after drain | |
| 125 | +| `Stopped` | Runtime completed normally. | exit code 0 | |
| 126 | +| `Failed` | Runtime stopped with a failure reason. | non-zero exit | |
| 127 | + |
| 128 | +State transitions should be monotonic after shutdown starts. `Draining` can move |
| 129 | +to `Stopped` or `Failed`; it should not move back to `Running`. |
| 130 | + |
| 131 | +## Event bus contract |
| 132 | + |
| 133 | +The event bus should be the narrow waist between command modules and observers. |
| 134 | +It should not know about terminal rendering, HTTP clients, market schemas, or |
| 135 | +storage. Those modules translate their own domain events into `RuntimeEvent` |
| 136 | +values before publishing. |
| 137 | + |
| 138 | +Minimum event fields: |
| 139 | + |
| 140 | +- `RuntimeEvent::StateChanged(RuntimeState)`, |
| 141 | +- `RuntimeEvent::Progress { command, message }`, |
| 142 | +- `RuntimeEvent::Warning { code, message }`, |
| 143 | +- `RuntimeEvent::ShutdownRequested(ShutdownSignal)`, |
| 144 | +- `RuntimeEvent::DroppedProgress { count }`. |
| 145 | + |
| 146 | +## Shutdown contract |
| 147 | + |
| 148 | +Shutdown should be observable and repeatable: |
| 149 | + |
| 150 | +1. Receive a `ShutdownSignal`. |
| 151 | +2. Publish `ShutdownRequested`. |
| 152 | +3. Move state from `Running` to `Draining`. |
| 153 | +4. Stop accepting new command work. |
| 154 | +5. Drain queued events. |
| 155 | +6. Move to `Stopped` or `Failed`. |
| 156 | + |
| 157 | +A hard stop can skip drain, but it must still update state when possible. |
| 158 | + |
| 159 | +## Verification notes |
| 160 | + |
| 161 | +The Appendix Rust block is self-contained so it can be compiled as a library |
| 162 | +stub before `src/runtime/` exists. The target copy path for Phase B is |
| 163 | +`src/runtime/mod.rs`. |
| 164 | + |
| 165 | +## Appendix: `src/runtime/mod.rs` starter stub |
| 166 | + |
| 167 | +```rust |
| 168 | +use std::collections::VecDeque; |
| 169 | + |
| 170 | +#[derive(Clone, Debug, Eq, PartialEq)] |
| 171 | +pub enum RuntimeState { |
| 172 | + Starting, |
| 173 | + Running, |
| 174 | + Draining { reason: ShutdownReason }, |
| 175 | + Stopped, |
| 176 | + Failed { reason: String }, |
| 177 | +} |
| 178 | + |
| 179 | +#[derive(Clone, Debug, Eq, PartialEq)] |
| 180 | +pub enum ShutdownReason { |
| 181 | + Operator, |
| 182 | + Signal, |
| 183 | + InternalError, |
| 184 | +} |
| 185 | + |
| 186 | +#[derive(Clone, Debug, Eq, PartialEq)] |
| 187 | +pub enum DrainPolicy { |
| 188 | + SoftDrain, |
| 189 | + HardStop, |
| 190 | +} |
| 191 | + |
| 192 | +#[derive(Clone, Debug, Eq, PartialEq)] |
| 193 | +pub struct ShutdownSignal { |
| 194 | + pub reason: ShutdownReason, |
| 195 | + pub policy: DrainPolicy, |
| 196 | +} |
| 197 | + |
| 198 | +impl ShutdownSignal { |
| 199 | + pub fn soft(reason: ShutdownReason) -> Self { |
| 200 | + Self { |
| 201 | + reason, |
| 202 | + policy: DrainPolicy::SoftDrain, |
| 203 | + } |
| 204 | + } |
| 205 | + |
| 206 | + pub fn hard(reason: ShutdownReason) -> Self { |
| 207 | + Self { |
| 208 | + reason, |
| 209 | + policy: DrainPolicy::HardStop, |
| 210 | + } |
| 211 | + } |
| 212 | +} |
| 213 | + |
| 214 | +#[derive(Clone, Debug, Eq, PartialEq)] |
| 215 | +pub enum RuntimeEvent { |
| 216 | + StateChanged(RuntimeState), |
| 217 | + Progress { command: String, message: String }, |
| 218 | + Warning { code: String, message: String }, |
| 219 | + ShutdownRequested(ShutdownSignal), |
| 220 | + DroppedProgress { count: u64 }, |
| 221 | +} |
| 222 | + |
| 223 | +#[derive(Clone, Debug, Eq, PartialEq)] |
| 224 | +pub struct RuntimeConfig { |
| 225 | + pub event_capacity: usize, |
| 226 | +} |
| 227 | + |
| 228 | +impl Default for RuntimeConfig { |
| 229 | + fn default() -> Self { |
| 230 | + Self { event_capacity: 256 } |
| 231 | + } |
| 232 | +} |
| 233 | + |
| 234 | +#[derive(Debug)] |
| 235 | +pub struct EventBus { |
| 236 | + capacity: usize, |
| 237 | + events: VecDeque<RuntimeEvent>, |
| 238 | + dropped_progress: u64, |
| 239 | +} |
| 240 | + |
| 241 | +impl EventBus { |
| 242 | + pub fn new(config: RuntimeConfig) -> Self { |
| 243 | + Self { |
| 244 | + capacity: config.event_capacity.max(1), |
| 245 | + events: VecDeque::new(), |
| 246 | + dropped_progress: 0, |
| 247 | + } |
| 248 | + } |
| 249 | + |
| 250 | + pub fn publish(&mut self, event: RuntimeEvent) { |
| 251 | + if self.events.len() < self.capacity { |
| 252 | + self.events.push_back(event); |
| 253 | + return; |
| 254 | + } |
| 255 | + |
| 256 | + match event { |
| 257 | + RuntimeEvent::ShutdownRequested(_) |
| 258 | + | RuntimeEvent::StateChanged(RuntimeState::Failed { .. }) => { |
| 259 | + self.drop_oldest_progress(); |
| 260 | + self.events.push_back(event); |
| 261 | + } |
| 262 | + RuntimeEvent::Progress { .. } => { |
| 263 | + self.dropped_progress = self.dropped_progress.saturating_add(1); |
| 264 | + } |
| 265 | + other => { |
| 266 | + self.drop_oldest_progress(); |
| 267 | + self.events.push_back(other); |
| 268 | + } |
| 269 | + } |
| 270 | + } |
| 271 | + |
| 272 | + pub fn drain(&mut self) -> Vec<RuntimeEvent> { |
| 273 | + self.events.drain(..).collect() |
| 274 | + } |
| 275 | + |
| 276 | + pub fn dropped_progress(&self) -> u64 { |
| 277 | + self.dropped_progress |
| 278 | + } |
| 279 | + |
| 280 | + fn drop_oldest_progress(&mut self) { |
| 281 | + if let Some(index) = self |
| 282 | + .events |
| 283 | + .iter() |
| 284 | + .position(|event| matches!(event, RuntimeEvent::Progress { .. })) |
| 285 | + { |
| 286 | + self.events.remove(index); |
| 287 | + self.dropped_progress = self.dropped_progress.saturating_add(1); |
| 288 | + return; |
| 289 | + } |
| 290 | + |
| 291 | + self.events.pop_front(); |
| 292 | + } |
| 293 | +} |
| 294 | + |
| 295 | +#[derive(Debug)] |
| 296 | +pub struct RuntimeController { |
| 297 | + state: RuntimeState, |
| 298 | + bus: EventBus, |
| 299 | +} |
| 300 | + |
| 301 | +impl RuntimeController { |
| 302 | + pub fn new(config: RuntimeConfig) -> Self { |
| 303 | + let mut bus = EventBus::new(config); |
| 304 | + let state = RuntimeState::Starting; |
| 305 | + bus.publish(RuntimeEvent::StateChanged(state.clone())); |
| 306 | + Self { state, bus } |
| 307 | + } |
| 308 | + |
| 309 | + pub fn mark_running(&mut self) { |
| 310 | + self.set_state(RuntimeState::Running); |
| 311 | + } |
| 312 | + |
| 313 | + pub fn request_shutdown(&mut self, signal: ShutdownSignal) { |
| 314 | + if matches!(self.state, RuntimeState::Stopped | RuntimeState::Failed { .. }) { |
| 315 | + return; |
| 316 | + } |
| 317 | + |
| 318 | + self.bus |
| 319 | + .publish(RuntimeEvent::ShutdownRequested(signal.clone())); |
| 320 | + |
| 321 | + match signal.policy { |
| 322 | + DrainPolicy::SoftDrain => { |
| 323 | + self.set_state(RuntimeState::Draining { |
| 324 | + reason: signal.reason, |
| 325 | + }); |
| 326 | + } |
| 327 | + DrainPolicy::HardStop => { |
| 328 | + self.set_state(RuntimeState::Stopped); |
| 329 | + } |
| 330 | + } |
| 331 | + } |
| 332 | + |
| 333 | + pub fn fail(&mut self, reason: impl Into<String>) { |
| 334 | + self.set_state(RuntimeState::Failed { |
| 335 | + reason: reason.into(), |
| 336 | + }); |
| 337 | + } |
| 338 | + |
| 339 | + pub fn state(&self) -> &RuntimeState { |
| 340 | + &self.state |
| 341 | + } |
| 342 | + |
| 343 | + pub fn bus_mut(&mut self) -> &mut EventBus { |
| 344 | + &mut self.bus |
| 345 | + } |
| 346 | + |
| 347 | + fn set_state(&mut self, state: RuntimeState) { |
| 348 | + self.state = state.clone(); |
| 349 | + self.bus.publish(RuntimeEvent::StateChanged(state)); |
| 350 | + } |
| 351 | +} |
| 352 | +``` |
0 commit comments