Skip to content

Commit 486bbe5

Browse files
NagyViktNagyVikt
andauthored
Add future runtime protocol doc (#207)
Co-authored-by: NagyVikt <nagy.viktordp@gmail.com>
1 parent fd801aa commit 486bbe5

1 file changed

Lines changed: 352 additions & 0 deletions

File tree

docs/future/02-runtime.md

Lines changed: 352 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,352 @@
1+
# Future Runtime: event bus, state, and shutdown
2+
3+
## Scope
4+
5+
This file defines the future `src/runtime/` boundary for the Polymarket CLI.
6+
The runtime layer owns process-wide coordination that does not belong inside a
7+
single command, adapter, or renderer:
8+
9+
- a typed event bus for internal status changes,
10+
- a small state machine for lifecycle reporting,
11+
- a shutdown path that drains in-flight work before the process exits,
12+
- a stable API that command modules can use without depending on a specific
13+
async executor.
14+
15+
The near-term target module is `src/runtime/mod.rs`. Follow-up modules can split
16+
this into `event_bus.rs`, `state.rs`, and `shutdown.rs` once the first runtime
17+
stub lands.
18+
19+
## Mission
20+
21+
Commands should be able to start workers, publish progress, react to shutdown,
22+
and expose state to the TUI or JSON output without each command inventing its
23+
own channels and signal handling. The runtime should stay boring: predictable
24+
types, bounded queues, explicit shutdown reasons, and no hidden background work.
25+
26+
## Current state
27+
28+
Runtime behavior is mostly implicit. Command handlers own their own loops and
29+
shutdown decisions. State is usually printed directly, which makes it hard to
30+
offer the same information to logs, the TUI, and machine-readable output. The
31+
existing protocol file, `docs/future/PROTOCOL.md`, contains the broad future
32+
catalog, but this split file is the runtime-specific contract for Phase B.
33+
34+
## Pain points
35+
36+
- [Runtime-1] Event shape is not centralized. A command can emit a message that
37+
another command cannot parse, so shared output and monitoring stay brittle.
38+
- [Runtime-2] State transitions are inferred from logs instead of represented
39+
as data. That makes resume, status, and graceful shutdown difficult to test.
40+
- [Runtime-3] Shutdown semantics are inconsistent. Some paths exit immediately,
41+
while others wait for work to finish, and callers cannot tell which occurred.
42+
- [Runtime-4] Backpressure is undefined. A hot feed can overwhelm a slow UI or
43+
logger because the queue policy is not named.
44+
- [Runtime-5] Async runtime choice is premature. The CLI needs an interface that
45+
can be implemented with synchronous tests first and swapped to Tokio later.
46+
47+
## Proposals
48+
49+
### [Runtime-1] Typed runtime events
50+
51+
Define `RuntimeEvent` as the only event shape that crosses runtime boundaries.
52+
Initial variants should cover lifecycle transitions, command progress,
53+
warnings, and shutdown requests. Payloads should be small and cloneable.
54+
55+
Acceptance:
56+
57+
- event producers call `EventBus::publish(RuntimeEvent)`,
58+
- command code never sends raw strings through runtime channels,
59+
- every event variant has a documented consumer expectation.
60+
61+
### [Runtime-2] Explicit lifecycle state
62+
63+
Represent runtime status as `RuntimeState`, with `Starting`, `Running`,
64+
`Draining`, `Stopped`, and `Failed` states. State changes should be emitted as
65+
events and stored as the last-known state for status commands.
66+
67+
Acceptance:
68+
69+
- a new command can report lifecycle through state transitions only,
70+
- tests can assert state without parsing logs,
71+
- failure carries a short reason string or typed error code.
72+
73+
### [Runtime-3] Graceful shutdown contract
74+
75+
Introduce `ShutdownSignal` with a reason and drain policy. The default policy is
76+
soft drain: stop accepting new work, emit a shutdown event, flush queued events,
77+
then exit. Hard stop is reserved for corrupted state, repeated signal delivery,
78+
or operator-requested abort.
79+
80+
Acceptance:
81+
82+
- one shutdown request is idempotent,
83+
- a second stronger request can escalate to hard stop,
84+
- shutdown state is visible to TUI and JSON callers before exit.
85+
86+
### [Runtime-4] Bounded event queue
87+
88+
Use a bounded in-memory queue for the first implementation. When full, preserve
89+
shutdown and failure events, then drop low-priority progress events with a
90+
counter. This keeps the process responsive under noisy market feeds.
91+
92+
Acceptance:
93+
94+
- queue capacity is configurable in `RuntimeConfig`,
95+
- dropped progress count is exposed as runtime state,
96+
- shutdown and failure events are never silently dropped.
97+
98+
### [Runtime-5] Executor-neutral API
99+
100+
Keep the first `src/runtime/` API synchronous and dependency-light. A future
101+
Tokio-backed implementation can sit behind the same bus and shutdown types
102+
without forcing every command module to become async.
103+
104+
Acceptance:
105+
106+
- the runtime stub compiles without external crates,
107+
- command modules can own their own blocking work while publishing events,
108+
- async integration is deferred until a command demonstrates a real need.
109+
110+
Deferred:
111+
112+
- OS signal registration is deferred to the command runner because it depends on
113+
the final CLI entrypoint shape.
114+
- Cross-process event persistence is deferred until storage design is ready.
115+
116+
## Runtime state model
117+
118+
The state model should be small enough to render in one status row:
119+
120+
| State | Meaning | Exit behavior |
121+
| --- | --- | --- |
122+
| `Starting` | Runtime is building command resources. | no exit |
123+
| `Running` | Runtime accepts events and work. | no exit |
124+
| `Draining` | Runtime rejects new work and flushes current work. | exits after drain |
125+
| `Stopped` | Runtime completed normally. | exit code 0 |
126+
| `Failed` | Runtime stopped with a failure reason. | non-zero exit |
127+
128+
State transitions should be monotonic after shutdown starts. `Draining` can move
129+
to `Stopped` or `Failed`; it should not move back to `Running`.
130+
131+
## Event bus contract
132+
133+
The event bus should be the narrow waist between command modules and observers.
134+
It should not know about terminal rendering, HTTP clients, market schemas, or
135+
storage. Those modules translate their own domain events into `RuntimeEvent`
136+
values before publishing.
137+
138+
Minimum event fields:
139+
140+
- `RuntimeEvent::StateChanged(RuntimeState)`,
141+
- `RuntimeEvent::Progress { command, message }`,
142+
- `RuntimeEvent::Warning { code, message }`,
143+
- `RuntimeEvent::ShutdownRequested(ShutdownSignal)`,
144+
- `RuntimeEvent::DroppedProgress { count }`.
145+
146+
## Shutdown contract
147+
148+
Shutdown should be observable and repeatable:
149+
150+
1. Receive a `ShutdownSignal`.
151+
2. Publish `ShutdownRequested`.
152+
3. Move state from `Running` to `Draining`.
153+
4. Stop accepting new command work.
154+
5. Drain queued events.
155+
6. Move to `Stopped` or `Failed`.
156+
157+
A hard stop can skip drain, but it must still update state when possible.
158+
159+
## Verification notes
160+
161+
The Appendix Rust block is self-contained so it can be compiled as a library
162+
stub before `src/runtime/` exists. The target copy path for Phase B is
163+
`src/runtime/mod.rs`.
164+
165+
## Appendix: `src/runtime/mod.rs` starter stub
166+
167+
```rust
168+
use std::collections::VecDeque;
169+
170+
#[derive(Clone, Debug, Eq, PartialEq)]
171+
pub enum RuntimeState {
172+
Starting,
173+
Running,
174+
Draining { reason: ShutdownReason },
175+
Stopped,
176+
Failed { reason: String },
177+
}
178+
179+
#[derive(Clone, Debug, Eq, PartialEq)]
180+
pub enum ShutdownReason {
181+
Operator,
182+
Signal,
183+
InternalError,
184+
}
185+
186+
#[derive(Clone, Debug, Eq, PartialEq)]
187+
pub enum DrainPolicy {
188+
SoftDrain,
189+
HardStop,
190+
}
191+
192+
#[derive(Clone, Debug, Eq, PartialEq)]
193+
pub struct ShutdownSignal {
194+
pub reason: ShutdownReason,
195+
pub policy: DrainPolicy,
196+
}
197+
198+
impl ShutdownSignal {
199+
pub fn soft(reason: ShutdownReason) -> Self {
200+
Self {
201+
reason,
202+
policy: DrainPolicy::SoftDrain,
203+
}
204+
}
205+
206+
pub fn hard(reason: ShutdownReason) -> Self {
207+
Self {
208+
reason,
209+
policy: DrainPolicy::HardStop,
210+
}
211+
}
212+
}
213+
214+
#[derive(Clone, Debug, Eq, PartialEq)]
215+
pub enum RuntimeEvent {
216+
StateChanged(RuntimeState),
217+
Progress { command: String, message: String },
218+
Warning { code: String, message: String },
219+
ShutdownRequested(ShutdownSignal),
220+
DroppedProgress { count: u64 },
221+
}
222+
223+
#[derive(Clone, Debug, Eq, PartialEq)]
224+
pub struct RuntimeConfig {
225+
pub event_capacity: usize,
226+
}
227+
228+
impl Default for RuntimeConfig {
229+
fn default() -> Self {
230+
Self { event_capacity: 256 }
231+
}
232+
}
233+
234+
#[derive(Debug)]
235+
pub struct EventBus {
236+
capacity: usize,
237+
events: VecDeque<RuntimeEvent>,
238+
dropped_progress: u64,
239+
}
240+
241+
impl EventBus {
242+
pub fn new(config: RuntimeConfig) -> Self {
243+
Self {
244+
capacity: config.event_capacity.max(1),
245+
events: VecDeque::new(),
246+
dropped_progress: 0,
247+
}
248+
}
249+
250+
pub fn publish(&mut self, event: RuntimeEvent) {
251+
if self.events.len() < self.capacity {
252+
self.events.push_back(event);
253+
return;
254+
}
255+
256+
match event {
257+
RuntimeEvent::ShutdownRequested(_)
258+
| RuntimeEvent::StateChanged(RuntimeState::Failed { .. }) => {
259+
self.drop_oldest_progress();
260+
self.events.push_back(event);
261+
}
262+
RuntimeEvent::Progress { .. } => {
263+
self.dropped_progress = self.dropped_progress.saturating_add(1);
264+
}
265+
other => {
266+
self.drop_oldest_progress();
267+
self.events.push_back(other);
268+
}
269+
}
270+
}
271+
272+
pub fn drain(&mut self) -> Vec<RuntimeEvent> {
273+
self.events.drain(..).collect()
274+
}
275+
276+
pub fn dropped_progress(&self) -> u64 {
277+
self.dropped_progress
278+
}
279+
280+
fn drop_oldest_progress(&mut self) {
281+
if let Some(index) = self
282+
.events
283+
.iter()
284+
.position(|event| matches!(event, RuntimeEvent::Progress { .. }))
285+
{
286+
self.events.remove(index);
287+
self.dropped_progress = self.dropped_progress.saturating_add(1);
288+
return;
289+
}
290+
291+
self.events.pop_front();
292+
}
293+
}
294+
295+
#[derive(Debug)]
296+
pub struct RuntimeController {
297+
state: RuntimeState,
298+
bus: EventBus,
299+
}
300+
301+
impl RuntimeController {
302+
pub fn new(config: RuntimeConfig) -> Self {
303+
let mut bus = EventBus::new(config);
304+
let state = RuntimeState::Starting;
305+
bus.publish(RuntimeEvent::StateChanged(state.clone()));
306+
Self { state, bus }
307+
}
308+
309+
pub fn mark_running(&mut self) {
310+
self.set_state(RuntimeState::Running);
311+
}
312+
313+
pub fn request_shutdown(&mut self, signal: ShutdownSignal) {
314+
if matches!(self.state, RuntimeState::Stopped | RuntimeState::Failed { .. }) {
315+
return;
316+
}
317+
318+
self.bus
319+
.publish(RuntimeEvent::ShutdownRequested(signal.clone()));
320+
321+
match signal.policy {
322+
DrainPolicy::SoftDrain => {
323+
self.set_state(RuntimeState::Draining {
324+
reason: signal.reason,
325+
});
326+
}
327+
DrainPolicy::HardStop => {
328+
self.set_state(RuntimeState::Stopped);
329+
}
330+
}
331+
}
332+
333+
pub fn fail(&mut self, reason: impl Into<String>) {
334+
self.set_state(RuntimeState::Failed {
335+
reason: reason.into(),
336+
});
337+
}
338+
339+
pub fn state(&self) -> &RuntimeState {
340+
&self.state
341+
}
342+
343+
pub fn bus_mut(&mut self) -> &mut EventBus {
344+
&mut self.bus
345+
}
346+
347+
fn set_state(&mut self, state: RuntimeState) {
348+
self.state = state.clone();
349+
self.bus.publish(RuntimeEvent::StateChanged(state));
350+
}
351+
}
352+
```

0 commit comments

Comments
 (0)