You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: live bytes/sec progress for in-flight requests in Logs view (#410)
## Summary
Adds real-time throughput visibility for in-flight streaming requests in
the Request Logs page.
Previously, once a request was dispatched to a provider, the Logs UI
showed a static pending row with no indication of whether bytes were
actually flowing. This change surfaces live byte counts and throughput
from the existing `StallInspector` ring buffer via SSE, updated once per
second.
## How it works
**Backend:**
- `StallInspector.getStats()` — new read-only method that snapshots
current state (bytes received, bytes/sec via sliding window) without
touching any stream logic. Throughput is computed from the ring buffer
regardless of stall enforcement state, so it works even during the grace
period.
- `UsageStorageService` — new in-flight registry (`registerInFlight` /
`deregisterInFlight` / `getProgressUpdates`). Keyed by `requestId`,
stores a reference to the inspector and the owning API key for scoping.
- `response-handler.ts` — registers the inspector when it enters the
pipeline; deregisters in `cleanupDisconnectWiring` (fires on both `end`
and `error`), so the registry never leaks.
- `/v0/management/events` SSE endpoint — adds a 1s `setInterval` per
connected client that calls `getProgressUpdates()`, filters by the
client's scoped API key, and emits `progress` events. Fire-and-forget;
write errors are swallowed. Interval is cleared on connection close
alongside the existing listeners.
**Frontend:**
- New `progress` SSE event handler updates a `progressMapRef`
(`Map<requestId, ProgressUpdate>`) and increments a render-tick counter
to trigger re-renders without mutating the `logs` array.
- Progress entries are cleared when a `completed` event arrives for the
same request.
- **Desktop Perf column** — for pending rows with live data: shows bytes
received + KB/s instead of the empty Duration/TTFT/TPS fields.
- **Mobile Latency card** — same conditional rendering.
- New `formatBytes()` helper in `format.ts`.
## What is not changed
- Stall detection/abort logic is untouched — `getStats()` is purely
observational.
- No new DB writes or queries; all data comes from the in-memory ring
buffer.
- No new API endpoints.
- Only chat/messages/responses/gemini routes have a `StallInspector` in
the pipeline; other request types (embeddings, images, speech,
transcriptions) simply have no registry entry and are skipped silently.
Copy file name to clipboardExpand all lines: README.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -204,6 +204,7 @@ When a provider fails, Plexus removes it from rotation using exponential backoff
204
204
- Client disconnects now cancel the upstream provider request, reducing wasted tokens/quota on abandoned streams.
205
205
- Global and per-provider upstream timeouts cut off requests that run too long.
206
206
- Optional stall detection can fail over slow-to-start providers before bytes reach the client, and can abort streams that become too slow mid-flight.
207
+
- The Request Logs page shows **live bytes received and throughput (KB/s)** for in-flight streaming requests, updated every second via SSE, so you can see whether a provider is actively responding before it completes.
207
208
208
209
→ See [Configuration: Request Timeouts](docs/CONFIGURATION.md#request-timeouts) and [Configuration: Stall Detection](docs/CONFIGURATION.md#stall-detection)
0 commit comments