|
| 1 | +# Multi-Instance Stage 1: Read-Only Observer — Design Spec |
| 2 | + |
| 3 | +Issue: #127 (Stage 1 of 3). Prerequisites: #125 (explicit workspace join + named |
| 4 | +broker instances — pear side landed `435d78c`, relay/cloud halves in flight), |
| 5 | +#126 (remote host support — not started). |
| 6 | + |
| 7 | +Status: DESIGN ONLY. Written 2026-06-06 during the #125 build-out; contract |
| 8 | +references below are to the locked #125 naming contract. |
| 9 | + |
| 10 | +## Goal |
| 11 | + |
| 12 | +A second Pear instance (same user, different machine — multi-user comes with |
| 13 | +invite scoping later in this stage) can open a project that is "hosted" |
| 14 | +elsewhere, see the same agent graph, and watch live terminal output. It cannot |
| 15 | +send PTY input, spawn/stop agents, or mutate project settings. |
| 16 | + |
| 17 | +## Non-goals (Stage 1) |
| 18 | + |
| 19 | +- No write path of any kind from the observer (Stage 2). |
| 20 | +- No per-user permission levels beyond owner/observer (Stage 3 adds editor). |
| 21 | +- No presence avatars/notifications UI beyond a minimal "N instances connected" |
| 22 | + indicator (Stage 3). |
| 23 | +- No CRDT/merge for project definitions: single-writer (the host instance), |
| 24 | + observers treat shared state as read-only snapshots. |
| 25 | +- No web observer; Electron only. |
| 26 | + |
| 27 | +## Foundations this builds on (all #125 vintage) |
| 28 | + |
| 29 | +| Primitive | Where | Why it matters here | |
| 30 | +|---|---|---| |
| 31 | +| Explicit workspace join | relay `--workspace-key` / `AGENT_RELAY_WORKSPACE_KEY`, fail-loud on invalid key | Observer joins the project workspace; MUST hard-fail rather than silently create a fresh one (the #125 failure mode) | |
| 32 | +| Named broker instances | `--instance-name` / `AGENT_RELAY_BROKER_NAME`; `RuntimeSpawnOptions.workspaceKey?/brokerName?` | Instance identity. Observers are addressable, and ownership checks key off the name | |
| 33 | +| Workspace key seam | `brokerManager.workspaceKeyForProject(projectId)` (broker.ts) | The host-side source of truth an invite token wraps | |
| 34 | +| Remote attach primitive | `attachCloudSandbox()` connects by `execUrl + apiKey` (broker.ts:1467) | The observer's connection path to the host broker is the same shape (#126 generalizes it) | |
| 35 | +| Event dedupe discipline | `slackLogicalInjections` canonical-identity claims (integration-event-bridge.ts) | Fan-out to N instances multiplies the duplicate-delivery surface; reuse logical-identity claims, never per-copy revisions | |
| 36 | + |
| 37 | +## Instance naming |
| 38 | + |
| 39 | +Local broker names are currently `pear-${project.relayWorkspaceId}` |
| 40 | +(project-store.ts:58) — project-stable but NOT instance-unique; two instances |
| 41 | +on one project collide. This was explicitly deferred out of #125. Stage 1 is |
| 42 | +where it lands: |
| 43 | + |
| 44 | +- Name = `pear-<relayWorkspaceId>-<machineSlug>` where machineSlug = |
| 45 | + hostname-derived, 8 chars, persisted in local app config on first run |
| 46 | + (NOT regenerated per session — names must be stable for ownership checks). |
| 47 | +- The HOST instance keeps the legacy un-suffixed name working via PEAR |
| 48 | + METADATA, not broker-side aliasing (relay-worker ruling, 2026-06-06): the |
| 49 | + shared project definition publishes both `brokerName` (canonical suffixed) |
| 50 | + and `legacyBrokerName`; consumers resolve through the definition. Broker-side |
| 51 | + aliasing would disturb the just-stabilized strict-name registration |
| 52 | + semantics and is rejected. |
| 53 | + |
| 54 | +## 1. Shared project definitions |
| 55 | + |
| 56 | +Authoritative project definition moves to the relay workspace as a single |
| 57 | +relayfile document: `/pear/project.json` (channels, integration scopes, roots, |
| 58 | +host assignment, schema version). Rationale for relayfile over a new relay |
| 59 | +cloud API: sync, change events, and conflict surface already exist; observers |
| 60 | +already need relayfile access for mirrors. |
| 61 | + |
| 62 | +- Host instance: writes `/pear/project.json` on every local mutation |
| 63 | + (debounced). Local `projects.json` stays the cache/bootstrap copy. |
| 64 | +- Observer instance: subscribes to `/pear/project.json` change events |
| 65 | + (the same `subscribe()` machinery the integration-event bridge uses); applies |
| 66 | + snapshots read-only; never writes. |
| 67 | +- Conflict rule (Stage 1): host wins, always — observers don't write, so the |
| 68 | + only conflict is host vs stale cache, resolved by `revision` compare on open. |
| 69 | +- Schema versioned (`schemaVersion: 1`); observer with unknown newer version |
| 70 | + shows "upgrade required" instead of guessing. |
| 71 | + |
| 72 | +## 2. Invite / join flow |
| 73 | + |
| 74 | +Stage 1 keeps tokens same-account (multi-user scoping is the second half of |
| 75 | +this stage, gated on relay-side scoped tokens): |
| 76 | + |
| 77 | +``` |
| 78 | +InviteToken = base64url(JSON{ |
| 79 | + v: 1, |
| 80 | + workspaceKey, // from workspaceKeyForProject(projectId) |
| 81 | + relayWorkspaceId, // account workspace (cloud API URL construction) |
| 82 | + hostBrokerName, // addressing + ownership root |
| 83 | + brokerUrl?, // #126 remote-host URL when available; absent = cloud-relay discovery |
| 84 | + role: 'observer' |
| 85 | +}) |
| 86 | +``` |
| 87 | + |
| 88 | +- Generate: host instance, new IPC `workspace.invite(projectId)` → token string |
| 89 | + (UI: copy button in project settings). |
| 90 | +- Join: `workspace.join(token)` → validates schema/version → spawns/joins |
| 91 | + observer-side broker session with `workspaceKey` + its own instance name → |
| 92 | + fetches `/pear/project.json` → materializes a read-only project entry in the |
| 93 | + local store (flagged `origin: 'shared'`, `role: 'observer'`). |
| 94 | +- Fail-loud inheritance: a bad/expired key surfaces the broker's strict-join |
| 95 | + error verbatim. No fallback to create. The broker distinguishes fatal |
| 96 | + rejection (401/403 — "rejected") from rate-limiting (429 — "rate-limited", |
| 97 | + HTTP status preserved through AuthHttpError): the join UI treats the former |
| 98 | + as a bad invite and the latter as retryable with backoff. |
| 99 | +- Token carries no bearer secret beyond the workspace key in Stage 1 |
| 100 | + (same-account); the multi-user variant swaps `workspaceKey` for a relay-issued |
| 101 | + scoped token and is EXPLICITLY out of scope until relay exposes one. |
| 102 | + |
| 103 | +## 3. Read-only enforcement |
| 104 | + |
| 105 | +Two layers, because UI-only enforcement is not enforcement: |
| 106 | + |
| 107 | +1. **Pear layer (UX):** project entries with `role: 'observer'` get |
| 108 | + permission-aware guards in the renderer stores — spawn/stop/input/settings |
| 109 | + actions disabled with tooltips. IPC handlers for mutating calls check the |
| 110 | + role flag and reject (`ROLE_OBSERVER_READONLY` error) so a buggy renderer |
| 111 | + can't mutate either. |
| 112 | +2. **Broker layer (authority):** per relay-worker (2026-06-06) the right home |
| 113 | + is a readonly capability on the CONNECTION/API layer — a flag in the local |
| 114 | + HTTP/WS connection/session context that rejects mutating REST endpoints and |
| 115 | + delivery/spawn/release/write actions, while the host connection keeps |
| 116 | + normal identity. The lease API is explicitly the wrong layer (leases govern |
| 117 | + broker lifetime, not caller authority). Effort: moderate; scheduled with |
| 118 | + relay, not in Stage 1's critical path. Stage 1 ships with pear-layer |
| 119 | + enforcement only; the trust boundary is then "same account", acceptable for |
| 120 | + same-user Stage 1, NOT for multi-user invites (hard gate: multi-user waits |
| 121 | + for the broker-side capability). |
| 122 | + |
| 123 | +## 4. PTY fan-out |
| 124 | + |
| 125 | +The broker already supports multiple clients; #125 makes both instances land in |
| 126 | +one workspace. Stage 1 needs verification + hardening, not new plumbing: |
| 127 | + |
| 128 | +- Test matrix: 2 instances × (local host, remote host) × (agent spawn before / |
| 129 | + after observer join) — observer must receive output chunks for agents |
| 130 | + spawned both before and after it connected. Catch-up contract per |
| 131 | + relay-worker (2026-06-06): current visible-screen PTY snapshot (existing |
| 132 | + snapshot/state machinery) + live stream from join. There is NO durable |
| 133 | + per-observer scrollback contract — historical replay is out of scope and the |
| 134 | + UI labels the point where the observer's view begins. |
| 135 | +- Duplicate-event hardening per AGENTS.md guidance: PTY chunks are |
| 136 | + sequence-numbered per (agentName, ptyId); pear-side consumer drops |
| 137 | + already-seen sequence numbers — same canonical-identity discipline as the |
| 138 | + slack dedupe, trivially cheaper (monotonic seq, not content hashes). |
| 139 | +- Broker events (`agent_spawned`, `agent_exited`, …) fan out to all instances; |
| 140 | + observer applies them to its read-only graph. Event `instanceName` field |
| 141 | + (from the #125 named-instance work) distinguishes "who did that" for the |
| 142 | + Stage 3 presence layer — carried but unused in Stage 1. |
| 143 | + |
| 144 | +## 5. Minimal presence (Stage 1 slice) |
| 145 | + |
| 146 | +- Each instance publishes `{instanceName, role, joinedAt}` on a relaycast |
| 147 | + channel `#pear-presence-<projectId>` on join, tombstone on clean exit, |
| 148 | + TTL-expired by peers on silence (heartbeat every 60s). |
| 149 | +- UI: "2 instances connected" pill on the project header. Nothing else. |
| 150 | +- This channel is also Stage 2's coordination root (ownership claims), so the |
| 151 | + message schema gets a `kind` discriminator from day one. |
| 152 | + |
| 153 | +## IPC / type additions |
| 154 | + |
| 155 | +- `workspace.invite(projectId) → string` |
| 156 | +- `workspace.join(token) → { projectId }` |
| 157 | +- `workspace.onPresenceUpdate(projectId, instances[])` |
| 158 | +- Project record: `origin: 'local' | 'shared'`, `role: 'owner' | 'observer'`, |
| 159 | + `hostBrokerName?`, `sharedRevision?` |
| 160 | +- Mutating IPC handlers gain the role guard (single helper, applied at the |
| 161 | + handler boundary, not per-store). |
| 162 | + |
| 163 | +## Dependencies / sequencing |
| 164 | + |
| 165 | +``` |
| 166 | +#125 relay strict-join + named instances [in flight, T3] |
| 167 | +#125 cloud verbatim env injection [in flight, T4] |
| 168 | +#126 remote host (broker URL for observer) [not started — Stage 1 can demo |
| 169 | + against a cloud-sandbox host first] |
| 170 | +relay: scoped invite tokens [needed for multi-user only] |
| 171 | +relay: broker-side readonly capability [needed for multi-user; stub OK same-user] |
| 172 | +``` |
| 173 | + |
| 174 | +Buildable order inside Stage 1: instance-name uniqueness → shared |
| 175 | +project.json (host write path, observer read path) → invite/join IPC + UI → |
| 176 | +PTY fan-out verification → presence slice. Each lands behind a |
| 177 | +`PEAR_MULTI_INSTANCE` flag until the stage is coherent. |
| 178 | + |
| 179 | +## Open questions (for #general before implementation) |
| 180 | + |
| 181 | +1. relay-worker: name alias for the legacy un-suffixed host broker name vs |
| 182 | + pear publishing both names — which is cheaper broker-side? |
| 183 | +2. relay-worker: per-connection readonly capability — connection API or |
| 184 | + lease API? Effort estimate decides whether same-user Stage 1 waits for it. |
| 185 | +3. cloud-lead: does `/pear/project.json` in the account workspace collide with |
| 186 | + any cloud-side relayfile path conventions/reserved prefixes? |
| 187 | +4. PTY backfill: does the broker keep enough scrollback per PTY to replay on |
| 188 | + attach, or is "live from join" the Stage 1 contract? |
| 189 | + |
| 190 | +## Test plan sketch |
| 191 | + |
| 192 | +- Unit: invite token round-trip (incl. version/role rejection), role guard on |
| 193 | + every mutating IPC handler (table-driven), project.json snapshot apply + |
| 194 | + revision conflict. |
| 195 | +- Integration: two BrokerManager instances against one broker (the |
| 196 | + broker.test.ts harness already mocks spawn; extend with a second client), |
| 197 | + PTY seq dedupe under interleaved chunks, observer join while agent mid-run. |
| 198 | +- Manual/e2e (needs T2-style debug logging): second machine joins via token, |
| 199 | + watches a live agent, kill -9 the host instance → observer survives in |
| 200 | + read-only state with stale-host indicator. |
0 commit comments