|
| 1 | +# Agent App Platform |
| 2 | + |
| 3 | +> Status: DESIGN — pre-build, pending validation gate. |
| 4 | +> Layer impact: new L11 plugin (`plugins/appsandbox`), new L11 server (`pkg/appstore/server`), new utilities (`pkg/appmanifest`, `pkg/appfetch`, `internal/appkeys`), new L12 binary (`cmd/pilot-submit`), extended L12 binary (`cmd/pilotctl`). |
| 5 | +
|
| 6 | +A curated app registry and sandboxed runtime that lets autonomous agents discover, install, and use stateful apps without human-in-the-loop setup. Apps are signed by Pilot project keys, distributed only via `apps.pilotprotocol.network`, and run inside per-app Linux sandboxes on the host daemon. |
| 7 | + |
| 8 | +--- |
| 9 | + |
| 10 | +## Model |
| 11 | + |
| 12 | +**Each compiled app binary ships with a skill-file template** (`skill.md.tmpl`): |
| 13 | + |
| 14 | +- **Static parts** — vendor-authored, immutable instructions to the agent: what the app does, available RPC methods, schemas, examples. Signed in the tarball by the vendor and countersigned by Pilot at deploy. |
| 15 | +- **Fractional parts** — slots in the template (Jinja-shaped: `{{tasks_count}}`, `{{recent: limit=5}}`) that the daemon's renderer fills at expansion time by calling the app's `summary` RPC. The RPC reads from the app's own SQLite database and returns a digest of current state. |
| 16 | + |
| 17 | +The skill file is the agent's *map*. SQLite stays inside the sandbox; agents never touch it directly. SQL traversal is the app's job — exposed through manifest-declared RPC methods. The skill file describes which methods to call; the app implements them. |
| 18 | + |
| 19 | +This is `skillinject` v2: instead of one global heartbeat injected into every agent tool dir, each installed app contributes one rendered SKILL.md to each tool dir. Generalizes the existing pattern; deprecates the single-heartbeat model. |
| 20 | + |
| 21 | +--- |
| 22 | + |
| 23 | +## End-to-end flow |
| 24 | + |
| 25 | +```mermaid |
| 26 | +flowchart TB |
| 27 | + %% ============ Pilot-operated registry ============ |
| 28 | + subgraph REG["apps.pilotprotocol.network — Pilot-operated registry"] |
| 29 | + direction LR |
| 30 | + QUEUE["Submission queue<br/>(pilot-submit uploads)"] |
| 31 | + REVIEW["Manual review<br/>(v1 = founder)"] |
| 32 | + SIGNER["Sign with<br/>Pilot project key"] |
| 33 | + CDN[("Signed binary CDN")] |
| 34 | + DENY[("revoked.json<br/>deny-list")] |
| 35 | + QUEUE --> REVIEW --> SIGNER --> CDN |
| 36 | + end |
| 37 | +
|
| 38 | + VENDOR(["Vendor<br/>e.g. Legora"]) -- "pilot-submit tarball<br/>{bin, manifest.toml, skill.md.tmpl, schema.sql}" --> QUEUE |
| 39 | +
|
| 40 | + %% ============ Local host ============ |
| 41 | + subgraph HOST["Local host running pilot-daemon"] |
| 42 | + direction TB |
| 43 | + DAEMON{{"pilot-daemon (L7)"}} |
| 44 | +
|
| 45 | + subgraph PLUGIN["plugins/appsandbox (L11)"] |
| 46 | + FETCH["appfetch + appkeys<br/>HTTPS + sig verify"] |
| 47 | + LIFECYCLE["Sandbox lifecycle<br/>namespaces + seccomp +<br/>cgroups + bind-mount"] |
| 48 | + RENDER["Skill-file renderer<br/>(static + fractional)"] |
| 49 | + REVCHK["Revocation check<br/>24h cached"] |
| 50 | + end |
| 51 | +
|
| 52 | + subgraph APPFS["~/.pilot/apps/<name>/"] |
| 53 | + BIN[/"bin/app binary"/] |
| 54 | + MAN[/"manifest.toml<br/>caps + RPC methods"/] |
| 55 | + TMPL[/"skill.md.tmpl<br/>static + {{slots}}"/] |
| 56 | + DB[("state.db<br/>per-app SQLite")] |
| 57 | + end |
| 58 | +
|
| 59 | + subgraph SBX["sandboxed app process"] |
| 60 | + RUNTIME["App runtime<br/>(handles RPC methods)"] |
| 61 | + SUMMARY["summary RPC<br/>returns fractional state JSON"] |
| 62 | + end |
| 63 | +
|
| 64 | + subgraph DIRS["Agent tool dirs"] |
| 65 | + CSKILL[/"~/.claude/skills/<name>/SKILL.md<br/>(rendered)"/] |
| 66 | + OSKILL[/"~/.openclaw/skills/<name>/SKILL.md<br/>(rendered)"/] |
| 67 | + end |
| 68 | + end |
| 69 | +
|
| 70 | + %% ============ Agent ============ |
| 71 | + AGENT(["LLM agent<br/>(Claude Code, OpenClaw, …)"]) |
| 72 | +
|
| 73 | + %% ----- Install flow (numbered) ----- |
| 74 | + DAEMON -- "1. install <name>" --> FETCH |
| 75 | + CDN -- "2. HTTPS GET<br/>signed tarball" --> FETCH |
| 76 | + FETCH -- "3. verify Pilot sig" --> LIFECYCLE |
| 77 | + DENY -. "pull every 24h" .-> REVCHK |
| 78 | + REVCHK -- "block if revoked" --> LIFECYCLE |
| 79 | + LIFECYCLE -- "4. materialize folder" --> APPFS |
| 80 | + LIFECYCLE -- "5. spawn sandboxed" --> RUNTIME |
| 81 | + RUNTIME -- "owns r/w" --> DB |
| 82 | +
|
| 83 | + %% ----- Skill render flow (lettered) ----- |
| 84 | + RENDER -- "A. read static" --> TMPL |
| 85 | + RENDER -- "B. call summary RPC" --> SUMMARY |
| 86 | + SUMMARY -- "query" --> DB |
| 87 | + SUMMARY -- "fractional JSON<br/>{tasks_count, recent: [...]}" --> RENDER |
| 88 | + RENDER -- "C. expand + write" --> CSKILL |
| 89 | + RENDER -- "C. expand + write" --> OSKILL |
| 90 | +
|
| 91 | + %% ----- Agent runtime invocation (roman) ----- |
| 92 | + AGENT -- "i. reads SKILL.md<br/>(learns commands + sees state)" --> CSKILL |
| 93 | + AGENT -- "ii. pilotctl app call <name> <method> <args>" --> DAEMON |
| 94 | + DAEMON -- "iii. route via L9 IPC" --> RUNTIME |
| 95 | + RUNTIME -- "iv. SQL r/w + response" --> DAEMON |
| 96 | + DAEMON -- "v. return JSON to agent" --> AGENT |
| 97 | +
|
| 98 | + %% ----- Styling ----- |
| 99 | + classDef cloud fill:#e8f0ff,stroke:#3366cc |
| 100 | + classDef host fill:#f5f5f5,stroke:#666 |
| 101 | + classDef sandbox fill:#fff4e0,stroke:#cc8800 |
| 102 | + classDef store fill:#e8f8e8,stroke:#3a9d3a |
| 103 | + class REG cloud |
| 104 | + class HOST host |
| 105 | + class SBX sandbox |
| 106 | + class APPFS store |
| 107 | +``` |
| 108 | + |
| 109 | +Three independent loops, deliberately separate: |
| 110 | + |
| 111 | +1. **Install (1→5)** — runs once per app, on `pilotctl app install`. Pulls signed tarball, verifies against Pilot project key (pinned at compile in `internal/appkeys`), checks deny-list, materializes folder, spawns under sandbox primitives. |
| 112 | +2. **Skill render (A→C)** — runs periodically (default 5 min) AND on app-published state-change events. The fractional memory expansion loop. Stays cheap: `summary` RPC returns a digest, never raw rows. |
| 113 | +3. **Agent invocation (i→v)** — runtime hot path. Agent reads SKILL.md to learn commands and see digest state; calls happen via L9 IPC mediated by the daemon; app handles SQL internally; response returns as JSON. |
| 114 | + |
| 115 | +--- |
| 116 | + |
| 117 | +## Why direct SQLite stubs are deliberately excluded from skill files |
| 118 | + |
| 119 | +- The app's schema is implementation, not contract. Schema migrations would break agents if exposed. |
| 120 | +- Arbitrary SQL from an LLM against a vendor's DB is a footgun (correctness, security, performance). |
| 121 | +- The RPC surface is the stable contract. The vendor controls the queries; the agent stays at the API boundary. |
| 122 | + |
| 123 | +If an app needs to expose richer query capability, the manifest declares additional RPC methods (`search`, `filter`, etc.) that the app implements over its own DB. |
| 124 | + |
| 125 | +--- |
| 126 | + |
| 127 | +## Manifest example |
| 128 | + |
| 129 | +```toml |
| 130 | +name = "pilot-tasks" |
| 131 | +version = "0.1.0" |
| 132 | +publisher = "pilot-project" |
| 133 | +signed-by = "ed25519:PILOT_PROJECT_KEY..." |
| 134 | + |
| 135 | +[capabilities] |
| 136 | +allowed-network = [] # this app needs no outbound |
| 137 | +allowed-fs = ["state.db", "cache/"] # paths relative to bind-mount root |
| 138 | +exec = "none" |
| 139 | + |
| 140 | +[sqlite] |
| 141 | +schema = "schema.sql" |
| 142 | + |
| 143 | +[rpc] |
| 144 | +methods = ["add", "list", "complete", "summary"] |
| 145 | +# summary is the fractional-render contract: returns |
| 146 | +# { tasks_count: int, recent: [{title, ts}], in_progress: int } |
| 147 | +``` |
| 148 | + |
| 149 | +--- |
| 150 | + |
| 151 | +## Skill file template example |
| 152 | + |
| 153 | +```markdown |
| 154 | +# pilot-tasks |
| 155 | + |
| 156 | +A persistent task queue. Use this app for any task you want to remember across sessions. |
| 157 | + |
| 158 | +## Current state |
| 159 | +- You have {{tasks_count}} open tasks. |
| 160 | +- {{in_progress}} in progress. |
| 161 | +- Recent: {{recent | bullet_list}} |
| 162 | + |
| 163 | +## Commands |
| 164 | +- `pilotctl app call pilot-tasks add '{"title": "..."}'` — create a task |
| 165 | +- `pilotctl app call pilot-tasks list` — list open tasks |
| 166 | +- `pilotctl app call pilot-tasks complete '{"id": "..."}'` — mark complete |
| 167 | +``` |
| 168 | + |
| 169 | +After rendering with live state: |
| 170 | + |
| 171 | +```markdown |
| 172 | +# pilot-tasks |
| 173 | + |
| 174 | +A persistent task queue. Use this app for any task you want to remember across sessions. |
| 175 | + |
| 176 | +## Current state |
| 177 | +- You have 3 open tasks. |
| 178 | +- 1 in progress. |
| 179 | +- Recent: |
| 180 | + - draft contract for Acme |
| 181 | + - review PR #1247 |
| 182 | + - file weekly status |
| 183 | + |
| 184 | +## Commands |
| 185 | +- `pilotctl app call pilot-tasks add '{"title": "..."}'` — create a task |
| 186 | +- `pilotctl app call pilot-tasks list` — list open tasks |
| 187 | +- `pilotctl app call pilot-tasks complete '{"id": "..."}'` — mark complete |
| 188 | +``` |
| 189 | + |
| 190 | +--- |
| 191 | + |
| 192 | +## Layer mapping (cross-reference with `layers.yaml`) |
| 193 | + |
| 194 | +| Component | Layer | Package | |
| 195 | +|---|---|---| |
| 196 | +| Manifest TOML schema + parser | utility | `pkg/appmanifest/` | |
| 197 | +| Pilot project pubkey + sig verify | utility | `internal/appkeys/` | |
| 198 | +| HTTPS fetch + verify of signed tarball | utility | `pkg/appfetch/` | |
| 199 | +| Sandbox runtime (namespaces, seccomp, cgroups, bind-mount) | L11 plugin | `plugins/appsandbox/` | |
| 200 | +| Agent ↔ app IPC routing | L11 plugin (inside appsandbox) | uses L9 via L10 contract | |
| 201 | +| App outbound network mediation | L11 plugin (inside appsandbox) | uses L10 `Streams` | |
| 202 | +| Skill-file renderer | L11 plugin (inside appsandbox) | replaces `plugins/skillinject` semantics | |
| 203 | +| `pilotctl app *` subcommands | L12 | `cmd/pilotctl/` (extension) | |
| 204 | +| `pilot-submit` CLI | L12 | `cmd/pilot-submit/` (new) | |
| 205 | +| Registry server (queue + review + sign + CDN + deny-list publish) | L11 server | `pkg/appstore/server/` | |
| 206 | +| Server binary | L12 | `cmd/appstore-server/` | |
| 207 | + |
| 208 | +No new layer is required. All imports remain strictly downward (P1). Wire formats L1/L4/L5/L6/L7 are untouched (P6). |
| 209 | + |
| 210 | +--- |
| 211 | + |
| 212 | +## Prerequisites flagged by the existing architecture docs |
| 213 | + |
| 214 | +The following items in `web4-private/architecture-notes/` become **blocking** for the appsandbox plugin, not merely transitional: |
| 215 | + |
| 216 | +1. **Per-plugin panic supervisor** (`03-INVARIANTS.md §8`). Today, an L11 plugin panic kills the daemon. `appsandbox` manages N child processes — a panic during app lifecycle kills the daemon AND orphans every running app. Must ship before `appsandbox` launches. |
| 217 | +2. **T7.1 — plugin lifecycle out of `pkg/daemon` into `cmd/daemon`** (`layers.yaml` transitional). Today `pkg/daemon (L7)` imports `pkg/coreapi (L10)` — wrong direction. Must resolve before introducing a heavier plugin. |
| 218 | +3. **Audit #15 — `TestDaemonShutdownStopsGoroutines`** (`03-INVARIANTS.md §4`). Listed as "the prerequisite for all extraction work." `appsandbox` adds goroutines per running app; without the baseline test, sandbox lifetime leaks are invisible. |
| 219 | + |
| 220 | +These are not new concerns; the architecture docs already track them. The agent app platform makes them load-bearing instead of background hygiene. |
| 221 | + |
| 222 | +--- |
| 223 | + |
| 224 | +## Out of scope for v1 |
| 225 | + |
| 226 | +- Cross-host app calls (agent on host A invoking app on host B). v1 is local-host install only. |
| 227 | +- App → app composition. v1 forbids one app calling another; capability composition deferred to v1.1. |
| 228 | +- Schema migrations (`migrations/` directory in tarball). v1 = single-version installs only. |
| 229 | +- Python SDK. v1 ships Go SDK in `pkg/appclient/`; most LLM agents are Python-shaped, so this is a known tradeoff. |
| 230 | +- Multi-reviewer submission workflow. v1 = founder-manual review at ≤1 app/week throughput. |
| 231 | +- Automated submission scanning (static/dynamic analysis, malware). v1.1+. |
| 232 | +- Per-app billing/metering. |
| 233 | +- Confidential computing / TEE. Apps trust the host. |
| 234 | + |
| 235 | +--- |
| 236 | + |
| 237 | +## Open questions |
| 238 | + |
| 239 | +The current open items most relevant to this doc: |
| 240 | + |
| 241 | +- First-party reference app: `pilot-tasks` or `pilot-data`? `pilot-tasks` directly demonstrates the state-inversion thesis. |
| 242 | +- Skill-render frequency and event triggers (currently default 5 min periodic + on state-change events). |
| 243 | +- Multi-tenant blast radius for one host running many apps. |
| 244 | +- App crash + state recovery semantics. |
| 245 | + |
| 246 | +--- |
| 247 | + |
| 248 | +## See also |
| 249 | + |
| 250 | +- `layers.yaml` — single source of truth for layer membership |
| 251 | +- `web4-private/architecture-notes/01-LAYERS.md` — full layer definitions |
| 252 | +- `web4-private/architecture-notes/03-INVARIANTS.md` — the ten principles + state ownership + lock graph + panic boundaries |
| 253 | +- `internal/skillinject/` — v1 of the skill-injection mechanism this platform generalizes |
0 commit comments