Skip to content

Commit 762e715

Browse files
committed
Initial extraction from web4
Lifts docs/ to a standalone repo. - SPEC.md + SPEC-*.md: protocol-internal specs - RUNBOOK-*.md: operator procedures (pilot-ca ceremony, compat-mode ops) - WHITEPAPER.{tex,pdf}: the protocol whitepaper - enterprise-readiness-report.{tex,pdf}: enterprise audit catalog - ietf/: IETF drafts - research/: comparison + social-structures academic papers - media/: diagrams + screencasts referenced by other docs - 22 files, ~2.9MB
0 parents  commit 762e715

23 files changed

Lines changed: 6773 additions & 0 deletions

AGENT-APPS.md

Lines changed: 253 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,253 @@
1+
# Agent App Platform
2+
3+
> Status: DESIGN — pre-build, pending validation gate.
4+
> Layer impact: new L11 plugin (`plugins/appsandbox`), new L11 server (`pkg/appstore/server`), new utilities (`pkg/appmanifest`, `pkg/appfetch`, `internal/appkeys`), new L12 binary (`cmd/pilot-submit`), extended L12 binary (`cmd/pilotctl`).
5+
6+
A curated app registry and sandboxed runtime that lets autonomous agents discover, install, and use stateful apps without human-in-the-loop setup. Apps are signed by Pilot project keys, distributed only via `apps.pilotprotocol.network`, and run inside per-app Linux sandboxes on the host daemon.
7+
8+
---
9+
10+
## Model
11+
12+
**Each compiled app binary ships with a skill-file template** (`skill.md.tmpl`):
13+
14+
- **Static parts** — vendor-authored, immutable instructions to the agent: what the app does, available RPC methods, schemas, examples. Signed in the tarball by the vendor and countersigned by Pilot at deploy.
15+
- **Fractional parts** — slots in the template (Jinja-shaped: `{{tasks_count}}`, `{{recent: limit=5}}`) that the daemon's renderer fills at expansion time by calling the app's `summary` RPC. The RPC reads from the app's own SQLite database and returns a digest of current state.
16+
17+
The skill file is the agent's *map*. SQLite stays inside the sandbox; agents never touch it directly. SQL traversal is the app's job — exposed through manifest-declared RPC methods. The skill file describes which methods to call; the app implements them.
18+
19+
This is `skillinject` v2: instead of one global heartbeat injected into every agent tool dir, each installed app contributes one rendered SKILL.md to each tool dir. Generalizes the existing pattern; deprecates the single-heartbeat model.
20+
21+
---
22+
23+
## End-to-end flow
24+
25+
```mermaid
26+
flowchart TB
27+
%% ============ Pilot-operated registry ============
28+
subgraph REG["apps.pilotprotocol.network — Pilot-operated registry"]
29+
direction LR
30+
QUEUE["Submission queue<br/>(pilot-submit uploads)"]
31+
REVIEW["Manual review<br/>(v1 = founder)"]
32+
SIGNER["Sign with<br/>Pilot project key"]
33+
CDN[("Signed binary CDN")]
34+
DENY[("revoked.json<br/>deny-list")]
35+
QUEUE --> REVIEW --> SIGNER --> CDN
36+
end
37+
38+
VENDOR(["Vendor<br/>e.g. Legora"]) -- "pilot-submit tarball<br/>{bin, manifest.toml, skill.md.tmpl, schema.sql}" --> QUEUE
39+
40+
%% ============ Local host ============
41+
subgraph HOST["Local host running pilot-daemon"]
42+
direction TB
43+
DAEMON{{"pilot-daemon (L7)"}}
44+
45+
subgraph PLUGIN["plugins/appsandbox (L11)"]
46+
FETCH["appfetch + appkeys<br/>HTTPS + sig verify"]
47+
LIFECYCLE["Sandbox lifecycle<br/>namespaces + seccomp +<br/>cgroups + bind-mount"]
48+
RENDER["Skill-file renderer<br/>(static + fractional)"]
49+
REVCHK["Revocation check<br/>24h cached"]
50+
end
51+
52+
subgraph APPFS["~/.pilot/apps/&lt;name&gt;/"]
53+
BIN[/"bin/app binary"/]
54+
MAN[/"manifest.toml<br/>caps + RPC methods"/]
55+
TMPL[/"skill.md.tmpl<br/>static + {{slots}}"/]
56+
DB[("state.db<br/>per-app SQLite")]
57+
end
58+
59+
subgraph SBX["sandboxed app process"]
60+
RUNTIME["App runtime<br/>(handles RPC methods)"]
61+
SUMMARY["summary RPC<br/>returns fractional state JSON"]
62+
end
63+
64+
subgraph DIRS["Agent tool dirs"]
65+
CSKILL[/"~/.claude/skills/&lt;name&gt;/SKILL.md<br/>(rendered)"/]
66+
OSKILL[/"~/.openclaw/skills/&lt;name&gt;/SKILL.md<br/>(rendered)"/]
67+
end
68+
end
69+
70+
%% ============ Agent ============
71+
AGENT(["LLM agent<br/>(Claude Code, OpenClaw, …)"])
72+
73+
%% ----- Install flow (numbered) -----
74+
DAEMON -- "1. install &lt;name&gt;" --> FETCH
75+
CDN -- "2. HTTPS GET<br/>signed tarball" --> FETCH
76+
FETCH -- "3. verify Pilot sig" --> LIFECYCLE
77+
DENY -. "pull every 24h" .-> REVCHK
78+
REVCHK -- "block if revoked" --> LIFECYCLE
79+
LIFECYCLE -- "4. materialize folder" --> APPFS
80+
LIFECYCLE -- "5. spawn sandboxed" --> RUNTIME
81+
RUNTIME -- "owns r/w" --> DB
82+
83+
%% ----- Skill render flow (lettered) -----
84+
RENDER -- "A. read static" --> TMPL
85+
RENDER -- "B. call summary RPC" --> SUMMARY
86+
SUMMARY -- "query" --> DB
87+
SUMMARY -- "fractional JSON<br/>{tasks_count, recent: [...]}" --> RENDER
88+
RENDER -- "C. expand + write" --> CSKILL
89+
RENDER -- "C. expand + write" --> OSKILL
90+
91+
%% ----- Agent runtime invocation (roman) -----
92+
AGENT -- "i. reads SKILL.md<br/>(learns commands + sees state)" --> CSKILL
93+
AGENT -- "ii. pilotctl app call &lt;name&gt; &lt;method&gt; &lt;args&gt;" --> DAEMON
94+
DAEMON -- "iii. route via L9 IPC" --> RUNTIME
95+
RUNTIME -- "iv. SQL r/w + response" --> DAEMON
96+
DAEMON -- "v. return JSON to agent" --> AGENT
97+
98+
%% ----- Styling -----
99+
classDef cloud fill:#e8f0ff,stroke:#3366cc
100+
classDef host fill:#f5f5f5,stroke:#666
101+
classDef sandbox fill:#fff4e0,stroke:#cc8800
102+
classDef store fill:#e8f8e8,stroke:#3a9d3a
103+
class REG cloud
104+
class HOST host
105+
class SBX sandbox
106+
class APPFS store
107+
```
108+
109+
Three independent loops, deliberately separate:
110+
111+
1. **Install (1→5)** — runs once per app, on `pilotctl app install`. Pulls signed tarball, verifies against Pilot project key (pinned at compile in `internal/appkeys`), checks deny-list, materializes folder, spawns under sandbox primitives.
112+
2. **Skill render (A→C)** — runs periodically (default 5 min) AND on app-published state-change events. The fractional memory expansion loop. Stays cheap: `summary` RPC returns a digest, never raw rows.
113+
3. **Agent invocation (i→v)** — runtime hot path. Agent reads SKILL.md to learn commands and see digest state; calls happen via L9 IPC mediated by the daemon; app handles SQL internally; response returns as JSON.
114+
115+
---
116+
117+
## Why direct SQLite stubs are deliberately excluded from skill files
118+
119+
- The app's schema is implementation, not contract. Schema migrations would break agents if exposed.
120+
- Arbitrary SQL from an LLM against a vendor's DB is a footgun (correctness, security, performance).
121+
- The RPC surface is the stable contract. The vendor controls the queries; the agent stays at the API boundary.
122+
123+
If an app needs to expose richer query capability, the manifest declares additional RPC methods (`search`, `filter`, etc.) that the app implements over its own DB.
124+
125+
---
126+
127+
## Manifest example
128+
129+
```toml
130+
name = "pilot-tasks"
131+
version = "0.1.0"
132+
publisher = "pilot-project"
133+
signed-by = "ed25519:PILOT_PROJECT_KEY..."
134+
135+
[capabilities]
136+
allowed-network = [] # this app needs no outbound
137+
allowed-fs = ["state.db", "cache/"] # paths relative to bind-mount root
138+
exec = "none"
139+
140+
[sqlite]
141+
schema = "schema.sql"
142+
143+
[rpc]
144+
methods = ["add", "list", "complete", "summary"]
145+
# summary is the fractional-render contract: returns
146+
# { tasks_count: int, recent: [{title, ts}], in_progress: int }
147+
```
148+
149+
---
150+
151+
## Skill file template example
152+
153+
```markdown
154+
# pilot-tasks
155+
156+
A persistent task queue. Use this app for any task you want to remember across sessions.
157+
158+
## Current state
159+
- You have {{tasks_count}} open tasks.
160+
- {{in_progress}} in progress.
161+
- Recent: {{recent | bullet_list}}
162+
163+
## Commands
164+
- `pilotctl app call pilot-tasks add '{"title": "..."}'` — create a task
165+
- `pilotctl app call pilot-tasks list` — list open tasks
166+
- `pilotctl app call pilot-tasks complete '{"id": "..."}'` — mark complete
167+
```
168+
169+
After rendering with live state:
170+
171+
```markdown
172+
# pilot-tasks
173+
174+
A persistent task queue. Use this app for any task you want to remember across sessions.
175+
176+
## Current state
177+
- You have 3 open tasks.
178+
- 1 in progress.
179+
- Recent:
180+
- draft contract for Acme
181+
- review PR #1247
182+
- file weekly status
183+
184+
## Commands
185+
- `pilotctl app call pilot-tasks add '{"title": "..."}'` — create a task
186+
- `pilotctl app call pilot-tasks list` — list open tasks
187+
- `pilotctl app call pilot-tasks complete '{"id": "..."}'` — mark complete
188+
```
189+
190+
---
191+
192+
## Layer mapping (cross-reference with `layers.yaml`)
193+
194+
| Component | Layer | Package |
195+
|---|---|---|
196+
| Manifest TOML schema + parser | utility | `pkg/appmanifest/` |
197+
| Pilot project pubkey + sig verify | utility | `internal/appkeys/` |
198+
| HTTPS fetch + verify of signed tarball | utility | `pkg/appfetch/` |
199+
| Sandbox runtime (namespaces, seccomp, cgroups, bind-mount) | L11 plugin | `plugins/appsandbox/` |
200+
| Agent ↔ app IPC routing | L11 plugin (inside appsandbox) | uses L9 via L10 contract |
201+
| App outbound network mediation | L11 plugin (inside appsandbox) | uses L10 `Streams` |
202+
| Skill-file renderer | L11 plugin (inside appsandbox) | replaces `plugins/skillinject` semantics |
203+
| `pilotctl app *` subcommands | L12 | `cmd/pilotctl/` (extension) |
204+
| `pilot-submit` CLI | L12 | `cmd/pilot-submit/` (new) |
205+
| Registry server (queue + review + sign + CDN + deny-list publish) | L11 server | `pkg/appstore/server/` |
206+
| Server binary | L12 | `cmd/appstore-server/` |
207+
208+
No new layer is required. All imports remain strictly downward (P1). Wire formats L1/L4/L5/L6/L7 are untouched (P6).
209+
210+
---
211+
212+
## Prerequisites flagged by the existing architecture docs
213+
214+
The following items in `web4-private/architecture-notes/` become **blocking** for the appsandbox plugin, not merely transitional:
215+
216+
1. **Per-plugin panic supervisor** (`03-INVARIANTS.md §8`). Today, an L11 plugin panic kills the daemon. `appsandbox` manages N child processes — a panic during app lifecycle kills the daemon AND orphans every running app. Must ship before `appsandbox` launches.
217+
2. **T7.1 — plugin lifecycle out of `pkg/daemon` into `cmd/daemon`** (`layers.yaml` transitional). Today `pkg/daemon (L7)` imports `pkg/coreapi (L10)` — wrong direction. Must resolve before introducing a heavier plugin.
218+
3. **Audit #15`TestDaemonShutdownStopsGoroutines`** (`03-INVARIANTS.md §4`). Listed as "the prerequisite for all extraction work." `appsandbox` adds goroutines per running app; without the baseline test, sandbox lifetime leaks are invisible.
219+
220+
These are not new concerns; the architecture docs already track them. The agent app platform makes them load-bearing instead of background hygiene.
221+
222+
---
223+
224+
## Out of scope for v1
225+
226+
- Cross-host app calls (agent on host A invoking app on host B). v1 is local-host install only.
227+
- App → app composition. v1 forbids one app calling another; capability composition deferred to v1.1.
228+
- Schema migrations (`migrations/` directory in tarball). v1 = single-version installs only.
229+
- Python SDK. v1 ships Go SDK in `pkg/appclient/`; most LLM agents are Python-shaped, so this is a known tradeoff.
230+
- Multi-reviewer submission workflow. v1 = founder-manual review at ≤1 app/week throughput.
231+
- Automated submission scanning (static/dynamic analysis, malware). v1.1+.
232+
- Per-app billing/metering.
233+
- Confidential computing / TEE. Apps trust the host.
234+
235+
---
236+
237+
## Open questions
238+
239+
The current open items most relevant to this doc:
240+
241+
- First-party reference app: `pilot-tasks` or `pilot-data`? `pilot-tasks` directly demonstrates the state-inversion thesis.
242+
- Skill-render frequency and event triggers (currently default 5 min periodic + on state-change events).
243+
- Multi-tenant blast radius for one host running many apps.
244+
- App crash + state recovery semantics.
245+
246+
---
247+
248+
## See also
249+
250+
- `layers.yaml` — single source of truth for layer membership
251+
- `web4-private/architecture-notes/01-LAYERS.md` — full layer definitions
252+
- `web4-private/architecture-notes/03-INVARIANTS.md` — the ten principles + state ownership + lock graph + panic boundaries
253+
- `internal/skillinject/` — v1 of the skill-injection mechanism this platform generalizes

0 commit comments

Comments
 (0)