Skip to content

Commit 5892713

Browse files
Step 9: deprecate subsumed specialists (v0.2 complete) (#14)
* Step 9: deprecate subsumed specialists (v0.2 complete) Remove fetch-page-title and extract-structured-data from the advertised skill set. Both are reachable via browser-task — page-title as a trivial intent, structured extraction via a "return JSON: {...}" instruction carried in the planner's done(result=...) channel. Cost delta is 2-3x tokens per call, acceptable given zero deterministic high-volume callers today. extract-structured-data was also out of spec on §7.1 — it called the no-argument CreateSessionAsync overload and accepted any host. The generalist enforces allowlists by design. Advertised v0.2 surface lands at three skills: browser-task, learn-form-schema, execute-form-batch. - Delete FetchPageTitleCapability, ExtractStructuredDataCapability, and the shared CapabilityInput URL/description parser (no other consumers). browser-task has its own BrowserTaskInput; form capabilities have their own input classes. - Delete the session-level one-shot helpers that only the removed specialists used: IBrowserSession.FetchPageTitleAsync, CapturePageSnapshotAsync, PageSnapshot, PageSnapshotSource. - Delete the corresponding tests — 7 unit tests for the capabilities, the PlaywrightBrowserSessionTests + PageSnapshotTests integration suites, and the ExtractStructuredDataIntegrationTests real-LLM benchmark. BrowserTaskIntegrationTests remains the real-LLM surface. - Trim StubBrowserSessionFactory + FakeAgentBrowserSession to match the pruned IBrowserSession. Update metadata: deploy/rockbot-seed/*.json, docker-compose.yml description + curl smoke example, .env.example comments, Program.cs comment, docs/capabilities.md, spec §5.2 capability table and §9.1 step-9 description, CLAUDE.md Status + Browser + Capabilities sections, framework-feedback step-9 section. Version bumped 0.2.0-alpha.8 → 0.2.0-alpha.9. Tests: 48 passed (46 Agent unit + 1 Forms integration + 1 placeholder), 3 real-LLM BrowserTaskIntegrationTests skipped as expected. Build clean on Release. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Step 9 follow-ups: fixes surfaced during end-to-end validation Running RockBot → Foragent with a real browser-task (MacBook-price search across apple.com + bestbuy.com) surfaced three pre-existing issues that blocked step 9's claimed end-to-end validation. Fixing them here on the step-9 branch keeps the PR's test plan honest. 1. BrowserTaskPriming required IEmbeddingGenerator (DI resolution bug) The primary-constructor parameter was annotated nullable (IEmbeddingGenerator<string, Embedding<float>>?), but MSDI ignores C# nullable annotations — it only honors default parameter values. Reordered to put embeddingGenerator last with = null so MSDI treats it as optional. Spec §5.6 says missing embeddings should downgrade to BM25-only retrieval; that claim is now actually true. Two test callers updated to drop the explicit embeddingGenerator: null arg. 2. Skill names with dotted hosts failed silently RockBot 0.9's FileSkillStore.ValidateName rejects '.' — every real host (bsky.app, apple.com, example.com) threw ArgumentException on save. BskySeedSkillService swallowed the throw as a startup warning, TryWriteLearnedSkillAsync swallowed it on the error path, and form schemas just never persisted. Added SkillNaming.SanitizeHost that replaces '.' → '-' (bsky.app → bsky-app) and applied it at three call sites: BskySeedSkillService, BrowserTaskCapability. TryWriteLearnedSkillAsync, LearnFormSchemaCapability.DeriveSkillName. Allowlist matching and memory-search categories keep the original dotted host — only skill names need sanitization. Test assertions (BrowserTaskCapabilityTests, BskySeedSkillServiceTests, LearnFormSchemaCapabilityTests) updated to the sanitized names; skill-optimize.md directive examples updated so the dream loop produces valid names. 3. Fresh named volume masks Dockerfile chown The Foragent Dockerfile chowns /data to the non-root foragent user (uid 1655) at image-build time, but Docker mounts a fresh named volume root-owned, masking the build-time chown. Added a foragent-init busybox one-shot (mirroring rockbot-init) that chmod -R 777 /data/foragent on volume creation. Docs updated: CLAUDE.md Status + Learning-substrate sections, docs/capabilities.md, spec §5.6 skill-naming paragraph (calls out the sanitization rule), framework-feedback step-9 follow-up section with three framework observations (MSDI nullable footgun, validator's dot rejection making real hosts fail, named-volume permissions pattern). Tests: 48 passed / 3 LLM-gated skipped. End-to-end smoke: RockBot dispatches browser-task to Foragent over the bus; Foragent plans 2 steps (navigate + snapshot), emits done with JSON result, reply lands on user.response.RockBot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Step 9 follow-ups: A2A DataPart input + agent-card hints When RockBot's LLM called invoke_agent with free-form prose (message= "...allowedHosts: ['*']..."), Foragent's three input parsers kept rejecting with 'Missing allowedHosts' — they only consumed text parts and expected a JSON object. RockBot 0.9.11+ supports structured input via an A2A DataPart (AgentMessagePart{Kind="data", Data=<json>}), but Foragent never advertised that it consumed data parts and the invoke_agent tool description steered the LLM to omit 'data' unless the target "is known to consume data." Result: loop. Fix spans three surfaces: 1. Parsers accept DataPart. BrowserTaskInput, LearnFormSchemaInput, and ExecuteFormBatchInput now look for a Kind="data" part first and use its Data string as the JSON source. Text-JSON fallback stays (curl callers), and for browser-task, a prose text part serves as the intent fallback when the data part doesn't supply one. Metadata overrides remain. 2. Skill descriptions explicitly direct callers to use the data parameter. Each SkillDefinition.Description now leads with "PASS INPUT AS AN A2A DATA PART (a structured JSON object), not as prose inside the text message. When calling via RockBot's invoke_agent, populate the 'data' parameter with this object." Matching entries in deploy/rockbot-seed/well-known-agents.json updated so the LLM sees the same guidance through list_known_agents. 3. Tests. Four new unit tests: one per input parser verifying a DataPart with JSON is consumed; one for browser-task's text-as- intent fallback when the data part omits intent. TestContext gained RequestWithData(...) to build the dual-part shape RockBot's invoke_agent produces. Image bumped to rockylhotka/rockbot-agent:0.9.14 — softens the invoke_agent 'data' tool description upstream, complementing the skill-description hints on the Foragent side. CLAUDE.md Status paragraph updated. Docs: CLAUDE.md Capabilities section gains a note on the DataPart contract. framework-feedback step-9 follow-up section extended with the three-surface lesson (sender tool description ↔ target skill description ↔ target parser all need to agree on the canonical shape). Tests: 52 passed / 3 LLM-gated skipped. Build clean. Curl smoke (text-JSON path) returns valid JSON via browser-task unchanged. Live Blazor end-to-end test is next, against the updated 0.9.14 rockbot image. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Step 9 follow-ups: A2A task cancellation + self-teaching errors RockBot 0.9.15 now publishes agent.task.cancel.{agentName} messages when a wisp's local state fails after dispatching an A2A task (the duplicate- dispatch scenario observed during step-9 validation). Foragent's previous behavior inherited the framework's default AgentTaskCancelHandler, which always replies TaskNotCancelable because it assumes stateless agents. Foragent is stateful — a browser task is potentially minutes long — so leaving the default would orphan browser runs. Implementation: - InFlightTaskRegistry (singleton): ConcurrentDictionary<taskId, CancellationTokenSource> with Register/TryCancel/Remove. Register returns a linked CT that fires on either external cancel or the parent message CT. Redelivered task ids cancel the prior registration before replacing it, so stale work unwinds. - ForagentTaskHandler wraps the capability's AgentTaskContext so the CT observed via context.MessageContext.CancellationToken is the linked one from the registry. Capabilities observe cancellation without any signature change. - ForagentCancelHandler (IMessageHandler<AgentTaskCancelRequest>): on match calls TryCancel and publishes nothing (the running task's own terminal reply is the acknowledgment); on miss publishes AgentTaskError{Code=TaskNotFound}. Registered via agent.HandleMessage<AgentTaskCancelRequest, ForagentCancelHandler>() after AddA2A — last AddScoped wins, overriding the default. - 11 new unit tests across registry, cancel handler, and task-handler integration (parent-cancel → linked CT fires, external cancel → linked CT fires, register/remove ties to finally, Remove drops registration even on thrown capability). Also in this commit, incorporating earlier step-9 follow-ups for the same RockBot 0.9.x interop round: - Self-teaching errors. When a parser rejects for a missing required field, the response now tells the LLM exactly how to fix the call: "Pass inputs as a JSON object on the A2A DataPart — in RockBot's invoke_agent tool, that means filling the 'data' parameter, NOT adding fields to the 'message' text. Example data: {...}." Observed behavior: LLMs that ignore skill descriptions do read error replies and adjust subsequent calls. Applied to all three parsers. - Docker image bumped to rockylhotka/rockbot-agent:0.9.15 — brings (a) invoke_agent's structured 'data' parameter (0.9.11), (b) softened tool description encouraging DataPart usage (0.9.14), and (c) the cancel-publisher that this commit consumes (0.9.15). CLAUDE.md Status section updated accordingly. Framework-feedback step-9 follow-up section extended with the cancel- handler-override pattern as a candidate for upstream WithTaskCancellation ergonomics (non-blocking — ~50 LOC across consumers isn't unbearable). Tests: 63 passed / 3 LLM-gated skipped. Build clean. Foragent starts cleanly on fresh volumes; agent.task.cancel.Foragent subscription verified active at boot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 9e2023f commit 5892713

44 files changed

Lines changed: 999 additions & 1207 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.env.example

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,13 @@
22
#
33
# These are consumed by docker-compose.yml. Foragent and the RockBot agent can
44
# be pointed at different models — Foragent uses FORAGENT_LLM_* for its own
5-
# LLM-backed capabilities (extract-structured-data and beyond), the RockBot
6-
# agent uses LLM_* for its own reasoning.
5+
# LLM-backed capabilities (browser-task planner, form-schema enrichment), the
6+
# RockBot agent uses LLM_* for its own reasoning.
77
#
8-
# Direct curl tests of capabilities that don't need an LLM (fetch-page-title)
9-
# still work without either set; anything LLM-backed will fail until Foragent's
10-
# config is populated.
8+
# Every Foragent capability needs the LLM wired; without it, the host fails
9+
# fast at startup.
1110

12-
# ── Foragent's LLM (REQUIRED for extract-structured-data) ────────────────────
11+
# ── Foragent's LLM (REQUIRED — browser-task + form-schema enrichment) ────────
1312
# Azure AI Foundry / OpenAI-compatible endpoints are both fine.
1413
# Foundry endpoint shape: https://<resource>-<region>.cognitiveservices.azure.com/openai/v1/
1514
FORAGENT_LLM_ENDPOINT=https://your-resource-region.cognitiveservices.azure.com/openai/v1/

CLAUDE.md

Lines changed: 7 additions & 5 deletions
Large diffs are not rendered by default.

Directory.Build.props

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
<ImplicitUsings>enable</ImplicitUsings>
77
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
88
<EnforceCodeStyleInBuild>true</EnforceCodeStyleInBuild>
9-
<Version>0.2.0-alpha.8</Version>
9+
<Version>0.2.0-alpha.9</Version>
1010
<Authors>Marimer LLC</Authors>
1111
<Company>Marimer LLC</Company>
1212
<Copyright>Copyright (c) Marimer LLC</Copyright>

deploy/rockbot-seed/agent-trust.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
{
33
"agentId": "Foragent",
44
"level": 4,
5-
"approvedSkills": ["browser-task", "fetch-page-title", "extract-structured-data"],
5+
"approvedSkills": ["browser-task", "learn-form-schema", "execute-form-batch"],
66
"firstSeen": "2026-04-21T00:00:00+00:00",
77
"lastInteraction": "2026-04-21T00:00:00+00:00",
88
"interactionCount": 0

deploy/rockbot-seed/well-known-agents.json

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[
22
{
33
"agentName": "Foragent",
4-
"description": "Browser agent — navigates pages with Chromium and exposes task-level skills over HTTP A2A.",
4+
"description": "Browser agent — navigates pages with Chromium and exposes task-level skills over HTTP A2A. All skills consume structured input on the A2A DataPart (invoke_agent 'data' parameter); the text 'message' is only for human-readable summaries.",
55
"version": "1.0",
66
"url": "http://foragent:8080",
77
"protocolVersion": "1.0",
@@ -11,17 +11,17 @@
1111
{
1212
"id": "browser-task",
1313
"name": "Browser Task (generalist)",
14-
"description": "Drive a browser with an LLM-in-the-loop planner to accomplish a free-form intent. Input JSON {\"intent\":\"...\",\"allowedHosts\":[\"host\",\"*.host\",\"*\"],\"url\":\"optional start\",\"credentialId\":\"optional\",\"maxSteps\":60,\"maxSeconds\":120}. allowedHosts is required and empty rejects. Returns a structured JSON result with status (done/failed/incomplete), summary, optional result, step count, and navigations."
14+
"description": "Drive a browser with an LLM-in-the-loop planner to accomplish a free-form intent. PASS INPUT AS AN A2A DATA PART — populate invoke_agent's 'data' parameter with {\"intent\":\"...\",\"allowedHosts\":[\"host\",\"*.host\",\"*\"],\"url\":\"optional start\",\"credentialId\":\"optional\",\"maxSteps\":60,\"maxSeconds\":120}. 'intent' and 'allowedHosts' are required; an empty allowlist rejects the task. Use [\"*\"] explicitly when any host is acceptable. Returns a structured JSON result with status (done/failed/incomplete), summary, optional result, step count, and navigations."
1515
},
1616
{
17-
"id": "fetch-page-title",
18-
"name": "Fetch Page Title",
19-
"description": "Navigate to a URL with a real browser and return the contents of its <title> element."
17+
"id": "learn-form-schema",
18+
"name": "Learn Form Schema",
19+
"description": "Navigate to a web form, extract its structure (fields, types, options, validation), and persist it as a reusable skill. PASS INPUT AS AN A2A DATA PART — populate invoke_agent's 'data' parameter with {\"url\":\"https://...\",\"allowedHosts\":[\"host\"],\"formSelector\":\"optional\",\"credentialId\":\"optional\",\"skillName\":\"optional override\",\"intent\":\"optional prose\"}. 'url' and 'allowedHosts' are required. Returns the typed form schema plus the skill name it was persisted under."
2020
},
2121
{
22-
"id": "extract-structured-data",
23-
"name": "Extract Structured Data",
24-
"description": "Navigate to a URL and extract data matching a natural-language description, returning JSON. Input the target URL and a description of what to extract."
22+
"id": "execute-form-batch",
23+
"name": "Execute Form Batch",
24+
"description": "Submit a batch of rows against a learned form schema. PASS INPUT AS AN A2A DATA PART — populate invoke_agent's 'data' parameter with {\"schemaRef\":\"sites/host/forms/name\" OR \"schema\":{...FormSchema...},\"rows\":[{fieldName:value,...}],\"allowedHosts\":[\"host\"],\"credentialId\":\"optional\",\"mode\":\"abort-on-first\"|\"continue\",\"successIndicator\":\"optional CSS selector\"}. 'rows', 'allowedHosts', and exactly one of schemaRef/schema are required. Streams per-row progress. Default mode aborts on first failure."
2525
}
2626
]
2727
}

docker-compose.yml

Lines changed: 22 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
# - foragent — this project; exposes HTTP A2A on port 5210
66
# - rockbot-init — seeds /data/agent with RockBot profile + well-known-agents.json
77
# pointing at foragent
8-
# - rockbot — rockylhotka/rockbot-agent:0.9.11, configured to know Foragent
8+
# - rockbot — rockylhotka/rockbot-agent:0.9.15, configured to know Foragent
99
# as an A2A peer it can delegate tasks to. 0.9.11 brings
1010
# the structured-data invoke_agent surface (PR #291) so
1111
# RockBot can consume Foragent's FormSchema JSON results
@@ -21,7 +21,7 @@
2121
# curl -X POST http://localhost:5210/ \
2222
# -H "X-Api-Key: rockbot-calls-foragent" \
2323
# -H "Content-Type: application/json" \
24-
# -d '{"jsonrpc":"2.0","id":1,"method":"message/send","params":{"message":{"role":"ROLE_USER","messageId":"m1","parts":[{"text":"https://example.com"}]},"metadata":{"skill":"fetch-page-title"}}}'
24+
# -d '{"jsonrpc":"2.0","id":1,"method":"message/send","params":{"message":{"role":"ROLE_USER","messageId":"m1","parts":[{"text":"{\"intent\":\"fetch the page title\",\"url\":\"https://example.com\",\"allowedHosts\":[\"example.com\"]}"}]},"metadata":{"skill":"browser-task"}}}'
2525
# Note: the A2A v1-preview schema uses protobuf-style enum values (ROLE_USER, not "user")
2626
# and parts are bare {"text":"..."} objects — no "kind" field.
2727
#
@@ -47,12 +47,27 @@ services:
4747
timeout: 3s
4848
retries: 15
4949

50+
foragent-init:
51+
# One-shot ownership fix for the foragent-data named volume. The Foragent
52+
# Dockerfile chowns /data to the non-root `foragent` user at build time, but
53+
# Docker mounts a fresh named volume root-owned and masks the chown —
54+
# FileSkillStore then hits UnauthorizedAccessException on first boot. This
55+
# init container runs as root once per volume creation and mirrors the
56+
# rockbot-init pattern. Subsequent boots skip the mkdir+chown if already set.
57+
image: busybox:latest
58+
user: root
59+
command: ["sh", "-c", "mkdir -p /data/foragent/skills /data/foragent/memory && chmod -R 777 /data/foragent"]
60+
volumes:
61+
- foragent-data:/data/foragent
62+
5063
foragent:
5164
build:
5265
context: .
5366
depends_on:
5467
rabbitmq:
5568
condition: service_healthy
69+
foragent-init:
70+
condition: service_completed_successfully
5671
ports:
5772
- "5210:8080"
5873
environment:
@@ -64,12 +79,12 @@ services:
6479
RabbitMq__VirtualHost: /
6580
Gateway__AgentName: Foragent
6681
Gateway__InternalAgentName: Foragent
67-
Gateway__Description: "Browser agent — browser-task (generalist), learn-form-schema, execute-form-batch, fetch-page-title, extract-structured-data"
82+
Gateway__Description: "Browser agent — browser-task (generalist), learn-form-schema, execute-form-batch"
6883
# RockBot will call Foragent with header X-Api-Key: rockbot-calls-foragent
6984
ApiKeys__rockbot-calls-foragent__AgentId: RockBot
7085
ApiKeys__rockbot-calls-foragent__DisplayName: RockBot
71-
# LLM required for the extract-structured-data capability. Namespaced so
72-
# Foragent can point at a different model than the RockBot side.
86+
# LLM required for the browser-task planner and form-schema enrichment.
87+
# Namespaced so Foragent can point at a different model than the RockBot side.
7388
ForagentLlm__Endpoint: ${FORAGENT_LLM_ENDPOINT:?FORAGENT_LLM_ENDPOINT is required}
7489
ForagentLlm__ModelId: ${FORAGENT_LLM_MODEL_ID:?FORAGENT_LLM_MODEL_ID is required}
7590
ForagentLlm__ApiKey: ${FORAGENT_LLM_API_KEY:?FORAGENT_LLM_API_KEY is required}
@@ -100,7 +115,7 @@ services:
100115
- foragent-data:/data/foragent
101116

102117
rockbot-init:
103-
image: rockylhotka/rockbot-agent:0.9.11
118+
image: rockylhotka/rockbot-agent:0.9.15
104119
user: root
105120
entrypoint: ["/bin/sh", "-c"]
106121
command:
@@ -136,7 +151,7 @@ services:
136151
- ./deploy/rockbot-seed:/seed:ro
137152

138153
rockbot:
139-
image: rockylhotka/rockbot-agent:0.9.11
154+
image: rockylhotka/rockbot-agent:0.9.15
140155
depends_on:
141156
rockbot-init:
142157
condition: service_completed_successfully

docs/capabilities.md

Lines changed: 22 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,22 @@ invoke capabilities by name; Foragent handles the browser mechanics.
88
- `browser-task`**generalist**, spec §5.2. LLM-in-the-loop planner that
99
drives a real browser to accomplish a free-form intent. Shipped in
1010
step 6; step 7 added skills + memory priming (spec §5.6).
11-
- `fetch-page-title` — specialist. Inherited from step 1/2.
12-
- `extract-structured-data` — specialist. Inherited from step 3.
13-
14-
The step-4 `post-to-site` capability was removed in step 7 — the
15-
generalist `browser-task` plus the seeded `sites/bsky.app/login` skill
16-
subsume its function, and the project is still pre-public so no consumer
17-
needed a deprecation path.
11+
- `learn-form-schema` — specialist (phase-1, spec §5.5). Introspects a
12+
form and returns a typed `FormSchema` persisted at `sites/{host}/forms/{slug}`.
13+
- `execute-form-batch` — specialist (phase-3, spec §5.5). Submits rows
14+
against a learned schema, streaming per-row progress over A2A.
15+
16+
Three v0.1/v0.2 specialists have been removed as `browser-task` subsumes
17+
them. The project is pre-public so no deprecation path was required:
18+
19+
- `post-to-site` — removed in step 7. `browser-task` + the seeded
20+
`sites/bsky-app/login` skill cover the use case.
21+
- `fetch-page-title` — removed in step 9. Was a milestone-1 smoke
22+
target; `browser-task` with a simple intent produces the same result.
23+
- `extract-structured-data` — removed in step 9. `browser-task` with a
24+
"return JSON: {…}" intent produces the same result. Its typed input
25+
shape also lacked the mandatory allowlist required by spec §7.1; the
26+
generalist enforces that by design.
1827

1928
## `browser-task` input shape
2029

@@ -60,7 +69,12 @@ A JSON object in a single text part:
6069
```
6170

6271
`incomplete` means the budget was exhausted before `done`/`fail` was
63-
called.
72+
called. For extraction-style tasks, instruct the planner to return JSON
73+
via the `result` field — e.g. intent `"Open https://shop.example/p/42
74+
and return {\"name\":..., \"price_usd\":...} as JSON in the result
75+
field."`. The planner is not schema-enforced the way
76+
`extract-structured-data` used to be, so keep the target shape explicit
77+
in the intent.
6478

6579
## `browser-task` tool surface
6680

docs/foragent-specification.md

Lines changed: 23 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -311,15 +311,15 @@ callers cheap — not to proliferate.
311311
| `browser-task` | Generalist | Given intent + optional URL, credential id, and allowed-hosts list, plan and drive the browser to fulfill the intent. Uses RockBot skills + memory as priming. Returns a result or a structured intermediate artifact (e.g. a learned form schema). |
312312
| `learn-form-schema` | Specialist (phase-1) | Given a URL and optional credential, introspect a form and return its schema — fields, types, dropdown dependencies, validation rules. Persists the schema as a skill (§5.6). Returns the schema to the caller for review. |
313313
| `execute-form-batch` | Specialist (phase-2) | Given a learned schema (by id or inline) and a batch of row data, submit the form once per row. Streams A2A progress updates. Handles partial failure. |
314-
| `fetch-page-title` | Specialist | Return the `<title>` of a URL. Inherited from milestone 2. |
315-
| `extract-structured-data` | Specialist | Extract structured data from a page matching a natural-language description. Inherited from milestone 3. |
316314

317-
The v0.1 `post-to-site` capability ships in the main codebase as a
318-
regression test for credential handling. After step 7 it is removed
319-
from the advertised skill list; `browser-task` subsumes its function.
320-
321-
The v0.1 `monitor-page` and `fill-form` capabilities fold into
322-
`browser-task` and do not ship as separate advertised skills.
315+
After step 9 the v0.2 surface is three skills. The v0.1 `post-to-site`
316+
capability was removed in step 7 once the seeded `bsky.app` skill +
317+
`browser-task` covered it; the v0.1 `fetch-page-title` and v0.1
318+
`extract-structured-data` specialists were removed in step 9 — the
319+
generalist subsumes both at the cost of 2–3× tokens per call, which
320+
is acceptable given zero deterministic high-volume callers today. The
321+
v0.1 `monitor-page` and `fill-form` capabilities fold into `browser-task`
322+
and do not ship as separate advertised skills either.
323323

324324
### 5.3 Capabilities explicitly out of scope (v1)
325325

@@ -393,18 +393,21 @@ site knowledge, rather than building a Foragent-local store.
393393
`RockBot.Host.Abstractions` + `RockBot.Host.AgentMemoryExtensions.WithSkills()`).
394394
Stores site knowledge as markdown skills. Two origin categories:
395395
- **Human-authored skills** — operator-written primers for a site
396-
(e.g. `sites/bsky.app/overview`). Treated as priming hints for the
396+
(e.g. `sites/bsky-app/overview`). Treated as priming hints for the
397397
generalist planner.
398398
- **Agent-learned skills** — written by the generalist on successful
399-
task completion (e.g. `sites/bsky.app/learned/login-flow`). Tagged
399+
task completion (e.g. `sites/bsky-app/learned/login-flow`). Tagged
400400
with `metadata.source = "agent-learned"` and an importance score.
401401
- **`ILongTermMemory`** (file-backed, BM25 + semantic —
402402
`WithLongTermMemory()`). Declarative observations that don't fit the
403403
procedural skill shape: failed attempts, site-version notes, ambient
404404
facts.
405405

406406
**Skill naming:** `sites/{host}/{phase-or-intent}` — e.g.
407-
`sites/bsky.app/login`, `sites/bsky.app/compose-post`. Hierarchical `/`
407+
`sites/bsky-app/login`, `sites/bsky-app/compose-post`. Host segments are
408+
sanitized (`.``-`) because RockBot 0.9's `FileSkillStore.ValidateName`
409+
rejects dots; `bsky.app` becomes `bsky-app`. Allowlists and memory
410+
categories keep the original dotted host. Hierarchical `/`
408411
nesting is supported by the store. `seeAlso` links cross-reference
409412
skills for the same site so retrieval surfaces a small knowledge
410413
cluster, not one skill at a time.
@@ -714,10 +717,15 @@ hard design questions until usage forces them.
714717
Resolve open question #6 (how to persist typed JSON alongside
715718
markdown skills) in the deliverable.
716719

717-
9. **Deprecate subsumed specialists.** Review whether `fetch-page-title`
718-
/ `extract-structured-data` still pay their way or fold into
719-
`browser-task` with equivalent cost. Land on the minimum advertised
720-
capability set v0.2 actually needs.
720+
9. **Deprecate subsumed specialists.** Reviewed whether `fetch-page-title`
721+
/ `extract-structured-data` still paid their way vs. `browser-task` at
722+
equivalent cost. Both removed: `fetch-page-title` was a milestone-1
723+
smoke-test relic that `browser-task` subsumes trivially;
724+
`extract-structured-data` was functionally equivalent to a `browser-task`
725+
intent that asks for JSON in the `done.result` channel (cost delta
726+
~2–3× tokens per call, zero deterministic high-volume callers today),
727+
and was out of spec on §7.1 mandatory allowlists. Advertised surface
728+
lands at `browser-task` + `learn-form-schema` + `execute-form-batch`.
721729

722730
Each milestone produces framework feedback. Capture it in
723731
`docs/framework-feedback.md` — some will be small ergonomic fixes; some

0 commit comments

Comments
 (0)