Commit 5892713
Step 9: deprecate subsumed specialists (v0.2 complete) (#14)
* Step 9: deprecate subsumed specialists (v0.2 complete)
Remove fetch-page-title and extract-structured-data from the advertised
skill set. Both are reachable via browser-task — page-title as a trivial
intent, structured extraction via a "return JSON: {...}" instruction
carried in the planner's done(result=...) channel. Cost delta is 2-3x
tokens per call, acceptable given zero deterministic high-volume
callers today. extract-structured-data was also out of spec on §7.1 —
it called the no-argument CreateSessionAsync overload and accepted any
host. The generalist enforces allowlists by design.
Advertised v0.2 surface lands at three skills: browser-task,
learn-form-schema, execute-form-batch.
- Delete FetchPageTitleCapability, ExtractStructuredDataCapability,
and the shared CapabilityInput URL/description parser (no other
consumers). browser-task has its own BrowserTaskInput; form
capabilities have their own input classes.
- Delete the session-level one-shot helpers that only the removed
specialists used: IBrowserSession.FetchPageTitleAsync,
CapturePageSnapshotAsync, PageSnapshot, PageSnapshotSource.
- Delete the corresponding tests — 7 unit tests for the capabilities,
the PlaywrightBrowserSessionTests + PageSnapshotTests integration
suites, and the ExtractStructuredDataIntegrationTests real-LLM
benchmark. BrowserTaskIntegrationTests remains the real-LLM surface.
- Trim StubBrowserSessionFactory + FakeAgentBrowserSession to match
the pruned IBrowserSession.
Update metadata: deploy/rockbot-seed/*.json, docker-compose.yml
description + curl smoke example, .env.example comments, Program.cs
comment, docs/capabilities.md, spec §5.2 capability table and §9.1
step-9 description, CLAUDE.md Status + Browser + Capabilities
sections, framework-feedback step-9 section. Version bumped
0.2.0-alpha.8 → 0.2.0-alpha.9.
Tests: 48 passed (46 Agent unit + 1 Forms integration + 1 placeholder),
3 real-LLM BrowserTaskIntegrationTests skipped as expected. Build
clean on Release.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Step 9 follow-ups: fixes surfaced during end-to-end validation
Running RockBot → Foragent with a real browser-task (MacBook-price search
across apple.com + bestbuy.com) surfaced three pre-existing issues that
blocked step 9's claimed end-to-end validation. Fixing them here on the
step-9 branch keeps the PR's test plan honest.
1. BrowserTaskPriming required IEmbeddingGenerator (DI resolution bug)
The primary-constructor parameter was annotated nullable
(IEmbeddingGenerator<string, Embedding<float>>?), but MSDI ignores C#
nullable annotations — it only honors default parameter values.
Reordered to put embeddingGenerator last with = null so MSDI treats
it as optional. Spec §5.6 says missing embeddings should downgrade
to BM25-only retrieval; that claim is now actually true. Two test
callers updated to drop the explicit embeddingGenerator: null arg.
2. Skill names with dotted hosts failed silently
RockBot 0.9's FileSkillStore.ValidateName rejects '.' — every real
host (bsky.app, apple.com, example.com) threw ArgumentException on
save. BskySeedSkillService swallowed the throw as a startup warning,
TryWriteLearnedSkillAsync swallowed it on the error path, and form
schemas just never persisted. Added SkillNaming.SanitizeHost that
replaces '.' → '-' (bsky.app → bsky-app) and applied it at three
call sites: BskySeedSkillService, BrowserTaskCapability.
TryWriteLearnedSkillAsync, LearnFormSchemaCapability.DeriveSkillName.
Allowlist matching and memory-search categories keep the original
dotted host — only skill names need sanitization. Test assertions
(BrowserTaskCapabilityTests, BskySeedSkillServiceTests,
LearnFormSchemaCapabilityTests) updated to the sanitized names;
skill-optimize.md directive examples updated so the dream loop
produces valid names.
3. Fresh named volume masks Dockerfile chown
The Foragent Dockerfile chowns /data to the non-root foragent user
(uid 1655) at image-build time, but Docker mounts a fresh named
volume root-owned, masking the build-time chown. Added a
foragent-init busybox one-shot (mirroring rockbot-init) that
chmod -R 777 /data/foragent on volume creation.
Docs updated: CLAUDE.md Status + Learning-substrate sections,
docs/capabilities.md, spec §5.6 skill-naming paragraph (calls out the
sanitization rule), framework-feedback step-9 follow-up section with
three framework observations (MSDI nullable footgun, validator's dot
rejection making real hosts fail, named-volume permissions pattern).
Tests: 48 passed / 3 LLM-gated skipped. End-to-end smoke: RockBot
dispatches browser-task to Foragent over the bus; Foragent plans 2
steps (navigate + snapshot), emits done with JSON result, reply lands
on user.response.RockBot.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Step 9 follow-ups: A2A DataPart input + agent-card hints
When RockBot's LLM called invoke_agent with free-form prose (message=
"...allowedHosts: ['*']..."), Foragent's three input parsers kept
rejecting with 'Missing allowedHosts' — they only consumed text parts
and expected a JSON object. RockBot 0.9.11+ supports structured input
via an A2A DataPart (AgentMessagePart{Kind="data", Data=<json>}), but
Foragent never advertised that it consumed data parts and the invoke_agent
tool description steered the LLM to omit 'data' unless the target
"is known to consume data." Result: loop.
Fix spans three surfaces:
1. Parsers accept DataPart. BrowserTaskInput, LearnFormSchemaInput, and
ExecuteFormBatchInput now look for a Kind="data" part first and use
its Data string as the JSON source. Text-JSON fallback stays (curl
callers), and for browser-task, a prose text part serves as the
intent fallback when the data part doesn't supply one. Metadata
overrides remain.
2. Skill descriptions explicitly direct callers to use the data
parameter. Each SkillDefinition.Description now leads with "PASS
INPUT AS AN A2A DATA PART (a structured JSON object), not as prose
inside the text message. When calling via RockBot's invoke_agent,
populate the 'data' parameter with this object." Matching entries
in deploy/rockbot-seed/well-known-agents.json updated so the LLM
sees the same guidance through list_known_agents.
3. Tests. Four new unit tests: one per input parser verifying a
DataPart with JSON is consumed; one for browser-task's text-as-
intent fallback when the data part omits intent. TestContext
gained RequestWithData(...) to build the dual-part shape RockBot's
invoke_agent produces.
Image bumped to rockylhotka/rockbot-agent:0.9.14 — softens the
invoke_agent 'data' tool description upstream, complementing the
skill-description hints on the Foragent side. CLAUDE.md Status
paragraph updated.
Docs: CLAUDE.md Capabilities section gains a note on the DataPart
contract. framework-feedback step-9 follow-up section extended with
the three-surface lesson (sender tool description ↔ target skill
description ↔ target parser all need to agree on the canonical shape).
Tests: 52 passed / 3 LLM-gated skipped. Build clean. Curl smoke
(text-JSON path) returns valid JSON via browser-task unchanged.
Live Blazor end-to-end test is next, against the updated 0.9.14
rockbot image.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Step 9 follow-ups: A2A task cancellation + self-teaching errors
RockBot 0.9.15 now publishes agent.task.cancel.{agentName} messages when
a wisp's local state fails after dispatching an A2A task (the duplicate-
dispatch scenario observed during step-9 validation). Foragent's
previous behavior inherited the framework's default
AgentTaskCancelHandler, which always replies TaskNotCancelable because
it assumes stateless agents. Foragent is stateful — a browser task is
potentially minutes long — so leaving the default would orphan browser
runs.
Implementation:
- InFlightTaskRegistry (singleton): ConcurrentDictionary<taskId,
CancellationTokenSource> with Register/TryCancel/Remove. Register
returns a linked CT that fires on either external cancel or the
parent message CT. Redelivered task ids cancel the prior
registration before replacing it, so stale work unwinds.
- ForagentTaskHandler wraps the capability's AgentTaskContext so the
CT observed via context.MessageContext.CancellationToken is the
linked one from the registry. Capabilities observe cancellation
without any signature change.
- ForagentCancelHandler (IMessageHandler<AgentTaskCancelRequest>):
on match calls TryCancel and publishes nothing (the running task's
own terminal reply is the acknowledgment); on miss publishes
AgentTaskError{Code=TaskNotFound}. Registered via
agent.HandleMessage<AgentTaskCancelRequest, ForagentCancelHandler>()
after AddA2A — last AddScoped wins, overriding the default.
- 11 new unit tests across registry, cancel handler, and task-handler
integration (parent-cancel → linked CT fires, external cancel →
linked CT fires, register/remove ties to finally, Remove drops
registration even on thrown capability).
Also in this commit, incorporating earlier step-9 follow-ups for the
same RockBot 0.9.x interop round:
- Self-teaching errors. When a parser rejects for a missing required
field, the response now tells the LLM exactly how to fix the call:
"Pass inputs as a JSON object on the A2A DataPart — in RockBot's
invoke_agent tool, that means filling the 'data' parameter, NOT
adding fields to the 'message' text. Example data: {...}." Observed
behavior: LLMs that ignore skill descriptions do read error replies
and adjust subsequent calls. Applied to all three parsers.
- Docker image bumped to rockylhotka/rockbot-agent:0.9.15 — brings
(a) invoke_agent's structured 'data' parameter (0.9.11), (b)
softened tool description encouraging DataPart usage (0.9.14), and
(c) the cancel-publisher that this commit consumes (0.9.15).
CLAUDE.md Status section updated accordingly.
Framework-feedback step-9 follow-up section extended with the cancel-
handler-override pattern as a candidate for upstream WithTaskCancellation
ergonomics (non-blocking — ~50 LOC across consumers isn't unbearable).
Tests: 63 passed / 3 LLM-gated skipped. Build clean. Foragent starts
cleanly on fresh volumes; agent.task.cancel.Foragent subscription
verified active at boot.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 9e2023f commit 5892713
44 files changed
Lines changed: 999 additions & 1207 deletions
File tree
- deploy/rockbot-seed
- docs
- src
- Foragent.Agent
- directives
- Foragent.Browser
- Foragent.Capabilities
- BrowserTask
- Forms
- tests
- Foragent.Agent.Tests
- BrowserTask
- Forms
- Foragent.Browser.Tests
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
6 | | - | |
| 5 | + | |
| 6 | + | |
7 | 7 | | |
8 | | - | |
9 | | - | |
10 | | - | |
| 8 | + | |
| 9 | + | |
11 | 10 | | |
12 | | - | |
| 11 | + | |
13 | 12 | | |
14 | 13 | | |
15 | 14 | | |
| |||
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
| 14 | + | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
18 | | - | |
19 | | - | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
23 | | - | |
24 | | - | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
50 | 63 | | |
51 | 64 | | |
52 | 65 | | |
53 | 66 | | |
54 | 67 | | |
55 | 68 | | |
| 69 | + | |
| 70 | + | |
56 | 71 | | |
57 | 72 | | |
58 | 73 | | |
| |||
64 | 79 | | |
65 | 80 | | |
66 | 81 | | |
67 | | - | |
| 82 | + | |
68 | 83 | | |
69 | 84 | | |
70 | 85 | | |
71 | | - | |
72 | | - | |
| 86 | + | |
| 87 | + | |
73 | 88 | | |
74 | 89 | | |
75 | 90 | | |
| |||
100 | 115 | | |
101 | 116 | | |
102 | 117 | | |
103 | | - | |
| 118 | + | |
104 | 119 | | |
105 | 120 | | |
106 | 121 | | |
| |||
136 | 151 | | |
137 | 152 | | |
138 | 153 | | |
139 | | - | |
| 154 | + | |
140 | 155 | | |
141 | 156 | | |
142 | 157 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
12 | | - | |
13 | | - | |
14 | | - | |
15 | | - | |
16 | | - | |
17 | | - | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
18 | 27 | | |
19 | 28 | | |
20 | 29 | | |
| |||
60 | 69 | | |
61 | 70 | | |
62 | 71 | | |
63 | | - | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
64 | 78 | | |
65 | 79 | | |
66 | 80 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
311 | 311 | | |
312 | 312 | | |
313 | 313 | | |
314 | | - | |
315 | | - | |
316 | 314 | | |
317 | | - | |
318 | | - | |
319 | | - | |
320 | | - | |
321 | | - | |
322 | | - | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
323 | 323 | | |
324 | 324 | | |
325 | 325 | | |
| |||
393 | 393 | | |
394 | 394 | | |
395 | 395 | | |
396 | | - | |
| 396 | + | |
397 | 397 | | |
398 | 398 | | |
399 | | - | |
| 399 | + | |
400 | 400 | | |
401 | 401 | | |
402 | 402 | | |
403 | 403 | | |
404 | 404 | | |
405 | 405 | | |
406 | 406 | | |
407 | | - | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
408 | 411 | | |
409 | 412 | | |
410 | 413 | | |
| |||
714 | 717 | | |
715 | 718 | | |
716 | 719 | | |
717 | | - | |
718 | | - | |
719 | | - | |
720 | | - | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
721 | 729 | | |
722 | 730 | | |
723 | 731 | | |
| |||
0 commit comments