Skip to content

Commit afb4879

Browse files
committed
fix(dev,html-app): adjust skills index to align with Cloudflare Agent Skills Discovery RFC
1 parent e718677 commit afb4879

3 files changed

Lines changed: 112 additions & 29 deletions

File tree

dev/tools/AGENTS.md

Lines changed: 36 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -319,11 +319,11 @@ porting between repos stays trivial.
319319

320320
### URL shape
321321

322-
| Surface | URL | Content type | Source on disk |
323-
|-------------------|----------------------------------------------|------------------|-----------------------------------------|
324-
| Tool page | `https://{{TOOLS_HOST}}/<path>` | `text/html` | `dev/tools/<name>.html` |
325-
| Per-tool skill | `https://{{TOOLS_HOST}}/<path>.md` | `text/markdown` | `dev/tools/<name>.skill.md` |
326-
| Aggregate index | `https://{{TOOLS_HOST}}/llms.txt` | `text/markdown` | generated at deploy time from .html metadata; also the destination of `/`'s `Accept: text/markdown` redirect |
322+
| Surface | URL | Content type | Source on disk |
323+
|-----------------|------------------------------------|-----------------|--------------------------------------------------------------------------------------------------------------|
324+
| Tool page | `https://{{TOOLS_HOST}}/<path>` | `text/html` | `dev/tools/<name>.html` |
325+
| Per-tool skill | `https://{{TOOLS_HOST}}/<path>.md` | `text/markdown` | `dev/tools/<name>.skill.md` |
326+
| Aggregate index | `https://{{TOOLS_HOST}}/llms.txt` | `text/markdown` | generated at deploy time from .html metadata; also the destination of `/`'s `Accept: text/markdown` redirect |
327327

328328
Two **separate responders** per tool: one for the HTML body, one for the markdown.
329329
This avoids fragile content-negotiation, keeps the HTML body under
@@ -448,12 +448,12 @@ checklist asks every agent-friendly site to publish. None of them require any
448448
per-tool authoring; they are derived 1:1 from the same HTML registry +
449449
`*.skill.md` directory listing as `llms.txt`.
450450

451-
| URL | Content type | Source of truth | Responder env var |
452-
|------------------------------------------------|---------------------|--------------------------------------------|-----------------------------------------------------|
453-
| `/robots.txt` | `text/plain` | `buildRobotsTxt()` in `deploy.ts` | `SECUTILS_HTML_APP_RESPONDER_ID_ROBOTS_TXT` |
454-
| `/sitemap.xml` | `application/xml` | `buildSitemapXml()` in `deploy.ts` | `SECUTILS_HTML_APP_RESPONDER_ID_SITEMAP_XML` |
455-
| `/.well-known/agent-skills/index.json` | `application/json` | `buildAgentSkillsIndex()` in `deploy.ts` | `SECUTILS_HTML_APP_RESPONDER_ID_AGENT_SKILLS_INDEX` |
456-
| `Link:` headers on `/` | (HTTP response headers) | hard-coded `indexLinkHeaders` in `deploy.ts` | (no extra responder; pinned via index settings) |
451+
| URL | Content type | Source of truth | Responder env var |
452+
|----------------------------------------|-------------------------|----------------------------------------------|-----------------------------------------------------|
453+
| `/robots.txt` | `text/plain` | `buildRobotsTxt()` in `deploy.ts` | `SECUTILS_HTML_APP_RESPONDER_ID_ROBOTS_TXT` |
454+
| `/sitemap.xml` | `application/xml` | `buildSitemapXml()` in `deploy.ts` | `SECUTILS_HTML_APP_RESPONDER_ID_SITEMAP_XML` |
455+
| `/.well-known/agent-skills/index.json` | `application/json` | `buildAgentSkillsIndex()` in `deploy.ts` | `SECUTILS_HTML_APP_RESPONDER_ID_AGENT_SKILLS_INDEX` |
456+
| `Link:` headers on `/` | (HTTP response headers) | hard-coded `indexLinkHeaders` in `deploy.ts` | (no extra responder; pinned via index settings) |
457457

458458
#### `/robots.txt`
459459

@@ -486,12 +486,31 @@ engines respect it as a hint, not a contract.
486486
#### `/.well-known/agent-skills/index.json`
487487

488488
[Cloudflare's Agent Skills Discovery RFC v0.2.0](https://github.com/cloudflare/agent-skills-discovery-rfc)
489-
shape: `$schema` field plus a `skills` array where each entry has `name`,
490-
`type: "skill"`, `description` (mirrors the HTML's `su-tool-description`),
491-
`url` (the live `<path>.md` URL), and `sha256` of the deployed skill body.
492-
The hash is computed from the **substituted** Markdown body that actually
493-
ships, so an agent that's already cached the skill can detect updates with
494-
a single GET.
489+
shape: `$schema` URI (pinned to the canonical
490+
`https://schemas.agentskills.io/discovery/0.2.0/schema.json` - the spec
491+
requires strict clients to match it exactly) plus a `skills` array where
492+
each entry has:
493+
494+
- `name` - the **frontmatter `name:` value from the SKILL.md**, not the
495+
file slug. This is the canonical Agent Skills identifier (e.g.
496+
`pem-certificate-decoder`, `mock-response`); the slug (`pem`, `echo`) is
497+
a deploy-time path concern and would diverge from the promo site's
498+
`/.well-known/agent-skills/index.json`, which keys off the same field.
499+
`deploy.ts` parses the frontmatter at index build time and **fails the
500+
deploy** if any skill is missing a `name:` or if two skills collide on
501+
it - agents cache by name, so a collision corrupts that cache.
502+
- `type: "skill-md"` - the v0.2.0 RFC requires `"skill-md"` or
503+
`"archive"`; strict clients silently skip unrecognized values. Earlier
504+
deploys used `"skill"`, which would have made every entry invisible to
505+
a literal RFC implementation.
506+
- `description` - mirrors the HTML's `su-tool-description` `<meta>` so
507+
marketing/SEO/agent copy stays in sync from one source.
508+
- `url` - the live `<path>.md` URL.
509+
- `digest: "sha256:<hex>"` - per the RFC's "Integrity and Verification"
510+
section. The hash is computed from the **substituted** Markdown body
511+
that actually ships, so an agent that's already cached the skill can
512+
detect updates with a single GET. Earlier deploys emitted a bare
513+
`sha256: <hex>` field instead, which strict clients would not recognise.
495514

496515
#### `Link:` headers on `/`
497516

dev/tools/deploy.ts

Lines changed: 55 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -250,16 +250,47 @@ function buildSitemapXml(tools: ToolMeta[], toolsHost: string): string {
250250

251251
// `/.well-known/agent-skills/index.json` -- Cloudflare's Agent Skills
252252
// Discovery RFC v0.2.0 format (https://github.com/cloudflare/agent-skills-discovery-rfc).
253-
// One entry per deployed `<slug>.skill.md`. The sha256 of each skill body is
254-
// included so agent skill loaders can detect updates without re-fetching.
253+
// One entry per deployed `<slug>.skill.md`. The digest of each skill body is
254+
// included (as `sha256:<hex>` per the spec) so agent skill loaders can detect
255+
// updates without re-fetching.
256+
//
257+
// Strict v0.2.0 conformance (was wrong in earlier deploys, fixed for review
258+
// feedback from Cloudflare):
259+
// - `$schema` is the canonical `https://schemas.agentskills.io/...` URL,
260+
// not the `agentskills.io/schema/...` variant.
261+
// - `type` is `"skill-md"` (was `"skill"`).
262+
// - Integrity field is `digest: "sha256:<hex>"` (was a bare `sha256: <hex>`).
263+
// - `name` is taken from the SKILL.md YAML frontmatter `name:` field, NOT
264+
// from the file slug. The slug is a deploy-time path concern; the skill's
265+
// canonical identifier (e.g. `pem-certificate-decoder`, `mock-response`)
266+
// lives in the SKILL.md itself, where it must match the Agent Skills
267+
// spec naming rules and stay in sync with the promo site's
268+
// `/.well-known/agent-skills/index.json`, which keys off the same field.
255269
type SkillIndexEntry = {
256270
name: string;
257-
type: "skill";
271+
type: "skill-md";
258272
description: string;
259273
url: string;
260-
sha256: string;
274+
digest: string;
261275
};
262276

277+
// Extracts the `name:` value from a SKILL.md YAML frontmatter block. The
278+
// frontmatter is always the first `---`-delimited block at the top of the
279+
// file (we generate it that way ourselves). Returns `undefined` if there is
280+
// no frontmatter or no `name:` line -- the caller treats that as a hard
281+
// error rather than silently falling back to the slug, because a wrong name
282+
// in the discovery index makes the skill indistinguishable from a different
283+
// one cached by clients keying on `name`.
284+
const FRONTMATTER_RE = /^---\r?\n([\s\S]*?)\r?\n---/;
285+
const FRONTMATTER_NAME_RE = /^name:\s*["']?([^"'\r\n]+?)["']?\s*$/m;
286+
287+
function extractSkillName(body: string): string | undefined {
288+
const block = FRONTMATTER_RE.exec(body);
289+
if (!block) return undefined;
290+
const name = FRONTMATTER_NAME_RE.exec(block[1])?.[1]?.trim();
291+
return name || undefined;
292+
}
293+
263294
function buildAgentSkillsIndex(
264295
tools: ToolMeta[],
265296
skillBodies: Map<string, string>,
@@ -269,19 +300,35 @@ function buildAgentSkillsIndex(
269300
// non-promoted skills are still served at `<path>.md` for direct fetching.
270301
const ordered = tools.filter((t) => t.promote && t.path !== "/");
271302
const skills: SkillIndexEntry[] = [];
303+
const seenNames = new Set<string>();
272304
for (const t of ordered) {
273305
const body = skillBodies.get(t.slug);
274306
if (!body) continue;
307+
const name = extractSkillName(body);
308+
if (!name) {
309+
throw new Error(
310+
`agent-skills index: ${t.slug}.skill.md is missing a \`name:\` field in its YAML frontmatter`,
311+
);
312+
}
313+
if (seenNames.has(name)) {
314+
throw new Error(
315+
`agent-skills index: duplicate skill name "${name}" -- two SKILL.md files share the same frontmatter \`name:\``,
316+
);
317+
}
318+
seenNames.add(name);
319+
const hex = createHash("sha256").update(body, "utf-8").digest("hex");
275320
skills.push({
276-
name: t.slug,
277-
type: "skill",
321+
name,
322+
type: "skill-md",
278323
description: t.description,
279324
url: `https://${toolsHost}${t.path}.md`,
280-
sha256: createHash("sha256").update(body, "utf-8").digest("hex"),
325+
digest: `sha256:${hex}`,
281326
});
282327
}
283328
const doc = {
284-
$schema: "https://agentskills.io/schema/v0.2.0/index.schema.json",
329+
// Canonical RFC v0.2.0 schema URL. See
330+
// https://github.com/cloudflare/agent-skills-discovery-rfc for the spec.
331+
$schema: "https://schemas.agentskills.io/discovery/0.2.0/schema.json",
285332
skills,
286333
};
287334
return JSON.stringify(doc, null, 2) + "\n";

e2e/tools/registry.spec.ts

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -114,16 +114,33 @@ test.describe('Tools registry - cross-cutting agent-discovery artefacts', () =>
114114
expect(r.headers()['content-type'] ?? '').toMatch(/application\/json/);
115115
const doc = await r.json();
116116
expect(doc).toHaveProperty('$schema');
117-
expect(doc.$schema).toMatch(/agentskills/);
117+
// The `$schema` URI is an opaque identifier per RFC v0.2.0; strict
118+
// clients MUST match it exactly. We pin to the canonical Cloudflare URL.
119+
expect(doc.$schema).toBe('https://schemas.agentskills.io/discovery/0.2.0/schema.json');
118120
expect(Array.isArray(doc.skills)).toBe(true);
119121
expect(doc.skills.length, 'should list at least one skill').toBeGreaterThan(0);
120122
for (const skill of doc.skills) {
121-
expect(skill, 'every entry must be type=skill').toMatchObject({ type: 'skill' });
122-
expect(skill.name, 'name must be non-empty').toMatch(/\S/);
123+
// RFC v0.2.0: every entry MUST declare `type: "skill-md"` or `"archive"`.
124+
// Earlier deploys used a non-spec `"skill"` value, which strict clients
125+
// would silently skip; we now emit the spec-compliant value.
126+
expect(skill, 'every entry must be type=skill-md').toMatchObject({ type: 'skill-md' });
127+
// Agent Skills naming spec: 1-64 chars, lowercase alphanumeric + hyphens,
128+
// no leading/trailing/consecutive hyphens.
129+
expect(skill.name, 'name must conform to Agent Skills naming spec').toMatch(
130+
/^[a-z0-9]+(-[a-z0-9]+)*$/,
131+
);
132+
expect(skill.name.length).toBeLessThanOrEqual(64);
123133
expect(skill.description, 'description must be non-empty').toMatch(/\S/);
124134
expect(skill.url, 'url must point at our tools host .md').toMatch(new RegExp(`^https://${TOOLS_HOST}/.+\\.md$`));
125-
expect(skill.sha256, 'sha256 must be 64 hex chars').toMatch(/^[0-9a-f]{64}$/);
135+
// RFC v0.2.0 §"Integrity and Verification": digest is `sha256:<hex>`,
136+
// not a bare hex string under a `sha256` field.
137+
expect(skill.digest, 'digest must be sha256:<hex>').toMatch(/^sha256:[0-9a-f]{64}$/);
138+
expect(skill).not.toHaveProperty('sha256');
126139
}
140+
// The skill `name` MUST be unique across the index: agents cache by name,
141+
// and duplicates corrupt that cache.
142+
const names = doc.skills.map((s: { name: string }) => s.name);
143+
expect(new Set(names).size, 'skill names must be unique').toBe(names.length);
127144
// Non-promoted tools must not advertise their skill in the discovery
128145
// index (their `<path>.md` is still served for direct fetching).
129146
const urls = doc.skills.map((s: { url: string }) => s.url);

0 commit comments

Comments
 (0)