feat(skillet): Skill generated by skillet's new agent-orchestrator pipeline by gricha · Pull Request #143 · getsentry/skills

gricha · 2026-05-05T01:27:50Z

This PR adds the skillet skill — the meta-skill that routes a
user to the right skillet CLI subcommand — produced clean-room
by skillet's new bundled-agent pipeline.

This PR is just the skill artifact, isolated for review. The
pipeline that produced it (the rewrite that replaces skillet's
multi-phase TypeScript pipeline with a small set of bundled
authoring agents) lives in getsentry/skillet#2.

Generation stats

Input: spec.yaml (12 behaviors/must_nots, hand-curated)
Pipeline: skill-writer + eval-writer in parallel, then
skill-validator + evals-validator in parallel
Wall-clock: 175 seconds end-to-end
Tool calls: 6 (skill-writer) + 21 (eval-writer)
Validators: both ok=true, 0 findings on first pass — no
re-passes needed

Files

SKILL.md (118 lines) — router for skillet CLI commands;
imperative voice, decision table for command selection,
Don't section for must_nots
spec.yaml — source of truth (skillet-managed; this is what
the agents read)
evals/_judges.ts — 13 canonical named judges
evals/<id>.eval.ts — one file per spec entry, 14 cases
total. Uses upstream vitest-evals + skillet's harness via
@sentry/skillet/evals

Eval results against the produced SKILL.md

./dist/cli.js eval skills/skillet:

First run: 14/14 passed
Second run: 13/14 (one judge variance — UsesScopedPackageJudge
graded a clarifying-questions response 0.0 because the scoped
package mention came at the end, not the start; the agent
technically did the right thing, the judge was overstrict)

Known shape vs. what skillet's legacy pipeline produced

This branch's diff vs. the legacy-produced version (on
getsentry/skillet main) is ~263 insertions / ~286 deletions.
Tighter prose, same describeEval ids, same case shapes.

Two cases lost workspace fixtures in favor of judge-only
assertions — a slight regression to address by tightening
agents/eval-writer/references/eval-contract.md in the
skillet repo.

Reviewing this PR

You're looking at the skill artifact end-state. To replicate
the generation:

git clone https://github.com/getsentry/skillet
cd skillet
git checkout experimental/agent-orchestration
npm install && npm run build
rm -rf skills/skillet/{SKILL.md,evals}
./dist/cli.js improve skills/skillet

That regenerates the same files (modulo agent variance).

This is the `skillet` skill — the meta-skill that routes a user's intent to the right `skillet` CLI subcommand — produced clean-room by skillet's new bundled-agent pipeline (skill-writer + eval-writer + skill-validator + evals-validator). Generated from `spec.yaml` in 175 seconds. No re-passes needed: both validators returned ok=true with 0 findings on first pass. 14 eval cases across 12 spec entries; the regenerated evals run 13-14/14 against the produced SKILL.md. This PR is *just the skill artifact* — the pipeline that produced it lives in getsentry/skillet#2. Files: - SKILL.md (118 lines) — router for skillet CLI commands - spec.yaml — 12 behaviors/must_nots, source of truth - evals/_judges.ts — 13 canonical judges - evals/<id>.eval.ts — one per spec entry, all using the vitest-evals + @sentry/skillet/evals harness shape Co-Authored-By: Opus 4.7 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skillet): Skill generated by skillet's new agent-orchestrator pipeline#143

feat(skillet): Skill generated by skillet's new agent-orchestrator pipeline#143
gricha wants to merge 1 commit into
mainfrom
gricha/skillet-skill-from-new-pipeline

gricha commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gricha commented May 5, 2026

Generation stats

Files

Eval results against the produced SKILL.md

Known shape vs. what skillet's legacy pipeline produced

Reviewing this PR

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant