Skip to content

Latest commit

 

History

History
59 lines (47 loc) · 2.83 KB

File metadata and controls

59 lines (47 loc) · 2.83 KB

Agent Spec Completeness Checklist

Run through this before deploying any agent. Every NO is a failure mode you haven't addressed.


IDENTITY

  • Does the agent have a specific function described in one sentence?
  • Could you distinguish this agent's output from a generic chatbot's?
  • If client-facing, is there a brand name separate from your personal identity?
  • Have you defined what the agent is NOT? (scope boundaries)

PLATFORMS

  • Is every platform the agent interacts with listed?
  • Are authentication requirements noted (key names, not keys)?
  • Are platform-specific constraints documented (rate limits, blocked endpoints, format requirements)?
  • Could someone else deploy this agent on these platforms without asking you questions?

TRIGGER

  • Is the trigger type explicit (manual, cron, event)?
  • If cron: have you calculated the daily/monthly token cost of the schedule?
  • If event: is the trigger condition specific enough to avoid firing on noise?
  • Can you predict exactly when this agent will run in the next 24 hours?

WORKFLOW

  • Is every step numbered?
  • Are there explicit decision points for ambiguous situations?
  • Does the workflow include "flag for human review" at the right moments?
  • Is the output location specified (file path, naming convention)?
  • If the agent followed these steps literally and did nothing else, would you get a usable deliverable?

OUTPUT STANDARDS

  • Are format requirements specific (markdown, plain text, JSON, etc.)?
  • Are length constraints given as ranges, not vibes?
  • Could you use these standards to reject a bad deliverable with a specific reason?
  • Have you described what bad output looks like?

TOOL PERMISSIONS

  • Is this an explicit allowlist (not "use whatever you need")?
  • For each tool: is the permitted use case stated?
  • Is anything NOT on the list implicitly denied?

STANDING LIMITS

  • Are all hard NOs written as prohibitions, not suggestions?
  • Does the list include: publishing, spending, client interaction, scope boundaries?
  • If the agent violated any of these, would it cause a real problem?

TOKEN DISCIPLINE

  • Does the agent know when to stop iterating?
  • Does the agent know when to flag uncertainty instead of guessing?
  • Is there a cost ceiling or iteration cap?
  • Would this section prevent a 10x budget overrun on a single run?

FINAL CHECK

  • Read the entire spec as if you're the agent seeing it for the first time. Does it make sense without any context you haven't written down?
  • Is the spec short enough to fit in a context window without crowding out the actual work? (Target: under 1000 words for most agents)
  • Have you removed every sentence that says "be helpful" or "use your best judgment"? Those aren't instructions. They're wishes.