apiad
diff --git a/‎.knowledge/log/2025-03-27.yaml‎
Lines changed: 33 additions & 0 deletions b/‎.knowledge/log/2025-03-27.yaml‎
Lines changed: 33 additions & 0 deletions
diff --git a/‎.knowledge/notes/report-cli-ai-docs.md‎
Lines changed: 311 additions & 0 deletions b/‎.knowledge/notes/report-cli-ai-docs.md‎
Lines changed: 311 additions & 0 deletions
@@ -0,0 +1,33 @@
+entries:
+  - timestamp: 2025-03-27T14:30:00Z
+    type: commit
+    message: "refactor(opencode): update agent definitions and remove unused deps"
+    files:
+      - .opencode/agents/analyze.md
+      - .opencode/agents/build.md
+      - .opencode/agents/plan.md
+      - .opencode/agents/scout.md
+      - .opencode/commands/audit.md
+      - .opencode/commands/build.md
+      - .opencode/commands/help.md
+      - .opencode/commands/research.md
+      - AGENTS.md
+      - makefile
+      - pyproject.toml (deleted)
+      - uv.lock (deleted)
+    issue: null
+    
+  - timestamp: 2025-03-27T14:35:00Z
+    type: commit
+    message: "docs(research): add comprehensive CLI self-documentation research"
+    files:
+      - .knowledge/notes/report-cli-ai-docs.md
+      - .knowledge/notes/research-cli-ai-docs/rq1-machine-readable-formats.md
+      - .knowledge/notes/research-cli-ai-docs/rq2-help-system-design.md
+      - .knowledge/notes/research-cli-ai-docs/rq3-ai-agent-interaction.md
+      - .knowledge/notes/research-cli-ai-docs/rq4-standards-specifications.md
+      - .knowledge/notes/research-cli-ai-docs/rq5-real-world-examples.md
+      - .knowledge/notes/research-cli-ai-docs/rq6-auto-discovery-mechanisms.md
+      - journal/2026-03-27.yaml
+      - todo.yaml
+    issue: null
@@ -0,0 +1,311 @@
+---
+id: self-documenting-cli-ai-tools
+created: 2025-03-27
+modified: 2025-03-27
+type: research
+status: active
+sources:
+  - https://clig.dev
+  - https://oclif.io
+  - https://cobra.dev
+  - https://github.com/oclif/core
+  - https://spec.openapis.org/oas/latest.html
+  - https://json-schema.org
+  - https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html
+  - https://www.gnu.org/prep/standards/standards.html
+  - https://man.freebsd.org/cgi/man.cgi?query=sysexits
+  - https://modelcontextprotocol.io/specification
+  - https://ietf.org/blog/agentic-ai-standards
+  - https://arxiv.org/abs/2603.24709
+  - https://arxiv.org/abs/2603.15309
+  - https://arxiv.org/abs/2304.03442
+  - https://arxiv.org/abs/2312.11444
+  - https://git-scm.com/docs/git
+  - https://kubernetes.io/docs/reference/kubectl/
+  - https://docs.aws.amazon.com/cli/latest/userguide/
+  - https://cli.github.com/manual/
+  - https://click.palletsprojects.com/
+  - https://typer.tiangolo.com/
+  - https://docs.rs/clap/latest/clap/
+---
+
+# Research: Best Practices for Self-Documenting CLI Tools That AI Agents Can Learn to Use
+
+## Executive Summary
+
+This comprehensive research investigates how to design CLI tools that are self-documenting and easily learnable by AI agents. The study examined machine-readable documentation formats, help system design patterns, AI agent interaction behaviors, industry standards, real-world exemplary tools, and auto-discovery mechanisms.
+
+**Key Finding:** There is currently **no industry-wide standard for AI-consumable CLI documentation**, creating both a challenge and an opportunity. While mature standards like POSIX.1-2017 and GNU Coding Standards provide solid foundations for human-readable documentation, the gap for machine-readable formats remains largely unfilled.
+
+**Most AI-Friendly Approaches:**
+1. **OCLIF Manifest format** (9/10 AI-friendliness) - Complete structured metadata but framework-specific
+2. **Cobra documentation generation** (8/10) - Widely adopted, generates multiple formats
+3. **JSON Schema** (7/10) - Universal but requires custom CLI-specific schema development
+
+**Critical Discovery:** Research shows that current LLMs fail on multi-step CLI orchestration tasks with **no model achieving above 20% task completion** when strict constraint adherence is required. CLI tools must be designed with explicit parameter typing, structured output options, and dry-run capabilities to be AI-friendly.
+
+---
+
+## Key Findings
+
+### 1. Machine-Readable Documentation Formats Vary Widely in AI-Friendliness
+
+**Evidence:** Research into RQ1 (Machine-Readable Formats)
+
+Several structured formats exist for CLI documentation, with varying degrees of AI parseability:
+
+| Format | AI-Friendliness Score | Pros | Cons |
+|--------|----------------------|------|------|
+| **OCLIF Manifest** | 9/10 | Complete metadata, type info, relationships | Node.js ecosystem only |
+| **Cobra JSON** | 8/10 | Widely used, multiple output formats | Go-specific, requires code gen |
+| **JSON Schema** | 7/10 | Universal, strong validation | No CLI-native concepts |
+| **CLIG Guidelines** | 4/10 | Good conventions | Not a structured format |
+| **OpenAPI** | 5/10 | Excellent tooling | HTTP-centric, mismatches CLI semantics |
+
+**Implication:** Framework-native formats provide the richest metadata but limit ecosystem choice. For maximum interoperability, JSON Schema offers the best lingua franca despite requiring custom extensions for CLI-specific concepts like subcommands and shell completion.
+
+**Link:** See `.knowledge/notes/research-cli-ai-docs/rq1-machine-readable-formats.md` for complete format specifications and examples.
+
+---
+
+### 2. AI Agents Struggle with Multi-Step CLI Orchestration
+
+**Evidence:** Research into RQ3 (AI Agent Interaction Patterns)
+
+Current research reveals significant challenges in AI CLI tool usage:
+
+- **No model achieves >20% task completion** when strict constraint adherence is required (CCTU benchmark, arXiv:2603.15309)
+- **>50% constraint violation rate** across resource and response dimensions
+- **Parameter value errors** account for a significant portion of failures
+- **Limited self-refinement capacity** - LLMs cannot effectively self-correct even after receiving detailed feedback
+
+**AI-Agent Documentation Prioritization (by utility):**
+1. **Tier 1:** Structured interface definitions (JSON Schema, function signatures)
+2. **Tier 2:** Usage examples and patterns
+3. **Tier 3:** Inline help/man pages
+4. **Tier 4:** Web documentation
+
+**Implication:** CLI tools must provide explicit parameter typing, structured output options, dry-run capabilities, and comprehensive examples. Ambiguity is the enemy of AI usability.
+
+**Link:** See `.knowledge/notes/research-cli-ai-docs/rq3-ai-agent-interaction.md` for detailed failure modes and research citations.
+
+---
+
+### 3. A Critical Gap Exists: No Standard for AI-Consumable CLI Documentation
+
+**Evidence:** Research into RQ4 (Standards and Specifications)
+
+While mature standards exist for human-readable documentation:
+- **POSIX.1-2017** - 14 utility syntax guidelines
+- **GNU Coding Standards** - Required `--version`, `--help`, long options
+- **clig.dev** - Modern comprehensive CLI guidelines
+
+**There is NO existing standard for:**
+- Machine-readable CLI documentation
+- AI-consumable command descriptions  
+- Structured help output formats
+- Tool description schemas for AI agents
+
+**Emerging Initiatives:**
+- **Model Context Protocol (MCP)** - JSON Schema-based tool definitions (most relevant)
+- **IETF Agentic AI Communications** - Exploring AI agent interoperability standards
+
+**Implication:** This represents a significant opportunity to define new conventions, potentially through:
+1. Extending existing tools to generate structured output (e.g., `--help-json`)
+2. Following MCP's pattern for tool definitions
+3. Proposing new conventions through standards bodies
+
+**Link:** See `.knowledge/notes/research-cli-ai-docs/rq4-standards-specifications.md` for complete standards analysis.
+
+---
+
+### 4. Exemplary CLI Tools Demonstrate Clear AI-Friendly Patterns
+
+**Evidence:** Research into RQ5 (Real-World Examples)
+
+Analysis of 10+ exemplary CLI tools revealed consistent AI-friendly patterns:
+
+**Top Exemplary Tools:**
+| Tool | Key AI-Friendly Feature |
+|------|------------------------|
+| **kubectl** | Schema documentation via `explain`, JSON output, dry-run |
+| **AWS CLI** | 200+ services with consistent patterns, JMESPath queries |
+| **GitHub CLI (gh)** | JSON output with field selection, documented exit codes |
+| **Docker** | Go template formatting, structured output |
+| **jq** | JSON-native by design |
+
+**12-Point AI-Friendly Pattern Checklist:**
+- [ ] Structured Output (JSON/YAML)
+- [ ] Schema Documentation (`explain` command)
+- [ ] Dry-Run Support
+- [ ] Consistent Help Format (NAME, SYNOPSIS, OPTIONS, EXAMPLES)
+- [ ] Usage Examples in Help
+- [ ] Documented Exit Codes
+- [ ] Shell Completion (Bash, Zsh, Fish)
+- [ ] Environment Variable Support
+- [ ] Web Documentation
+- [ ] Query Language Support
+- [ ] Pagination Handling
+- [ ] Versioned Documentation
+
+**Implication:** Tools using established frameworks (Cobra, Click, Typer, Clap) can easily implement these patterns through built-in features.
+
+**Link:** See `.knowledge/notes/research-cli-ai-docs/rq5-real-world-examples.md` for detailed tool analysis.
+
+---
+
+### 5. Exit Codes and Error Messages Are Critical for AI Reliability
+
+**Evidence:** Research into RQ2 (Help System Design)
+
+**Exit Code Standards:**
+- **POSIX:** 0=success, non-zero=failure
+- **BSD sysexits.h (64-78):** Specific codes for different error types
+  - 64: EX_USAGE (command line usage error)
+  - 65: EX_DATAERR (data format error)
+  - 66: EX_NOINPUT (cannot open input)
+  - ...through 78: EX_CONFIG (configuration error)
+
+**Error Message Structure (clig.dev formula):**
+```
+Error: <what went wrong>. <why it matters>. <how to fix it>.
+```
+
+**AI-Friendly Error Features:**
+- Suggest corrections for typos (Levenshtein distance)
+- Include suggested fixes in error messages
+- Link to documentation
+- Support `--verbose` for debugging
+
+**Implication:** Well-documented exit codes enable AI agents to handle errors programmatically. Self-documenting errors reduce the need for trial-and-error learning.
+
+**Link:** See `.knowledge/notes/research-cli-ai-docs/rq2-help-system-design.md` for complete exit code reference and error message patterns.
+
+---
+
+### 6. Auto-Discovery Mechanisms Enable Runtime Understanding
+
+**Evidence:** Research into RQ6 (Auto-Discovery Mechanisms)
+
+**Introspection Patterns:**
+- Standard flags: `--help`, `--version`, `--json`
+- Framework-specific: Cobra's `completion` subcommand, Click's env var pattern
+- Capability advertising: Feature flags, capability commands
+
+**Shell Completion Generation:**
+- All major frameworks support Bash, Zsh, Fish, PowerShell
+- Static completions (predefined) vs dynamic completions (runtime-generated)
+- Completion descriptions for modern shells
+
+**Structured Data Export:**
+- `--json` flag pattern (Heroku, kubectl, AWS CLI)
+- TTY detection for automatic format selection
+- oclif's `enableJsonFlag` property
+
+**Implication:** CLI tools should expose their interface as structured data at runtime, not just in static documentation. This enables AI agents to adapt to version differences and plugin extensions.
+
+**Link:** See `.knowledge/notes/research-cli-ai-docs/rq6-auto-discovery-mechanisms.md` for implementation patterns.
+
+---
+
+## Recommendations
+
+### For CLI Tool Developers
+
+1. **Implement structured output** (`--json`, `--yaml`) for all commands that produce data
+2. **Provide dry-run modes** (`--dry-run`) for destructive operations to enable safe exploration
+3. **Document exit codes** explicitly in help text, following BSD sysexits.h conventions where appropriate
+4. **Use established CLI frameworks** (Cobra, Click, Typer, Clap) to inherit AI-friendly patterns
+5. **Include comprehensive examples** in help text showing realistic usage patterns
+6. **Support shell completion** generation for Bash, Zsh, and Fish
+7. **Generate machine-readable manifests** (OCLIF-style or JSON Schema) from code annotations
+
+### For AI System Builders
+
+1. **Prioritize structured interface definitions** over prose documentation when learning CLI tools
+2. **Implement graduated rewards** rather than binary success/failure signals
+3. **Validate constraints explicitly** - don't rely on LLMs to self-correct
+4. **Cache successful patterns** for reuse across sessions
+5. **Support interactive clarification** when interfaces are ambiguous
+6. **Use dry-run modes** extensively for safe exploration before execution
+
+### For Standards Organizations
+
+1. **Define a `--help-json` standard** for machine-readable CLI documentation
+2. **Extend MCP (Model Context Protocol)** to include CLI tool definitions
+3. **Create compliance checkers** for AI-friendly CLI documentation
+4. **Establish a CLI-to-AI bridge working group** under IETF or similar body
+
+### For Documentation Authors
+
+1. **Lead with schemas** - Machine-readable definitions before prose
+2. **Include realistic examples** showing complete workflows
+3. **Document failure modes** - What can go wrong and how to recover
+4. **Version your documentation** - AI agents need to know which version they're using
+5. **Consider the LLM reader** - Write assuming documentation will be processed by AI
+
+---
+
+## Further Reading
+
+### Primary Standards and Guidelines
+- **POSIX.1-2017 Utility Conventions** - https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html
+- **GNU Coding Standards** - https://www.gnu.org/prep/standards/standards.html
+- **Command Line Interface Guidelines (clig.dev)** - https://clig.dev/
+
+### Research Papers on AI Tool Use
+- **"Training LLMs for Multi-Step Tool Orchestration"** (arXiv:2603.24709) - Cheng et al., 2026
+- **"CCTU: A Benchmark for Tool Use under Complex Constraints"** (arXiv:2603.15309) - Ye et al., 2026
+- **"An In-depth Look at Gemini's Language Abilities"** (arXiv:2312.11444) - Akter et al., 2023
+
+### Framework Documentation
+- **Cobra CLI Framework** - https://cobra.dev/
+- **OCLIF (Open CLI Framework)** - https://oclif.io/
+- **Click (Python)** - https://click.palletsprojects.com/
+- **Clap (Rust)** - https://docs.rs/clap/latest/clap/
+
+### Exemplary CLI Tools to Study
+- **kubectl** - https://kubernetes.io/docs/reference/kubectl/
+- **AWS CLI** - https://docs.aws.amazon.com/cli/latest/userguide/
+- **GitHub CLI** - https://cli.github.com/manual/
+
+---
+
+## Follow-Up Questions
+
+1. **What is the performance impact** of different documentation formats on AI agent task completion rates? (Needs empirical study)
+
+2. **Can we develop a standard schema** for CLI tool interfaces that bridges the gap between human and machine readability?
+
+3. **How do different AI models** (Claude, GPT-4, Gemini) perform on CLI tool tasks with varying documentation quality?
+
+4. **What is the minimum viable documentation** required for AI agents to successfully use a CLI tool?
+
+5. **Should we propose a new IETF standard** for AI-consumable CLI documentation formats?
+
+6. **How can CLI frameworks** be extended to automatically generate MCP-compatible tool definitions?
+
+7. **What patterns emerge** from studying AI agent failure modes across different CLI tool categories (dev tools, sysadmin tools, cloud CLIs)?
+
+---
+
+## Research Sources Summary
+
+This report synthesizes findings from 6 parallel research investigations:
+
+| Research Question | Scout Output | Key Sources |
+|-------------------|--------------|-------------|
+| RQ1: Machine-Readable Formats | 233 lines | OCLIF, Cobra, JSON Schema, OpenAPI, CLIG |
+| RQ2: Help System Design | 576 lines | POSIX.1-2017, GNU Standards, clig.dev, sysexits.h |
+| RQ3: AI Agent Interactions | 316 lines | 9 research papers from arXiv |
+| RQ4: Standards & Specifications | 642 lines | POSIX, GNU, IETF, MCP, OpenTelemetry |
+| RQ5: Real-World Examples | 338 lines | git, kubectl, AWS CLI, gh, docker, jq |
+| RQ6: Auto-Discovery | 220 lines | Click, Cobra, oclif, shell completion |
+
+**Total Research Output:** 2,325 lines of synthesized research across 6 domains
+
+---
+
+*Research completed: 2025-03-27*
+*Research methodology: Parallel scout deployment with synthesis*
+*Report format: AGENTS.md research specification*