GrayCodeAI
diff --git a/‎.github/dependabot.yml‎
Lines changed: 41 additions & 0 deletions b/‎.github/dependabot.yml‎
Lines changed: 41 additions & 0 deletions
diff --git a/‎.hermes/plans/world-best-cli.md‎
Lines changed: 93 additions & 0 deletions b/‎.hermes/plans/world-best-cli.md‎
Lines changed: 93 additions & 0 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 18 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 18 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 51 additions & 3 deletions b/‎README.md‎
Lines changed: 51 additions & 3 deletions
diff --git a/‎cmd/ai_comments.go‎
Lines changed: 81 additions & 0 deletions b/‎cmd/ai_comments.go‎
Lines changed: 81 additions & 0 deletions
@@ -0,0 +1,41 @@
+# Dependabot configuration for the hawk repo.
+# Keeps Go module dependencies and GitHub Actions up to date.
+# Minor/patch updates are grouped to minimize PR noise.
+version: 2
+
+updates:
+  # ---------------------------------------------------------------------------
+  # Go modules.
+  # ---------------------------------------------------------------------------
+  - package-ecosystem: "gomod"
+    directory: "/"
+    schedule:
+      interval: "weekly"
+      day: "monday"
+    open-pull-requests-limit: 10
+    commit-message:
+      prefix: "build"
+      include: "scope"
+    groups:
+      gomod-minor-patch:
+        update-types:
+          - "minor"
+          - "patch"
+
+  # ---------------------------------------------------------------------------
+  # GitHub Actions used by the workflows.
+  # ---------------------------------------------------------------------------
+  - package-ecosystem: "github-actions"
+    directory: "/"
+    schedule:
+      interval: "weekly"
+      day: "monday"
+    open-pull-requests-limit: 10
+    commit-message:
+      prefix: "ci"
+      include: "scope"
+    groups:
+      actions-minor-patch:
+        update-types:
+          - "minor"
+          - "patch"
@@ -0,0 +1,93 @@
+# World's Best CLI - Improvement Plan
+
+## Current State Analysis
+
+Hawk is already a sophisticated AI coding agent with:
+- 200+ built-in tools
+- Bubble Tea TUI with slash commands
+- Multi-provider support via eyrie
+- Session persistence & recovery
+- Plugin system
+- Shell completions (bash/zsh/fish/powershell/json)
+- Man page generation
+- Container sandboxing
+
+## Competitive Analysis: Top OSS AI Coding Agents (2024-2025)
+
+| Agent | Stars | Language | Key Strengths | CLI/UX Patterns |
+|-------|-------|----------|---------------|-----------------|
+| **Aider** | 45.8k | Python | Git-native, multi-model, voice, IDE integration, /help with rich commands | REPL-first, slash commands, verbose flags, pipe-friendly |
+| **OpenHands** | 75.9k | TypeScript/Go | Web UI + headless, agent skills, runtime containers, eval harness | Web-first, headless API, skill marketplace |
+| **Continue** | 33.6k | TypeScript | IDE-native (VSCode/JetBrains), custom models, checks/CI | IDE sidebar, slash commands, config-driven |
+| **SWE-agent** | 19.4k | Python | Issue-to-fix automation, NeurIPS 2024, cybersecurity focus | Config YAML, single-command runs, docker-native |
+| **Open Interpreter** | 63.8k | Python | OS-level control, local execution, vision, voice | REPL, `interpreter -c`, multi-language |
+
+### Common World-Class Patterns Identified:
+1. **REPL mode** — `aider`, `open-interpreter` default to REPL, TUI optional
+2. **Pipe/JSONL streaming** — All support structured output for scripting
+3. **Config-driven** — YAML/TOML configs with full schema + env override
+4. **Skill/Plugin marketplace** — OpenHands skills, Continue checks, Aider conventions
+5. **Multi-model routing** — Aider's model aliases, Continue's custom models
+6. **Voice/TTS integration** — Aider voice-to-code, Open Interpreter voice
+7. **Watch/daemon modes** — File-watch auto-re-run, background servers
+8. **Git-native workflow** — Aider auto-commits, diff preview, branch mgmt
+9. **Rich help system** — Interactive `/help`, command palette (Cmd+K)
+10. **Onboarding wizard** — First-run setup, API key config, model picker
+
+## Gaps to World-Class
+
+### Phase 1: Performance & Startup Optimization
+- [ ] Sub-100ms cold start (lazy-load heavy deps) ✓ **started**
+- [ ] Background catalog/credential warming
+- [ ] Binary size optimization (`-ldflags="-s -w"`, trim debug)
+- [ ] Startup profiling ✓ **implemented `--startup-profile`**
+
+### Phase 2: Enhanced Discoverability & UX
+- [ ] **Interactive onboarding wizard** (`hawk init` — API keys, model picker, sandbox)
+- [ ] **Context-aware slash command suggestions** (fuzzy, frequent, project-aware)
+- [ ] **Built-in tutorial mode** (`hawk --tutorial` interactive walkthrough)
+- [ ] **Better error messages** with actionable fixes (did-you-mean, doc links)
+- [ ] **Command palette** (Cmd/Ctrl+K) — fuzzy search all commands/tools/skills
+- [ ] **Rich `/help`** with categories, examples, search (like Aider)
+
+### Phase 3: Advanced Shell Integration (HIGHEST IMPACT)
+- [ ] **REPL mode** (`hawk -p "prompt"` without TUI, streaming JSONL) — like Aider
+- [ ] **Pipe-friendly JSONL streaming** (`hawk -p "fix" --json | jq ...`)
+- [ ] **Watch mode** (`hawk watch "fix tests"` auto-re-runs on file changes)
+- [ ] **Shell widget/integration** (fzf-style inline picker, atuin-style history sync)
+- [ ] **Alias system** for common workflows (`hawk alias fix-tests="test ./..."`)
+- [ ] **Daemon/API server** (`hawk serve` — REST/SSE for IDE integrations)
+
+### Phase 4: Extensibility & API
+- [ ] **Lua/JS/WASM plugin runtime** (sandboxed, hot-reload)
+- [ ] **Tool protocol** for external tools (stdio/HTTP — like MCP but simpler)
+- [ ] **Hooks system** (pre/post tool, session events, config change)
+- [ ] **Configuration schema** with validation (JSON Schema → Go types)
+- [ ] **Skill marketplace** (install from GitHub/registry, versioned)
+
+### Phase 5: Reliability & Data Integrity
+- [ ] **CRDT-based session sync** for multi-device
+- [ ] **Undo/redo** at session level (tool call granularity)
+- [ ] **Automatic backup** before destructive ops (git commit, file snapshot)
+- [ ] **Crash recovery** with WAL ✓ **started**
+
+### Phase 6: Accessibility & Polish
+- [ ] **Screen reader support** (ARIA in TUI, semantic markup)
+- [ ] **High contrast themes** (WCAG AA)
+- [ ] **Reduced motion** option
+- [ ] **i18n framework** (gettext-style, locale files)
+
+## Priority Order (Updated from Competitive Analysis)
+1. **Phase 3: REPL + Watch + JSONL** — highest user impact, matches Aider/OpenInterpreter default UX
+2. **Phase 1: Performance** — foundation, sub-100ms target
+3. **Phase 2: Onboarding + Help + Command Palette** — critical for adoption
+4. **Phase 4: Plugin Runtime + Tool Protocol** — ecosystem growth
+5. **Phase 5: Reliability** — trust, production readiness
+6. **Phase 6: Accessibility** — inclusion
+
+## Implementation Notes
+- **REPL mode** can reuse `runPrint()` with streaming JSONL output
+- **Watch mode** uses `fsnotify` + debounce + session resume
+- **Command palette** extends existing `CommandPalette` in chat_model.go
+- **Plugin runtime** — evaluate `wazero` (WASM) or `go-lua`/`gopher-lua`
+- **Config schema** — `go-jsonschema` + codegen for type-safe Settings
@@ -13,6 +13,24 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
   with the rest of the GrayCodeAI ecosystem (`eyrie`, `tok`, `yaad`, `sight`, `inspect`).
 
 ### Added
+- **Watch mode (`--watch`)**: file-watcher loop that acts on `AI!` (do-now) and `AI?` (answer) code comments. Off by default.
+- **GitHub Action** (`.github/actions/hawk`): interactive mode on `@hawk` mentions, automation mode on labeled issues/PRs, and skill dispatch for `/`-prefixed prompts.
+- **Messaging gateways**: opt-in Telegram, Discord, and Slack gateways on the daemon for chatting with hawk from messaging apps.
+- **AST repo-map** (`internal/context/repomap`): structural repository map for richer model context.
+- **Auto codebase analysis on first run** (`internal/autoinit`): opt-in seeding of project context.
+- **Auto-lint / auto-fix cycle**: runs the matching linter after edits and iterates on fixes with bounded retries (opt-in).
+- **Image / multimodal context** (`internal/engine/vision.go`): feed screenshots/images to vision-capable models.
+- **Plan & Explore sub-agent modes**: read-only `plan` (task decomposition) and `explore` (codebase investigation) modes; explore supports `quick`/`medium`/`very-thorough` thoroughness budgets.
+- **Persona `color` and `hooks`**: per-agent display color and lifecycle hook (`pre_run`/`post_run`) fields.
+- **YAML agents & eval tasks**: define personas and eval tasks in YAML alongside markdown personas.
+- **IT-managed policy tier** (`internal/context/rules.go`): non-excludable org-policy rule tier with highest precedence; strips HTML comments from rule files.
+- **SQL exploration tool** (`internal/tool/sql.go`): read-only database exploration tool.
+- **Conventional-commit generation**: `SmartCommit` and the diff summarizer produce Conventional Commit messages from staged changes.
+- **Durable workflows + named checkpoints** (`internal/multiagent`): LangGraph-style resumable workflows with named, persisted step boundaries.
+- **Human-in-the-loop approval gates** (`internal/engine/approval_gate.go`): durable approve/reject gates within workflows.
+- **Structured output** (`internal/engine/structured_output.go`): JSON-Schema-constrained responses, validated and retried once on mismatch.
+- **MCP WebSocket transport** (`internal/mcp/ws.go`): opt-in WebSocket transport in addition to stdio/HTTP.
+- **`GET /v1/ready`**: dependency-aware readiness endpoint on the daemon.
 - REPL magic commands (%reset, %undo, %tokens, %history, %copy, %save, %compact, %model, %clear)
 - Prompt cache keep-alive pings
 - Unified Finding type in shared/types for cross-tool interoperability
 
@@ -80,7 +80,7 @@ Built with [Bubble Tea](https://github.com/charmbracelet/bubbletea) for a smooth
 | **Git** | `GitCommit`, `SmartCommit`, `EnterWorktree`, `ExitWorktree` |
 | **Web** | `WebFetch`, `WebSearch`, `CodeSearch` |
 | **Tasks** | `TodoWrite`, `TaskCreate`, `TaskList`, `TaskUpdate` |
-| **Code** | `LSP` diagnostics, `CodeSearch`, `NotebookEdit` |
+| **Code** | `LSP` diagnostics, `CodeSearch`, `NotebookEdit`, `SQL` (read-only DB exploration) |
 | **MCP** | `ListMcpResources`, `ReadMcpResource` |
 
 ### Multi-Agent Mission Mode (optional)
@@ -109,7 +109,55 @@ hawk asks before running dangerous tools. Auto-mode learns from your decisions,
 
 ### MCP & LSP Support
 
-Connect external tools via [Model Context Protocol](https://modelcontextprotocol.io/) and get code intelligence through Language Server Protocol.
+Connect external tools via [Model Context Protocol](https://modelcontextprotocol.io/) and get code intelligence through Language Server Protocol. MCP also supports a WebSocket transport (opt-in) in addition to stdio/HTTP.
+
+### Watch Mode (AI-comment loop)
+
+`hawk --watch` watches your tree for `AI!` (do it now) and `AI?` (answer my question) comments and acts on them automatically — leave a directive in code, save, and hawk responds. Off by default; enabled via the `--watch` flag.
+
+### CI / GitHub Action
+
+A bundled GitHub Action (`.github/actions/hawk`) runs hawk in your pipeline: interactive mode on `@hawk` mentions in issue/PR comments, automation mode on labeled issues/PRs, and skill dispatch when a prompt begins with `/` (e.g. `/code-review`).
+
+### Messaging Gateways (opt-in)
+
+The daemon exposes Telegram, Discord, and Slack gateways so you can chat with hawk from your messaging app. Disabled by default; enabled per-channel via daemon config.
+
+### AST Repo-Map & Codebase Analysis
+
+An AST-based repository map (`internal/context/repomap`) gives the model a structural overview of your code. On first run, hawk can auto-analyze the codebase to seed context (default-off, opt-in).
+
+### Auto-Lint / Auto-Fix Cycle
+
+After edits, hawk can run the matching linter and iterate on fixes (bounded retries) before handing back. Opt-in; preserves existing behavior when disabled.
+
+### Image / Multimodal Context
+
+Feed screenshots and images into the conversation for vision-capable models (`internal/engine/vision.go`).
+
+### Plan & Explore Sub-Agents
+
+Read-only sub-agent modes: `plan` decomposes a task into steps, `explore` investigates the codebase with a configurable thoroughness budget (`quick` / `medium` / `very-thorough`).
+
+### Conventional-Commit Generation
+
+`SmartCommit` and the diff summarizer generate Conventional Commit messages from your staged changes.
+
+### Durable Workflows & Approval Gates
+
+LangGraph-style durable workflows with named, resumable step checkpoints and optional human-in-the-loop approval gates that persist the gate decision.
+
+### Structured Output
+
+Request JSON-Schema-constrained responses; results are validated against the schema and retried once on mismatch.
+
+### YAML Agents & Tasks
+
+Define personas and eval tasks in YAML (in addition to markdown personas), including per-agent display `color` and lifecycle `hooks`.
+
+### IT-Managed Policy Tier
+
+An optional, non-excludable org-policy rule tier (highest precedence) with HTML-comment stripping of rule files for IT-managed deployments.
 
 ## Usage
 
@@ -160,7 +208,7 @@ hawk daemon status             # Check if running
 hawk daemon stop               # Graceful shutdown
 ```
 
-Endpoints: `GET /v1/health`, `POST /v1/chat` (JSON or SSE streaming)
+Endpoints: `GET /v1/health`, `GET /v1/ready` (dependency-aware readiness), `POST /v1/chat` (JSON or SSE streaming)
 
 ### Mission Mode
 
 
@@ -6,6 +6,7 @@ import (
 	"os"
 	"path/filepath"
 	"regexp"
+	"sort"
 	"strings"
 )
 
@@ -82,6 +83,86 @@ func scanForAIComments(dir string, ignore []string) []AIDirective {
 	return directives
 }
 
+// aiDispatchFn is the LLM/agent execution path used to act on a single AI
+// directive. It defaults to runAIDirectivePrint (which drives the same chat/agent
+// stream as `runPrint`), but is a package-level variable so tests can substitute
+// a mock without standing up a real model.
+var aiDispatchFn = runAIDirectivePrint
+
+// runAIDirectivePrint dispatches a single AI directive to the existing print
+// execution path. The prompt is built from the directive so the agent edits the
+// referenced file in place.
+func runAIDirectivePrint(d AIDirective) error {
+	return runPrint(formatDirectivePrompt(d))
+}
+
+// formatDirectivePrompt builds a targeted prompt for a single AI directive.
+// AI! directives instruct the agent to act now; AI? directives ask it to answer
+// the embedded question.
+func formatDirectivePrompt(d AIDirective) string {
+	var b strings.Builder
+	if d.Mode == "?" {
+		b.WriteString(fmt.Sprintf(
+			"An AI question comment was found at %s:%d.\n\nQuestion: %s\n\n",
+			d.Path, d.Line, d.Instruction))
+		b.WriteString("Answer the question. If a code change is warranted, make it. ")
+	} else {
+		b.WriteString(fmt.Sprintf(
+			"An AI instruction comment was found at %s:%d.\n\nInstruction: %s\n\n",
+			d.Path, d.Line, d.Instruction))
+		b.WriteString("Implement this change now, editing the file in place. ")
+	}
+	b.WriteString(fmt.Sprintf(
+		"The directive lives in the file %s; after you finish, the AI comment token will be removed automatically.\n",
+		d.Path))
+	return b.String()
+}
+
+// processAIDirectives scans dir for AI!/AI? directives, dispatches each one to
+// the configured execution path, and strips the AI comment token from the file
+// after a successful dispatch. It returns the number of directives that were
+// successfully processed (token stripped). Errors from individual directives are
+// reported to stderr and do not abort the remaining directives.
+//
+// Directives are processed from the bottom of each file upward so that removing a
+// line never shifts the line numbers of directives not yet handled in the same
+// file.
+func processAIDirectives(dir string, ignore []string) int {
+	directives := scanForAIComments(dir, ignore)
+	if len(directives) == 0 {
+		return 0
+	}
+
+	// Sort by file then descending line so earlier-line removals don't invalidate
+	// the line numbers of later directives in the same file.
+	sort.SliceStable(directives, func(i, j int) bool {
+		if directives[i].Path != directives[j].Path {
+			return directives[i].Path < directives[j].Path
+		}
+		return directives[i].Line > directives[j].Line
+	})
+
+	processed := 0
+	for _, d := range directives {
+		if err := aiDispatchFn(d); err != nil {
+			fmt.Fprintf(os.Stderr, "AI directive %s:%d failed: %v\n", d.Path, d.Line, err)
+			continue
+		}
+		// Resolve back to an absolute path for removal; scan returns paths
+		// relative to dir.
+		full := d.Path
+		if !filepath.IsAbs(full) {
+			full = filepath.Join(dir, d.Path)
+		}
+		if err := removeAIComment(full, d.Line); err != nil {
+			fmt.Fprintf(os.Stderr, "AI directive %s:%d: failed to strip token: %v\n", d.Path, d.Line, err)
+			continue
+		}
+		processed++
+	}
+	return processed
+}
+
 // formatDirectivesAsPrompt formats found directives into a prompt string.
 func formatDirectivesAsPrompt(directives []AIDirective) string {
 	if len(directives) == 0 {