Skip to content

Commit e29e51a

Browse files
Patel230claude
andcommitted
feat: close Top-20 OSS coding-agent gaps (AI-watch, GH Actions, gateways, repo-map, multimodal, subagents, more)
Implement code-feasible competitive-parity features from the Top-20 OSS gap analysis: - wired --watch AI-comment loop end-to-end (AI!/AI? directives, auto-strip after edit) - GitHub Action + interactive/automation mode auto-detect + skill dispatch from prompt - Telegram (wired) + Discord + Slack messaging gateways with pairing-code allowlist - AST repo-map (PageRank-ranked, token-budgeted) - auto-lint/auto-fix cycle; declarative sandbox runtime_extra_deps - image/multimodal context; SQL exploration tool; conventional-commit generation - Plan subagent type + Explore thoroughness; persona color/hooks - IT-managed policy tier + HTML-comment stripping in rule hierarchy - durable workflow checkpoints + named checkpoints; structured-output schema enforcement - human-in-the-loop approval gates; YAML agents/tasks config - /ready endpoint; MCP WebSocket transport; auto codebase analysis on first run (HAWK_DISABLE_AUTO_INIT) Design docs (HAWK-CLOUD-SAAS, ECOSYSTEM-MARKETPLACE, ECOSYSTEM-CONFIG, OTEL-CONVENTIONS) added under docs/. Stdlib-only, no new dependencies. Full test suite green (107/107 packages). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 4d3c43c commit e29e51a

91 files changed

Lines changed: 11995 additions & 45 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/dependabot.yml

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# Dependabot configuration for the hawk repo.
2+
# Keeps Go module dependencies and GitHub Actions up to date.
3+
# Minor/patch updates are grouped to minimize PR noise.
4+
version: 2
5+
6+
updates:
7+
# ---------------------------------------------------------------------------
8+
# Go modules.
9+
# ---------------------------------------------------------------------------
10+
- package-ecosystem: "gomod"
11+
directory: "/"
12+
schedule:
13+
interval: "weekly"
14+
day: "monday"
15+
open-pull-requests-limit: 10
16+
commit-message:
17+
prefix: "build"
18+
include: "scope"
19+
groups:
20+
gomod-minor-patch:
21+
update-types:
22+
- "minor"
23+
- "patch"
24+
25+
# ---------------------------------------------------------------------------
26+
# GitHub Actions used by the workflows.
27+
# ---------------------------------------------------------------------------
28+
- package-ecosystem: "github-actions"
29+
directory: "/"
30+
schedule:
31+
interval: "weekly"
32+
day: "monday"
33+
open-pull-requests-limit: 10
34+
commit-message:
35+
prefix: "ci"
36+
include: "scope"
37+
groups:
38+
actions-minor-patch:
39+
update-types:
40+
- "minor"
41+
- "patch"

.hermes/plans/world-best-cli.md

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
# World's Best CLI - Improvement Plan
2+
3+
## Current State Analysis
4+
5+
Hawk is already a sophisticated AI coding agent with:
6+
- 200+ built-in tools
7+
- Bubble Tea TUI with slash commands
8+
- Multi-provider support via eyrie
9+
- Session persistence & recovery
10+
- Plugin system
11+
- Shell completions (bash/zsh/fish/powershell/json)
12+
- Man page generation
13+
- Container sandboxing
14+
15+
## Competitive Analysis: Top OSS AI Coding Agents (2024-2025)
16+
17+
| Agent | Stars | Language | Key Strengths | CLI/UX Patterns |
18+
|-------|-------|----------|---------------|-----------------|
19+
| **Aider** | 45.8k | Python | Git-native, multi-model, voice, IDE integration, /help with rich commands | REPL-first, slash commands, verbose flags, pipe-friendly |
20+
| **OpenHands** | 75.9k | TypeScript/Go | Web UI + headless, agent skills, runtime containers, eval harness | Web-first, headless API, skill marketplace |
21+
| **Continue** | 33.6k | TypeScript | IDE-native (VSCode/JetBrains), custom models, checks/CI | IDE sidebar, slash commands, config-driven |
22+
| **SWE-agent** | 19.4k | Python | Issue-to-fix automation, NeurIPS 2024, cybersecurity focus | Config YAML, single-command runs, docker-native |
23+
| **Open Interpreter** | 63.8k | Python | OS-level control, local execution, vision, voice | REPL, `interpreter -c`, multi-language |
24+
25+
### Common World-Class Patterns Identified:
26+
1. **REPL mode**`aider`, `open-interpreter` default to REPL, TUI optional
27+
2. **Pipe/JSONL streaming** — All support structured output for scripting
28+
3. **Config-driven** — YAML/TOML configs with full schema + env override
29+
4. **Skill/Plugin marketplace** — OpenHands skills, Continue checks, Aider conventions
30+
5. **Multi-model routing** — Aider's model aliases, Continue's custom models
31+
6. **Voice/TTS integration** — Aider voice-to-code, Open Interpreter voice
32+
7. **Watch/daemon modes** — File-watch auto-re-run, background servers
33+
8. **Git-native workflow** — Aider auto-commits, diff preview, branch mgmt
34+
9. **Rich help system** — Interactive `/help`, command palette (Cmd+K)
35+
10. **Onboarding wizard** — First-run setup, API key config, model picker
36+
37+
## Gaps to World-Class
38+
39+
### Phase 1: Performance & Startup Optimization
40+
- [ ] Sub-100ms cold start (lazy-load heavy deps) ✓ **started**
41+
- [ ] Background catalog/credential warming
42+
- [ ] Binary size optimization (`-ldflags="-s -w"`, trim debug)
43+
- [ ] Startup profiling ✓ **implemented `--startup-profile`**
44+
45+
### Phase 2: Enhanced Discoverability & UX
46+
- [ ] **Interactive onboarding wizard** (`hawk init` — API keys, model picker, sandbox)
47+
- [ ] **Context-aware slash command suggestions** (fuzzy, frequent, project-aware)
48+
- [ ] **Built-in tutorial mode** (`hawk --tutorial` interactive walkthrough)
49+
- [ ] **Better error messages** with actionable fixes (did-you-mean, doc links)
50+
- [ ] **Command palette** (Cmd/Ctrl+K) — fuzzy search all commands/tools/skills
51+
- [ ] **Rich `/help`** with categories, examples, search (like Aider)
52+
53+
### Phase 3: Advanced Shell Integration (HIGHEST IMPACT)
54+
- [ ] **REPL mode** (`hawk -p "prompt"` without TUI, streaming JSONL) — like Aider
55+
- [ ] **Pipe-friendly JSONL streaming** (`hawk -p "fix" --json | jq ...`)
56+
- [ ] **Watch mode** (`hawk watch "fix tests"` auto-re-runs on file changes)
57+
- [ ] **Shell widget/integration** (fzf-style inline picker, atuin-style history sync)
58+
- [ ] **Alias system** for common workflows (`hawk alias fix-tests="test ./..."`)
59+
- [ ] **Daemon/API server** (`hawk serve` — REST/SSE for IDE integrations)
60+
61+
### Phase 4: Extensibility & API
62+
- [ ] **Lua/JS/WASM plugin runtime** (sandboxed, hot-reload)
63+
- [ ] **Tool protocol** for external tools (stdio/HTTP — like MCP but simpler)
64+
- [ ] **Hooks system** (pre/post tool, session events, config change)
65+
- [ ] **Configuration schema** with validation (JSON Schema → Go types)
66+
- [ ] **Skill marketplace** (install from GitHub/registry, versioned)
67+
68+
### Phase 5: Reliability & Data Integrity
69+
- [ ] **CRDT-based session sync** for multi-device
70+
- [ ] **Undo/redo** at session level (tool call granularity)
71+
- [ ] **Automatic backup** before destructive ops (git commit, file snapshot)
72+
- [ ] **Crash recovery** with WAL ✓ **started**
73+
74+
### Phase 6: Accessibility & Polish
75+
- [ ] **Screen reader support** (ARIA in TUI, semantic markup)
76+
- [ ] **High contrast themes** (WCAG AA)
77+
- [ ] **Reduced motion** option
78+
- [ ] **i18n framework** (gettext-style, locale files)
79+
80+
## Priority Order (Updated from Competitive Analysis)
81+
1. **Phase 3: REPL + Watch + JSONL** — highest user impact, matches Aider/OpenInterpreter default UX
82+
2. **Phase 1: Performance** — foundation, sub-100ms target
83+
3. **Phase 2: Onboarding + Help + Command Palette** — critical for adoption
84+
4. **Phase 4: Plugin Runtime + Tool Protocol** — ecosystem growth
85+
5. **Phase 5: Reliability** — trust, production readiness
86+
6. **Phase 6: Accessibility** — inclusion
87+
88+
## Implementation Notes
89+
- **REPL mode** can reuse `runPrint()` with streaming JSONL output
90+
- **Watch mode** uses `fsnotify` + debounce + session resume
91+
- **Command palette** extends existing `CommandPalette` in chat_model.go
92+
- **Plugin runtime** — evaluate `wazero` (WASM) or `go-lua`/`gopher-lua`
93+
- **Config schema**`go-jsonschema` + codegen for type-safe Settings

CHANGELOG.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,24 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1313
with the rest of the GrayCodeAI ecosystem (`eyrie`, `tok`, `yaad`, `sight`, `inspect`).
1414

1515
### Added
16+
- **Watch mode (`--watch`)**: file-watcher loop that acts on `AI!` (do-now) and `AI?` (answer) code comments. Off by default.
17+
- **GitHub Action** (`.github/actions/hawk`): interactive mode on `@hawk` mentions, automation mode on labeled issues/PRs, and skill dispatch for `/`-prefixed prompts.
18+
- **Messaging gateways**: opt-in Telegram, Discord, and Slack gateways on the daemon for chatting with hawk from messaging apps.
19+
- **AST repo-map** (`internal/context/repomap`): structural repository map for richer model context.
20+
- **Auto codebase analysis on first run** (`internal/autoinit`): opt-in seeding of project context.
21+
- **Auto-lint / auto-fix cycle**: runs the matching linter after edits and iterates on fixes with bounded retries (opt-in).
22+
- **Image / multimodal context** (`internal/engine/vision.go`): feed screenshots/images to vision-capable models.
23+
- **Plan & Explore sub-agent modes**: read-only `plan` (task decomposition) and `explore` (codebase investigation) modes; explore supports `quick`/`medium`/`very-thorough` thoroughness budgets.
24+
- **Persona `color` and `hooks`**: per-agent display color and lifecycle hook (`pre_run`/`post_run`) fields.
25+
- **YAML agents & eval tasks**: define personas and eval tasks in YAML alongside markdown personas.
26+
- **IT-managed policy tier** (`internal/context/rules.go`): non-excludable org-policy rule tier with highest precedence; strips HTML comments from rule files.
27+
- **SQL exploration tool** (`internal/tool/sql.go`): read-only database exploration tool.
28+
- **Conventional-commit generation**: `SmartCommit` and the diff summarizer produce Conventional Commit messages from staged changes.
29+
- **Durable workflows + named checkpoints** (`internal/multiagent`): LangGraph-style resumable workflows with named, persisted step boundaries.
30+
- **Human-in-the-loop approval gates** (`internal/engine/approval_gate.go`): durable approve/reject gates within workflows.
31+
- **Structured output** (`internal/engine/structured_output.go`): JSON-Schema-constrained responses, validated and retried once on mismatch.
32+
- **MCP WebSocket transport** (`internal/mcp/ws.go`): opt-in WebSocket transport in addition to stdio/HTTP.
33+
- **`GET /v1/ready`**: dependency-aware readiness endpoint on the daemon.
1634
- REPL magic commands (%reset, %undo, %tokens, %history, %copy, %save, %compact, %model, %clear)
1735
- Prompt cache keep-alive pings
1836
- Unified Finding type in shared/types for cross-tool interoperability

README.md

Lines changed: 51 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ Built with [Bubble Tea](https://github.com/charmbracelet/bubbletea) for a smooth
8080
| **Git** | `GitCommit`, `SmartCommit`, `EnterWorktree`, `ExitWorktree` |
8181
| **Web** | `WebFetch`, `WebSearch`, `CodeSearch` |
8282
| **Tasks** | `TodoWrite`, `TaskCreate`, `TaskList`, `TaskUpdate` |
83-
| **Code** | `LSP` diagnostics, `CodeSearch`, `NotebookEdit` |
83+
| **Code** | `LSP` diagnostics, `CodeSearch`, `NotebookEdit`, `SQL` (read-only DB exploration) |
8484
| **MCP** | `ListMcpResources`, `ReadMcpResource` |
8585

8686
### Multi-Agent Mission Mode (optional)
@@ -109,7 +109,55 @@ hawk asks before running dangerous tools. Auto-mode learns from your decisions,
109109

110110
### MCP & LSP Support
111111

112-
Connect external tools via [Model Context Protocol](https://modelcontextprotocol.io/) and get code intelligence through Language Server Protocol.
112+
Connect external tools via [Model Context Protocol](https://modelcontextprotocol.io/) and get code intelligence through Language Server Protocol. MCP also supports a WebSocket transport (opt-in) in addition to stdio/HTTP.
113+
114+
### Watch Mode (AI-comment loop)
115+
116+
`hawk --watch` watches your tree for `AI!` (do it now) and `AI?` (answer my question) comments and acts on them automatically — leave a directive in code, save, and hawk responds. Off by default; enabled via the `--watch` flag.
117+
118+
### CI / GitHub Action
119+
120+
A bundled GitHub Action (`.github/actions/hawk`) runs hawk in your pipeline: interactive mode on `@hawk` mentions in issue/PR comments, automation mode on labeled issues/PRs, and skill dispatch when a prompt begins with `/` (e.g. `/code-review`).
121+
122+
### Messaging Gateways (opt-in)
123+
124+
The daemon exposes Telegram, Discord, and Slack gateways so you can chat with hawk from your messaging app. Disabled by default; enabled per-channel via daemon config.
125+
126+
### AST Repo-Map & Codebase Analysis
127+
128+
An AST-based repository map (`internal/context/repomap`) gives the model a structural overview of your code. On first run, hawk can auto-analyze the codebase to seed context (default-off, opt-in).
129+
130+
### Auto-Lint / Auto-Fix Cycle
131+
132+
After edits, hawk can run the matching linter and iterate on fixes (bounded retries) before handing back. Opt-in; preserves existing behavior when disabled.
133+
134+
### Image / Multimodal Context
135+
136+
Feed screenshots and images into the conversation for vision-capable models (`internal/engine/vision.go`).
137+
138+
### Plan & Explore Sub-Agents
139+
140+
Read-only sub-agent modes: `plan` decomposes a task into steps, `explore` investigates the codebase with a configurable thoroughness budget (`quick` / `medium` / `very-thorough`).
141+
142+
### Conventional-Commit Generation
143+
144+
`SmartCommit` and the diff summarizer generate Conventional Commit messages from your staged changes.
145+
146+
### Durable Workflows & Approval Gates
147+
148+
LangGraph-style durable workflows with named, resumable step checkpoints and optional human-in-the-loop approval gates that persist the gate decision.
149+
150+
### Structured Output
151+
152+
Request JSON-Schema-constrained responses; results are validated against the schema and retried once on mismatch.
153+
154+
### YAML Agents & Tasks
155+
156+
Define personas and eval tasks in YAML (in addition to markdown personas), including per-agent display `color` and lifecycle `hooks`.
157+
158+
### IT-Managed Policy Tier
159+
160+
An optional, non-excludable org-policy rule tier (highest precedence) with HTML-comment stripping of rule files for IT-managed deployments.
113161

114162
## Usage
115163

@@ -160,7 +208,7 @@ hawk daemon status # Check if running
160208
hawk daemon stop # Graceful shutdown
161209
```
162210

163-
Endpoints: `GET /v1/health`, `POST /v1/chat` (JSON or SSE streaming)
211+
Endpoints: `GET /v1/health`, `GET /v1/ready` (dependency-aware readiness), `POST /v1/chat` (JSON or SSE streaming)
164212

165213
### Mission Mode
166214

cmd/ai_comments.go

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ import (
66
"os"
77
"path/filepath"
88
"regexp"
9+
"sort"
910
"strings"
1011
)
1112

@@ -82,6 +83,86 @@ func scanForAIComments(dir string, ignore []string) []AIDirective {
8283
return directives
8384
}
8485

86+
// aiDispatchFn is the LLM/agent execution path used to act on a single AI
87+
// directive. It defaults to runAIDirectivePrint (which drives the same chat/agent
88+
// stream as `runPrint`), but is a package-level variable so tests can substitute
89+
// a mock without standing up a real model.
90+
var aiDispatchFn = runAIDirectivePrint
91+
92+
// runAIDirectivePrint dispatches a single AI directive to the existing print
93+
// execution path. The prompt is built from the directive so the agent edits the
94+
// referenced file in place.
95+
func runAIDirectivePrint(d AIDirective) error {
96+
return runPrint(formatDirectivePrompt(d))
97+
}
98+
99+
// formatDirectivePrompt builds a targeted prompt for a single AI directive.
100+
// AI! directives instruct the agent to act now; AI? directives ask it to answer
101+
// the embedded question.
102+
func formatDirectivePrompt(d AIDirective) string {
103+
var b strings.Builder
104+
if d.Mode == "?" {
105+
b.WriteString(fmt.Sprintf(
106+
"An AI question comment was found at %s:%d.\n\nQuestion: %s\n\n",
107+
d.Path, d.Line, d.Instruction))
108+
b.WriteString("Answer the question. If a code change is warranted, make it. ")
109+
} else {
110+
b.WriteString(fmt.Sprintf(
111+
"An AI instruction comment was found at %s:%d.\n\nInstruction: %s\n\n",
112+
d.Path, d.Line, d.Instruction))
113+
b.WriteString("Implement this change now, editing the file in place. ")
114+
}
115+
b.WriteString(fmt.Sprintf(
116+
"The directive lives in the file %s; after you finish, the AI comment token will be removed automatically.\n",
117+
d.Path))
118+
return b.String()
119+
}
120+
121+
// processAIDirectives scans dir for AI!/AI? directives, dispatches each one to
122+
// the configured execution path, and strips the AI comment token from the file
123+
// after a successful dispatch. It returns the number of directives that were
124+
// successfully processed (token stripped). Errors from individual directives are
125+
// reported to stderr and do not abort the remaining directives.
126+
//
127+
// Directives are processed from the bottom of each file upward so that removing a
128+
// line never shifts the line numbers of directives not yet handled in the same
129+
// file.
130+
func processAIDirectives(dir string, ignore []string) int {
131+
directives := scanForAIComments(dir, ignore)
132+
if len(directives) == 0 {
133+
return 0
134+
}
135+
136+
// Sort by file then descending line so earlier-line removals don't invalidate
137+
// the line numbers of later directives in the same file.
138+
sort.SliceStable(directives, func(i, j int) bool {
139+
if directives[i].Path != directives[j].Path {
140+
return directives[i].Path < directives[j].Path
141+
}
142+
return directives[i].Line > directives[j].Line
143+
})
144+
145+
processed := 0
146+
for _, d := range directives {
147+
if err := aiDispatchFn(d); err != nil {
148+
fmt.Fprintf(os.Stderr, "AI directive %s:%d failed: %v\n", d.Path, d.Line, err)
149+
continue
150+
}
151+
// Resolve back to an absolute path for removal; scan returns paths
152+
// relative to dir.
153+
full := d.Path
154+
if !filepath.IsAbs(full) {
155+
full = filepath.Join(dir, d.Path)
156+
}
157+
if err := removeAIComment(full, d.Line); err != nil {
158+
fmt.Fprintf(os.Stderr, "AI directive %s:%d: failed to strip token: %v\n", d.Path, d.Line, err)
159+
continue
160+
}
161+
processed++
162+
}
163+
return processed
164+
}
165+
85166
// formatDirectivesAsPrompt formats found directives into a prompt string.
86167
func formatDirectivesAsPrompt(directives []AIDirective) string {
87168
if len(directives) == 0 {

0 commit comments

Comments
 (0)