docs: add Session god-object decomposition design sketch by Patel230 · Pull Request #33 · GrayCodeAI/hawk

Patel230 · 2026-06-12T13:45:16Z

What

A design proposal (NOT yet implemented) for breaking the 35-collaborator Session struct at hawk/internal/engine/session.go:42-141 into 6 cohesive sub-services:

ChatService (150 LOC) — LLM transport, retry, circuit-breaker
MemoryService (200 LOC) — yaad bridge, recall, sleeptime, skill distill
ToolService (400 LOC) — registry, 15-stage post-call pipeline
PermissionService (150 LOC) — guardian, approval, budget tracking
LifecycleService (250 LOC) — self-improvement, beliefs, doom-loop
PersistenceService (300 LOC) — checkpoint, compaction, branching DAG

Why

Session is a god object. agentLoop is an 833-line function that reads ~25 fields. New contributors can't tell which field is required vs optional vs diagnostic, and unit tests can't construct a Session without wiring all 30 fields. Every new feature adds another field.

Status

Design proposal only. No code changes. The 4 concrete fixes in the companion PR fix/agent-loop-safety-and-tool-classification ship ahead of this refactor and are independent of it.

The proposed 7-PR migration sequence (Phases 2-7) is laid out in the doc. Each phase is independent and additive; each gets its own focused test suite. Estimated ~5 days of focused refactor work + tests.

Files

docs/session-decomposition.md (new, 256 lines) — full design with proposed sub-service file paths, method signatures, the slimmed-down Session shape, the 250-line refactored agentLoop, testing strategy, open questions, and effort estimate.

Seeking

Comments on the decomposition boundary, the proposed sub-service responsibilities, and whether anyone wants to start Phase 2 (ChatService extraction) before or after other work.

ai_passage.md was a 53-line, ~1000-word essay on the history and ethics of AI in general — entirely unrelated to the hawk project, no README/AGENTS.md/CHANGELOG.md reference to it. It looks like LLM-generated filler committed in '99261ca Fix CI formatting and toolchain hygiene' to satisfy a 'must have an essay' requirement that no longer applies. Untrack and delete.

…revert failure Three related fixes in the hawk agent loop: 1. **stream.go Stream retry no longer blocks ctx cancellation.** Previously: `time.Sleep(time.Duration(streamAttempt+1) * time.Second)` — a user-initiated cancel during a reasoning-only retry's backoff was ignored for up to 3s. Now uses a select with ctx.Done() so cancel is observed immediately, the in-flight stream is closed, and the loop exits with a structured error event. 2. **stream_tool_exec.go Self-review-before-apply now returns a hard error when the revert itself fails.** Previously: if the LLM said 'this diff is bad' and we tried to revert (os.Remove or os.WriteFile) and that syscall failed, the code only logged s.log.Warn and proceeded with the rejected diff still on disk. Now: capture the revert error, log it at Error level, and surface a hard tool error ('Self-review rejected the change AND the revert failed: <err>. Manual intervention required.') so the LLM can't continue building on top of code it just flagged as broken. 3. **internal/tool/tool.go now exports ReadOnlyTools + IsReadOnly() so the safe-concurrent allowlist has a single source of truth.** The map was previously duplicated in stream.go:716 and stream_tool_exec.go:29 with identical content — drift waiting to happen. Both call sites now go through tool.IsReadOnly(name), which canonicalises aliases (read/file_read, ls, etc.) before lookup. A new test (TestIsReadOnly, TestReadOnlyToolsSetContainsExpectedNames) locks the contract so a future removal of e.g. 'Read' from the allowlist fails CI in internal/tool/ rather than silently breaking concurrency classification in internal/engine/.

hawk/internal/engine/session.go has 35 collaborators, 30 of them optional *T pointer fields. The agentLoop is an 833-line function that reads ~25 of these fields, creating a hidden web of dependencies. New contributors can't tell which field is required vs optional vs diagnostic, and unit tests can't construct a Session without wiring all 30 fields. This is a design proposal (NOT yet implemented) for breaking Session into 6 cohesive sub-services: - ChatService (150 LOC) — LLM transport, retry, circuit-breaker - MemoryService (200 LOC) — yaad bridge, recall, sleeptime, skill distill - ToolService (400 LOC) — registry, 15-stage post-call pipeline - PermissionService (150 LOC) — guardian, approval, budget tracking - LifecycleService (250 LOC) — self-improvement, beliefs, doom-loop - PersistenceService (300 LOC) — checkpoint, compaction, branching DAG The proposed 7-PR migration sequence is documented in the doc. Phases 2-7 are each independent and additive; each gets its own focused test suite. The 4 concrete fixes in the current PR (time.Sleep, ReadOnlyTools, AGENTS.md, self-review revert) are independent of this refactor and ship ahead of it. See: docs/session-decomposition.md (this file)

Bump external/eyrie to dce29ff (#33 squash) and external/trace to 6feded7 so all submodule checkouts match their origin/main heads. Co-authored-by: Cursor <cursoragent@cursor.com>

* docs: remove generic LLM boilerplate ai_passage.md ai_passage.md was a 53-line, ~1000-word essay on the history and ethics of AI in general — entirely unrelated to the hawk project, no README/AGENTS.md/CHANGELOG.md reference to it. It looks like LLM-generated filler committed in '99261ca Fix CI formatting and toolchain hygiene' to satisfy a 'must have an essay' requirement that no longer applies. Untrack and delete. * fix(engine): respect ctx cancel on stream retry, surface self-review revert failure Three related fixes in the hawk agent loop: 1. **stream.go Stream retry no longer blocks ctx cancellation.** Previously: `time.Sleep(time.Duration(streamAttempt+1) * time.Second)` — a user-initiated cancel during a reasoning-only retry's backoff was ignored for up to 3s. Now uses a select with ctx.Done() so cancel is observed immediately, the in-flight stream is closed, and the loop exits with a structured error event. 2. **stream_tool_exec.go Self-review-before-apply now returns a hard error when the revert itself fails.** Previously: if the LLM said 'this diff is bad' and we tried to revert (os.Remove or os.WriteFile) and that syscall failed, the code only logged s.log.Warn and proceeded with the rejected diff still on disk. Now: capture the revert error, log it at Error level, and surface a hard tool error ('Self-review rejected the change AND the revert failed: <err>. Manual intervention required.') so the LLM can't continue building on top of code it just flagged as broken. 3. **internal/tool/tool.go now exports ReadOnlyTools + IsReadOnly() so the safe-concurrent allowlist has a single source of truth.** The map was previously duplicated in stream.go:716 and stream_tool_exec.go:29 with identical content — drift waiting to happen. Both call sites now go through tool.IsReadOnly(name), which canonicalises aliases (read/file_read, ls, etc.) before lookup. A new test (TestIsReadOnly, TestReadOnlyToolsSetContainsExpectedNames) locks the contract so a future removal of e.g. 'Read' from the allowlist fails CI in internal/tool/ rather than silently breaking concurrency classification in internal/engine/. * docs: add Session god-object decomposition design sketch hawk/internal/engine/session.go has 35 collaborators, 30 of them optional *T pointer fields. The agentLoop is an 833-line function that reads ~25 of these fields, creating a hidden web of dependencies. New contributors can't tell which field is required vs optional vs diagnostic, and unit tests can't construct a Session without wiring all 30 fields. This is a design proposal (NOT yet implemented) for breaking Session into 6 cohesive sub-services: - ChatService (150 LOC) — LLM transport, retry, circuit-breaker - MemoryService (200 LOC) — yaad bridge, recall, sleeptime, skill distill - ToolService (400 LOC) — registry, 15-stage post-call pipeline - PermissionService (150 LOC) — guardian, approval, budget tracking - LifecycleService (250 LOC) — self-improvement, beliefs, doom-loop - PersistenceService (300 LOC) — checkpoint, compaction, branching DAG The proposed 7-PR migration sequence is documented in the doc. Phases 2-7 are each independent and additive; each gets its own focused test suite. The 4 concrete fixes in the current PR (time.Sleep, ReadOnlyTools, AGENTS.md, self-review revert) are independent of this refactor and ship ahead of it. See: docs/session-decomposition.md (this file)

Bump external/eyrie to dce29ff (#33 squash) and external/trace to 6feded7 so all submodule checkouts match their origin/main heads.

* docs: remove generic LLM boilerplate ai_passage.md ai_passage.md was a 53-line, ~1000-word essay on the history and ethics of AI in general — entirely unrelated to the hawk project, no README/AGENTS.md/CHANGELOG.md reference to it. It looks like LLM-generated filler committed in '99261ca Fix CI formatting and toolchain hygiene' to satisfy a 'must have an essay' requirement that no longer applies. Untrack and delete. * fix(engine): respect ctx cancel on stream retry, surface self-review revert failure Three related fixes in the hawk agent loop: 1. **stream.go Stream retry no longer blocks ctx cancellation.** Previously: `time.Sleep(time.Duration(streamAttempt+1) * time.Second)` — a user-initiated cancel during a reasoning-only retry's backoff was ignored for up to 3s. Now uses a select with ctx.Done() so cancel is observed immediately, the in-flight stream is closed, and the loop exits with a structured error event. 2. **stream_tool_exec.go Self-review-before-apply now returns a hard error when the revert itself fails.** Previously: if the LLM said 'this diff is bad' and we tried to revert (os.Remove or os.WriteFile) and that syscall failed, the code only logged s.log.Warn and proceeded with the rejected diff still on disk. Now: capture the revert error, log it at Error level, and surface a hard tool error ('Self-review rejected the change AND the revert failed: <err>. Manual intervention required.') so the LLM can't continue building on top of code it just flagged as broken. 3. **internal/tool/tool.go now exports ReadOnlyTools + IsReadOnly() so the safe-concurrent allowlist has a single source of truth.** The map was previously duplicated in stream.go:716 and stream_tool_exec.go:29 with identical content — drift waiting to happen. Both call sites now go through tool.IsReadOnly(name), which canonicalises aliases (read/file_read, ls, etc.) before lookup. A new test (TestIsReadOnly, TestReadOnlyToolsSetContainsExpectedNames) locks the contract so a future removal of e.g. 'Read' from the allowlist fails CI in internal/tool/ rather than silently breaking concurrency classification in internal/engine/. * docs: add Session god-object decomposition design sketch hawk/internal/engine/session.go has 35 collaborators, 30 of them optional *T pointer fields. The agentLoop is an 833-line function that reads ~25 of these fields, creating a hidden web of dependencies. New contributors can't tell which field is required vs optional vs diagnostic, and unit tests can't construct a Session without wiring all 30 fields. This is a design proposal (NOT yet implemented) for breaking Session into 6 cohesive sub-services: - ChatService (150 LOC) — LLM transport, retry, circuit-breaker - MemoryService (200 LOC) — yaad bridge, recall, sleeptime, skill distill - ToolService (400 LOC) — registry, 15-stage post-call pipeline - PermissionService (150 LOC) — guardian, approval, budget tracking - LifecycleService (250 LOC) — self-improvement, beliefs, doom-loop - PersistenceService (300 LOC) — checkpoint, compaction, branching DAG The proposed 7-PR migration sequence is documented in the doc. Phases 2-7 are each independent and additive; each gets its own focused test suite. The 4 concrete fixes in the current PR (time.Sleep, ReadOnlyTools, AGENTS.md, self-review revert) are independent of this refactor and ship ahead of it. See: docs/session-decomposition.md (this file)

Bump external/eyrie to dce29ff (#33 squash) and external/trace to 6feded7 so all submodule checkouts match their origin/main heads.

Patel230 added 4 commits June 12, 2026 17:57

Merge branch 'main' of github.com:GrayCodeAI/hawk

44e79e3

Patel230 merged commit db7bb66 into main Jun 12, 2026
18 checks passed

Patel230 deleted the docs/session-decomposition-sketch branch June 12, 2026 20:14

Patel230 mentioned this pull request Jun 13, 2026

chore(submodules): align eyrie and trace to latest main #45

Merged

2 tasks

Patel230 added a commit that referenced this pull request Jun 15, 2026

chore(submodules): align eyrie and trace to latest main (#45)

7ec4b7e

Bump external/eyrie to dce29ff (#33 squash) and external/trace to 6feded7 so all submodule checkouts match their origin/main heads.

Patel230 added a commit that referenced this pull request Jun 18, 2026

chore(submodules): align eyrie and trace to latest main (#45)

22d7831

Bump external/eyrie to dce29ff (#33 squash) and external/trace to 6feded7 so all submodule checkouts match their origin/main heads.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: add Session god-object decomposition design sketch#33

docs: add Session god-object decomposition design sketch#33
Patel230 merged 4 commits into
mainfrom
docs/session-decomposition-sketch

Patel230 commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Patel230 commented Jun 12, 2026

What

Why

Status

Files

Seeking

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant