Skip to content

docs: add Session god-object decomposition design sketch#33

Merged
Patel230 merged 4 commits into
mainfrom
docs/session-decomposition-sketch
Jun 12, 2026
Merged

docs: add Session god-object decomposition design sketch#33
Patel230 merged 4 commits into
mainfrom
docs/session-decomposition-sketch

Conversation

@Patel230

Copy link
Copy Markdown
Contributor

What

A design proposal (NOT yet implemented) for breaking the 35-collaborator Session struct at hawk/internal/engine/session.go:42-141 into 6 cohesive sub-services:

  • ChatService (150 LOC) — LLM transport, retry, circuit-breaker
  • MemoryService (200 LOC) — yaad bridge, recall, sleeptime, skill distill
  • ToolService (400 LOC) — registry, 15-stage post-call pipeline
  • PermissionService (150 LOC) — guardian, approval, budget tracking
  • LifecycleService (250 LOC) — self-improvement, beliefs, doom-loop
  • PersistenceService (300 LOC) — checkpoint, compaction, branching DAG

Why

Session is a god object. agentLoop is an 833-line function that reads ~25 fields. New contributors can't tell which field is required vs optional vs diagnostic, and unit tests can't construct a Session without wiring all 30 fields. Every new feature adds another field.

Status

Design proposal only. No code changes. The 4 concrete fixes in the companion PR fix/agent-loop-safety-and-tool-classification ship ahead of this refactor and are independent of it.

The proposed 7-PR migration sequence (Phases 2-7) is laid out in the doc. Each phase is independent and additive; each gets its own focused test suite. Estimated ~5 days of focused refactor work + tests.

Files

  • docs/session-decomposition.md (new, 256 lines) — full design with proposed sub-service file paths, method signatures, the slimmed-down Session shape, the 250-line refactored agentLoop, testing strategy, open questions, and effort estimate.

Seeking

Comments on the decomposition boundary, the proposed sub-service responsibilities, and whether anyone wants to start Phase 2 (ChatService extraction) before or after other work.

Patel230 added 4 commits June 12, 2026 17:57
ai_passage.md was a 53-line, ~1000-word essay on the history and
ethics of AI in general — entirely unrelated to the hawk project,
no README/AGENTS.md/CHANGELOG.md reference to it. It looks like
LLM-generated filler committed in '99261ca Fix CI formatting and
toolchain hygiene' to satisfy a 'must have an essay' requirement
that no longer applies. Untrack and delete.
…revert failure

Three related fixes in the hawk agent loop:

1. **stream.go Stream retry no longer blocks ctx cancellation.**
   Previously: `time.Sleep(time.Duration(streamAttempt+1) * time.Second)` — a
   user-initiated cancel during a reasoning-only retry's backoff was ignored for
   up to 3s. Now uses a select with ctx.Done() so cancel is observed immediately,
   the in-flight stream is closed, and the loop exits with a structured error event.

2. **stream_tool_exec.go Self-review-before-apply now returns a hard error
   when the revert itself fails.** Previously: if the LLM said 'this diff is bad'
   and we tried to revert (os.Remove or os.WriteFile) and that syscall failed, the
   code only logged s.log.Warn and proceeded with the rejected diff still on disk.
   Now: capture the revert error, log it at Error level, and surface a hard tool
   error ('Self-review rejected the change AND the revert failed: <err>. Manual
   intervention required.') so the LLM can't continue building on top of code it
   just flagged as broken.

3. **internal/tool/tool.go now exports ReadOnlyTools + IsReadOnly() so the
   safe-concurrent allowlist has a single source of truth.** The map was previously
   duplicated in stream.go:716 and stream_tool_exec.go:29 with identical content
   — drift waiting to happen. Both call sites now go through tool.IsReadOnly(name),
   which canonicalises aliases (read/file_read, ls, etc.) before lookup. A new
   test (TestIsReadOnly, TestReadOnlyToolsSetContainsExpectedNames) locks the
   contract so a future removal of e.g. 'Read' from the allowlist fails CI in
   internal/tool/ rather than silently breaking concurrency classification in
   internal/engine/.
hawk/internal/engine/session.go has 35 collaborators, 30 of them optional
*T pointer fields. The agentLoop is an 833-line function that reads ~25 of
these fields, creating a hidden web of dependencies. New contributors can't
tell which field is required vs optional vs diagnostic, and unit tests can't
construct a Session without wiring all 30 fields.

This is a design proposal (NOT yet implemented) for breaking Session into 6
cohesive sub-services:

  - ChatService        (150 LOC) — LLM transport, retry, circuit-breaker
  - MemoryService      (200 LOC) — yaad bridge, recall, sleeptime, skill distill
  - ToolService        (400 LOC) — registry, 15-stage post-call pipeline
  - PermissionService  (150 LOC) — guardian, approval, budget tracking
  - LifecycleService   (250 LOC) — self-improvement, beliefs, doom-loop
  - PersistenceService (300 LOC) — checkpoint, compaction, branching DAG

The proposed 7-PR migration sequence is documented in the doc. Phases 2-7
are each independent and additive; each gets its own focused test suite.
The 4 concrete fixes in the current PR (time.Sleep, ReadOnlyTools, AGENTS.md,
self-review revert) are independent of this refactor and ship ahead of it.

See: docs/session-decomposition.md (this file)
@Patel230 Patel230 merged commit db7bb66 into main Jun 12, 2026
18 checks passed
@Patel230 Patel230 deleted the docs/session-decomposition-sketch branch June 12, 2026 20:14
Patel230 added a commit that referenced this pull request Jun 13, 2026
Bump external/eyrie to dce29ff (#33 squash) and external/trace to 6feded7
so all submodule checkouts match their origin/main heads.

Co-authored-by: Cursor <cursoragent@cursor.com>
Patel230 added a commit that referenced this pull request Jun 15, 2026
* docs: remove generic LLM boilerplate ai_passage.md

ai_passage.md was a 53-line, ~1000-word essay on the history and
ethics of AI in general — entirely unrelated to the hawk project,
no README/AGENTS.md/CHANGELOG.md reference to it. It looks like
LLM-generated filler committed in '99261ca Fix CI formatting and
toolchain hygiene' to satisfy a 'must have an essay' requirement
that no longer applies. Untrack and delete.

* fix(engine): respect ctx cancel on stream retry, surface self-review revert failure

Three related fixes in the hawk agent loop:

1. **stream.go Stream retry no longer blocks ctx cancellation.**
   Previously: `time.Sleep(time.Duration(streamAttempt+1) * time.Second)` — a
   user-initiated cancel during a reasoning-only retry's backoff was ignored for
   up to 3s. Now uses a select with ctx.Done() so cancel is observed immediately,
   the in-flight stream is closed, and the loop exits with a structured error event.

2. **stream_tool_exec.go Self-review-before-apply now returns a hard error
   when the revert itself fails.** Previously: if the LLM said 'this diff is bad'
   and we tried to revert (os.Remove or os.WriteFile) and that syscall failed, the
   code only logged s.log.Warn and proceeded with the rejected diff still on disk.
   Now: capture the revert error, log it at Error level, and surface a hard tool
   error ('Self-review rejected the change AND the revert failed: <err>. Manual
   intervention required.') so the LLM can't continue building on top of code it
   just flagged as broken.

3. **internal/tool/tool.go now exports ReadOnlyTools + IsReadOnly() so the
   safe-concurrent allowlist has a single source of truth.** The map was previously
   duplicated in stream.go:716 and stream_tool_exec.go:29 with identical content
   — drift waiting to happen. Both call sites now go through tool.IsReadOnly(name),
   which canonicalises aliases (read/file_read, ls, etc.) before lookup. A new
   test (TestIsReadOnly, TestReadOnlyToolsSetContainsExpectedNames) locks the
   contract so a future removal of e.g. 'Read' from the allowlist fails CI in
   internal/tool/ rather than silently breaking concurrency classification in
   internal/engine/.

* docs: add Session god-object decomposition design sketch

hawk/internal/engine/session.go has 35 collaborators, 30 of them optional
*T pointer fields. The agentLoop is an 833-line function that reads ~25 of
these fields, creating a hidden web of dependencies. New contributors can't
tell which field is required vs optional vs diagnostic, and unit tests can't
construct a Session without wiring all 30 fields.

This is a design proposal (NOT yet implemented) for breaking Session into 6
cohesive sub-services:

  - ChatService        (150 LOC) — LLM transport, retry, circuit-breaker
  - MemoryService      (200 LOC) — yaad bridge, recall, sleeptime, skill distill
  - ToolService        (400 LOC) — registry, 15-stage post-call pipeline
  - PermissionService  (150 LOC) — guardian, approval, budget tracking
  - LifecycleService   (250 LOC) — self-improvement, beliefs, doom-loop
  - PersistenceService (300 LOC) — checkpoint, compaction, branching DAG

The proposed 7-PR migration sequence is documented in the doc. Phases 2-7
are each independent and additive; each gets its own focused test suite.
The 4 concrete fixes in the current PR (time.Sleep, ReadOnlyTools, AGENTS.md,
self-review revert) are independent of this refactor and ship ahead of it.

See: docs/session-decomposition.md (this file)
Patel230 added a commit that referenced this pull request Jun 15, 2026
Bump external/eyrie to dce29ff (#33 squash) and external/trace to 6feded7
so all submodule checkouts match their origin/main heads.
Patel230 added a commit that referenced this pull request Jun 18, 2026
* docs: remove generic LLM boilerplate ai_passage.md

ai_passage.md was a 53-line, ~1000-word essay on the history and
ethics of AI in general — entirely unrelated to the hawk project,
no README/AGENTS.md/CHANGELOG.md reference to it. It looks like
LLM-generated filler committed in '99261ca Fix CI formatting and
toolchain hygiene' to satisfy a 'must have an essay' requirement
that no longer applies. Untrack and delete.

* fix(engine): respect ctx cancel on stream retry, surface self-review revert failure

Three related fixes in the hawk agent loop:

1. **stream.go Stream retry no longer blocks ctx cancellation.**
   Previously: `time.Sleep(time.Duration(streamAttempt+1) * time.Second)` — a
   user-initiated cancel during a reasoning-only retry's backoff was ignored for
   up to 3s. Now uses a select with ctx.Done() so cancel is observed immediately,
   the in-flight stream is closed, and the loop exits with a structured error event.

2. **stream_tool_exec.go Self-review-before-apply now returns a hard error
   when the revert itself fails.** Previously: if the LLM said 'this diff is bad'
   and we tried to revert (os.Remove or os.WriteFile) and that syscall failed, the
   code only logged s.log.Warn and proceeded with the rejected diff still on disk.
   Now: capture the revert error, log it at Error level, and surface a hard tool
   error ('Self-review rejected the change AND the revert failed: <err>. Manual
   intervention required.') so the LLM can't continue building on top of code it
   just flagged as broken.

3. **internal/tool/tool.go now exports ReadOnlyTools + IsReadOnly() so the
   safe-concurrent allowlist has a single source of truth.** The map was previously
   duplicated in stream.go:716 and stream_tool_exec.go:29 with identical content
   — drift waiting to happen. Both call sites now go through tool.IsReadOnly(name),
   which canonicalises aliases (read/file_read, ls, etc.) before lookup. A new
   test (TestIsReadOnly, TestReadOnlyToolsSetContainsExpectedNames) locks the
   contract so a future removal of e.g. 'Read' from the allowlist fails CI in
   internal/tool/ rather than silently breaking concurrency classification in
   internal/engine/.

* docs: add Session god-object decomposition design sketch

hawk/internal/engine/session.go has 35 collaborators, 30 of them optional
*T pointer fields. The agentLoop is an 833-line function that reads ~25 of
these fields, creating a hidden web of dependencies. New contributors can't
tell which field is required vs optional vs diagnostic, and unit tests can't
construct a Session without wiring all 30 fields.

This is a design proposal (NOT yet implemented) for breaking Session into 6
cohesive sub-services:

  - ChatService        (150 LOC) — LLM transport, retry, circuit-breaker
  - MemoryService      (200 LOC) — yaad bridge, recall, sleeptime, skill distill
  - ToolService        (400 LOC) — registry, 15-stage post-call pipeline
  - PermissionService  (150 LOC) — guardian, approval, budget tracking
  - LifecycleService   (250 LOC) — self-improvement, beliefs, doom-loop
  - PersistenceService (300 LOC) — checkpoint, compaction, branching DAG

The proposed 7-PR migration sequence is documented in the doc. Phases 2-7
are each independent and additive; each gets its own focused test suite.
The 4 concrete fixes in the current PR (time.Sleep, ReadOnlyTools, AGENTS.md,
self-review revert) are independent of this refactor and ship ahead of it.

See: docs/session-decomposition.md (this file)
Patel230 added a commit that referenced this pull request Jun 18, 2026
Bump external/eyrie to dce29ff (#33 squash) and external/trace to 6feded7
so all submodule checkouts match their origin/main heads.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant