This document is the canonical wider-system reference. It describes how the system fits together right now — package boundaries, state machines, dispatch flows. The bones were laid down at v0.2.0; subsequent releases (v0.3.0 through v0.6.2) added cross-cutting features that are described in-line below as the corresponding subsystems evolved.
The why behind those changes — design rationale, rejected
alternatives, threat models for individual decisions — lives in the
ADR catalogue: see ../INDEX.md for the indexed list.
Companion reference documents in this directory:
memory-model.md— canonical reference for the 4-facility memory design.data-analysis.md— the per-session DuckDB engine, the sliding-window analyze-data summarizer, and the Findings lifecycle.privacy-controls.md— private sessions, log-level filter, audit entries.
For the pre-v0.2.0 audit trail see ../history/.
shell-agent-v2 is a macOS-native chat-and-agent application built with Wails v2 (Go backend + React + TypeScript frontend). It hosts a single user conversation against either a local LLM (LM Studio, OpenAI-compatible) or Vertex AI (Gemini), with first-class support for:
- Data analysis — load CSV / JSON / JSONL into a per-session DuckDB engine and let the agent run SELECTs, sliding-window summaries, and promote findings to the chat-pane Findings panel.
- Shell-script tool calling — register scripts as tools with
per-tool MITL gates, headers for category / timeout / mitl
hints, and a shared
/workdirectory. - Container sandbox (opt-in) — Python / shell execution in a
per-session
podman/dockercontainer. - MCP (Model Context Protocol) — stdio JSON-RPC 2.0 via the
external
mcp-guardianbinary. - Cross-session memory — Global Memory (preferences /
decisions across sessions) plus per-session Session Memory and
Findings; explicit user-controlled "Pin to Global Memory"
promotion. See
memory-model.md.
The project is a successor to shell-agent v1 (a Slack-driven
agent) but shares no code. v1's heuristics about Hot/Warm/Cold
memory tiers, /finding slash commands, and the global Pinned
store have all been replaced — see the v0.2.0 entry in
../../CHANGELOG.md for the breaking-change
summary.
The agent occupies the active session exclusively while it works:
- Idle — accepts user input. May still have an in-flight
memory extraction (
a.extractionInFlight == true); see below. - Busy — agentLoop in progress. Input field disabled, session switch requires explicit abort.
v0.11.0 deferred extraction (ADR-0015): memory extraction no longer gates the Busy→Idle transition. Title generation still does, but it's first-turn-only and fast. While extraction runs in the background:
state == Idle— the input bar re-enables so the user can compose the next message.a.extractionInFlight == true— visible viaIsExtractionInFlight().- A SEND received during this window lands in
a.queuedSend(single-slot, most-recent-wins) and auto-fires when extraction completes. The frontend surfaces this as a "Queued" pill above the input bar. - Session-management operations (
LoadSession,DeleteSession,ExportSession,ImportSession,RenameSession,NewSession) remain blocked through the extraction window because the extraction goroutine continues to register viatrackBgand the existing frontend gate checksbgTasks.length === 0. This prevents the extraction from writing into a session it no longer owns.
State transitions are owned by internal/agent.Agent and surfaced
to the frontend via the agent:state Wails event plus the
GetState binding. The Busy guard is also enforced backend-side:
binding entry-points that would mutate session state during Busy
return an error rather than queuing.
Bindings.IsBusy() (used by OnBeforeClose in main.go to
gate app quit) returns true for state == Busy, for
IsExtractionInFlight(), and for HasQueuedSend(). Quitting
mid-extraction could lose facts mid-write, so the OS confirmation
fires until extraction completes.
Operations gated by the Busy guard (by introduction date):
- v0.2.0:
Send/SendWithImages,LoadSession - v0.4.0:
ExportSession,ImportSession - v0.4.2:
DeleteSession— previously bound directly in the binding layer with only an entry-timeIsBusy()check; now routed throughagent.DeleteSessionwhich holds the slot for the operation's full duration. SeeADR-0003§2 for the failure modes the looser pre-v0.4.2 path allowed (active- session-deleted Send racing the dir RemoveAll, etc.).
Active-session deletes additionally nil-clear a.session,
a.sessionMemory, a.findings and Close() the analysis Engine
before RemoveAll runs, so a stray Save / Engine call cannot
resurrect the session directory.
Agent.Send(ctx, message) runs a synchronous tool-calling loop:
buildMessages → backend.Chat → parse tool_calls
↓ if no tool calls
return reply
↓ if tool calls
for each call: dispatch → record result
↓ next round (max 10)
Hard cap: cfg.MaxToolRounds (default 10). Loop-detection logic
(repeated same-error stretches) trips earlier — see
../history/agent-loop-resilience.md.
The loop is not ReAct: tool results feed back into the next round verbatim, with no separate "Observation/Thought" framing. This keeps it compatible with weaker local LLMs that don't reliably follow the ReAct grammar.
After the loop returns a reply, Agent kicks off two async
WaitGroup tasks (state stays Busy until both complete):
- Title generation (only on the first user turn of a session)
- Memory extraction — see §4.
Both surface as bg-task:* Wails events so the input bar can
show a per-task badge.
app/
├── main.go # Wails App config + lifecycle
├── bindings.go # Wails binding surface (thin delegation)
├── internal/
│ ├── agent/ # Idle/Busy state machine, tool dispatch, extractMemories
│ ├── chat/ # System prompt assembly, BuildMessages, temporal context
│ ├── llm/ # Backend abstraction
│ │ ├── backend.go # Backend interface, Message / ToolCall types
│ │ ├── local.go # LM Studio (OpenAI compat, tool→user mapping)
│ │ ├── vertex.go # Vertex AI (genai SDK, FunctionCall/Response)
│ │ └── mock.go # Mock backend for testing
│ ├── analysis/ # Per-session DuckDB engine + sliding-window summarizer
│ ├── memory/ # Records, GlobalMemoryStore, SessionMemoryStore, sessions
│ ├── findings/ # Per-session findings store + Jaccard dedup
│ ├── contextbuild/ # Non-destructive context assembly + summary cache
│ ├── objstore/ # Central object repository (image/blob/report; 32-hex IDs)
│ ├── sessionio/ # .shellagent bundle pack/unpack + reference rewriter (v0.4.0)
│ ├── toolcall/ # Shell script registry, header parsing, MITL categories
│ ├── mcp/ # mcp-guardian stdio JSON-RPC 2.0 client
│ ├── sandbox/ # Per-session podman/docker container
│ │ └── imagebuild/ # Sandbox image build + digest pinning checks
│ ├── bundled/ # First-run scaffold of bundled shell-tool scripts
│ ├── pathfix/ # macOS app-launch PATH normalisation (Homebrew)
│ ├── atomicio/ # WriteFileAtomic (tmp+rename + parent-dir fsync)
│ ├── config/ # JSON config, path expansion
│ ├── logger/ # File-based structured logging
│ └── frontendlint/ # CI guard against forbidden CSS / JS patterns
├── frontend/src/ # React + TypeScript UI
└── cmd/ # Test helper binaries (tooltest-local, tooltest-vertex)
Four facilities, three storage scopes, no v1 compaction:
| Facility | Categories | Scope | File |
|---|---|---|---|
| Records | — | per-session | sessions/<id>/chat.json |
| Session Memory | fact / context | per-session | sessions/<id>/session_memory.json |
| Findings | (data-analysis discoveries) | per-session | sessions/<id>/findings.json |
| Global Memory | preference / decision | cross-session | <dataDir>/global_memory.json |
v0.3.0: privacy flag on Session. A Session.Private bool
(persisted in chat.json with omitempty for legacy compat)
opts the session out of cross-session promotion: extraction
drops preference / decision facts, the Pin handlers
(PromoteSessionMemoryToGlobal, PromoteFindingToGlobal)
reject server-side, and the frontend hides the ★ Pin UI plus
shows a 🔒 indicator. Privacy is fixed at session creation.
Full design: privacy-controls.md.
Auto-extraction (agent.extractMemories) runs after each response,
takes the last 4 user/assistant turns (skipping past tool records
to avoid the tool-flood blind spot), asks the extraction LLM for
category|turn-N|fact|native lines, then routes by category:
preference/decision→ GlobalMemoryStorefact/context→ SessionMemoryStore
Findings come from two paths and dedup at insert:
- Auto-promote from
analyze-datasliding-window analysis. - Explicit from
promote-findingtool calls.
3-tier dedup (exact / normalised / word-set Jaccard ≥ 0.5) keeps the same observation in slightly different wording from filling the store.
User can promote a Session Memory entry or a Finding into Global Memory via the Pin to Global Memory UI (sidebar ★ button or chat-pane panel ★ button → category-picker dialog). The original entry stays in place; promotion is additive.
Context is built non-destructively per call by contextbuild:
records stay full-fidelity, older portions condense via a
content-keyed summary cache at sessions/<id>/summaries.json.
Time-range markers are added to summaries and to raw records
after a >30-min gap so the LLM can reason about when events
happened.
Full design + threat model: memory-model.md.
All tools — analysis built-ins, shell scripts, sandbox
counterparts, MCP — flow through one dispatcher
(agent.executeTool) and one MITL gate
(agent.IsToolMITLRequired). Since v0.6 the analysis +
builtin + sandbox sources additionally share a single
metadata source: ToolDescriptor (per-Agent
a.toolDescriptors slice, indexed by name in
a.toolDescriptorIndex).
- Builtin tools (
internal/agent/tool_descriptors_builtin.go):resolve-date,list-objects,get-object,register-object. Don't depend on the analysis engine. - Analysis tools (
internal/agent/tool_descriptors_analysis.go):load-data,describe-data,query-sql,query-preview,quick-summary,analyze-data,promote-finding,create-report,list-tables,suggest-analysis,reset-analysis,analyze-text,grep-text,get-text. Filtered out of the LLM tool list whena.analysis == nil; data-gated subset hidden when legacy mode is on and no table is loaded. - Sandbox tools (
internal/agent/tool_descriptors_sandbox.go): eightsandbox-*tools that execute in a per-session container with/workmounted from the session data dir. Filtered out of the LLM tool list and Settings UI whena.sandbox == nil(the engine's lifecycle is dynamic viaRestartSandbox). - Shell scripts (
internal/toolcall/): user-registered scripts with header-driven metadata. Bundled scripts (file-info,preview-file,list-files,weather,get-location,write-note) are scaffolded on first launch viago:embed. Not in the descriptor registry — registered separately from the toolcall.Registry and joined into the LLM tool list atbuildToolDefstime. - MCP tools — discovered at startup from the configured
guardian profiles and namespaced as
<guardian>__<tool>. Not in the descriptor registry — joined dynamically atbuildToolDefstime.
Each ToolDescriptor carries Name, Description, Parameters
(JSON Schema), Category (read/write/execute), Source
(analysis/builtin/sandbox), MITLDefault,
MITLCategoryOverride (for the specialised SQL-preview /
analysis-plan dialogs), HideUntilDataLoaded (legacy hide-
until-table-loaded gate), and Handle (the closure that
captures *Agent and dispatches to the underlying tool
method). The same descriptor backs the LLM tool def
(descriptorToolDefs), the Settings → Tools entry
(ListTools), the MITL default (IsToolMITLRequired /
toolMITLDefault), and the dispatch
(dispatchDescriptor) — adding a new analysis / builtin /
sandbox tool requires editing exactly one file.
The pre-v0.6 design maintained those four surfaces as
parallel lists; the v0.5.0 → v0.5.1 manual smoke caught two
drift bugs (Settings tab missing a tool, stale MITL map
entry). Structural tests in tool_descriptor_structural_test.go
now enforce the invariants mechanically. Full design rationale
is in ADR-0007.
Each tool has a default — read = auto-allow, write/execute =
require approval. Per-tool override via MITLOverrides in
config. For descriptor-registered tools the default comes
straight from descriptor.MITLDefault; for shell scripts it
comes from Tool.NeedsMITL(); the prefix branches in
IsToolMITLRequired (mcp__ / sandbox-) act as defense in
depth so a missing descriptor cannot accidentally grant a
zero-friction sandbox call. The Settings → Tools UI reads
from the same registry the dispatcher does, so the toggle
state and the actual gate cannot drift.
Special MITL flows surfaced via descriptor.MITLCategoryOverride:
query-sql→ SQL preview dialog before execution (MITLCategoryOverride = "sql_preview").analyze-data→ analysis-plan confirmation dialog (MITLCategoryOverride = "analysis_plan").
A reject can include free-text feedback that's piped back to the LLM as a tool result so it can revise the call.
~/Library/Application Support/shell-agent-v2/
├── config.json # User settings (LLM, sandbox, theme, …)
├── global_memory.json # Cross-session Global Memory (v0.2.0)
├── pinned.json # Legacy v0.1.x; ignored on launch
├── findings.json # Legacy v0.1.x; ignored on launch
├── objects/
│ ├── index.json # Object metadata + session attribution
│ └── <id-prefix>/<id> # Object content (images / blobs / reports)
├── sessions/<session-id>/
│ ├── chat.json # Records (the conversation transcript)
│ ├── session_memory.json # Session Memory entries (fact / context)
│ ├── findings.json # Findings (data-analysis discoveries)
│ ├── summaries.json # contextbuild content-keyed summary cache
│ ├── analysis.duckdb # Session-scoped DuckDB
│ └── work/ # Sandbox /work mount (also $SHELL_AGENT_WORK_DIR)
└── tools/ # User shell scripts (bundled + custom)
All JSON files on this path go through
internal/atomicio.WriteFileAtomic (tmp file → rename + parent-dir
fsync) so a crash mid-save leaves the previous file intact.
Session deletion removes the entire sessions/<id>/ directory
plus the session's owned objstore objects. As of v0.4.2 this is
orchestrated by agent.DeleteSession (not the binding layer
directly), running under the agent state-machine Busy slot —
see §2 for why. Global Memory is unaffected.
v0.4.0: .shellagent bundle import / export. A session can
be packaged into a single ZIP bundle (internal/sessionio) that
carries chat.json, session_memory.json, findings.json,
summaries.json, analysis.duckdb, the recursive work/
subtree, and an objects/ subdirectory with the session's
objstore blobs and metadata. The bundle is portable across
machines (DuckDB's binary format is cross-platform). On import
the new session gets a fresh sess-id and every objstore object
is re-stored with a fresh ID; references in chat.json
(Record.ObjectIDs[] and object:ID markdown in
Record.Content) and summaries.json (SummaryEntry.Summary)
are rewritten through the same agent.Mu-gated state-machine
slot. Privacy flag is preserved verbatim. Full design:
ADR-0001.
Objects are referenced in two directions — into the LLM's input and out of the LLM's output. The two surfaces use distinct shapes; collapsing them is the bug ADR-0014 codified out of the codebase. The rules:
Image (object ID: ID):— input-only anchor for user-attached images, prepended to the user-message text byllm.imageIDPrefixso the model sees the ID adjacent to the image part. The LLM must never emit this form in its own output — it does not render as an image. To cite or surface an image in chat or in a report, use rule 3 below.Document (object ID: ID, name: NAME, N tokens):— input-only anchor for user-attached markdown / text, prepended to a user-message text byllm.PrependDocumentAnchors(internal/contextbuild/render.go:85). Like rule 1, this is an INPUT shape only; cite a document in output via rule 4.— LLM output for inline images. Rendered as<ObjectImage>(data URL →<img>+ lightbox). If the ID resolves to a non-image type the renderer degrades to the chip from rule 4 — no broken-image glyph.[title](object:ID)— LLM output for inline document / report / blob references. Rendered as a clickable chip via theObjectLinkcomponent. Click dispatches toGetObjectText→ReportViewerfor markdown / report, orExportObjectsave-as for blob. If the ID resolves to an image type the renderer degrades to inline<ObjectImage>— no dead anchor.- Reports do not gain document anchors retroactively. A
Role: "report"record carries the full report body inline (tools.go:374-382pending-report append). Anchors are a surrogate for content the LLM cannot see in the message text — reports are visible, so they need no surrogate. No code path addsDocumentIDsto aRole: "report"record, and no future path should.
Rules 1 / 2 are produced by internal/llm.imageIDPrefix and
internal/llm.PrependDocumentAnchors. Rules 3 / 4 are honoured
by frontend/src/markdown/objectMarkdown.tsx (objectComponents
factory) and its companion ObjectLink / ObjectImage
components. The export resolver
(bindings.go:resolveObjectRefsForExport) is type-aware to match:
images are inlined as data: URLs for self-contained exports,
non-images keep their object: href so re-import resolves
cleanly. Full design:
ADR-0014.
internal/llm.Backend is the common interface:
type Backend interface {
Name() string
Chat(ctx, messages, tools) (Response, error)
}Implementations:
local.go— LM Studio's OpenAI-compatible REST API. Tool calls map onto OpenAI'sfunction_callshape; returnedtoolmessages are folded into a synthesiseduserturn (<TOOL_RESULT name=…>…</TOOL_RESULT>) because some local models drop the dedicated role.vertex.go— Vertex AI Gemini via thegoogle.golang.org/genaiSDK. Tool calls use Gemini'sFunctionCall/FunctionResponseparts; responses can be streaming (parsed token-by-token).mock.go— deterministic mock for tests.
Backend swap is runtime: /model local / /model vertex. The
active backend is read once per agent loop round, so a swap
mid-conversation takes effect on the next user turn.
Per-backend config (config.LocalConfig / VertexAIConfig) holds
endpoints, model names, retry policy, request timeout, max tool-call
args size, and ContextBudget. Resolved by
cfg.ContextBudgetFor(backend) so the same session adapts to the
active model's window.
The (Local, VertexAI) pair is one profile among many in
config.json's llm.profiles[]. Each session references one
profile via a per-session session.json file alongside
chat.json:
sessions/<id>/
├── chat.json # transcript (records, title, private)
└── session.json # {schema_version, profile_id} ← v0.12.0+
agent.currentProfile() resolves the session's profile (falling
back to the default profile when no session is loaded, or the
recorded profile_id was deleted), and setBackend sources the
Local / VertexAI config from that profile rather than the
top-level config. /model toggles the active backend within
the session's profile; new in v0.12.0 is /profile <name> to
switch the session's profile binding (atomically rewrites
session.json; emits agent:profile:changed).
Backwards compat: a v0.11.x config.json is migrated on first
v0.12.0 load by synthesising one profile named "Default" from
the legacy fields. A v0.11.x session without session.json is
treated as binding to the default profile and gets a fresh
session.json lazy-written on first load.
BuildSystemPrompt deliberately omits any per-call volatile
content (no timestamp, no millisecond-fresh anything). On local
backends that means the system block is byte-identical across
turns whenever memory state is stable, so LM Studio's (and any
llama.cpp-backed server's) KV-cache prefix reuse fires on every
subsequent turn — a 25× wall-clock saving on 5K-token prompts.
Temporal context travels with each user record:
contextbuild.renderRecordContent calls
opts.UserRecordTemporalPrefix(record.Timestamp) and prepends
the result to user-role content. The rendering is deterministic
in the stored timestamp, so historical user records produce the
same bytes on every replay. See ADR-0017 §3.1 for why a
per-record render (not "only the latest user message") was the
correct design.
React + TypeScript, single-page, no router. Wails generates the JS
shim for window.go.main.Bindings; frontend/src/bindings.ts
declares the TypeScript view of that shim.
Top-level structure:
App.tsx
├── Sidebar (sessions / memory tabs)
│ ├── Sessions list
│ └── Memory tab
│ ├── Global Memory section (bulk select / delete)
│ └── Session Memory section (bulk select / Pin button)
├── DataDisclosure (chat-pane top, per-session)
│ ├── Objects card grid
│ ├── Tables list
│ └── /work files (sandbox-only)
├── FindingsDisclosure (chat-pane, per-session)
│ ├── Severity filter + search
│ ├── Bulk delete
│ └── Per-row Pin / Delete
├── Messages stream (with MessageItem renderer)
├── ChatInput (with image attach, history navigation)
├── Status footer (backend, tokens, bg-tasks)
└── Dialogs
├── SettingsDialog (General / Tools / MCP / Sandbox tabs)
├── MITLDialog (approval prompts)
├── PinToGlobalDialog (category picker)
├── Lightbox / ReportViewer / BlobPreview
| Event | Trigger |
|---|---|
agent:stream |
Token stream (Vertex AI) |
agent:activity |
tool_start / tool_end / tool_progress / thinking / assistant_text |
agent:state |
Idle / Busy transitions |
session:title |
Auto-generated session title |
global_memory:updated |
Global Memory store changed |
session_memory:updated |
Session Memory store changed |
findings:updated |
Findings store changed |
report:created |
New report bubble |
mitl:request |
MITL approval needed |
bg-task:start / bg-task:end |
Background task lifecycle |
v0.4.1: tool_progress event. Long-running tools (currently
analyze-data's sliding-window summarizer) emit tool_progress
ActivityEvents carrying the parent tool's tool_call_id plus an
updated display string. The frontend matches by id (not by
content text) and overwrites the running bubble's content
in-place — one bubble that updates from "analyze-data" →
"analyze-data — window 1/3" → … → reverts to "analyze-data" →
status flips on tool_end. Replaces the pre-v0.4.1 behaviour
where each window emitted its own tool_start with no matching
tool_end, leaving N permanent "running" pills (issue #5).
Full design: ADR-0002.
themes.css defines four themes (dark / light / warm / midnight)
as CSS custom properties. Surface-level tokens (--bg-primary,
--bg-sidebar, --bg-input) are opaque rgb; layered tokens
(--bg-msg-*, --bg-input-field, --bg-hover) keep rgba alpha
for tinting. WebView itself is opaque
(main.go: WebviewIsTransparent: false); window-level
translucency would need private macOS APIs and was traded away
for readable code blocks.
shell-agent-v2 is a single-user local-first app, but it processes attacker-controlled bytes from many sources (CSV cells, MCP responses, OCR'd image text, fetched web pages). The threat model is mostly prompt-injection through tool output rather than network exposure.
Defences:
nlk/guardwrapping — every user-provided text and every tool-result body is wrapped in a noncesha-tagged XML block (<user_data_NONCE>…</user_data_NONCE>) so the LLM can't be steered into treating untrusted bytes as instructions. Fail-closed: chat / contextbuild / extraction all return an error rather than silently falling back to unwrapped content.- Self-referential filter —
memory.IsSelfReferentialblocks THINK-leak class facts ("the assistant", "system prompt",<think>, etc.) at extraction time so they can't re-inject into future system prompts. - Category allowlist — only the documented 4 extraction categories are accepted; anything else is dropped.
- Provenance tagging — every Global / Session Memory and
Finding entry carries a
Sourceand renders as[user-stated]vs[derived]in the system prompt so the LLM can discount derived content. - MITL gates — write / execute tool categories require approval by default; analysis-plan and SQL-preview have dedicated dialogs.
- Sandbox — opt-in container isolation for code execution
with
/workas the only persistent surface. - Sandbox image trust —
SandboxImageStatus.ActivePinnedByDigestsurfaces a banner when the active image is a mutable upstream tag (python:3.12-slim); locally-built<TagPrefix>:<sha>and@sha256:upstream pins are treated as safe. - Atomic IO — every state file uses
WriteFileAtomicso a crash can't corrupt mid-save. - Tool-call args cap — 1 MiB by default
(
LocalConfig.MaxToolCallArgsBytes/VertexAIConfig.MaxToolCallArgsBytes). - Symlink rejection —
analysis.validateFilePathusesos.Lstatand rejects symlinks forload-dataand any other host-path entry point.
cd app && make build→dist/shell-agent-v2.app. Nevergo builddirectly — the binary leaks into the project root.cd app && make test→go test -tags no_duckdb_arrow ./...for the standard suite.lmstudio/vertexaibuild tags enable integration tests against the live backends.- Frontend:
cd app/frontend && npm run buildfor the TypeScript / Vite check (also runs as part ofmake build). - Manual smoke before release: see the
Pre-release smokechecklist in CHANGELOG / RELEASE notes.
The Mac config in main.go keeps the WebView opaque, hides the
title, and uses a transparent (cosmetic) titlebar. There is no
launcher app — direct .app launch only.
For "why was X done" / "what was the previous shape" questions,
../history/ preserves the original design notes.
Some of them no longer reflect current code (notably the v1 Hot/
Warm/Cold compaction notes and the original Pinned Memory design)
but are kept as the audit trail behind the v0.2.0 rewrite. When
in doubt, prefer this document and memory-model.md for current
behaviour.
Notable history docs that still describe behaviour shipped in v0.2.0:
../history/agent-data-flow.md— analysis tool dataflow, mostly current.../history/agent-loop-resilience.md— loop guards (still in effect).../history/sandbox-execution.md/sandbox-image-build.md— current sandbox setup.../history/security-hardening-2.md— the canonical threat model reference (this file's §9 summarises it).../history/work-dir-shell-bridge.md— current$SHELL_AGENT_WORK_DIRcontract.../history/object-storage.md— current objstore behaviour.../history/tool-event-restore.md— current session-restore tool bubble shape.../history/llm-abstraction.md— current Backend interface.../history/multi-image-handling.md— current multimodal flow.
Notable history docs that describe v0.1.x and have been superseded:
../history/memory-architecture-v2.md— v1's Hot/Warm/Cold + contextbuild rationale. The non-destructive contextbuild path is current; the v1 destructive compaction it discusses is gone.../history/memory-injection-hardening.md— original Pinned-Memory threat model. Still useful for the defences, but the storage design is superseded bymemory-model.md.../history/information-display-redesign.md— Phase 2 of the UI layout, mostly still in effect; note that Findings moved out of the sidebar in v0.2.0 Phase 8.../history/frontend-decomposition.md— Phase 3 component split.../history/background-task-indicator.md—pinned-extractionbg-task name was renamed tomemory-extractionin v0.2.0.../history/shell-agent-v2-architecture.md/shell-agent-v2-rfp.md— the original RFP and architecture doc that shaped the v0.1 baseline. This document supersedes them as the current source of truth.