layer1labs
diff --git a/‎docs/REQUIREMENTS.md‎
Lines changed: 16 additions & 0 deletions b/‎docs/REQUIREMENTS.md‎
Lines changed: 16 additions & 0 deletions
diff --git a/‎docs/TEST_SPEC.md‎
Lines changed: 29 additions & 0 deletions b/‎docs/TEST_SPEC.md‎
Lines changed: 29 additions & 0 deletions
@@ -306,6 +306,22 @@
 - **REQ-SCF-EPI-002**: `enable_epistemic=true` adds epistemic governance to any project type
 - **REQ-SCF-EPI-003**: Epistemic project types get domain-specific directory structures
 
+## Token & Credit Optimization
+
+- **REQ-OPT-001**: `TokenEstimator` estimates token count from text using per-model character ratios, and estimates cost in USD from token counts and provider pricing tables
+- **REQ-OPT-002**: `ResponseCache` stores LLM responses keyed by SHA-256 hash of (provider, model, serialised messages); returns cached response on hit and records savings
+- **REQ-OPT-003**: `ResponseCache` supports configurable TTL (default 1 h) and optional JSON persistence to `.specsmith/response-cache.json`
+- **REQ-OPT-004**: `ContextManager.trim()` implements a sliding window that drops oldest non-system messages when total estimated tokens exceed `context_max_tokens`
+- **REQ-OPT-005**: `ContextManager` triggers a summarisation recommendation when history token count exceeds `summarize_threshold`
+- **REQ-OPT-006**: `ModelRouter.classify()` assigns a complexity tier (FAST/BALANCED/POWERFUL) to a user message using keyword and length heuristics, with no external API call
+- **REQ-OPT-007**: `ModelRouter.suggest_model()` returns the cheapest default model for a given (provider, tier) pair from a built-in pricing table
+- **REQ-OPT-008**: `ToolFilter.select()` scores available tools against task text and returns only the top-N relevant tools, reducing tool-schema token overhead
+- **REQ-OPT-009**: `OptimizationEngine.pre_call()` applies caching, context trim, model routing, and tool filtering before each LLM call; returns transformed messages, selected model, and an `OptimizationHint`
+- **REQ-OPT-010**: `OptimizationEngine.post_call()` records tokens saved, cache hit/miss, and model routing decision to running `OptimizationReport`
+- **REQ-OPT-011**: `AnthropicProvider` adds `cache_control: {"type": "ephemeral"}` to the system message when `prompt_caching=True`, enabling Anthropic’s 90% cached-read discount
+- **REQ-OPT-012**: `specsmith optimize` CLI command reads `.specsmith/` usage data and emits an `OptimizationReport` with concrete recommendations and projected monthly savings
+- **REQ-OPT-013**: `OptimizationConfig` is serialisable and can be embedded in `scaffold.yml` under `optimization:` to persist settings per project
+
 ## GUI Workbench
 
 - **REQ-GUI-001**: `specsmith gui` launches a cross-platform Qt6 desktop workbench (Windows, Linux, macOS)
 
@@ -539,6 +539,35 @@
 - **TEST-WFL-010**: `specsmith session-end` reports unpushed commits and dirty files
   Covers: REQ-WFL-008
 
+### Token & Credit Optimization
+
+- **TEST-OPT-001**: `TokenEstimator.estimate()` returns positive int for non-empty text; GPT-4 uses 0.25 tokens/char ratio
+  Covers: REQ-OPT-001
+- **TEST-OPT-002**: `TokenEstimator.estimate_cost()` returns expected USD for known token counts and provider
+  Covers: REQ-OPT-001
+- **TEST-OPT-003**: `ResponseCache.get()` returns None on cold cache; returns response string on warm hit
+  Covers: REQ-OPT-002
+- **TEST-OPT-004**: `ResponseCache` records tokens_saved and cost_saved on cache hit
+  Covers: REQ-OPT-002
+- **TEST-OPT-005**: `ResponseCache` expires entries after TTL seconds
+  Covers: REQ-OPT-003
+- **TEST-OPT-006**: `ContextManager.trim()` returns fewer messages when total tokens exceed max_tokens
+  Covers: REQ-OPT-004
+- **TEST-OPT-007**: `ContextManager.trim()` always preserves system message
+  Covers: REQ-OPT-004
+- **TEST-OPT-008**: `ContextManager.needs_summarization()` returns True when history exceeds summarize_threshold
+  Covers: REQ-OPT-005
+- **TEST-OPT-009**: `ModelRouter.classify()` returns FAST for short/simple inputs, POWERFUL for code/architecture keywords
+  Covers: REQ-OPT-006
+- **TEST-OPT-010**: `ModelRouter.suggest_model()` returns haiku/mini/flash for FAST tier per provider
+  Covers: REQ-OPT-007
+- **TEST-OPT-011**: `ToolFilter.select()` returns subset of tools; governance tools ranked higher for audit-related tasks
+  Covers: REQ-OPT-008
+- **TEST-OPT-012**: `OptimizationEngine.pre_call()` returns cache hit and skips model call when response is cached
+  Covers: REQ-OPT-009
+- **TEST-OPT-013**: `OptimizationReport` accumulates correct cache_hits and tokens_saved across multiple calls
+  Covers: REQ-OPT-010
+
 ### GUI Workbench
 
 - **TEST-GUI-001**: `specsmith gui` command is registered and exits cleanly when PySide6 not installed