Commit d5daea7
feat: add AI Teammate training system with learn-by-example patterns (#148)
* Add AI Teammate repositioning design document
Comprehensive design for repositioning altimate from "AI tool" to "AI
teammate" — including trainable knowledge system (/teach, /train,
/feedback), Deep Research mode for multi-step investigations, team
memory that persists via git, and UX reframing from "agent modes" to
"teammate roles."
https://claude.ai/code/session_01V17Kk3qCZFp9ZJiuNYucoq
* Enrich design doc with OpenClaw research and proactive behaviors
Add detailed competitive analysis from OpenClaw (self-improving memory,
heartbeat scheduler, meet-users-where-they-are), Devin ($10.2B
valuation, "junior partner" framing), and Factory AI (workflow
embedding). Add proactive behaviors section with background monitors
(cost alerts, freshness checks, schema drift, PII scanning) and
auto-promotion of learned corrections.
https://claude.ai/code/session_01V17Kk3qCZFp9ZJiuNYucoq
* Implement AI Teammate training system and Deep Research mode
Core training infrastructure built on top of existing memory system:
Training Store & Types:
- TrainingStore wraps MemoryStore with training-specific conventions
- Four knowledge kinds: pattern, rule, glossary, standard
- Structured metadata (applied count, source, acceptance tracking)
- Training blocks stored in .opencode/memory/training/ (git-committable)
- One person teaches, whole team benefits via git
Training Tools:
- training_save: Save learned patterns, rules, glossary, standards
- training_list: List all learned knowledge with applied counts
- training_remove: Remove outdated training entries
Training Skills:
- /teach: Learn patterns from example files in the codebase
- /train: Learn standards from documents or style guides
- /training-status: Dashboard of all learned knowledge
System Prompt Injection:
- Training knowledge injected alongside memory at session start
- Structured by kind: rules first, then patterns, standards, glossary
- Budget-limited to 6000 chars to control prompt size
- Zero LLM calls on startup — just reads files from disk
Deep Research Agent Mode:
- New "researcher" agent for multi-step investigations
- 4-phase protocol: Plan → Gather → Analyze → Report
- Read-only access to all warehouse, schema, FinOps tools
- Structured reports with evidence, root causes, action items
Agent Awareness:
- All agent prompts updated with training awareness section
- Agents offer to save corrections as rules when users correct behavior
- Training tools permitted in all agent modes
Tests:
- 88 new tests across 5 test files (types, store, prompt, tools, integration)
- All tests standalone (no Instance dependency)
- Full lifecycle tests: save → list → format → inject → remove
- Edge cases: budget limits, meta roundtrips, coexistence with memory
https://claude.ai/code/session_01V17Kk3qCZFp9ZJiuNYucoq
* Polish AI Teammate training UX: auto-lowercase names, update detection, budget visibility
- Fix researcher agent permissions: add training_save/remove (was read-only)
- Auto-lowercase + space-to-hyphen name transform in training_save (ARR → arr)
- Detect update vs new save, show "Updated" with preserved applied count
- Show training budget usage (chars/percent) on save, list, and remove
- Improve training_list: group by kind, show most-applied entries, budget %
- Improve training_remove: show available entries on not-found, applied count
- Show similar entry names in duplicate warnings (not just count)
- Raise content limit from 1800 to 2500 chars
- Export TRAINING_BUDGET constant, add budgetUsage() to TrainingPrompt
- Add 30 new tests: auto-lowercase, update detection, budget overflow,
name collision, scale (80 entries), improved messaging
- All 118 training tests + 305 memory tests pass
https://claude.ai/code/session_01V17Kk3qCZFp9ZJiuNYucoq
* Enhance training UX: attribution, correction detection, priority sorting
- Builder prompt: add attribution instructions (cite training entries that
influenced output), correction detection (explicit + implicit patterns),
conflict flagging between contradictory training entries
- Add /teach, /train, /training-status to Available Skills list in builder prompt
- Sort training entries by applied count (descending) in prompt injection so
most-used entries get priority within the 6000-char budget
- Restructure Teammate Training section with clear subsections
https://claude.ai/code/session_01V17Kk3qCZFp9ZJiuNYucoq
* Fix experience gaps from user journey simulations
Simulation findings and fixes:
1. training_save now echoes back saved content so user can verify
what was captured (new saves show content preview, updates show
old vs new diff)
2. When training limit is reached, error now lists existing entries
sorted by applied count and suggests the least-applied entry
for removal
3. Researcher prompt now documents training_save/remove permissions
(was contradicting its own permissions by saying "read-only" while
having write access to training)
4. Added 10 new tests: content echo, update diff, limit suggestion,
special character preservation (SQL -->, Jinja, HTML comments,
code blocks), priority sorting verification
Verified: --> in content does NOT corrupt meta block (false positive).
The non-greedy regex terminates at the meta block's own --> correctly.
128 training tests + 305 memory tests all pass.
https://claude.ai/code/session_01V17Kk3qCZFp9ZJiuNYucoq
* Add self-improvement loop: applied tracking, insights, staleness detection
OpenClaw-inspired self-improvement mechanisms:
1. Wire up incrementApplied at injection time — counters now actually
increment once per session per entry (deduped via session-scoped set),
making "Most Applied" dashboard and priority sorting meaningful
2. TrainingInsights module analyzes training metadata and surfaces:
- Stale entries (7+ days old, never applied) — suggests cleanup
- High-value entries (5+ applications) — highlights most impactful
- Near-limit warnings (18-19 of 20 entries per kind)
- Consolidation opportunities (3+ entries with shared name prefix)
3. Insights automatically shown in training_list output
4. 24 new tests covering all insight types, boundary conditions,
session tracking dedup, and format output
152 training tests + 305 memory tests all pass.
https://claude.ai/code/session_01V17Kk3qCZFp9ZJiuNYucoq
* fix: add dedicated training feature flag and remove unused insight type
- Add `ALTIMATE_DISABLE_TRAINING` flag independent of memory's disable flag
- Use new flag in session prompt injection and tool registry
- Remove unused `budget-warning` insight type from `TrainingInsight`
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: reset training session tracking, add error logging, fix list truncation
- Call `TrainingPrompt.resetSession()` at session start (step === 1)
to prevent applied counters from growing unbounded across sessions
- Add structured error logging to all three training tools
- Add truncation indicator (`...`) when training list preview is cut off
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use `.altimate-code/memory` as primary storage path with `.opencode` fallback
Memory store was hardcoded to `.opencode/memory/` but the config system
already uses `.altimate-code` as primary with `.opencode` as fallback.
Now checks for `.altimate-code/` directory first, falls back to `.opencode/`,
and defaults to `.altimate-code/` for new projects. Result is cached per
process to avoid repeated filesystem checks.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add Trainer agent mode with pattern discovery and training validation
Add dedicated trainer mode — the 8th primary agent — for systematically
building the AI teammate's knowledge base. Unlike inline corrections in
other modes, trainer mode actively scans codebases, validates training
against reality, and guides knowledge curation.
Changes:
- New `trainer` agent mode with read-only permissions (no write/edit/sql_execute)
- New `training_scan` tool: auto-discover patterns in models, SQL, config, tests, docs
- New `training_validate` tool: check training compliance against actual codebase
- Expand `TrainingKind` to 6 types: add `context` (background "why" knowledge)
and `playbook` (multi-step procedures)
- Update `count()` to derive from enum (prevents drift when kinds change)
- Add KIND_HEADERS for context and playbook in prompt injection
- Update injection order: rules first, playbooks last (budget priority)
- Update training-save and training-list descriptions for new kinds
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add comprehensive training guide with scenarios and limitations
- New `data-engineering/training/index.md` (350+ lines):
- Quick start with 3 entry points (trainer mode, inline corrections, /train skill)
- Deep dive into all 4 trainer workflows (scan, validate, teach, gap analysis)
- 5 comprehensive scenarios: new project onboarding, post-incident learning,
quarterly review, business domain teaching, pre-migration documentation
- Explicit limitations section (not a hard gate, budget limits, no auto-learning,
heuristic validation, no conflict resolution, no version history)
- Full reference tables for tools, skills, limits, and feature flag
- Updated `agent-modes.md`: add Researcher and Trainer mode sections with
examples, capabilities, and "when to use" guidance
- Updated `getting-started.md`: add training link to "Next steps"
- Updated `mkdocs.yml`: add Training nav section under Data Engineering
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: increase training budget to 16K chars and rewrite docs as harness customization guide
Training is not a CLAUDE.md replacement — it's the mechanism by which users
customize the data engineering harness for their specific project. The agent
works WITH the user to discover what it needs to know, rather than requiring
users to write perfect static instructions.
Changes:
- Increase TRAINING_BUDGET from 6000 to 16000 chars (removes the #1 criticism
from user simulations — budget was worse than unlimited CLAUDE.md)
- Complete docs rewrite with correct positioning:
- "Customizing Your AI Teammate" framing (not "Training Your AI Teammate")
- Research-backed "why" section (40-70% knowledge omission, guided discovery)
- Clear comparison table: training vs CLAUDE.md (complementary, not competing)
- 6 real-world scenarios including Databricks, Salesforce quirks, cost spikes
- Honest limitations section (not a linter, not an audit trail, not automatic)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: merge training into memory with context-aware relevance scoring
Replace two parallel injection systems (memory 8KB + training 16KB)
with a single unified injection that scores blocks by relevance to
the current agent.
How it works:
- All blocks (memory + training) loaded in one pass
- Each block scored: agent tag match (+10), training kind relevance
per agent (+1-5), applied count bonus (+0-3), recency (+0-2),
non-training base (+5)
- Builder sees rules/patterns first; analyst sees glossary/context first
- Budget is 20KB unified, filled greedily by score
- Training blocks still tracked with applied counts (fire-and-forget)
Architecture:
- memory/prompt.ts: new scoreBlock(), unified inject() with InjectionContext
- memory/types.ts: UNIFIED_INJECTION_BUDGET, AGENT_TRAINING_RELEVANCE weights
- session/prompt.ts: single inject call with agent context (was 2 separate)
- training/prompt.ts: deprecated, delegates to MemoryPrompt (backward compat)
No changes to: MemoryStore, TrainingStore, training tools, memory tools.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: cut training_scan and training_validate, simplify docs
Research from 8 independent evaluations + SkillsBench (7,308 test runs)
found that compact focused context beats comprehensive docs by 20pp.
The training system's value is in correction capture (2-sec saves) and
team propagation (git sync) — not in regex scanning or keyword grep.
Removed:
- training_scan (255 lines) — regex pattern counting, not discovery
- training_validate (315 lines) — keyword grep, not validation
Simplified:
- trainer.txt: removed scan/validate workflows, focused on guided
teaching and curation
- agent-modes.md: updated trainer section with correction-focused example
- training docs: complete rewrite with new pitch:
"Correct the agent once. It remembers forever. Your team inherits it."
Backed by SkillsBench research showing compact > comprehensive.
Net: -753 lines. 152 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: remove dead accepted/rejected fields, add training tips, expand limitations
Gaps found by simulation team:
1. Remove `accepted`/`rejected` counters from TrainingBlockMeta — they were
never incremented anywhere in the codebase (dead code since inception)
2. Add 5 training discoverability tips to TUI tips (was 0 mentions in 152 tips)
3. Expand limitations section in docs with honest, complete list:
context budget, 20/kind limit, no approval workflow, SQL-focused,
git discipline required
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: update site-wide docs for training and new agent modes
- Homepage: update from "Four agents" to "Seven agents" — add Researcher,
Trainer, Executive cards with descriptions
- Getting Started: update training link to match new pitch
"Corrections That Stick"
- Tools index: add Training row (3 tools + 3 skills) with link
- All references now consistent with simplified training system
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address Sentry review findings — 7 bugs fixed
1. stripTrainingMeta/parseTrainingMeta regex: remove multiline `m` flag
that could match user content starting with `<!-- training` mid-string
(types.ts, store.ts)
2. training_save content limit: reduce from 2500 to 1800 chars to account
for ~200 char metadata overhead against MemoryStore's 2048 char limit
(training-save.ts)
3. injectTrainingOnly: change `break` to `continue` so budget-exceeding
section headers skip to next kind instead of stopping all injection
(memory/prompt.ts)
4. injectTrainingOnly: track itemCount and return empty string when no
items injected (was returning header-only string, inflating budget
reports) (memory/prompt.ts)
5. projectDir cache: replace module-level singleton with Map keyed by
Instance.directory to prevent stale paths when AsyncLocalStorage
context changes across concurrent requests (memory/store.ts)
6. budgetUsage side effect: already fixed — delegates to injectTrainingOnly
which is read-only (no applied count increment). Sentry comments were
against pre-refactor code.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: CI failure + new Sentry finding — orphaned headers and agent test
1. Agent test: add researcher + trainer to "all disabled" test so it
correctly expects "no primary visible agent" when ALL agents are off
2. Orphaned section headers: add pre-check that at least one entry fits
before adding section header in both injectTrainingOnly and inject
memory section (prevents header-only output inflating budget reports)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address multi-model code review findings
Fixes from 6-model consensus review (Claude + GPT + Gemini + Kimi + MiniMax + GLM-5):
1. training_remove: add name validation regex matching training_save
(Gemini finding — prevents path traversal via malformed names)
2. training_save: improve name transform to strip ALL non-alphanumeric
chars, not just whitespace (Gemini finding — "don't-use-float!"
now becomes "don-t-use-float" instead of failing regex)
3. incrementApplied: replace silent `.catch(() => {})` with warning
log (Kimi + GLM-5 consensus — fire-and-forget is by design but
failures should be visible in logs for debugging)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address new Sentry findings — regex m flag and off-by-one budget check
1. formatTrainingEntry regex: remove multiline `m` flag that could
match user content mid-string (memory/prompt.ts:82)
2. Memory block budget check: change `<` to `<=` so blocks that fit
exactly into remaining budget are included (memory/prompt.ts:204)
3 prior Sentry findings already fixed in earlier commits:
- projectDir cache (Map keyed by Instance.directory)
- injectTrainingOnly header-only return (itemCount guard)
- orphaned section headers (first-entry pre-check)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address 6-model consensus review — 4 remaining bugs
Fixes from consensus across Claude, GPT 5.2, Gemini 3.1, Kimi K2.5,
MiniMax M2.5, and GLM-5:
1. parseTrainingMeta: check safeParse().success before accessing .data
(GLM-5 + MiniMax consensus — accessing .data on failed parse returns
undefined, could cause downstream errors)
2. Stale detection: use `e.updated` not `e.created` so entries updated
recently aren't incorrectly flagged as stale (MiniMax finding)
3. training_list: pass scope/kind filter to count() so summary table
matches the filtered entries list (GPT finding)
4. training_remove: show hint entries from same scope only, not all
scopes (GPT + MiniMax finding)
Prior fixes already addressed: name validation on remove (Gemini),
name transform punctuation (Gemini), silent incrementApplied catch
(Kimi + GLM-5), regex m flag (MiniMax + Sentry).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>1 parent 7799ed1 commit d5daea7
File tree
46 files changed
+6107
-144
lines changed- .github/meta
- .opencode/skills
- teach
- training-status
- train
- docs
- design
- docs
- data-engineering
- tools
- training
- packages/opencode
- .github/meta
- src
- agent
- altimate
- prompts
- tools
- training
- cli/cmd/tui/component
- flag
- memory
- session
- tool
- test
- agent
- training
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
46 files changed
+6107
-144
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
0 commit comments