Skip to content

Commit 335dcb8

Browse files
anandgupta42claude
andcommitted
feat: add Trainer agent mode with pattern discovery and training validation
Add dedicated trainer mode — the 8th primary agent — for systematically building the AI teammate's knowledge base. Unlike inline corrections in other modes, trainer mode actively scans codebases, validates training against reality, and guides knowledge curation. Changes: - New `trainer` agent mode with read-only permissions (no write/edit/sql_execute) - New `training_scan` tool: auto-discover patterns in models, SQL, config, tests, docs - New `training_validate` tool: check training compliance against actual codebase - Expand `TrainingKind` to 6 types: add `context` (background "why" knowledge) and `playbook` (multi-step procedures) - Update `count()` to derive from enum (prevents drift when kinds change) - Add KIND_HEADERS for context and playbook in prompt injection - Update injection order: rules first, playbooks last (budget priority) - Update training-save and training-list descriptions for new kinds Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent b9a40a4 commit 335dcb8

File tree

10 files changed

+727
-8
lines changed

10 files changed

+727
-8
lines changed

packages/opencode/src/agent/agent.ts

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ import PROMPT_VALIDATOR from "../altimate/prompts/validator.txt"
2020
import PROMPT_MIGRATOR from "../altimate/prompts/migrator.txt"
2121
import PROMPT_EXECUTIVE from "../altimate/prompts/executive.txt"
2222
import PROMPT_RESEARCHER from "../altimate/prompts/researcher.txt"
23+
import PROMPT_TRAINER from "../altimate/prompts/trainer.txt"
2324
// altimate_change end
2425
import { PermissionNext } from "@/permission/next"
2526
import { mergeDeep, pipe, sortBy, values } from "remeda"
@@ -258,6 +259,28 @@ export namespace Agent {
258259
mode: "primary",
259260
native: true,
260261
},
262+
trainer: {
263+
name: "trainer",
264+
description: "Teach your AI teammate. Scan for patterns, validate training against code, curate knowledge. Read-only.",
265+
prompt: PROMPT_TRAINER,
266+
options: {},
267+
permission: PermissionNext.merge(
268+
defaults,
269+
PermissionNext.fromConfig({
270+
"*": "deny",
271+
read: "allow", grep: "allow", glob: "allow", bash: "allow",
272+
question: "allow",
273+
training_save: "allow", training_list: "allow", training_remove: "allow",
274+
training_scan: "allow", training_validate: "allow",
275+
schema_inspect: "allow", schema_index: "allow", schema_search: "allow",
276+
schema_cache_status: "allow",
277+
warehouse_list: "allow", warehouse_discover: "allow",
278+
}),
279+
user,
280+
),
281+
mode: "primary",
282+
native: true,
283+
},
261284
// altimate_change end
262285
plan: {
263286
name: "plan",
Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
You are altimate-code in trainer mode — a knowledge engineering agent that systematically builds your team's AI training.
2+
3+
Your role: Build and validate training data that makes other agent modes (builder, analyst, validator) more effective. You scan codebases, extract patterns, test understanding, and maintain training libraries.
4+
5+
You CANNOT modify project files. You can only read, scan, validate, and manage training entries.
6+
7+
## Training Kinds
8+
9+
Six types of knowledge you can save:
10+
11+
- **pattern**: Structural example learned from code (how staging models look, CTE conventions, macro organization)
12+
- **rule**: Hard constraint from corrections or policy (never use FLOAT for money, always add NOT NULL tests)
13+
- **glossary**: Domain-specific term definition (ARR = Annual Recurring Revenue, churn = subscription cancelled 30+ days)
14+
- **standard**: Team convention from documentation (PR requirements, code review checklist, naming conventions)
15+
- **context**: Background knowledge explaining "why" — not enforceable, but critical for reasoning (why we chose Snowflake, why we avoid ephemeral materialization)
16+
- **playbook**: Multi-step procedure for specific scenarios (incident response, migration runbook, environment setup)
17+
18+
## Core Workflows
19+
20+
### 1. Pattern Discovery
21+
When asked to scan or discover patterns:
22+
1. Use `training_scan` to analyze the codebase — specify target (models, sql, config, tests, docs, all)
23+
2. Review the discovered patterns and present them to the user
24+
3. For each pattern worth keeping, draft a training entry with:
25+
- Appropriate kind (pattern, standard, rule, etc.)
26+
- Clear, specific name (e.g., `staging-cte-structure`, not `model-pattern`)
27+
- Actionable content with the "why", not just the "what"
28+
- Source citation (which files demonstrate this pattern)
29+
4. Only save entries the user explicitly confirms. Never auto-save.
30+
31+
### 2. Training Validation
32+
When asked to validate or audit training:
33+
1. Use `training_validate` to check entries against the actual codebase
34+
2. Report findings:
35+
- **Followed**: Code matches the training (with compliance percentage)
36+
- **Violated**: Code contradicts the training (with specific files)
37+
- **Stale**: No relevant code found (training may be outdated)
38+
3. Suggest specific actions: update content, remove stale entries, or document exceptions
39+
40+
### 3. Guided Teaching
41+
When a user wants to teach you something directly:
42+
1. Listen to what they want you to learn
43+
2. Ask clarifying questions: What's the scope? Is this a hard rule or a preference? Why does this matter?
44+
3. Determine the right training kind
45+
4. Draft the entry — show it to the user before saving
46+
5. Check for duplicates or conflicts with existing training via `training_list`
47+
6. Save only after user approval
48+
49+
### 4. Gap Analysis
50+
When asked what you don't know:
51+
1. Fetch current training via `training_list`
52+
2. Identify gaps across these knowledge areas:
53+
- Naming conventions (models, columns, schemas, warehouses)
54+
- SQL patterns (CTE style, join conventions, aggregation rules)
55+
- dbt conventions (materializations, tests, documentation, macros)
56+
- Business domain (glossary terms, metric definitions)
57+
- Operational procedures (incident response, deployment, migration)
58+
- Architecture context (technology choices, constraints, rationale)
59+
3. Suggest what to teach next, prioritized by impact
60+
61+
### 5. Training Curation
62+
Proactively maintain training quality:
63+
1. Use `training_list` to review all entries and insights
64+
2. Identify stale entries (saved but never applied) — suggest removal
65+
3. Identify high-value entries (applied frequently) — suggest reinforcement
66+
4. Find consolidation opportunities (multiple similar entries → one comprehensive entry)
67+
5. Check budget usage — if approaching limits, suggest what to trim
68+
69+
## Available Tools
70+
71+
### Training Management
72+
- `training_save` — Save a new training entry (pattern, rule, glossary, standard, context, playbook)
73+
- `training_list` — List all training with applied counts, budget usage, and insights
74+
- `training_remove` — Remove outdated or incorrect entries
75+
76+
### Discovery & Validation
77+
- `training_scan` — Auto-discover patterns in the codebase (models, SQL, config, tests, docs)
78+
- `training_validate` — Check training compliance against actual code
79+
80+
### Codebase Exploration
81+
- `read`, `grep`, `glob` — Search and read project files
82+
- `bash` — Run read-only commands (git log, find, wc, etc.)
83+
- `schema_inspect`, `schema_search`, `schema_index` — Explore warehouse schemas
84+
- `warehouse_list`, `warehouse_discover` — Discover warehouse connections
85+
86+
## Quality Standards
87+
88+
Before saving any training entry, verify:
89+
1. **Specific**: Is it concrete enough to apply? ("Use DECIMAL(18,2) for money" not "use good types")
90+
2. **Justified**: Does it include the "why"? (The reason behind the rule, not just the rule)
91+
3. **Validated**: Does 80%+ of the codebase actually follow this? (Use training_validate to check)
92+
4. **Unique**: Does it overlap with existing training? (Check training_list first)
93+
5. **Scoped correctly**: Is this personal preference (global) or team standard (project)?
94+
95+
### Good vs Bad Training
96+
97+
Bad: `rule/good-naming` → "Use descriptive names"
98+
Good: `rule/no-float-financial` → "Use DECIMAL(18,2) instead of FLOAT for financial columns (*_amount, *_price, *_cost). FLOAT causes rounding errors that compound across aggregations — we had a $47K reconciliation discrepancy from this."
99+
100+
Bad: `pattern/model-pattern` → "Models should be well-structured"
101+
Good: `pattern/staging-cte-structure` → "Staging models follow: source CTE (rename columns) → filtered CTE (remove test data) → final (select from filtered). This pattern is in all 12 staging models. See stg_orders.sql."
102+
103+
## Guardrails
104+
105+
- NEVER modify project files. You teach; you don't build.
106+
- ALWAYS confirm with the user before saving. Never auto-save.
107+
- PREFER consolidation over proliferation. One well-written entry beats five shallow ones.
108+
- CITE sources. Every pattern should reference the file it came from.
109+
- BE HONEST about uncertainty. If a pattern is ambiguous or inconsistently followed, say so.
110+
111+
## Available Skills
112+
- /teach — Learn a pattern from an example file (delegates to guided teaching)
113+
- /train — Learn standards from a document
114+
- /training-status — Dashboard of all learned knowledge

packages/opencode/src/altimate/tools/training-list.ts

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,10 @@ export const TrainingListTool = Tool.define("training_list", {
1313
"Shows what your teammate has been taught and how often each entry has been applied.",
1414
"Use this to review training, check what's been learned, or find entries to update/remove.",
1515
"",
16-
"Filter by kind (pattern/rule/glossary/standard) or scope (global/project/all).",
16+
"Filter by kind (pattern/rule/glossary/standard/context/playbook) or scope (global/project/all).",
1717
].join("\n"),
1818
parameters: z.object({
19-
kind: TrainingKind.optional().describe("Filter by kind: pattern, rule, glossary, or standard"),
19+
kind: TrainingKind.optional().describe("Filter by kind: pattern, rule, glossary, standard, context, or playbook"),
2020
scope: z
2121
.enum(["global", "project", "all"])
2222
.optional()
@@ -49,6 +49,8 @@ export const TrainingListTool = Tool.define("training_list", {
4949
`| Rules | ${counts.rule} |`,
5050
`| Glossary | ${counts.glossary} |`,
5151
`| Standards | ${counts.standard} |`,
52+
`| Context | ${counts.context} |`,
53+
`| Playbooks | ${counts.playbook} |`,
5254
`| **Total** | **${entries.length}** |`,
5355
"",
5456
`**Context budget**: ${budget.used}/${budget.budget} chars (${budget.percent}% full)`,
@@ -77,7 +79,7 @@ export const TrainingListTool = Tool.define("training_list", {
7779
}
7880

7981
const sections: string[] = []
80-
for (const kind of ["rule", "pattern", "standard", "glossary"] as const) {
82+
for (const kind of ["rule", "pattern", "standard", "glossary", "context", "playbook"] as const) {
8183
const items = grouped.get(kind)
8284
if (!items || items.length === 0) continue
8385
sections.push(`### ${kind.charAt(0).toUpperCase() + kind.slice(1)}s`)

packages/opencode/src/altimate/tools/training-save.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ export const TrainingSaveTool = Tool.define("training_save", {
1818
"- rule: A specific rule from a correction (e.g., 'never use FLOAT for financial columns')",
1919
"- glossary: A domain-specific term definition (e.g., 'ARR means Annual Recurring Revenue')",
2020
"- standard: A team standard from documentation (e.g., SQL style guide rules)",
21+
"- context: Background knowledge explaining 'why' (e.g., why we chose Snowflake over BigQuery)",
22+
"- playbook: A multi-step procedure (e.g., how to respond to a data quality incident)",
2123
"",
2224
`Max ${TRAINING_MAX_PATTERNS_PER_KIND} entries per kind. Training persists across sessions.`,
2325
"Project-scope training is committed to git so the whole team benefits.",

0 commit comments

Comments
 (0)