Skip to content

Commit 91923fe

Browse files
anandgupta42claude
andauthored
feat: comprehensive telemetry instrumentation (#39)
* feat: comprehensive telemetry instrumentation for error visibility and data moat Instrument 25 event types across the CLI lifecycle — auth, MCP servers, Python engine, provider errors, permissions, upgrades, context utilization, agent outcomes, workflow sequencing, and environment census. Adds telemetry docs page, firewall endpoint, 25 unit tests, and privacy-safe helpers (categorizeToolName, bucketCount). No behavioral changes; all telemetry is opt-out via config. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review feedback on telemetry instrumentation - compaction.ts: scope compactionAttempt per-session via Map instead of module-level counter that leaked across sessions - prompt.ts: remove hardcoded doom_loops/permission_denials fields from agent_outcome (were always 0); doom loops and permission denials are already tracked as separate events - mcp/index.ts: make mcp_server_census fire-and-forget so listTools/ listResources never block the MCP connect critical path - processor.ts: use dedicated generationCounter for context_utilization instead of toolCallCounter which counts a different thing - telemetry/index.ts: refactor categorizeToolName to use a pattern array with explicit ordering, making the match logic order-independent and easier to maintain - installation/index.ts: add "other" to upgrade method union instead of silently casting unknown methods to "npm" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 30b8077 commit 91923fe

File tree

15 files changed

+995
-9
lines changed

15 files changed

+995
-9
lines changed

docs/docs/configure/telemetry.md

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
# Telemetry
2+
3+
Altimate Code collects anonymous usage data to help us improve the product. This page describes what we collect, why, and how to opt out.
4+
5+
## What We Collect
6+
7+
We collect the following categories of events:
8+
9+
| Event | Description |
10+
|-------|-------------|
11+
| `session_start` | A new CLI session begins |
12+
| `session_end` | A CLI session ends (includes duration) |
13+
| `session_forked` | A session is forked from an existing one |
14+
| `generation` | An AI model generation completes (model ID, token counts, duration — no prompt content) |
15+
| `tool_call` | A tool is invoked (tool name and category — no arguments or output) |
16+
| `bridge_call` | A Python engine RPC call completes (method name and duration — no arguments) |
17+
| `command` | A CLI command is executed (command name only) |
18+
| `error` | An unhandled error occurs (error type and truncated message — no stack traces) |
19+
| `auth_login` | Authentication succeeds or fails (provider and method — no credentials) |
20+
| `auth_logout` | A user logs out (provider only) |
21+
| `mcp_server_status` | An MCP server connects, disconnects, or errors (server name and transport) |
22+
| `provider_error` | An AI provider returns an error (error type and HTTP status — no request content) |
23+
| `engine_started` | The Python engine starts or restarts (version and duration) |
24+
| `engine_error` | The Python engine fails to start (phase and truncated error) |
25+
| `upgrade_attempted` | A CLI upgrade is attempted (version and method) |
26+
| `permission_denied` | A tool permission is denied (tool name and source) |
27+
| `doom_loop_detected` | A repeated tool call pattern is detected (tool name and count) |
28+
| `compaction_triggered` | Context compaction runs (strategy and token counts) |
29+
| `tool_outputs_pruned` | Tool outputs are pruned during compaction (count) |
30+
| `environment_census` | Environment snapshot on project scan (warehouse types, dbt presence, feature flags — no hostnames) |
31+
| `context_utilization` | Context window usage per generation (token counts, utilization percentage, cache hit ratio) |
32+
| `agent_outcome` | Agent session outcome (agent type, tool/generation counts, cost, outcome status) |
33+
| `error_recovered` | Successful recovery from a transient error (error type, strategy, attempt count) |
34+
| `mcp_server_census` | MCP server capabilities after connect (tool and resource counts — no tool names) |
35+
| `context_overflow_recovered` | Context overflow is handled (strategy) |
36+
37+
Each event includes a timestamp, anonymous session ID, and the CLI version.
38+
39+
## Why We Collect Telemetry
40+
41+
Telemetry helps us:
42+
43+
- **Detect errors** — identify crashes, provider failures, and engine issues before users report them
44+
- **Improve reliability** — track MCP server stability, engine startup success rates, and upgrade outcomes
45+
- **Understand usage patterns** — know which tools and features are used so we can prioritize development
46+
- **Measure performance** — track generation latency, engine startup time, and bridge call duration
47+
48+
## Disabling Telemetry
49+
50+
To disable all telemetry collection, add this to your configuration file (`~/.config/altimate/config.json`):
51+
52+
```json
53+
{
54+
"telemetry": {
55+
"disabled": true
56+
}
57+
}
58+
```
59+
60+
You can also set the environment variable:
61+
62+
```bash
63+
export ALTIMATE_TELEMETRY_DISABLED=true
64+
```
65+
66+
When telemetry is disabled, no events are sent and no network requests are made to the telemetry endpoint.
67+
68+
## Privacy
69+
70+
We take your privacy seriously. Altimate Code telemetry **never** collects:
71+
72+
- SQL queries or query results
73+
- Code content, file contents, or file paths
74+
- Credentials, API keys, or tokens
75+
- Database connection strings or hostnames
76+
- Personally identifiable information beyond your email (used only for user correlation)
77+
- Tool arguments or outputs
78+
- AI prompt content or responses
79+
80+
Error messages are truncated to 500 characters and scrubbed of file paths before sending.
81+
82+
## Network
83+
84+
Telemetry data is sent to Azure Application Insights:
85+
86+
| Endpoint | Purpose |
87+
|----------|---------|
88+
| `eastus-8.in.applicationinsights.azure.com` | Telemetry ingestion |
89+
90+
For a complete list of network endpoints, see the [Network Reference](../network.md).
91+
92+
## For Contributors
93+
94+
### Naming Convention
95+
96+
Event type names use **snake_case** with a `domain_action` pattern:
97+
98+
- `auth_login`, `auth_logout` — authentication events
99+
- `mcp_server_status`, `mcp_server_census` — MCP server lifecycle
100+
- `engine_started`, `engine_error` — Python engine events
101+
- `provider_error` — AI provider errors
102+
- `session_forked` — session lifecycle
103+
- `environment_census` — environment snapshot events
104+
- `context_utilization`, `context_overflow_recovered` — context management events
105+
- `agent_outcome` — agent session events
106+
- `error_recovered` — error recovery events
107+
108+
### Adding a New Event
109+
110+
1. **Define the type** — Add a new variant to the `Telemetry.Event` union in `packages/altimate-code/src/telemetry/index.ts`
111+
2. **Emit the event** — Call `Telemetry.track()` at the appropriate location
112+
3. **Update docs** — Add a row to the event table above
113+
114+
### Privacy Checklist
115+
116+
Before adding a new event, verify:
117+
118+
- [ ] No SQL, code, or file contents are included
119+
- [ ] No credentials or connection strings are included
120+
- [ ] Error messages are truncated to 500 characters
121+
- [ ] File paths are not included in any field
122+
- [ ] Only tool names are sent, never arguments or outputs

docs/docs/network.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ altimate needs outbound HTTPS access to:
3939
| `registry.npmjs.org` | Package updates |
4040
| `models.dev` | Model catalog (can be disabled) |
4141
| Your warehouse endpoints | Database connections |
42+
| `eastus-8.in.applicationinsights.azure.com` | Telemetry (Azure Application Insights) |
4243

4344
### Disable Model Fetching
4445

docs/mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,7 @@ nav:
9696
- Appearance:
9797
- Themes: configure/themes.md
9898
- Keybinds: configure/keybinds.md
99+
- Telemetry: configure/telemetry.md
99100
- Integrations:
100101
- LSP Servers: configure/lsp.md
101102
- MCP Servers: configure/mcp-servers.md

packages/altimate-code/src/bridge/engine.ts

Lines changed: 49 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ import fs from "fs/promises"
1717
import path from "path"
1818
import { Global } from "../global"
1919
import { UI } from "../cli/ui"
20+
import { Telemetry } from "@/telemetry"
2021

2122
declare const ALTIMATE_ENGINE_VERSION: string
2223
declare const ALTIMATE_CLI_VERSION: string
@@ -94,7 +95,17 @@ export async function ensureUv(): Promise<void> {
9495
await fs.mkdir(path.join(dir, "bin"), { recursive: true })
9596

9697
const response = await fetch(url)
97-
if (!response.ok) throw new Error(`Failed to download uv: ${response.statusText}`)
98+
if (!response.ok) {
99+
const errMsg = `Failed to download uv: ${response.statusText}`
100+
Telemetry.track({
101+
type: "engine_error",
102+
timestamp: Date.now(),
103+
session_id: Telemetry.getContext().sessionId,
104+
phase: "uv_download",
105+
error_message: errMsg.slice(0, 500),
106+
})
107+
throw new Error(errMsg)
108+
}
98109
const buffer = Buffer.from(await response.arrayBuffer())
99110

100111
const tmpFile = path.join(dir, "bin", asset)
@@ -146,8 +157,11 @@ export async function ensureEngine(): Promise<void> {
146157

147158
async function ensureEngineImpl(): Promise<void> {
148159
const manifest = await readManifest()
160+
const isUpgrade = manifest !== null
149161
if (manifest && manifest.engine_version === ALTIMATE_ENGINE_VERSION) return
150162

163+
const startTime = Date.now()
164+
151165
await ensureUv()
152166

153167
const uv = uvPath()
@@ -157,13 +171,35 @@ async function ensureEngineImpl(): Promise<void> {
157171
// Create venv if it doesn't exist
158172
if (!existsSync(venvDir)) {
159173
UI.println(`${UI.Style.TEXT_DIM}Creating Python environment...${UI.Style.TEXT_NORMAL}`)
160-
execFileSync(uv, ["venv", "--python", "3.12", venvDir])
174+
try {
175+
execFileSync(uv, ["venv", "--python", "3.12", venvDir])
176+
} catch (e: any) {
177+
Telemetry.track({
178+
type: "engine_error",
179+
timestamp: Date.now(),
180+
session_id: Telemetry.getContext().sessionId,
181+
phase: "venv_create",
182+
error_message: (e?.message ?? String(e)).slice(0, 500),
183+
})
184+
throw e
185+
}
161186
}
162187

163188
// Install/upgrade engine
164189
const pythonPath = enginePythonPath()
165190
UI.println(`${UI.Style.TEXT_DIM}Installing altimate-engine ${ALTIMATE_ENGINE_VERSION}...${UI.Style.TEXT_NORMAL}`)
166-
execFileSync(uv, ["pip", "install", "--python", pythonPath, `altimate-engine==${ALTIMATE_ENGINE_VERSION}`])
191+
try {
192+
execFileSync(uv, ["pip", "install", "--python", pythonPath, `altimate-engine==${ALTIMATE_ENGINE_VERSION}`])
193+
} catch (e: any) {
194+
Telemetry.track({
195+
type: "engine_error",
196+
timestamp: Date.now(),
197+
session_id: Telemetry.getContext().sessionId,
198+
phase: "pip_install",
199+
error_message: (e?.message ?? String(e)).slice(0, 500),
200+
})
201+
throw e
202+
}
167203

168204
// Get python version
169205
const pyVersion = execFileSync(pythonPath, ["--version"]).toString().trim()
@@ -178,6 +214,16 @@ async function ensureEngineImpl(): Promise<void> {
178214
installed_at: new Date().toISOString(),
179215
})
180216

217+
Telemetry.track({
218+
type: "engine_started",
219+
timestamp: Date.now(),
220+
session_id: Telemetry.getContext().sessionId,
221+
engine_version: ALTIMATE_ENGINE_VERSION,
222+
python_version: pyVersion,
223+
status: isUpgrade ? "upgraded" : "started",
224+
duration_ms: Date.now() - startTime,
225+
})
226+
181227
UI.println(`${UI.Style.TEXT_SUCCESS}Engine ready${UI.Style.TEXT_NORMAL}`)
182228
}
183229

packages/altimate-code/src/cli/cmd/auth.ts

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ import { Config } from "../../config/config"
1010
import { Global } from "../../global"
1111
import { Plugin } from "../../plugin"
1212
import { Instance } from "../../project/instance"
13+
import { Telemetry } from "../../telemetry"
1314
import type { Hooks } from "@altimateai/altimate-code-plugin"
1415

1516
type PluginAuth = NonNullable<Hooks["auth"]>
@@ -78,6 +79,15 @@ async function handlePluginAuth(plugin: { auth: PluginAuth }, provider: string):
7879
const result = await authorize.callback()
7980
if (result.type === "failed") {
8081
spinner.stop("Failed to authorize", 1)
82+
Telemetry.track({
83+
type: "auth_login",
84+
timestamp: Date.now(),
85+
session_id: "cli",
86+
provider_id: provider,
87+
method: "oauth",
88+
status: "error",
89+
error: "OAuth auto authorization failed",
90+
})
8191
}
8292
if (result.type === "success") {
8393
const saveProvider = result.provider ?? provider
@@ -98,6 +108,14 @@ async function handlePluginAuth(plugin: { auth: PluginAuth }, provider: string):
98108
})
99109
}
100110
spinner.stop("Login successful")
111+
Telemetry.track({
112+
type: "auth_login",
113+
timestamp: Date.now(),
114+
session_id: "cli",
115+
provider_id: saveProvider,
116+
method: "oauth",
117+
status: "success",
118+
})
101119
}
102120
}
103121

@@ -110,6 +128,15 @@ async function handlePluginAuth(plugin: { auth: PluginAuth }, provider: string):
110128
const result = await authorize.callback(code)
111129
if (result.type === "failed") {
112130
prompts.log.error("Failed to authorize")
131+
Telemetry.track({
132+
type: "auth_login",
133+
timestamp: Date.now(),
134+
session_id: "cli",
135+
provider_id: provider,
136+
method: "oauth",
137+
status: "error",
138+
error: "OAuth code authorization failed",
139+
})
113140
}
114141
if (result.type === "success") {
115142
const saveProvider = result.provider ?? provider
@@ -130,6 +157,14 @@ async function handlePluginAuth(plugin: { auth: PluginAuth }, provider: string):
130157
})
131158
}
132159
prompts.log.success("Login successful")
160+
Telemetry.track({
161+
type: "auth_login",
162+
timestamp: Date.now(),
163+
session_id: "cli",
164+
provider_id: saveProvider,
165+
method: "oauth",
166+
status: "success",
167+
})
133168
}
134169
}
135170

@@ -142,6 +177,15 @@ async function handlePluginAuth(plugin: { auth: PluginAuth }, provider: string):
142177
const result = await method.authorize(inputs)
143178
if (result.type === "failed") {
144179
prompts.log.error("Failed to authorize")
180+
Telemetry.track({
181+
type: "auth_login",
182+
timestamp: Date.now(),
183+
session_id: "cli",
184+
provider_id: provider,
185+
method: "api_key",
186+
status: "error",
187+
error: "API key authorization failed",
188+
})
145189
}
146190
if (result.type === "success") {
147191
const saveProvider = result.provider ?? provider
@@ -150,6 +194,14 @@ async function handlePluginAuth(plugin: { auth: PluginAuth }, provider: string):
150194
key: result.key,
151195
})
152196
prompts.log.success("Login successful")
197+
Telemetry.track({
198+
type: "auth_login",
199+
timestamp: Date.now(),
200+
session_id: "cli",
201+
provider_id: saveProvider,
202+
method: "api_key",
203+
status: "success",
204+
})
153205
}
154206
prompts.outro("Done")
155207
return true
@@ -270,6 +322,15 @@ export const AuthLoginCommand = cmd({
270322
const exit = await proc.exited
271323
if (exit !== 0) {
272324
prompts.log.error("Failed")
325+
Telemetry.track({
326+
type: "auth_login",
327+
timestamp: Date.now(),
328+
session_id: "cli",
329+
provider_id: args.url!,
330+
method: "api_key",
331+
status: "error",
332+
error: "Well-known auth command failed",
333+
})
273334
prompts.outro("Done")
274335
return
275336
}
@@ -280,6 +341,14 @@ export const AuthLoginCommand = cmd({
280341
token: token.trim(),
281342
})
282343
prompts.log.success("Logged into " + args.url)
344+
Telemetry.track({
345+
type: "auth_login",
346+
timestamp: Date.now(),
347+
session_id: "cli",
348+
provider_id: args.url!,
349+
method: "api_key",
350+
status: "success",
351+
})
283352
prompts.outro("Done")
284353
return
285354
}
@@ -411,6 +480,14 @@ export const AuthLoginCommand = cmd({
411480
type: "api",
412481
key,
413482
})
483+
Telemetry.track({
484+
type: "auth_login",
485+
timestamp: Date.now(),
486+
session_id: "cli",
487+
provider_id: provider,
488+
method: "api_key",
489+
status: "success",
490+
})
414491

415492
prompts.outro("Done")
416493
},
@@ -439,6 +516,12 @@ export const AuthLogoutCommand = cmd({
439516
})
440517
if (prompts.isCancel(providerID)) throw new UI.CancelledError()
441518
await Auth.remove(providerID)
519+
Telemetry.track({
520+
type: "auth_logout",
521+
timestamp: Date.now(),
522+
session_id: "cli",
523+
provider_id: providerID,
524+
})
442525
prompts.outro("Logout successful")
443526
},
444527
})

0 commit comments

Comments
 (0)