Skip to content

feat: enrich telemetry with pythonVersion and dbtCoreVersion customAttributes#1942

Draft
ralphstodomingo wants to merge 1 commit intomasterfrom
feat/telemetry-python-and-dbt-core-versions
Draft

feat: enrich telemetry with pythonVersion and dbtCoreVersion customAttributes#1942
ralphstodomingo wants to merge 1 commit intomasterfrom
feat/telemetry-python-and-dbt-core-versions

Conversation

@ralphstodomingo
Copy link
Copy Markdown
Contributor

@ralphstodomingo ralphstodomingo commented May 8, 2026

Summary

Production telemetry currently carries common.os / common.nodeArch / common.extversion etc. on every event, but not the user's Python interpreter version or installed dbt-core distribution version. That's a real blindspot — App Insights can't split error clusters by Python version (e.g. is pythonBridgeInitPythonError "mashumaro UnserializableField" a Python 3.13 + mashumaro 3.14 incompat or something else?), and we can't tell whether catalogPythonError "missing used_schemas" correlates with a specific dbt-core minor.

This PR adds two customAttributes that get merged into every event going forward.

⚠️ Blocked on: AltimateAI/altimate-dbt-integration#100@altimateai/dbt-integration@0.3.1 release. The dbt-core probe utility lives upstream so mcp-engine / mcp-server can reuse it; this PR consumes the published export and bumps the dependency.

What's added

Attribute Source Cost
pythonVersion PythonEnvironment.pythonVersion (already populated via the VS Code Python extension's API on interpreter activation) zero — no extra probe
dbtCoreVersion probeDbtCoreVersion(pythonPath) from @altimateai/dbt-integration — spawns python -c "from importlib.metadata import version; print(version('dbt-core'))" one ~50ms subprocess on activation + on interpreter change

importlib.metadata.version reads dist-info directly so the probe survives even when dbt's own import chain is broken — exactly the failure mode PR #96 targeted. The bridge can be dead and we still get the version.

Both are best-effort — if either probe fails (interpreter missing, dbt-core not installed, probe times out), the corresponding customAttribute is cleared rather than left at a stale value from the previous interpreter, and activation is never blocked.

Wiring

// In DBTPowerUserExtension.activate(), AFTER detectDBT() but BEFORE
// initializeDBTProjects() — see "Why this priming order matters" below.
await this.dbtProjectContainer.detectDBT();
void this.refreshVersionTelemetryAttributes();
const pythonEnv = this.dbtProjectContainer.getPythonEnvironment();
this.disposables.push(
  pythonEnv.onPythonEnvironmentChanged(() => {
    void this.refreshVersionTelemetryAttributes();
  }),
);
await this.dbtProjectContainer.initializeDBTProjects();

The refresh helper sets pythonVersion synchronously from the already-populated PythonEnvironment.pythonVersion, then awaits probeDbtCoreVersion for the dbt-core probe.

Why this priming order matters

initializeDBTProjects() is the python-bridge init step where pythonBridgeInitPythonError itself originates — exactly the failure mode this PR exists to dimension. If the refresh started after that line, an init-throw caught by extensionActivationError in activate()'s catch would emit without pythonVersion / dbtCoreVersion, defeating the PR's stated purpose.

Because void-fired async functions execute synchronously up to their first await (the probeDbtCoreVersion call), pythonVersion is guaranteed in customAttributes BEFORE initializeDBTProjects() synchronously starts. So a throw caught by the activate handler is dimensioned with pythonVersion deterministically; dbtCoreVersion arrives best-effort once the spawn resolves.

detectDBT() is the earliest point where PythonEnvironment.executionDetails is set and _pythonVersion is populated — running the refresh before it would no-op.

Refresh contract

  • Sequence-guarded: rapid interpreter switches can fire two refreshes concurrently. A monotonic versionRefreshSeq is captured at entry; both the synchronous pythonVersion write and the post-await dbtCoreVersion write are gated by seq !== this.versionRefreshSeq. A slower earlier probe finishing last cannot overwrite a faster later probe's results.
  • Clear-on-empty: when a probe yields no value (interpreter without dbt-core, fresh venv, etc.), the corresponding customAttribute is removed via the new TelemetryService.clearTelemetryCustomAttribute(key) rather than left at a stale value from the previous interpreter.

Test plan

  • npx tsc --noEmit — passes (validated against locally-linked @altimateai/dbt-integration from the upstream PR's worktree)
  • npx jest — 33 suites / 486 passed / 1 skipped
  • 6 refresh-contract tests (versionTelemetryRefresh.test.ts) cover:
    • Clear-on-empty (4): success → both set; new interpreter without pythonVersion → cleared; new venv without dbt-core → only dbtCoreVersion cleared; no python interpreter at all → both cleared.
    • Sequence guard (2): slow-old / fast-new race where the slow probe must NOT overwrite the fast one; 3-way race where only the third refresh's writes win.
  • Probe-level tests (success, empty, non-zero, error, sync throw, timeout) live with the probe in altimate-dbt-integration PR 'dbtPowerUser.showCompiledSQL' not found #100.

What this enables

After this ships and rolls out, App Insights queries like:

customEvents
| where name == "innoverio.vscode-dbt-power-user/pythonBridgeInitPythonError"
| summarize machines = dcount(tostring(customDimensions.['common.vscodemachineid']))
  by tostring(customDimensions.pythonVersion), tostring(customDimensions.dbtCoreVersion)
| order by machines desc

— work for every event from the rollout forward, not just the wrapper-failure ones. Lets us split all dashboards by Python or dbt-core version without per-event injection.

Follow-up

Could similarly add dbtAdaptersVersion (relevant for the catalogPythonError cluster from dbt-adapters 1.4+ signature change). Left out of this PR to keep scope tight; same pattern would apply.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Chores

    • Telemetry now automatically gathers Python interpreter and dbt-core versions at startup and whenever the Python environment changes; probes are best-effort, time-bounded, and won't interrupt activation.
    • Telemetry attributes are cleared when version information is missing, preventing stale values and ensuring consistent telemetry.
  • Tests

    • Added comprehensive tests covering version probe outcomes, timeouts, and refresh sequencing to ensure telemetry accuracy.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Review Change Stack

Walkthrough

This PR adds a best-effort Python/dbt-core version probe, exposes an injectable SpawnFn, wires a sequence-guarded refresh into extension activation and Python environment-change subscriptions, and adds tests plus a TelemetryService method to clear custom attributes.

Changes

Telemetry Version Enrichment

Layer / File(s) Summary
Type Contracts and Constants
src/telemetry/versionProbes.ts
Defines module docs, PROBE_TIMEOUT_MS, and exported SpawnFn type describing injectable spawn behavior and best-effort semantics.
Version Probe Implementation
src/telemetry/versionProbes.ts
Implements probeDbtCoreVersion, spawning Python with an importlib.metadata script, collecting stdout, enforcing a 5s timeout, and returning trimmed version or undefined on any failure.
Probe Tests
src/test/suite/versionProbes.test.ts
Tests probe outcomes using an injected SpawnFn and fake ChildProcess: success, empty stdout, non-zero exit, async spawn error, sync spawn throw, and timeout (ensures kill called).
Extension Activation and Subscriptions
src/dbtPowerUserExtension.ts
Activation now calls refreshVersionTelemetryAttributes() and subscribes to pythonEnv.onPythonEnvironmentChanged to re-run telemetry enrichment when the interpreter/environment changes.
Telemetry Attribute Refresh
src/dbtPowerUserExtension.ts, src/telemetry/index.ts
Adds a sequence-guarded refreshVersionTelemetryAttributes() that reads pythonVersion, best-effort probes dbt-core when pythonPath exists, sets or clears pythonVersion/dbtCoreVersion telemetry attributes, swallows probe errors; TelemetryService gains clearTelemetryCustomAttribute.
Refresh Tests
src/test/suite/versionTelemetryRefresh.test.ts
Tests clear-on-empty behavior and the monotonic sequence guard with overlapping refreshes so only the latest result is applied.

Sequence Diagram(s)

sequenceDiagram
  participant VSCode
  participant Extension
  participant PythonExt
  participant ProbeModule
  participant Telemetry
  VSCode->>Extension: activate()
  Extension->>Extension: refreshVersionTelemetryAttributes()
  Extension->>PythonExt: read active pythonVersion / pythonPath
  alt pythonPath available
    Extension->>ProbeModule: probeDbtCoreVersion(pythonPath)
    ProbeModule-->>Extension: dbtCoreVersion or undefined
  end
  Extension->>Telemetry: setCustomAttribute / clearTelemetryCustomAttribute
  PythonExt->>Extension: onPythonEnvironmentChanged
  Extension->>Extension: refreshVersionTelemetryAttributes()
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • AltimateAI/vscode-dbt-power-user#1928: Modifies TelemetryService and related telemetry attribute handling; closely related to the added clearTelemetryCustomAttribute and telemetry enrichment.

Suggested reviewers

  • mdesmet
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: enrich telemetry with pythonVersion and dbtCoreVersion customAttributes' directly and precisely describes the main changes—adding two telemetry custom attributes to track Python and dbt-core versions.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request description is comprehensive and follows the required template structure with all key sections completed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/telemetry-python-and-dbt-core-versions

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ralphstodomingo ralphstodomingo self-assigned this May 8, 2026
@ralphstodomingo ralphstodomingo requested a review from mdesmet May 8, 2026 08:29
@ralphstodomingo ralphstodomingo marked this pull request as ready for review May 8, 2026 08:29
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

Bundle Size Report

darwin-arm64: 74.2 MB
Category Size Compressed Files
Webview JS bundles 36.3 MB 12.3 MB 346
Native: altimate-core 35.1 MB 14.0 MB 1
Media assets 29.6 MB 25.8 MB 91
Native: zeromq 20.5 MB 8.2 MB 15
Webview images 15.3 MB 12.2 MB 18
Extension backend (JS) 2.7 MB 0.6 MB 1
Python packages 2.0 MB 0.5 MB 95
Native: other node_modules 1.0 MB 0.2 MB 139
Webview CSS 0.8 MB 0.1 MB 2
Webview other 0.5 MB 0.1 MB 5
Other 0.1 MB 26 KB 15
Total 143.9 MB 74.0 MB 728
linux-x64: 75.9 MB
Category Size Compressed Files
Native: altimate-core 41.8 MB 15.1 MB 1
Webview JS bundles 36.3 MB 12.3 MB 346
Media assets 29.6 MB 25.8 MB 91
Native: zeromq 21.9 MB 8.7 MB 16
Webview images 15.3 MB 12.2 MB 18
Extension backend (JS) 2.7 MB 0.6 MB 1
Python packages 2.0 MB 0.5 MB 95
Native: other node_modules 1.0 MB 0.2 MB 139
Webview CSS 0.8 MB 0.1 MB 2
Webview other 0.5 MB 0.1 MB 5
Other 0.1 MB 26 KB 15
Total 152.0 MB 75.7 MB 729
win32-x64: 76.8 MB
Category Size Compressed Files
Native: altimate-core 50.3 MB 16.2 MB 1
Webview JS bundles 36.3 MB 12.3 MB 346
Media assets 29.6 MB 25.8 MB 91
Native: zeromq 20.0 MB 8.1 MB 15
Webview images 15.3 MB 12.2 MB 18
Extension backend (JS) 2.7 MB 0.6 MB 1
Native: other node_modules 2.3 MB 0.7 MB 147
Python packages 2.0 MB 0.5 MB 95
Webview CSS 0.8 MB 0.1 MB 2
Webview other 0.5 MB 0.1 MB 5
Other 0.1 MB 26 KB 15
Total 159.8 MB 76.7 MB 736

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/dbtPowerUserExtension.ts`:
- Around line 154-169: The telemetry attributes for interpreter info are only
set when truthy, leaving stale values when an interpreter lacks data; update the
logic in dbtPowerUserExtension.ts around pythonVersion, pythonPath and
probeDbtCoreVersion so that you always call
this.telemetry.setTelemetryCustomAttribute("pythonVersion", <value>) and
this.telemetry.setTelemetryCustomAttribute("dbtCoreVersion", <value>) — passing
the actual value when present and a clear/reset placeholder (e.g., empty string
or null) when not — and ensure probeDbtCoreVersion(pythonPath) errors are
handled so dbtCoreVersion is explicitly cleared when no value is found.
- Around line 112-117: refreshVersionTelemetryAttributes can run concurrently
and an earlier slower invocation may overwrite a later one; add a sequence-guard
(e.g., an incrementing requestId or a per-instance CancellationToken) in the
class so each invocation captures the current token/id at start and only applies
its results if the token/id still matches before writing telemetry attributes.
Update the call sites around pythonEnv.onPythonEnvironmentChanged and the
similar block at the other occurrence (the code between the earlier and later
refresh locations) to use the same guard (increment the id or generate a new
token before calling refreshVersionTelemetryAttributes and have
refreshVersionTelemetryAttributes check/compare the token/id before mutating
state).

In `@src/telemetry/versionProbes.ts`:
- Around line 72-93: The timeout scheduled with setTimeout(PROBE_TIMEOUT_MS) can
remain pending after early settlement; modify the probe logic around
settled/settle/proc to capture the timer id (e.g. const timer = setTimeout(...))
and call clearTimeout(timer) inside the settle function before killing the
process and resolve to prevent dangling timers; ensure the timeout handler still
calls settle(undefined) so behavior is unchanged and use a compatible timer type
(ReturnType<typeof setTimeout> / NodeJS.Timeout) if needed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5b70ac0b-2b2f-492b-a646-c131ea782f28

📥 Commits

Reviewing files that changed from the base of the PR and between 68ef665 and 2a05a6f.

📒 Files selected for processing (3)
  • src/dbtPowerUserExtension.ts
  • src/telemetry/versionProbes.ts
  • src/test/suite/versionProbes.test.ts

Comment thread src/dbtPowerUserExtension.ts
Comment thread src/dbtPowerUserExtension.ts
Comment thread src/telemetry/versionProbes.ts Outdated
@ralphstodomingo ralphstodomingo force-pushed the feat/telemetry-python-and-dbt-core-versions branch from 2a05a6f to a47a2bd Compare May 8, 2026 08:45
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/test/suite/versionTelemetryRefresh.test.ts (1)

16-20: 🏗️ Heavy lift

These tests can drift away from the real implementation.

RefreshHarness and FakeTelemetry re-implement the production logic, so this suite can stay green even if DBTPowerUserExtension.refreshVersionTelemetryAttributes() or TelemetryService.clearTelemetryCustomAttribute() regresses. I’d strongly prefer extracting the refresh helper and testing that directly, or adding at least one case that drives the real extension + telemetry classes through the refresh path.

Also applies to: 55-105

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/test/suite/versionTelemetryRefresh.test.ts` around lines 16 - 20, The
test currently re-implements production behavior via RefreshHarness and
FakeTelemetry which allows regressions in
DBTPowerUserExtension.refreshVersionTelemetryAttributes and
TelemetryService.clearTelemetryCustomAttribute to go unnoticed; fix by
extracting the refresh helper logic into a single exported function (e.g.,
refreshVersionTelemetryAttributesHelper) and update tests to import and exercise
that helper directly, or alternately add an integration-style test that
instantiates the real DBTPowerUserExtension and the real TelemetryService (or a
thin spy) and asserts the same outcomes so the actual
DBTPowerUserExtension.refreshVersionTelemetryAttributes and
TelemetryService.clearTelemetryCustomAttribute code paths are executed; update
or remove RefreshHarness/FakeTelemetry usages to ensure at least one test drives
the real implementation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/dbtPowerUserExtension.ts`:
- Around line 107-125: Move the initial telemetry priming so version attributes
are set before any init that can emit telemetry: call
this.refreshVersionTelemetryAttributes() immediately after await
this.dbtProjectContainer.detectDBT() (i.e., before await
this.dbtProjectContainer.initializeDBTProjects() and before await
this.statusBars.initialize()), and keep the existing
pythonEnv.onPythonEnvironmentChanged subscription so subsequent changes still
re-run refreshVersionTelemetryAttributes(); alternatively, if a synchronous
"prime only python version" helper is preferred, add and call a small
synchronous priming method (e.g., primePythonVersionTelemetry()) before
initializeDBTProjects(), then leave refreshVersionTelemetryAttributes() where it
is for the full async refresh.

---

Nitpick comments:
In `@src/test/suite/versionTelemetryRefresh.test.ts`:
- Around line 16-20: The test currently re-implements production behavior via
RefreshHarness and FakeTelemetry which allows regressions in
DBTPowerUserExtension.refreshVersionTelemetryAttributes and
TelemetryService.clearTelemetryCustomAttribute to go unnoticed; fix by
extracting the refresh helper logic into a single exported function (e.g.,
refreshVersionTelemetryAttributesHelper) and update tests to import and exercise
that helper directly, or alternately add an integration-style test that
instantiates the real DBTPowerUserExtension and the real TelemetryService (or a
thin spy) and asserts the same outcomes so the actual
DBTPowerUserExtension.refreshVersionTelemetryAttributes and
TelemetryService.clearTelemetryCustomAttribute code paths are executed; update
or remove RefreshHarness/FakeTelemetry usages to ensure at least one test drives
the real implementation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 122543c8-027a-4375-845b-d613d82b5098

📥 Commits

Reviewing files that changed from the base of the PR and between 2a05a6f and a47a2bd.

📒 Files selected for processing (5)
  • src/dbtPowerUserExtension.ts
  • src/telemetry/index.ts
  • src/telemetry/versionProbes.ts
  • src/test/suite/versionProbes.test.ts
  • src/test/suite/versionTelemetryRefresh.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/test/suite/versionProbes.test.ts
  • src/telemetry/versionProbes.ts

Comment thread src/dbtPowerUserExtension.ts
@ralphstodomingo ralphstodomingo force-pushed the feat/telemetry-python-and-dbt-core-versions branch from a47a2bd to 66711da Compare May 8, 2026 08:55
@dev-punia-altimate
Copy link
Copy Markdown
Contributor

✅ Tests — All Passed

cc @ralphstodomingo

Copy link
Copy Markdown
Contributor

@mdesmet mdesmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be implemented here. this should be in dbt integration.

@ralphstodomingo ralphstodomingo marked this pull request as draft May 8, 2026 19:29
…tributes

Production telemetry currently carries `common.os` / `common.nodeArch` /
`common.extversion` etc. on every event but **not** the user's Python
interpreter version or installed dbt-core distribution version. That's a
real blindspot — App Insights can't split error clusters by Python
version (e.g. is `pythonBridgeInitPythonError "mashumaro
UnserializableField"` a Python 3.13 + mashumaro 3.14 incompat or
something else?) and we can't tell whether `catalogPythonError "missing
used_schemas"` correlates with a specific dbt-core minor.

Adds two customAttributes that get merged into every event going
forward via `TelemetryService.setTelemetryCustomAttribute`:

- `pythonVersion` — sourced from `PythonEnvironment.pythonVersion`
  (already populated via the VS Code Python extension's API on
  interpreter activation, no additional probe needed).
- `dbtCoreVersion` — sourced from a small `child_process.spawn` probe
  in `src/telemetry/versionProbes.ts` that runs
  `python -c "from importlib.metadata import version; print(version('dbt-core'))"`.
  Reads dist-info directly so it survives even when dbt's own import
  chain is broken (the failure mode PR #96 targeted) — the bridge can
  be dead and we still get the version.

Both are best-effort: probe failure → no customAttribute set rather
than blocking activation. Wired into `DBTPowerUserExtension.activate`
after `initializeDBTProjects` (when the Python interpreter is known)
plus a refresh on `pythonEnvironment.onPythonEnvironmentChanged` so the
dimensions track interpreter changes.

`probeDbtCoreVersion` accepts an injectable `SpawnFn` parameter
defaulted to `child_process.spawn`. `import * as childProcess` produces
ESM-style bindings that resist `jest.spyOn` / `jest.mock` redefinition
under ts-jest, so the tests pass their own spawner directly. Adds 6
unit tests covering: success, empty stdout (PackageNotFoundError),
non-zero exit, error event, sync throw, and timeout.

After this ships and rolls out, App Insights queries can split by
`customDimensions.pythonVersion == "3.13.0"` or
`customDimensions.dbtCoreVersion startswith "1.10"` on every event,
not just the wrapper-failure events. Dashboards become diagnostic by
construction without per-event injection.

Tests: 33 suites / 486 passed (32 prior + 1 new).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ralphstodomingo ralphstodomingo force-pushed the feat/telemetry-python-and-dbt-core-versions branch from 66711da to f735717 Compare May 8, 2026 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants