Skip to content

refactor(instrumentation): unify enable/disable lifecycle and default to disabled#278

Draft
overbalance wants to merge 10 commits into
open-telemetry:mainfrom
embrace-io:overbalance/unify-instrumentation-enable-disable
Draft

refactor(instrumentation): unify enable/disable lifecycle and default to disabled#278
overbalance wants to merge 10 commits into
open-telemetry:mainfrom
embrace-io:overbalance/unify-instrumentation-enable-disable

Conversation

@overbalance
Copy link
Copy Markdown
Contributor

@overbalance overbalance commented May 14, 2026

Which problem is this PR solving?

Each browser instrumentation in this package implemented its own enable/disable bookkeeping. Several used a `declare` field to dodge a class-field-initializer race against the upstream `InstrumentationBase` constructor (which calls `enable()` before subclass fields are populated). State lived in three places at once (`_isEnabled`, `_active`, `_config.enabled`), defaults differed across instrumentations, there was no public `isEnabled()`, and `setConfig` could surprise-enable a disabled instrumentation by omitting the `enabled` key. There was also no shared error-handling story: a throw from a subclass `enable()` or `disable()` could leave an instance silently half-applied with no diagnostic.

Short description of the changes

  • Add a browser-specific `InstrumentationBase` wrapper at `packages/instrumentation/src/utils/InstrumentationBase.ts` that:
    • suppresses upstream's constructor-time `enable()` by forcing `enabled: false` into the `super()` call, so subclass class-field initializers no longer fight a parallel `enable()`,
    • defaults to disabled (callers opt in via `{ enabled: true }` or by calling `.enable()`),
    • exposes `isEnabled()` publicly and a protected `_enabled` getter that reads `this._config.enabled` as the single source of truth,
    • overrides `getConfig()` to return a frozen snapshot so external callers cannot mutate `_config.enabled` and silently invalidate the lifecycle invariant,
    • routes `enable()` / `disable()` through a private `_runTransition` that wraps the subclass hook in a shared try/catch: on throw, the base rolls `_config.enabled` back to its pre-call value (so `isEnabled()` remains accurate), logs a uniform `lifecycle transition (enabled: X -> Y) threw` diagnostic via `_diag.error`, and rethrows. The same diagnostic fires whether the entry point is `enable()`, `disable()`, or `setConfig()`,
    • overrides `setConfig` so omitting `enabled` preserves the current state, and emits `_diag.warn` when `setConfig({ enabled: true })` is silently vetoed by a subclass `_canEnable()` check (so capability-detection failures are no longer invisible to callers).
  • Refactor all seven instrumentations (console, errors, navigation, navigation-timing, resource-timing, user-action, web-vitals) to extend the new base via the template-method pattern. Each subclass implements only `_onEnable` / `_onDisable` (and optionally `_canEnable` for capability detection). Constructor bodies end with `if (config.enabled === true) this.enable();`. `_isEnabled` / `_active` fields are removed, as are the `declare` workarounds, and `init()` no longer needs to be overridden per subclass.
  • Align `resource-timing`'s `PerformanceObserver` feature check with `web-vitals` (`typeof PerformanceObserver === 'undefined'`).
  • Isolate `web-vitals` registration: each of `onCLS` / `onINP` / `onLCP` / `onFCP` / `onTTFB` is wrapped in its own try/catch so a sync throw from one (e.g. an underlying observer type the browser doesn't support — INP needs `EventTiming`, CLS needs `LayoutShift`) does not quarantine the rest, and `_listenersRegistered` flips to `true` only after all five attempts so a throw cannot lock a future `enable()` retry into the 'already registered' branch.
  • Make `NavigationTimingInstrumentation._didEmit` sticky across disable/enable. A page only has one `PerformanceNavigationTiming` entry, so re-emitting after a disable/enable cycle would duplicate the same event. This matches the once-per-page-lifecycle contract used for hard navigation. Only `_retryCount` and `_lastEntry` reset on disable.
  • Fix a retry-safety hole in `NavigationInstrumentation._onEnable`: set `_isHistoryPatched = true` before wrapping `replaceState` / `pushState` so a partial throw cannot cause double-wrapping on a subsequent `enable()` retry.
  • Restore runtime null guards in `NavigationTimingInstrumentation._unsubscribeAll` per the repo convention of keeping defensive guards even when TS types make them unreachable.
  • Document the lifecycle change in the package README so users who construct instrumentations standalone (without `registerInstrumentations`) know to opt in, and that `_canEnable()` vetoes can leave `isEnabled()` at `false` even after passing `enabled: true`.
  • Fix a latent silent-failure: when an `ErrorEvent` arrives with `event.error` null/undefined (cross-origin scripts, some older browsers) but a populated `event.message`, the instrumentation now emits using the message instead of dropping the event. (Pre-existing behavior was a silent no-op.)

Test coverage added

  • `InstrumentationBase.test.ts`: full coverage of the wrapper, including all four `setConfig` transitions, the do-not-auto-enable-during-super() invariant, idempotent enable/disable, config preservation, the throw-during-transition diag log, the rollback-on-throw contract for both `_onEnable` and `_onDisable`, retry-after-throw, the `_canEnable` veto + `setConfig` warn, the no-veto-on-disable contract, the `setConfig` omitting-enabled-on-disabled-instance no-op, and the frozen `getConfig()` snapshot.
  • Negative-construction tests for `errors`, `navigation`, `resource-timing`, and `user-action` confirming that constructing without an explicit `{ enabled: true }` installs no listeners, patches no globals, and creates no observers.
  • `navigation-timing`: explicit assertion that construction without `{ enabled: true }` does not emit; pinning that a successful emit is once-per-page-lifecycle (no re-emit on disable/enable); pinning that `_retryCount` resets on disable so a second lifecycle can retry from zero.
  • `web-vitals`: assertion that INP emits exactly once per interaction across multiple enable/disable cycles, plus a structural assertion (via the `_diag` registration debug log) that all five listeners (CLS, INP, LCP, FCP, TTFB) register exactly once across multiple enable cycles. Pins the `_listenersRegistered` invariant for every metric, not just INP.
  • `navigation`: assertion that the hard-navigation event emits at most once per page lifecycle across disable/enable cycles, and that this still holds when soft navigations interleave the cycles (pins the `_hasProcessedInitialLoad` sticky-state isolation).
  • `errors`: assertion that an `ErrorEvent` with no `error` but a populated `message` still emits an `exception` log.
  • Removed a brittle private-`_diag` spy in `navigation/instrumentation.test.ts` in favor of a behavioral assertion that the hook throw is caught and the log still emits.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Note on behavior: instrumentations constructed standalone (without `registerInstrumentations`) used to default to enabled and now default to disabled. `registerInstrumentations` still auto-enables (via its own `!getConfig().enabled` check), so the supported public usage in the README is unchanged. The Lifecycle section in the package README documents the standalone-construction path and the `_canEnable()` veto behavior.

How Has This Been Tested?

  • `npm test` for `@opentelemetry/browser-instrumentation` (202 tests pass across the unit and browser projects, including the new lifecycle, rollback, veto-warn, retry-safety, once-per-lifecycle, frozen-snapshot, soft-nav interleave, and five-metric registration coverage)
  • `npm run check` (turbo `check-types` + eslint + biome) clean
  • Manually traced all four `setConfig` transitions (off-off, off-on, on-off, on-on), the rollback path for a throwing `_onEnable` and a throwing `_onDisable`, and the `_canEnable` veto path through `setConfig`

Checklist:

  • Followed the style guidelines of this project
  • Unit tests have been added
  • Documentation has been updated

// via a `browser` condition, and the public `@opentelemetry/instrumentation`
// types point at the Node variant (with its own `_enabled` field, `_modules`,
// etc.) which doesn't match what the bundler actually loads at runtime.
import { InstrumentationBase as CoreInstrumentationBase } from '@opentelemetry/instrumentation/build/esm/platform/browser/instrumentation.js';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it means the actual exports key. TS doesn't follow the browser key to dig out types

*/
export abstract class InstrumentationBase<
ConfigType extends InstrumentationConfig = InstrumentationConfig,
> extends CoreInstrumentationBase<ConfigType> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could get rid of this core dependency altogether. (Maybe a separate PR)

) {
super(instrumentationName, instrumentationVersion, {
...config,
enabled: false,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why disabled by default?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants