Skip to content

feat(observability): add @sourceloop/observability with OTEL-first bootstrap, profiles, and pluggable instrumentation#2502

Open
samarpan-b wants to merge 3 commits intomasterfrom
arc-obf
Open

feat(observability): add @sourceloop/observability with OTEL-first bootstrap, profiles, and pluggable instrumentation#2502
samarpan-b wants to merge 3 commits intomasterfrom
arc-obf

Conversation

@samarpan-b
Copy link
Copy Markdown
Contributor

@samarpan-b samarpan-b commented Apr 13, 2026

Description

This PR introduces a new reusable package, @sourceloop/observability, to provide a generic OpenTelemetry-first observability layer for Sourceloop services.

The package is designed around:

  • early bootstrap for tracing setup before app startup
  • optional LoopBack integration through ObservabilityComponent
  • profile-based configuration for OTLP, New Relic, SigNoz, and Datadog
  • optional instrumentation packages that consumers can install and enable selectively
  • generic tracing helpers for spans, errors, and propagation

Configuration model

Startup-critical observability config is resolved during bootstrap from:

  1. explicit bootstrap overrides
  2. environment variables
  3. defaults

LoopBack DI config is still supported, but only for component-level enrichment rather than early startup control.

Instrumentation behavior

By default, only:

  • http
  • express

are enabled.

All other built-in instrumentations are disabled by default and must be explicitly enabled through bootstrap config or env vars.

Instrumentation packages are optional peer dependencies. If an instrumentation is enabled but its package is not installed, bootstrap fails fast with a clear validation error.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Intermediate change (work in progress)

How Has This Been Tested?

Verified locally with:

  • npm run build
  • npm test
  • npm run lint

Checklist:

  • Performed a self-review of my own code
  • npm test passes on your machine
  • New tests added or existing tests modified to cover all changes
  • Code conforms with the style guide
  • API Documentation in code was updated
  • Any dependent changes have been merged and published in downstream modules

Copilot AI review requested due to automatic review settings April 13, 2026 21:46
@samarpan-b samarpan-b requested a review from a team as a code owner April 13, 2026 21:46
@sonarqube-agent
Copy link
Copy Markdown

sonarqube-agent bot commented Apr 13, 2026

Remediation Agent Summary 📊

🤖 To review: Fixes are ready for 4 of 6 issues found.
💪 Save time: Applying these fixes could save you an estimated 58 minutes.
Suggested fixes (4)

QualityIssue
Maintainability
🔴 High
Function has a complexity of 13 which is greater than 10 authorized.

Why is this an issue?

Why is this an issue?

The Cyclomatic Complexity of functions should not exceed a defined threshold. Complex code may perform poorly and can be difficult to test thoroughly.


Maintainability
🔴 High
Function has a complexity of 21 which is greater than 10 authorized.

Why is this an issue?

Why is this an issue?

The Cyclomatic Complexity of functions should not exceed a defined threshold. Complex code may perform poorly and can be difficult to test thoroughly.


Maintainability
🔴 High
Function has a complexity of 14 which is greater than 10 authorized.

Why is this an issue?

Why is this an issue?

The Cyclomatic Complexity of functions should not exceed a defined threshold. Complex code may perform poorly and can be difficult to test thoroughly.


Maintainability
🔴 High
Refactor this code to not nest more than 3 if/for/while/switch/try statements.

Why is this an issue?

Why is this an issue?

Nested control flow statements such as if, for, while, switch, and try are often key ingredients in creating what’s known as "Spaghetti code". This code smell can make your program difficult to understand and maintain.

When numerous control structures are placed inside one another, the code becomes a tangled, complex web. This significantly reduces the code’s readability and maintainability, and it also complicates the testing process.

Resources


Issues requiring manual fix (2)

QualityIssue
Maintainability
🟡 Low
This assertion is unnecessary since it does not change the type of the expression.

Why is this an issue?

Why is this an issue?

In TypeScript, casts and non-null assertions are two mechanisms used to inform the TypeScript compiler about the intended types of variables, expressions, or values in the code. They are used to help the compiler understand the types more accurately and to handle certain situations where type inference might not be sufficient:

  • A type assertion tells the compiler to treat the value as the specified type, even if the compiler’s type inference suggests a different type.
  • A non-null assertion is a way to tell the TypeScript compiler explicitly that you are certain that a variable will not be null or undefined, and you want to access its properties or methods without any null checks.

However, a redundant cast occurs when you perform a type assertion that is not needed because the compiler already knows the type based on the context or explicit type declarations. Similarly, a redundant non-null assertion occurs when you use the non-null assertion operator ! to assert that a variable is not null or undefined, but the compiler already knows that based on the context.

Both redundant casts and redundant non-null assertions should be avoided in TypeScript code, as they add unnecessary noise, clutter the code, and lead to confusion.

View this code on SonarQube Cloud

Remove all the redundant casts and non-null assertions based on the contextual typing information, as inferred by the TypeScript compiler.

View this code on SonarQube Cloud

Resources

Documentation


Maintainability
🟡 Low
This assertion is unnecessary since it does not change the type of the expression.

Why is this an issue?

Why is this an issue?

In TypeScript, casts and non-null assertions are two mechanisms used to inform the TypeScript compiler about the intended types of variables, expressions, or values in the code. They are used to help the compiler understand the types more accurately and to handle certain situations where type inference might not be sufficient:

  • A type assertion tells the compiler to treat the value as the specified type, even if the compiler’s type inference suggests a different type.
  • A non-null assertion is a way to tell the TypeScript compiler explicitly that you are certain that a variable will not be null or undefined, and you want to access its properties or methods without any null checks.

However, a redundant cast occurs when you perform a type assertion that is not needed because the compiler already knows the type based on the context or explicit type declarations. Similarly, a redundant non-null assertion occurs when you use the non-null assertion operator ! to assert that a variable is not null or undefined, but the compiler already knows that based on the context.

Both redundant casts and redundant non-null assertions should be avoided in TypeScript code, as they add unnecessary noise, clutter the code, and lead to confusion.

View this code on SonarQube Cloud

Remove all the redundant casts and non-null assertions based on the contextual typing information, as inferred by the TypeScript compiler.

View this code on SonarQube Cloud

Resources

Documentation


🤖 Agent created PR #2503

💡 Got issues in your backlog? Let the agent fix them for you.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new @sourceloop/observability package that bootstraps OpenTelemetry tracing early in service startup, with optional LoopBack integration (bindings + extension point) and a small helper API for spans/correlation/errors.

Changes:

  • Introduces a new packages/observability workspace package (bootstrap/runtime, LoopBack component, profiles, config resolution/validation, tracing helpers).
  • Adds built-in OTLP-based profiles (default, newrelic, signoz, datadog) and optional auto-instrumentation loading via optional peer deps.
  • Adds unit/integration tests and wires the new package into the monorepo workspace + lockfile.

Reviewed changes

Copilot reviewed 37 out of 38 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
packages/observability/tsconfig.json TypeScript build configuration for the new package
packages/observability/src/types.ts Public types for config, profiles, runtime, and instrumentations
packages/observability/src/tracing.ts withSpan / addSpanAttributes helpers
packages/observability/src/runtime.ts Singleton runtime state management helpers
packages/observability/src/profiles/signoz.profile.ts SigNoz OTLP defaults profile
packages/observability/src/profiles/registry.service.ts LoopBack extension-point registry for profiles
packages/observability/src/profiles/otlp.profile.ts Base OTLP profile: resource/sampler/exporter + auto-instrumentations
packages/observability/src/profiles/newrelic.profile.ts New Relic OTLP defaults profile
packages/observability/src/profiles/keys.ts LoopBack extension-point keys + binding template helper
packages/observability/src/profiles/instrumentations.ts Optional peer-dep instrumentation loading + toggles
packages/observability/src/profiles/index.ts Public exports for profiles
packages/observability/src/profiles/datadog.profile.ts Datadog OTLP defaults profile
packages/observability/src/keys.ts LoopBack binding keys for config/runtime/resolved config
packages/observability/src/index.ts Package public API surface
packages/observability/src/errors.ts recordException span helper
packages/observability/src/correlation.ts Trace context propagation/correlation helpers
packages/observability/src/config/validate-config.ts Startup validation (dependencies + coherent toggles)
packages/observability/src/config/resolve-config.ts Env/bootstrap/DI config resolution logic
packages/observability/src/config/defaults.ts Default resolved config + default instrumentation toggles
packages/observability/src/component.ts Optional LoopBack component wiring + built-in profile bindings
packages/observability/src/bootstrap.ts Core bootstrap/shutdown/runtime creation
packages/observability/src/tests/unit/validate-config.unit.ts Unit test for config validation
packages/observability/src/tests/unit/resolve-config.unit.ts Unit test for config resolution precedence
packages/observability/src/tests/unit/profile-defaults.unit.ts Unit tests for built-in profile defaults
packages/observability/src/tests/unit/instrumentations.unit.ts Unit tests for instrumentation loading/enabling
packages/observability/src/tests/unit/README.md Unit test folder marker doc
packages/observability/src/tests/integration/runtime.integration.ts Integration test for runtime initialization + helpers
packages/observability/src/tests/integration/README.md Integration test folder marker doc
packages/observability/src/tests/acceptance/README.md Acceptance test folder marker doc
packages/observability/package.json New package manifest (deps/peers/scripts)
packages/observability/README.md Package documentation (usage, resolution model, profiles)
packages/observability/.prettierrc Prettier config for the new package
packages/observability/.prettierignore Prettier ignore rules
packages/observability/.gitignore Package-level ignore rules
packages/observability/.eslintrc.js ESLint config (typed linting)
packages/observability/.eslintignore ESLint ignore rules
package.json Adds packages/observability/ to workspaces
package-lock.json Lockfile updates for the new workspace package and dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +116 to +119
return {
exporterName: loadOtlpExporter.name,
tracerProvider,
};
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProfileInitResult.exporterName is currently hard-coded to loadOtlpExporter.name, even though the exporter instance is created via context.createExporter(config). This can report a misleading exporter name whenever a caller supplies a custom createExporter (the integration test does this with InMemorySpanExporter). Consider deriving exporterName from context.createExporter or from the created exporter instance (e.g., constructor name) so it reflects what was actually used.

Copilot uses AI. Check for mistakes.
Comment on lines +7 to +22
function assertDependencyInstalled(moduleName: string, message: string): void {
try {
require.resolve(moduleName);
} catch {
throw new Error(message);
}
}

function isDependencyInstalled(moduleName: string): boolean {
try {
require.resolve(moduleName);
return true;
} catch {
return false;
}
}
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateObservabilityConfig re-implements dependency resolution helpers (assertDependencyInstalled / isDependencyInstalled), while profiles/instrumentations.ts already contains very similar isModuleInstalled. Consider consolidating these helpers in one place to avoid divergence (e.g., inconsistent module resolution behavior) as new instrumentations are added.

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +19
export function getRuntimeState(): RuntimeState {
return state;
}
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getRuntimeState() returns the mutable singleton state object by reference. Because getRuntimeState is exported from the package public API, external callers can mutate internal runtime state without going through updateRuntimeState() / clearRuntimeState(), which can lead to hard-to-debug behavior (especially in tests). Consider returning a read-only view (e.g., Readonly<RuntimeState>) and/or a shallow copy to prevent accidental mutation.

Copilot uses AI. Check for mistakes.
Comment on lines +134 to +141
const exporterProtocol = (bootstrapOverrides?.exporterProtocol ??
(env.OTEL_EXPORTER_OTLP_PROTOCOL as ExporterProtocol | undefined) ??
DEFAULT_OBSERVABILITY_CONFIG.exporterProtocol) as ExporterProtocol;

const sampler = (bootstrapOverrides?.sampler ??
(env.OTEL_TRACES_SAMPLER as SamplerName | undefined) ??
DEFAULT_OBSERVABILITY_CONFIG.sampler) as SamplerName;

Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResolvedObservabilityConfig.exporterProtocol is treated as a discriminant in loadOtlpExporter, but the value is pulled from env and then force-cast to ExporterProtocol in resolveBootstrapConfig(). If the env var contains an unexpected value, the code silently falls back to the HTTP exporter branch instead of failing fast. Consider validating exporterProtocol (and sampler) against the allowed set during config resolution or in validateObservabilityConfig, and throw a clear startup error for invalid values.

Copilot uses AI. Check for mistakes.
Comment on lines +59 to +75
function loadOtlpExporter(config: ResolvedObservabilityConfig): SpanExporter {
if (config.exporterProtocol === 'grpc') {
const otlpGrpc = require('@opentelemetry/exporter-trace-otlp-grpc') as {
OTLPTraceExporter: new (options?: object) => SpanExporter;
};

return new otlpGrpc.OTLPTraceExporter({
url: config.otlpEndpoint,
metadata: Object.keys(config.otlpHeaders).length
? config.otlpHeaders
: undefined,
});
}

const otlpHttp = require('@opentelemetry/exporter-trace-otlp-http') as {
OTLPTraceExporter: new (options?: object) => SpanExporter;
};
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loadOtlpExporter() uses require() and types the exporter constructor as new (options?: object) => SpanExporter, which removes compile-time checking of exporter options. This makes it easy to pass an invalid config shape (e.g., gRPC vs HTTP header/metadata options) without TypeScript catching it. Prefer using typed imports (or typeof import(...)) for the exporter modules so the options object is type-checked.

Copilot uses AI. Check for mistakes.
@samarpan-b samarpan-b changed the title feat(observability): add a new package for otel based standard and best practices observability / tracing feat(observability): add @sourceloop/observability with OTEL-first bootstrap, profiles, and pluggable instrumentation Apr 14, 2026
…plexity from #2502 (#2503)

* fix: Commit 1 - Fully fix typescript:S1541

Commit 1 of SonarQube suggestions

Fully fixed issues:
- [typescript:S1541] AZ2I0JCED4ljEi2GS_-X: Function has a complexity of 13 which is greater than 10 authorized.
- [typescript:S1541] AZ2I0JCED4ljEi2GS_-Y: Function has a complexity of 21 which is greater than 10 authorized.

Generated by SonarQube Agent

* fix: Commit 2 - Fully fix typescript:S1541, typescript:S134

Commit 2 of SonarQube suggestions

Fully fixed issues:
- [typescript:S1541] AZ2I0JE5D4ljEi2GS_-c: Function has a complexity of 14 which is greater than 10 authorized.
- [typescript:S134] AZ2I0JE5D4ljEi2GS_-d: Refactor this code to not nest more than 3 if/for/while/switch/try statements.

Generated by SonarQube Agent

---------

Co-authored-by: sonarqube-agent[bot] <210722872+sonarqube-agent[bot]@users.noreply.github.com>
@sonarqubecloud
Copy link
Copy Markdown

SonarQube reviewer guide

Summary: Introduces a new @sourceloop/observability package providing OpenTelemetry-first observability bootstrap and optional LoopBack integration for Sourceloop services.

Review Focus:

  • The hybrid bootstrap/component design: ensure the separation between startup-critical config (bootstrap) and component-level enrichment (DI) is clear and maintainable
  • Instrumentation loading and dependency validation: verify that missing optional peer dependencies fail fast with clear errors during bootstrap
  • Profile system extensibility: check that the LoopBack extension point properly supports custom observability profiles
  • Config resolution precedence: ensure overrides → environment variables → defaults chain is correct across both bootstrap and component stages

Start review at: packages/observability/src/bootstrap.ts. It's the entry point that orchestrates profile resolution, config validation, and runtime initialization—understanding this flow is critical to reviewing the rest of the package's design.

💬 Please send your feedback

Quality Gate Passed Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

}

initialize(
config: ResolvedObservabilityConfig,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that bootstrapObservability() in bootstrap.ts already calls profile.applyDefaults(resolvedConfig) before invoking profile.initialize(). Since this override calls this.applyDefaults(config) again before delegating to super, applyDefaults ends up running twice — headers get merged twice and vendor.apm gets set twice. The same pattern exists in DatadogObservabilityProfile and SignozObservabilityProfile.

Removing the initialize() override here (and in Datadog/Signoz) should be enough — BaseOtlpObservabilityProfile.initialize handles initialization fully, and applyDefaults is already called by the bootstrap flow.

// applyDefaults() is already called by bootstrapObservability() before initialize() is invoked.
// The base class implementation is sufficient — no need to override initialize() here.
export class NewRelicObservabilityProfile extends BaseOtlpObservabilityProfile {
  name: ObservabilityProfileName = 'newrelic';

  applyDefaults(config: ResolvedObservabilityConfig): ResolvedObservabilityConfig {
    const licenseKey = process.env.NEW_RELIC_LICENSE_KEY?.trim();
    return {
      ...config,
      otlpEndpoint:
        config.otlpEndpoint ??
        (config.exporterProtocol === 'grpc'
          ? 'https://otlp.nr-data.net:4317'
          : 'https://otlp.nr-data.net:4318/v1/traces'),
      otlpHeaders: {
        ...(licenseKey ? {'api-key': licenseKey} : {}),
        ...config.otlpHeaders,
      },
      resourceAttributes: {
        'vendor.apm': 'newrelic',
        ...config.resourceAttributes,
      },
    };
  }
}

With the double call, the config.otlpHeaders spread runs twice, which can put headers in a state that's tricky to trace back to the source.

return await fn();
} catch (error) {
span.recordException(error as Error);
throw error;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The catch block records the exception but doesn't follow up with span.setStatus({ code: SpanStatusCode.ERROR }). Per the OpenTelemetry spec, recording an exception and setting the span status are two separate operations — the exception is attached as an event on the span, but the span status stays UNSET unless explicitly updated. Most trace backends (Jaeger, Grafana Tempo, Datadog, New Relic) use the span status to colour and surface error traces, so without this, spans that throw would appear successful in dashboards.

Worth noting that recordException in errors.ts already handles this correctly — withSpan just needs to match it:

import {context, trace, SpanStatusCode} from '@opentelemetry/api';

// inside the catch block:
} catch (error) {
  span.recordException(error as Error);
  span.setStatus({
    code: SpanStatusCode.ERROR,
    message: error instanceof Error ? error.message : String(error),
  });
  throw error;
} finally {
  span.end();
}

This is the one thing I'd consider a hard blocker — without it the package would make errors harder to find, which is the opposite of what observability is for.

validateObservabilityConfig(profileConfig);

const initResult = profile.initialize(profileConfig, {
createExporter: createOtlpExporter,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two related reliability gaps I wanted to flag together here.

First — if profile.initialize() throws (an instrumentation ABI mismatch, a tracerProvider.register() failure, a missing peer dep slipping past validation), updateRuntimeState({profile, config}) has already run but updateRuntimeState({runtime}) hasn't. A subsequent call to bootstrapObservability() then sees runtime === undefined, re-enters against corrupted partial state, and the original error is never logged. This can be surprisingly hard to diagnose in production.

Second — await runtime.shutdown() in shutdownObservability() isn't wrapped in a try-catch. The OTel SDK's tracerProvider.shutdown() can reject on flush timeout, which becomes an unhandled rejection in a SIGTERM handler and crashes the process in Node 15+.

// bootstrapObservability — wrap initialization in try-catch
try {
  const profile = resolveProfile(resolvedConfig.profile);
  const profileConfig = profile.applyDefaults(resolvedConfig);
  validateObservabilityConfig(profileConfig);

  const initResult = profile.initialize(profileConfig, {
    createExporter: createOtlpExporter,
  });

  const runtime: ObservabilityRuntime = {
    enabled: true,
    profile: profile.name,
    config: profileConfig,
    tracerProvider: initResult.tracerProvider,
    async shutdown() {
      await profile.shutdown?.();
      clearRuntimeState();
    },
  };

  updateRuntimeState({profile, config: profileConfig, runtime});
  return runtime;
} catch (err) {
  clearRuntimeState(); // prevent corrupted partial state on retry
  console.error('[observability] Initialization failed. Tracing will be disabled.', err);
  throw err;
}

// shutdownObservability — guard against rejection
export async function shutdownObservability(): Promise<void> {
  const runtime = getRuntimeState().runtime;
  if (!runtime) return;

  try {
    await runtime.shutdown();
  } catch (err) {
    // Log but don't rethrow — a shutdown error must not crash
    // the process during SIGTERM handling.
    console.error(
      '[observability] Error during shutdown. Some spans may not have been flushed.',
      err,
    );
  }
}

The clearRuntimeState() on failure is the key part — it ensures a retry starts clean rather than against partial state.

Copy link
Copy Markdown
Contributor

@rohit-sourcefuse rohit-sourcefuse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really impressive work here — the 3-tier config pipeline, the BaseOtlpObservabilityProfile abstraction shared cleanly across all four vendor profiles, and the lazy require() approach for optional instrumentation are all the right design choices for a package like this. The ProfileBootstrapContext injection seam is a particularly nice touch that makes testing straightforward without pulling in real OTel packages.

A few things I've flagged inline that I'd like to see addressed before this merges:

Worth resolving before merge:

  • Double applyDefaults in NewRelicObservabilityProfile, DatadogObservabilityProfile, and SignozObservabilityProfilebootstrap.ts already calls applyDefaults before initialize, so the overrides run it twice
  • SpanStatusCode.ERROR missing in withSpan catch block — without this, error spans appear successful in every trace backend (Jaeger, Datadog, Grafana Tempo, etc.)
  • No error handling around profile.initialize() in bootstrapObservability() — a failure there leaves the module singleton in a corrupted partial state with no logging; and shutdownObservability() needs a try-catch guard since tracerProvider.shutdown() can reject in a SIGTERM handler

Happy to defer to a follow-up:

  • ObservabilityProfileRegistry is wired into the component but bootstrapObservability() never consults it — a custom profile registered via LoopBack DI is silently ignored today. Worth either connecting it or leaving a comment noting it's planned for a later iteration
  • Exporter peer deps (@opentelemetry/exporter-trace-otlp-http, -grpc) don't have the same pre-flight check that instrumentation packages get in validateObservabilityConfig — a missing exporter gives a raw MODULE_NOT_FOUND instead of a helpful install prompt
  • Test coverage for the idempotency guard, shutdownObservability(), and the withSpan error rethrow path

Also worth confirming — are the prettier failures and Node 24 EBADENGINE warnings in CI pre-existing on the base branch, or introduced here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants