Skip to content

fix(ruby): keep OTel auto-instrumentation working on Rails 7.0#643

Draft
ccschmitz-launchdarkly wants to merge 12 commits into
mainfrom
feat/ruby-old-rails-compat
Draft

fix(ruby): keep OTel auto-instrumentation working on Rails 7.0#643
ccschmitz-launchdarkly wants to merge 12 commits into
mainfrom
feat/ruby-old-rails-compat

Conversation

@ccschmitz-launchdarkly

@ccschmitz-launchdarkly ccschmitz-launchdarkly commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Summary

The observability plugin depended on opentelemetry-instrumentation-all, whose Rails-family members raised their minimum to Rails 7.1. On Rails 7.0 every Rails-family instrumentation fails its runtime compatible? check, logs a flurry of "... failed to install" warnings, and never attaches — so the app loses all Rails auto-instrumentation (no HTTP server spans, DB spans, etc.) and gets only manual instrumentation. The meta-gem couples every instrumentation to a single version, so the fix is to stop using it.

Self-contained against main: this PR includes the boot-instrumentation foundation (previously split out as #582) plus the old-Rails compatibility fix. Supersedes #582.

Changes

Boot auto-instrumentation (formerly #582)

  • Install the OTel Rails-family instrumentations during boot via the Railtie so they attach even when the LaunchDarkly client is created lazily after Rails has booted; emit one clear warning when boot-time install was missed.

Old-Rails compatibility

  • Drop opentelemetry-instrumentation-all for individual gems. The Rails family is capped below each member's Rails-7.1-enforcing release (rails <0.42, action_pack <0.18, action_view/active_record <0.13, active_support <0.12, active_job <0.12, action_mailer <0.8, active_storage <0.5); everything else tracks the latest. Capped releases still work on Rails 7.1+, so modern apps keep the latest non-Rails instrumentations. use_all still activates any extra instrumentation gem a consumer adds.
  • InstrumentationLogFilter replaces the per-instrumentation "failed to install" flurry with one actionable summary (wraps the whole SDK.configure call, since the SDK installs instrumentations after the configure block returns).
  • Repro app e2e/ruby/rails/demo-rails70 (Rails 7.0) with a pure-Ruby in-process OTLP protobuf sink asserting traces (a server span), a log, and a captured exception are exported over the wire — runs under bundle exec rake, no Docker/Node.
  • CI: e2e-rails-legacy job in ruby-plugin.yml as a regression guard (the other e2e apps run Rails 7.2, above the floor).
  • README compatibility section + CHANGELOG.

Context

CardFlight (the original report) has since upgraded Rails, so this no longer blocks them — but it remains a valuable general regression guard so a future instrumentation-gem bump can't silently re-break Rails 7.0.

Testing

Red→green with the CI job's command (bundle exec rake in demo-rails70):

  • pre-fix (instrumentation-all → instrumentation-rails 0.42): 5 runs, 9 assertions, 4 failures
  • fixed: 5 runs, 16 assertions, 0 failures, 0 "failed to install"

demo (7.2) and api-only (7.2) pass and still resolve the latest non-Rails instrumentations (rack 0.31, redis 0.29, pg 0.36, …). Gem unit tests + rubocop green. Rebased onto main (0.2.2).

Notes

🤖 Generated with Claude Code

@abelonogov-ld abelonogov-ld left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Test app features

ccschmitz-launchdarkly and others added 12 commits June 23, 2026 14:29
The Railtie's `attach_otel_log_bridge` delegated to a module method
(`LaunchDarklyObservability.otel_logger_provider_available?`) defined in the
`class << self` block of launchdarkly_observability.rb. That block runs *after*
the `require_relative '.../rails'` near the top of the file. When the gem is
required lazily after Rails has booted, the Railtie's `config.after_initialize`
hook fires synchronously during the require — before the module method exists —
so the bridge attach failed with:

  Could not attach log bridge to Rails.logger: undefined method
  `otel_logger_provider_available?' for module LaunchDarklyObservability

Inline the availability check in the Railtie so it no longer depends on load
order, and move the `rails.rb` require to the bottom of the main file (after the
module body is fully defined) so future Railtie code can't reintroduce the same
class of bug. Adds a regression test that removes the module method to simulate
the load-order state.

Co-Authored-By: Claude <noreply@anthropic.com>
…rsions

The default instrumentation config passed `enable_recognize_route: true` to the
Rails instrumentation and `db_statement: :include` to ActiveRecord. Neither
option exists on those instrumentations (verified against the pinned gem
versions: rails 0.39.1, active_record 0.11.1, rack 0.29.0), so they were no-ops
that emitted a warning on every boot:

  Instrumentation ... ignored the following unknown configuration options [...]

Remove them. Route-based span naming (http.route) is handled automatically by
the ActionPack instrumentation, and SQL capture comes from the database adapter
instrumentations (Mysql2, PG, ...) which already obfuscate statements by default.

Co-Authored-By: Claude <noreply@anthropic.com>
The Rails e2e demo app only verified that a controller responds; nothing checked
that the OTel Rails-family instrumentations actually attached. That gap is why a
recent customer report ("Instrumentation: ... failed to install") went unnoticed
— the app initializes the plugin during Rails boot (the happy path) and never
asserted the result.

Add an integration test that asserts Rack, ActionPack, ActiveRecord,
ActiveSupport, and Rails report `installed? == true` after boot (the exact
inverse of the "failed to install" warning), and that an HTTP request produces a
server span. The plugin only configures OpenTelemetry when the LD client
registers it, which needs a non-empty SDK key, so test_helper sets a dummy key
before boot (invalid, so the client never connects).

Co-Authored-By: Claude <noreply@anthropic.com>
The OTel Rails-family instrumentations (ActionPack, ActiveRecord, ...) patch via
ActiveSupport.on_load hooks that fire while Rails boots. The plugin configured
OpenTelemetry from Plugin#register (at LDClient.new), so when an app creates the
client lazily — e.g. from a model on first request, after Rails has booted —
those hooks had already fired and every Rails instrumentation logged
"Instrumentation: ... failed to install".

Fix it in two parts:

1. Ship a hyphenated entry point (lib/launchdarkly-observability.rb) matching the
   gem name so Bundler.require auto-loads the gem — and its Railtie — during
   boot. Previously the gem was only loadable as 'launchdarkly_observability'
   (underscore), so Bundler couldn't auto-require it and users loaded it manually
   in an initializer, too late for the Railtie.

2. Add a Railtie initializer that installs auto-instrumentation during boot,
   decoupled from the LD client. project_id for the resource is resolved from
   LAUNCHDARKLY_SDK_KEY, which is present in the environment at boot in the common
   case even when the client object is built lazily. At register time the plugin
   then only attaches exporters to the existing provider instead of reconfiguring
   it (which would drop the boot-time instrumentation). It runs
   `after: :load_config_initializers` and no-ops if a provider is already
   configured, so boot-time-init apps keep their own configuration untouched.

When the client is registered after boot and boot-time install did not run (e.g.
no SDK key in the environment at boot), the plugin now logs a single actionable
warning instead of the upstream flood of "failed to install" lines.

Co-Authored-By: Claude <noreply@anthropic.com>
Add a lazy client-initialization mode to the Rails demo app (LD_LAZY_INIT=1):
the initializer skips creating the client at boot, and a LazyLdClient model
creates it on first use — mirroring apps that build the client after Rails has
booted. A new integration test boots a separate Rails process in this mode and
asserts the Rails-family instrumentations are still installed, which only holds
because the Railtie installs them during boot. Verified that removing the
boot-time install makes the test fail with "failed to install".

Co-Authored-By: Claude <noreply@anthropic.com>
…lure (red)

Reproduces the CardFlight failure: on Rails 7.0 with the current gem,
opentelemetry-instrumentation-all resolves to the latest (0.94 ->
instrumentation-rails 0.42), whose Rails-family members raised their floor to
Rails 7.1. They fail their runtime compatible? check, log a flurry of
'failed to install' warnings, and never attach, so no autoinstrumented HTTP
server span is produced.

- e2e/ruby/rails/demo-rails70: copy of demo pinned to rails ~> 7.0 (Ruby 3.3.4
  kept so bundler still resolves the latest instrumentation, faithfully matching
  a current-Ruby Rails 7.0 customer).
- test/support/otlp_sink.rb: in-process OTLP/protobuf sink (pure Ruby, decodes
  via the exporter gems' proto classes) so the e2e test asserts telemetry over
  the real export path under bundle exec rake (no Docker/Node).
- test/integration/otlp_export_e2e_test.rb: asserts traces (server span), a log
  record, and a captured exception all reach the sink. RED here.

Boot logs show 8 Rails-family 'failed to install' warnings; the new test and the
existing observability_instrumentation_test both fail (missing server span).
The observability plugin depended on opentelemetry-instrumentation-all, whose
Rails-family members raised their floor to Rails 7.1. On Rails 7.0 every
Rails-family instrumentation failed its compatible? check, logged a flurry of
'failed to install' warnings, and never attached — so apps lost all Rails
auto-instrumentation (no HTTP server spans, DB spans, etc.) and got only manual
instrumentation. The meta gem couples versions, so the only fix is to stop using
it.

- Replace opentelemetry-instrumentation-all with individual instrumentation gems
  (lib/launchdarkly_observability/instrumentations.rb). The Rails family is
  capped below each member's Rails-7.1-enforcing release (rails <0.42,
  action_pack <0.18, active_record/action_view <0.13, active_support <0.12,
  active_job <0.12, action_mailer <0.8, active_storage <0.5); everything else
  tracks the latest. These capped releases still work on Rails 7.1+, so modern
  apps are unaffected, and use_all still activates any extra instrumentation gem
  a consumer adds.
- Replace the per-instrumentation 'failed to install' log flurry with a single
  actionable summary via a logger filter (InstrumentationLogFilter), naming the
  instrumentations that could not attach and how to resolve it.

Verified on the demo-rails70 repro: 0 'failed to install', all 13
instrumentations install, and traces (server span) + a log + a captured
exception are exported to the OTLP sink. 5 runs, 16 assertions, 0 failures.
…lures correctly

- Move the log-suppression filter into its own file
  (instrumentation_log_filter.rb) with class-level capture_failures/
  failure_warning helpers (keeps OpenTelemetryConfig under the class-length
  limit and is unit-testable).
- Wrap the whole OpenTelemetry::SDK.configure call rather than just use_all:
  the SDK installs instrumentations AFTER the configure block returns, so the
  earlier placement around use_all never saw the install logging. Now the
  per-instrumentation 'failed to install' / 'successfully installed' chatter is
  actually suppressed and failures are collected for the single summary.
- rails.rb: Railtie#otel_logger_provider_available? now returns an explicit
  boolean instead of nil when the logs constant is absent. Fixes a pre-existing
  order-dependent failure in rails_railtie_test (verified failing identically on
  the unmodified gem at the same seed).

Updated e2e Gemfile.locks for the new individual-gem resolution. demo (7.2) and
api-only (7.2) suites pass; demo-rails70 (7.0) passes with 0 'failed to install'.
Adds e2e-rails-legacy to ruby-plugin.yml, running e2e/ruby/rails/demo-rails70
(Rails 7.0) under bundle exec rake. The existing e2e jobs run Rails 7.2 — above
the floor the OTel Rails-family instrumentations enforce — so they cannot catch
an old-Rails compatibility break. This job fails if Rails-family
auto-instrumentation stops attaching on Rails 7.0.

Demonstrated red-then-green with the job's exact command:
- pre-fix gemspec (instrumentation-all -> instrumentation-rails 0.42.0):
  5 runs, 9 assertions, 4 failures (ActionPack not installed, no server span)
- fixed gemspec (instrumentation-rails 0.41.0): 5 runs, 16 assertions, 0 failures
- README: replace the opentelemetry-instrumentation-all dependency note with a
  'Ruby & Rails compatibility' section (Ruby >= 3.0, Rails >= 7.0 matrix, the
  individual-gem rationale, the single-warning behavior, and the per-gem pin
  escape hatch). Update the auto-instrumentation list to the curated set.
- CHANGELOG: add an Unreleased Bug Fixes entry for the Rails 7.0 fix.
- demo-rails70/README: explain what the repro proves and how to run it.
…ncies)

rubocop's Gemspec/OrderedDependencies (run by the build CI job) requires
alphabetical order within each section. No functional change.
@ccschmitz-launchdarkly ccschmitz-launchdarkly force-pushed the feat/ruby-old-rails-compat branch from d039e6b to aaf9f8d Compare June 23, 2026 19:32
@ccschmitz-launchdarkly ccschmitz-launchdarkly changed the base branch from feat/ruby-rails-boot-instrumentation to main June 23, 2026 19:32
@github-actions

Copy link
Copy Markdown
Contributor

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
1327 1151 87% 0% 🟢

New Files

No new covered files...

Modified Files

No covered modified files...

updated for commit: aaf9f8d by action🐍

root: <%= Rails.root.join("tmp/storage") %>

local:
service: Disk

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High severity vulnerability may affect your project—review required:
Line 6 lists a dependency (activestorage) with a known High severity vulnerability.

ℹ️ Why this matters

Affected versions of activestorage are vulnerable to Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal'). Active Storage DiskService builds filesystem paths from blob keys without containing them to the storage root; a key containing traversal sequences can read, write, or delete arbitrary files if that key reaches disk storage (for example when custom keys are derived from untrusted input)

References: GHSA, CVE

To resolve this comment:
Check if you are using ActiveStorage::Service::DiskService in your Ruby application.

💬 Ignore this finding

To ignore this, reply with:

  • /fp <comment> for false positive
  • /ar <comment> for acceptable risk
  • /other <comment> for all other reasons

You can view more details on this finding in the Semgrep AppSec Platform here.

@@ -0,0 +1,33 @@
test:
service: Disk

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High severity vulnerability may affect your project—review required:
Line 2 lists a dependency (activestorage) with a known High severity vulnerability.

ℹ️ Why this matters

Affected versions of activestorage are vulnerable to Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal'). Active Storage DiskService builds filesystem paths from blob keys without containing them to the storage root; a key containing traversal sequences can read, write, or delete arbitrary files if that key reaches disk storage (for example when custom keys are derived from untrusted input)

References: GHSA, CVE

To resolve this comment:
Check if you are using ActiveStorage::Service::DiskService in your Ruby application.

💬 Ignore this finding

To ignore this, reply with:

  • /fp <comment> for false positive
  • /ar <comment> for acceptable risk
  • /other <comment> for all other reasons

You can view more details on this finding in the Semgrep AppSec Platform here.

@semgrep-code-launchdarkly

Copy link
Copy Markdown

Semgrep found 1 ssc-c8b7a1f2-4d36-4f0a-9e2b-1a5c8d7e6f30 finding:

Risk: Affected versions of vite and vite-plus are vulnerable to Exposure of Sensitive Information to an Unauthorized Actor / Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal'). Vite's server.fs.deny blocklist—which protects sensitive files such as .env and certificate files from being served—can be bypassed on Windows using alternate path representations (NTFS Alternate Data Stream syntax like /.env::$DATA?raw, or 8.3 short filenames), allowing an attacker to read otherwise-denied files when the dev server is exposed to the network.

Manual Review Advice: A vulnerability from this advisory is reachable if you expose the Vite dev server or vite-plus to the network by configuring a non-loopback address using the --host CLI flag on Windows

Fix: Upgrade this library to at least version 6.4.3 at observability-sdk/yarn.lock:47256.

Reference(s): GHSA-fx2h-pf6j-xcff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants