Skip to content

feat: add OTel spans to NATS request operations#25

Draft
bramwelt wants to merge 4 commits into
mainfrom
tbramwell/LFXV2-1743-nats-span-instrumentation
Draft

feat: add OTel spans to NATS request operations#25
bramwelt wants to merge 4 commits into
mainfrom
tbramwell/LFXV2-1743-nats-span-instrumentation

Conversation

@bramwelt
Copy link
Copy Markdown
Contributor

@bramwelt bramwelt commented May 22, 2026

Summary

  • Adds OpenTelemetry client spans around NATS RequestMsg calls so NATS request-reply operations are visible in Datadog APM traces
  • Injects W3C trace context into NATS message headers for cross-service propagation to fga-sync
  • Uses semconv/v1.39.0 constants for all standard messaging attributes; defines a local const for messaging.message.reply.body.size (no semconv equivalent yet)
  • Respects ctx deadline — uses the shorter of the explicit timeout and ctx.Deadline(); short-circuits with context.DeadlineExceeded if the deadline is already past
  • Adds TestMessagingRepository_Request_CreatesSpan to verify span creation and attributes via in-memory exporter

Changes

  • internal/infrastructure/messaging/messaging_repository.go — span instrumentation, natsHeaderCarrier, ctx-deadline timeout handling
  • internal/infrastructure/messaging/messaging_repository_test.go — span creation test with provider save/restore
  • go.mod — promote otel/trace from indirect to direct dependency

Issue: LFXV2-1743

🤖 Generated with Claude Code

- Add package-level tracer and natsHeaderCarrier for trace context
  propagation via NATS message headers
- Wrap Request() with a nats.request client span using semconv/v1.39.0
  constants for all standard attributes (messaging.system,
  messaging.operation.type=send, destination, body size)
- Define messaging.message.reply.body.size as a package-level const for
  the reply payload size (no semconv standard attribute exists yet)
- Set span status codes.Ok on success; RecordError + codes.Error on all
  failure paths
- Respect ctx deadline: use shorter of timeout and ctx.Deadline();
  short-circuit with context.DeadlineExceeded if deadline already past
- Add TestMessagingRepository_Request_CreatesSpan to verify span
  creation and attributes via in-memory exporter; save/restore global
  TracerProvider in cleanup to avoid nil-provider panics

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-1743
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
@bramwelt bramwelt requested a review from a team as a code owner May 22, 2026 00:33
Copilot AI review requested due to automatic review settings May 22, 2026 00:33
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 22, 2026

Review Change Stack

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d0a60311-89b8-40ee-9399-aa22b6d8c8be

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

This PR integrates OpenTelemetry tracing into NATS messaging by promoting the otel/trace dependency to direct, instrumenting the messaging repository's Request method with span creation and context propagation, and validating span attributes through unit tests.

Changes

OpenTelemetry tracing for NATS messaging

Layer / File(s) Summary
OTel dependency and package-level instrumentation
go.mod, internal/infrastructure/messaging/messaging_repository.go
go.opentelemetry.io/otel/trace v1.40.0 is promoted from indirect to direct dependency. Package-level tracer, reply size attribute key, and natsHeaderCarrier adapter for trace context injection/extraction are initialized.
Request method span creation and context propagation
internal/infrastructure/messaging/messaging_repository.go
Request method creates a client span, injects trace context into NATS message headers, manages timeout by short-circuiting to the earlier of explicit timeout and context deadline, sends via RequestMsg, records reply body size attribute, and sets span status on success or records error status on failure.
Span creation and attribute validation tests
internal/infrastructure/messaging/messaging_repository_test.go
OTel test imports (sdktrace, tracetest, semconv) are added. New test TestMessagingRepository_Request_CreatesSpan uses an in-memory span exporter to verify the Request method produces a span named "nats.request" with messaging system and destination name attributes.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add OTel spans to NATS request operations' clearly summarizes the main change: adding OpenTelemetry spans to NATS request operations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request description accurately describes the changeset, detailing OpenTelemetry instrumentation additions to NATS messaging, trace context injection, semantic conventions usage, context deadline handling, and test additions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch tbramwell/LFXV2-1743-nats-span-instrumentation

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR instruments the NATS request/reply path in the messaging repository with OpenTelemetry client spans and injects W3C trace context into NATS message headers to enable cross-service trace propagation (e.g., into fga-sync).

Changes:

  • Added an OTel client span around NATS request operations, including standard messaging semantic convention attributes and reply body size.
  • Injected the current trace context into outbound NATS message headers via a TextMapCarrier adapter.
  • Added a unit test using an in-memory trace exporter to verify span creation/attributes; updated go.mod to make otel/trace a direct dependency.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
internal/infrastructure/messaging/messaging_repository.go Adds OTel span instrumentation, context-propagation injection, and ctx deadline/timeout handling for NATS requests.
internal/infrastructure/messaging/messaging_repository_test.go Adds a unit test verifying that Request creates a nats.request span with expected attributes.
go.mod Promotes go.opentelemetry.io/otel/trace from indirect to direct dependency.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/infrastructure/messaging/messaging_repository.go Outdated
@bramwelt bramwelt marked this pull request as draft May 26, 2026 16:57
Address review comment from copilot-pull-request-reviewer[bot]:

- messaging_repository.go: flatten ctx-deadline/timeout guard so that
  a caller-supplied timeout <= 0 is also caught. Previously, the nested
  `remaining < timeout` condition was false when timeout was negative
  and remaining was positive, silently passing an invalid duration to
  RequestMsg. Now: always clamp to ctx deadline when shorter, then a
  single `timeout <= 0` guard short-circuits with context.DeadlineExceeded
  before reaching NATS.

Resolves 1 review thread.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-1743
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
Copilot AI review requested due to automatic review settings May 26, 2026 17:00
@bramwelt
Copy link
Copy Markdown
Contributor Author

Review Feedback Addressed

Commit: f6b2f82

Changes Made

  • messaging_repository.go:120-132: Flattened the ctx-deadline/timeout guard so that a caller-supplied timeout <= 0 is also caught. The previous nested structure meant remaining < timeout was false when timeout was negative and remaining was positive, silently passing an invalid duration to RequestMsg. Now the ctx deadline is clamped unconditionally (when shorter), followed by a single timeout <= 0 short-circuit with context.DeadlineExceeded. (per copilot-pull-request-reviewer[bot])

Threads Resolved

1 of 1 unresolved threads addressed.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment thread internal/infrastructure/messaging/messaging_repository.go Outdated
Address review comment from copilot-pull-request-reviewer[bot]:

- messaging_repository.go: use constants.ErrMsgNATSRequestFailed as the
  span status description instead of err.Error(). Raw NATS error strings
  are high-cardinality (contain server addresses, ports, etc.) and can
  leak internal connection details into traces. RecordError(err) already
  captures the full error detail; the status description should be a
  stable, low-cardinality string.

Resolves 1 review thread.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-1743
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
@bramwelt
Copy link
Copy Markdown
Contributor Author

Review Feedback Addressed

Commit: 3eb55c0

Changes Made

  • messaging_repository.go:140: Replaced err.Error() with constants.ErrMsgNATSRequestFailed as the span status description. Raw NATS error strings are high-cardinality and can leak internal server addresses/port details into traces. RecordError(err) already captures the full error on the span event — the status description should be a stable, low-cardinality string. (per copilot-pull-request-reviewer[bot])

Threads Resolved

1 of 1 unresolved threads addressed in this iteration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants