Skip to content

feat(otelplugin): add OpenTelemetry usage plugin with W3C Baggage propagation#3761

Open
quanyeomans wants to merge 1 commit into
router-for-me:devfrom
quanyeomans:feat/otelplugin-baggage
Open

feat(otelplugin): add OpenTelemetry usage plugin with W3C Baggage propagation#3761
quanyeomans wants to merge 1 commit into
router-for-me:devfrom
quanyeomans:feat/otelplugin-baggage

Conversation

@quanyeomans

Copy link
Copy Markdown

Motivation

This PR adds an opt-in OpenTelemetry usage plugin for CLIProxyAPI so any OTel-compatible backend (Honeycomb, Tempo, Jaeger, Datadog, Grafana Cloud, Azure Monitor, AWS X-Ray, …) becomes a one-config-change away from CLIProxyAPI telemetry — instead of each operator writing a backend-specific shim against the management API.

It also wires standards-aligned cross-service identity propagation per the W3C Baggage spec so caller-supplied identity (correlation ids, tenant ids, agent ids, …) lands on each upstream LLM span without per-request plumbing in the proxy core.

Design context lives in tc-agent-zone:

What the patch adds

New package at internal/otelplugin/:

File Purpose
config.go Config schema + atomic snapshot store (SetConfig, Enabled/SetEnabled); mirrors the redisqueue toggle pattern
baggage.go W3C Baggage parser + formatter + allowlist filter + context plumbing (WithBaggage, BaggageFromContext)
middleware.go Gin middleware (Middleware()) + outbound propagation helpers (PropagateBaggage, ApplyOutbound)
plugin.go usage.Plugin implementation; tracer + exporter wiring; GenAI semconv + cost attribute construction
status.go OTel codes aliases (keeps the import surface in plugin.go small)
README.md Design notes + open questions for maintainers
*_test.go 35 unit tests covering parser, middleware, attribute mapping, cost calc, disabled-plugin no-op

Plus go.mod / go.sum entries for:

  • go.opentelemetry.io/otel v1.44.0 (+ sdk/trace, exporters/otlp/otlptrace/otlptracehttp, semconv/v1.26.0)
  • golang.org/x/sys bumped from v0.38.0 (indirect) to v0.45.0 via go mod tidy to satisfy OTel transitive constraints — aligns with upstream's crypto v0.45.0 / net v0.47.0 cadence.

Design properties

  • Off by default. telemetry.otlp.enabled: false is the safe default; no behaviour change for existing operators.
  • Self-contained. Two-line downstream wiring (engine.Use(otelplugin.Middleware()) + otelplugin.SetConfig(...)); no other proxy internals modified.
  • Mirrors existing patterns. Plugin auto-registration via init() like internal/redisqueue/plugin.go. Feature flag via atomic.Bool like redisqueue.usageStatisticsEnabled. Config snapshot under atomic.Value for lock-free hot-path reads. Maintenance burden tracks an existing pattern rather than introducing a new mechanism.
  • OpenTelemetry GenAI semconv. Attribute names follow the emerging semantic conventions (gen_ai.system, gen_ai.request.model, gen_ai.usage.*) where they exist; falls back to local cost.* namespace where they don't.
  • W3C Baggage propagation policy. Three modes (off / propagate / allowlist); off is the safe default (matches today's behaviour — no header forwarding). Trusted-boundary safe by default; operators opt in to propagate or allowlist.
  • Per-request cost attribution. Optional pricing-table block produces cost.usd per span.

Test coverage

  • 35 unit tests in internal/otelplugin/:
    • Baggage parsing/formatting (baggage_test.go) — canonical multi-key, URL decode/encode, malformed input handling, empty/nil safety, allowlist normalisation
    • Middleware (middleware_test.go) — header-present roundtrip, missing-header noop, malformed-header noop, outbound propagation policies, off mode safety
    • Plugin (plugin_test.go) — GenAI semconv attribute mapping, cost-table lookup with prefix fallback, optional-field omission when zero, include-usage/include-cost toggles, disabled-plugin no-op

All 35 pass on upstream HEAD 07c607b7 (2026-06-08).

Backward compatibility

  • Opt-in. telemetry.otlp.enabled defaults to false. Without enabling, the plugin is registered but every entry point short-circuits — zero hot-path cost.
  • Mirrors the redisqueue toggle pattern already established in internal/redisqueue/plugin.go. Operators familiar with the redisqueue enable/disable flow get the same shape here.
  • No behavioural changes to existing functionality. The patch is purely additive at the package level; only go.mod / go.sum change in existing files.

Verification

git apply 0001-otelplugin.patch  # 9 new files in internal/otelplugin/
go mod tidy                       # pulls otel v1.44.0 + transitive deps
go build ./...                    # clean
go vet ./...                      # clean (warnings pre-existing in upstream)
go test -count=1 ./internal/otelplugin/...  # 35/35 pass
go test ./...                     # same 2 pre-existing failures in
                                  # internal/pluginhost (unrelated to patch)

Happy to iterate on naming, config layout, or any other aspect — this is offered as a reference implementation, not a take-it-or-leave-it patch. Feedback welcome.

…pagation

Adds internal/otelplugin/ as an opt-in `usage.Plugin` that:
- Parses W3C Baggage on inbound requests (Gin middleware)
- Allowlist-filters the baggage before downstream propagation
- Emits OTLP usage events with GenAI semconv + cost attributes
- Propagates baggage on outbound LLM provider requests
- Mirrors the redisqueue toggle pattern for runtime enable/disable

Off by default — set telemetry.otlp.enabled: true to turn it on. No
existing behaviour changes.

The plugin mirrors the redisqueue pattern: init()-time registration with
coreusage.RegisterPlugin, atomic.Bool feature flag (Enabled /
SetEnabled), config snapshot under atomic.Value for lock-free hot-path
reads.

Why
---
- Native OpenTelemetry observability: any OTel-compatible backend
  (Honeycomb, Tempo, Jaeger, Datadog, Grafana Cloud, Azure Monitor,
  AWS X-Ray, …) becomes a one-config-change away from CLIProxyAPI
  telemetry, instead of each operator writing a backend-specific shim
  against the management API.
- Standards-aligned cross-service identity propagation per the W3C
  Baggage spec. Trusted-boundary safe default (propagation: off);
  operators opt in to propagate or allowlist.
- Early adoption of the OTel GenAI semantic conventions — LLM-routing
  proxies are a category the OTel community has identified as needing
  reference implementations.
- Per-request cost attribution: optional pricing-table block produces
  cost.usd per span.

Files
-----
internal/otelplugin/
  config.go         — Config schema + SetConfig + Enabled/SetEnabled
  baggage.go        — W3C Baggage parser, formatter, allowlist filter,
                      context plumbing (WithBaggage, BaggageFromContext)
  middleware.go     — Gin middleware + outbound propagation helpers
  plugin.go         — usage.Plugin implementation + tracer wiring
  status.go         — OTel codes aliases
  *_test.go         — 35 unit tests covering parser, middleware,
                      attribute mapping, cost calc, disabled-plugin no-op

Spec + design context:
https://github.com/three-cubes/tc-agent-zone/blob/main/docs/upstream-asks/cli-proxy-api-otel-baggage.md
@github-actions github-actions Bot changed the base branch from main to dev June 8, 2026 06:10
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown

This pull request targeted main.

The base branch has been automatically changed to dev.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the otelplugin package, which exports OpenTelemetry spans for upstream LLM calls and supports W3C Baggage propagation and cost tracking. Feedback from the review highlights several critical improvements: correcting baggage encoding/decoding by using path-based escaping instead of query-based escaping to properly handle + and spaces; optimizing the hot path in tracerFor using atomic.Value to avoid global mutex contention; ensuring deterministic pricing prefix matches by sorting and caching pricing keys; and supporting multiple Baggage headers in the Gin middleware for full spec compliance.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +54 to +57
value, err := url.QueryUnescape(rawValue)
if err != nil {
value = rawValue
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using url.QueryUnescape will decode + characters as spaces. In W3C Baggage, + is a valid character (often appearing in base64-encoded values) and should not be converted to spaces. Use url.PathUnescape instead to correctly preserve + characters.

Suggested change
value, err := url.QueryUnescape(rawValue)
if err != nil {
value = rawValue
}
value, err := url.PathUnescape(rawValue)
if err != nil {
value = rawValue
}

if strings.TrimSpace(k) == "" || strings.TrimSpace(v) == "" {
continue
}
parts = append(parts, k+"="+url.QueryEscape(v))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using url.QueryEscape will encode spaces as + instead of %20. W3C Baggage values should be percent-encoded (using %20 for spaces), and + should be preserved. Use url.PathEscape instead.

Suggested change
parts = append(parts, k+"="+url.QueryEscape(v))
parts = append(parts, k+"="+url.PathEscape(v))

Comment on lines +85 to +112
var (
tracerMu sync.Mutex
tracerActive trace.Tracer
tracerEndpt string
tracerProto string
)

// tracerFor lazily constructs the global tracer + exporter on first use. The
// guard mutex serialises construction; subsequent reads are lock-free because
// trace.Tracer is itself a goroutine-safe handle.
func tracerFor(cfg *Config) trace.Tracer {
tracerMu.Lock()
defer tracerMu.Unlock()
if tracerActive != nil &&
tracerEndpt == cfg.OTLP.Endpoint &&
tracerProto == cfg.OTLP.Protocol {
return tracerActive
}
tracer, shutdown, err := buildTracer(cfg)
if err != nil {
return nil
}
sharedExporter.reset(shutdown)
tracerActive = tracer
tracerEndpt = cfg.OTLP.Endpoint
tracerProto = cfg.OTLP.Protocol
return tracerActive
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of tracerFor acquires a global mutex (tracerMu) on every single request on the hot path. This introduces a significant performance bottleneck under high concurrent load, defeating the goal of a lock-free hot path. We can optimize this by storing the active tracer state in an atomic.Value and only acquiring the mutex on the slow path when the configuration actually changes.

type tracerState struct {
	tracer   trace.Tracer
	endpoint string
	protocol string
}

var (
	tracerMu    sync.Mutex
	tracerValue atomic.Value // stores *tracerState
)

// tracerFor lazily constructs the global tracer + exporter on first use.
// It uses an atomic fast-path to avoid mutex contention on the hot path.
func tracerFor(cfg *Config) trace.Tracer {
	if state, ok := tracerValue.Load().(*tracerState); ok && state != nil {
		if state.endpoint == cfg.OTLP.Endpoint && state.protocol == cfg.OTLP.Protocol {
			return state.tracer
		}
	}

	tracerMu.Lock()
	defer tracerMu.Unlock()

	// Double-check under lock
	if state, ok := tracerValue.Load().(*tracerState); ok && state != nil {
		if state.endpoint == cfg.OTLP.Endpoint && state.protocol == cfg.OTLP.Protocol {
			return state.tracer
		}
	}

	tracer, shutdown, err := buildTracer(cfg)
	if err != nil {
		return nil
	}
	sharedExporter.reset(shutdown)

	tracerValue.Store(&tracerState{
		tracer:   tracer,
		endpoint: cfg.OTLP.Endpoint,
		protocol: cfg.OTLP.Protocol,
	})
	return tracer
}

Comment on lines +244 to +256
func costAttributes(record coreusage.Record, cfg *Config) []attribute.KeyValue {
pricing, ok := cfg.Cost.Pricing[record.Model]
if !ok {
// Try a prefix match — operators frequently use one entry per model
// family (e.g. "claude-sonnet") rather than every variant suffix.
for model, p := range cfg.Cost.Pricing {
if strings.HasPrefix(record.Model, model) {
pricing = p
ok = true
break
}
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Go map iteration order is randomized. When performing a prefix match over cfg.Cost.Pricing, if multiple configured keys match the model prefix (e.g., gpt-4 and gpt-4o both prefix-match gpt-4o-mini), the matched pricing will be non-deterministic and vary across requests. To ensure deterministic and correct behavior (longest prefix match first), we should sort the pricing keys by length in descending order. To avoid performance overhead on the hot path, we can cache the sorted keys and only rebuild them when the configuration pointer changes.

var (
	pricingKeysMu sync.Mutex
	lastConfig    *Config
	sortedPricing []string
)

func getSortedPricingKeys(cfg *Config) []string {
	pricingKeysMu.Lock()
	defer pricingKeysMu.Unlock()
	if cfg == lastConfig {
		return sortedPricing
	}
	lastConfig = cfg
	sortedPricing = make([]string, 0, len(cfg.Cost.Pricing))
	for k := range cfg.Cost.Pricing {
		sortedPricing = append(sortedPricing, k)
	}
	sort.Slice(sortedPricing, func(i, j int) bool {
		return len(sortedPricing[i]) > len(sortedPricing[j])
	})
	return sortedPricing
}

func costAttributes(record coreusage.Record, cfg *Config) []attribute.KeyValue {
	pricing, ok := cfg.Cost.Pricing[record.Model]
	if !ok {
		// Try a deterministic prefix match sorted by length descending (longest match first)
		for _, model := range getSortedPricingKeys(cfg) {
			if strings.HasPrefix(record.Model, model) {
				pricing = cfg.Cost.Pricing[model]
				ok = true
				break
			}
		}
	}

Comment on lines +3 to +7
import (
"context"
"strings"
"sync"
"time"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Add "sort" and "sync/atomic" to the imports to support deterministic pricing key sorting and lock-free tracer caching.

Suggested change
import (
"context"
"strings"
"sync"
"time"
import (
"context"
"sort"
"strings"
"sync"
"sync/atomic"
"time"

Comment on lines +28 to +34
func Middleware() gin.HandlerFunc {
return func(c *gin.Context) {
header := c.GetHeader(HeaderBaggage)
if header == "" {
c.Next()
return
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The W3C Baggage specification allows multiple Baggage headers to be sent in a single request. Using c.GetHeader only retrieves the first header value, ignoring any subsequent ones. To be fully spec-compliant, retrieve all values of the Baggage header and join them with commas.

func Middleware() gin.HandlerFunc {
	return func(c *gin.Context) {
		headers := c.Request.Header[HeaderBaggage]
		if len(headers) == 0 {
			c.Next()
			return
		}
		header := strings.Join(headers, ",")
		b := ParseBaggageHeader(header)

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 48564723a9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +27 to +28
func init() {
coreusage.RegisterPlugin(NewPlugin())

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Wire the package into server startup

This registration only runs if internal/otelplugin is imported, but I searched the repo outside internal/otelplugin and found no production import, no call to SetConfig, and no middleware registration. As a result, setting the documented telemetry.otlp.enabled: true block is ignored and the default server never registers this usage plugin or emits OTLP spans.

Useful? React with 👍 / 👎.

Comment on lines +249 to +253
for model, p := range cfg.Cost.Pricing {
if strings.HasPrefix(record.Model, model) {
pricing = p
ok = true
break

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prefer the most-specific pricing prefix

When operators configure both a broad family prefix and a more specific model prefix, e.g. claude and claude-sonnet, this map iteration can match either entry for claude-sonnet-* because Go map order is randomized. That makes cost.usd nondeterministic and can report the wrong price; choose the longest matching prefix instead of breaking on the first match.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant