Skip to content

Commit 045c7ce

Browse files
committed
feat(compat): add vendor compatibility translation edge
Adds compat/ package and docker-compose.compat.yml overlay that accepts telemetry from Datadog, Jaeger (legacy wire protocol), and Splunk HEC agents and forwards OTLP to the base observability-stack collector. Architecture: dedicated otel-collector-compat edge collector runs the upstream collector-contrib receivers for each vendor, batches, and forwards OTLP to the base collector — which handles all existing processor/exporter logic unchanged. No drift risk with the base config. Modern OTel-SDK apps (including Jaeger v1.42+) bypass the compat hop and send OTLP directly to the base collector. Activation uses the repo's existing INCLUDE_COMPOSE_* convention: echo 'INCLUDE_COMPOSE_COMPAT=docker-compose.compat.yml' >> .env docker compose up -d Pipelines wired into compat collector: - traces: [datadog, jaeger] - metrics: [datadog, statsd, splunk_hec] - logs: [datadog, splunk_hec] Scope and framing: - This is protocol-compatible ingest, not a 1:1 vendor platform replacement. - We defer to upstream collector-contrib READMEs and vendor documentation for attribute schemas and translation behavior. The vendor READMEs here document only 3-5 canonical attributes per vendor and link out to the authoritative sources. - Component receivers have varying stability tiers upstream (alpha/beta/ development). Vendor READMEs document the specifics. SignalFx deliberately omitted — upstream signalfxreceiver was deprecated 2026-02-13 with explicit guidance to migrate to OTLP. Bundles Jaeger's hotrod demo on port 8080 as a built-in OTLP trace generator (Jaeger's canonical demo, pointed at the base collector). Signed-off-by: Kyle Hounslow <kylhouns@amazon.com>
1 parent aa0fb14 commit 045c7ce

8 files changed

Lines changed: 589 additions & 0 deletions

File tree

compat/README.md

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
# Vendor Compatibility Overlay
2+
3+
Accept telemetry from Datadog, Jaeger, and Splunk HEC agents without re-instrumenting your applications. The compat overlay adds a dedicated edge collector that translates vendor wire protocols to OpenTelemetry Protocol (OTLP) and forwards to the base observability-stack collector.
4+
5+
## What this is — and what it isn't
6+
7+
**What it is:**
8+
- Protocol-compatible ingest for Datadog, Jaeger (legacy wire protocol), and Splunk HEC
9+
- A translation edge that hands off to the standard observability-stack pipeline
10+
- A quick-start for evaluating observability-stack as a destination for your existing vendor agents
11+
12+
**What it isn't:**
13+
- Not a 1:1 feature replacement for any vendor platform (no Datadog APM UI, no Splunk apps, no Jaeger query UI)
14+
- Not a semantic-convention authority — attribute translation behavior is defined by upstream OTel receivers, not by us. We defer to vendor documentation and upstream [`opentelemetry-collector-contrib`](https://github.com/open-telemetry/opentelemetry-collector-contrib) for schema questions.
15+
- Not production-hardened — the component receivers have varying stability tiers (`alpha`, `beta`, `development`) upstream. See each vendor page and the upstream receiver README for current maturity.
16+
17+
## Architecture
18+
19+
```
20+
┌─────────────────────────────────────────────────┐
21+
vendor agents ───┐ │ otel-collector-compat (added by this overlay) │
22+
(Datadog, │ │ receivers: datadog, jaeger, splunk_hec, statsd │
23+
Jaeger legacy, ├───▶│ exporters: otlp → otel-collector:4317 │
24+
Splunk HEC) │ └─────────────────────────────────────────────────┘
25+
│ │
26+
modern OTLP ──────┘ │ OTLP
27+
apps (incl. ▼
28+
Jaeger v1.42+) ┌─────────────────────────────────────────┐
29+
│ otel-collector (base — unchanged) │
30+
│ all processing and downstream routing │
31+
│ exports: data-prepper, prometheus │
32+
└─────────────────────────────────────────┘
33+
34+
35+
OpenSearch / Prometheus / OSD
36+
```
37+
38+
**Why a separate edge collector?**
39+
40+
- Zero configuration drift with the base collector. Edge config stays minimal (translate + forward).
41+
- Modern OTel/OTLP apps bypass the compat hop entirely — they send OTLP directly to the base collector.
42+
- Separate failure domain — a misbehaving vendor receiver can't affect the primary observability path.
43+
- Matches OpenTelemetry's reference pattern: translation tier at the edge, central collector for enrichment and export.
44+
45+
## Quickstart
46+
47+
```bash
48+
# Activate the overlay via the repo's standard INCLUDE_COMPOSE_* convention
49+
echo "INCLUDE_COMPOSE_COMPAT=docker-compose.compat.yml" >> .env
50+
docker compose up -d
51+
52+
# Trigger a trace via the bundled Jaeger hotrod demo (OTLP path)
53+
curl http://localhost:8080/dispatch?customer=123
54+
55+
# Send a Splunk HEC log event (compat path)
56+
curl -X POST http://localhost:8088/services/collector \
57+
-H "Authorization: Splunk any-token" \
58+
-d '{"event":"hello from splunk","sourcetype":"manual"}'
59+
60+
# View everything in OpenSearch Dashboards
61+
open http://localhost:5601
62+
```
63+
64+
## Supported vendors
65+
66+
| Vendor | Receiver(s) | Port(s) on compat collector | See |
67+
|--------|-------------|----------------------------|-----|
68+
| Datadog | `datadogreceiver`, `statsdreceiver` | 8126/tcp, 8125/udp | [vendors/datadog/](vendors/datadog/) |
69+
| Jaeger (legacy wire protocol) | `jaegerreceiver` | 14250/grpc, 14268/http | [vendors/jaeger/](vendors/jaeger/) |
70+
| Splunk HEC | `splunkhecreceiver` | 8088/tcp | [vendors/splunk/](vendors/splunk/) |
71+
72+
**Modern Jaeger users (v1.42+) bypass the compat collector** and send OTLP directly to the base collector on 4317/4318. See [vendors/jaeger/](vendors/jaeger/) for why.
73+
74+
## Source of truth for schemas
75+
76+
We intentionally do not maintain attribute translation tables or schema documentation. The authoritative references for how each vendor's data maps to OTel are:
77+
78+
- The upstream receiver README (e.g., [`datadogreceiver`](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.151.0/receiver/datadogreceiver/README.md))
79+
- The vendor's own instrumentation documentation (e.g., [Datadog unified tagging](https://docs.datadoghq.com/getting_started/tagging/unified_service_tagging/))
80+
- The shared OTel translator packages where applicable (e.g., [`pkg/translator/jaeger`](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.151.0/pkg/translator/jaeger))
81+
82+
Our vendor READMEs link to these directly and document only the 3-5 canonical attributes every user of that vendor sees.
83+
84+
## Deployment modes
85+
86+
All three vendor integrations support the same migration progression:
87+
88+
1. **Greenfield** — only observability-stack runs on the target port
89+
2. **Side-by-side** — run your vendor agent AND observability-stack simultaneously (remap ports via env vars below) for A/B validation
90+
3. **Full replacement** — decommission the vendor agent, observability-stack takes over
91+
92+
## Directory layout
93+
94+
```
95+
compat/
96+
├── README.md ← you are here
97+
├── collector/
98+
│ ├── config.compat.yaml ← compat collector config (translate + forward)
99+
│ └── README.md ← architecture notes
100+
└── vendors/
101+
├── datadog/README.md
102+
├── jaeger/README.md
103+
└── splunk/README.md
104+
105+
docker-compose.compat.yml ← overlay (adds otel-collector-compat + hotrod)
106+
```
107+
108+
## Port customization
109+
110+
All vendor ports are remappable via env vars. Useful when the host already has a real vendor agent bound to the default port.
111+
112+
| Variable | Default | Receiver |
113+
|----------|---------|---------|
114+
| `COMPAT_DATADOG_APM_PORT` | 8126 | Datadog trace-agent |
115+
| `COMPAT_DATADOG_STATSD_PORT` | 8125 | DogStatsD |
116+
| `COMPAT_JAEGER_GRPC_PORT` | 14250 | Jaeger gRPC |
117+
| `COMPAT_JAEGER_THRIFT_HTTP_PORT` | 14268 | Jaeger Thrift HTTP |
118+
| `COMPAT_SPLUNK_HEC_PORT` | 8088 | Splunk HEC |
119+
| `COMPAT_COLLECTOR_MEMORY_LIMIT` | 256M | Compat collector memory limit |
120+
121+
## Adding a new vendor
122+
123+
1. Create `vendors/<name>/README.md` (see existing vendor pages as templates)
124+
2. Add the receiver stanza to `collector/config.compat.yaml` and wire it into the appropriate pipeline
125+
3. Add port mappings to `docker-compose.compat.yml`
126+
127+
## Related
128+
129+
- [OpenTelemetry collector-contrib receivers (v0.151.0)](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.151.0/receiver)

compat/collector/README.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# Compat Collector Config
2+
3+
`config.compat.yaml` is the config for a **dedicated edge collector** (`otel-collector-compat`) that runs alongside the base collector when the compat overlay is active.
4+
5+
## Design: translate and forward
6+
7+
The compat collector does ONE thing: accepts vendor wire protocols and forwards OTLP to the base collector. It does not run transforms, does not export to Data Prepper, does not export to Prometheus — all of that stays in the base collector's config.
8+
9+
```
10+
vendor apps ──▶ otel-collector-compat ──OTLP──▶ otel-collector (base)
11+
[this config] [unchanged]
12+
```
13+
14+
This means:
15+
16+
- **Zero drift risk** — our config is ~100 lines of receiver stanzas + one forwarder. If the base config's processors change, we don't need to update this.
17+
- **One place for business logic** — the base collector's config is the source of truth for all enrichment, filtering, and downstream routing.
18+
- **Separate failure domains** — a broken vendor receiver can only crash the compat collector; base observability keeps working.
19+
20+
## How activation works
21+
22+
The compat overlay activates via the repo's standard `.env` convention:
23+
24+
```bash
25+
echo "INCLUDE_COMPOSE_COMPAT=docker-compose.compat.yml" >> .env
26+
docker compose up -d
27+
```
28+
29+
Docker Compose's `include:` directive reads the env var and pulls the overlay in, which adds `otel-collector-compat` as a new service alongside the base stack.
30+
31+
## Receiver reference
32+
33+
| Receiver | Config key | Pipeline(s) | Port | Upstream stability |
34+
|----------|-----------|-------------|------|-------------------|
35+
| Datadog APM | `datadog` | traces | 8126/tcp | alpha |
36+
| DogStatsD | `statsd` | metrics | 8125/udp | beta |
37+
| Jaeger | `jaeger` | traces | 14250/grpc, 14268/http | beta |
38+
| Splunk HEC | `splunk_hec` | logs, metrics | 8088/tcp | beta |
39+
40+
> **Note:** `signalfxreceiver` is deprecated upstream (as of 2026-02-13). SignalFx migration path is OTLP, not this compat layer.
41+
42+
## Pipeline wiring
43+
44+
All vendor receivers route to a single `otlp` exporter pointed at the base collector:
45+
46+
```yaml
47+
service:
48+
pipelines:
49+
traces:
50+
receivers: [datadog, jaeger]
51+
processors: [batch]
52+
exporters: [otlp, debug]
53+
54+
metrics:
55+
receivers: [statsd, splunk_hec]
56+
processors: [batch]
57+
exporters: [otlp, debug]
58+
59+
logs:
60+
receivers: [splunk_hec]
61+
processors: [batch]
62+
exporters: [otlp, debug]
63+
```
64+
65+
`splunk_hec` is wired into BOTH logs and metrics pipelines per upstream's recommendation — it routes events to the correct signal type internally.
66+
67+
The `batch` processor is present to avoid hammering the base collector with one RPC per span/event. No other processing happens here.
68+
69+
## Resource usage
70+
71+
The compat collector is intentionally lightweight. Default memory limit is 256MB (configurable via `COMPAT_COLLECTOR_MEMORY_LIMIT`). In practice, idle memory is well under 100MB.
Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
# config.compat.yaml
2+
#
3+
# OTel Collector config for the COMPAT EDGE collector — a dedicated translation
4+
# tier that accepts vendor wire protocols and forwards OTLP to the base collector.
5+
#
6+
# Architecture:
7+
#
8+
# vendor apps ──▶ otel-collector-compat ──OTLP──▶ otel-collector (base)
9+
# (this config) (unchanged)
10+
#
11+
# The base collector does all the real work (processors, Data Prepper export,
12+
# Prometheus export, etc.). This one just translates and forwards. Keeping it
13+
# minimal means zero drift risk with the base config.
14+
#
15+
# Source of truth for receiver configs: collector-contrib v0.151.0 upstream READMEs.
16+
17+
receivers:
18+
# === Datadog APM (datadogreceiver) ===
19+
# Accepts Datadog trace-agent protocol on port 8126.
20+
# Source: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.151.0/receiver/datadogreceiver/README.md
21+
# Upstream stability: alpha (traces); metrics endpoints "Development" status
22+
datadog:
23+
endpoint: 0.0.0.0:8126
24+
read_timeout: 60s
25+
trace_id_cache_size: 100
26+
27+
# === DogStatsD / StatsD metrics (statsdreceiver) ===
28+
# Accepts StatsD protocol on port 8125/udp. Handles DogStatsD tag extensions.
29+
# Source: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.151.0/receiver/statsdreceiver
30+
# Upstream stability: beta
31+
statsd:
32+
endpoint: 0.0.0.0:8125
33+
aggregation_interval: 60s
34+
enable_metric_type: true
35+
is_monotonic_counter: false
36+
37+
# === Jaeger (jaegerreceiver) ===
38+
# Accepts Jaeger native wire protocols on standard Jaeger collector ports.
39+
# Source: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.151.0/receiver/jaegerreceiver
40+
# Upstream stability: beta
41+
#
42+
# NOTE: As of Jaeger v1.42 (Feb 2023), Jaeger itself emits OTLP — modern Jaeger
43+
# users should send OTLP directly to the base collector on 4317/4318 (no compat
44+
# hop needed). This receiver is for LEGACY apps still on archived
45+
# jaeger-client-{go,java,python} libraries.
46+
jaeger:
47+
protocols:
48+
grpc:
49+
endpoint: 0.0.0.0:14250
50+
thrift_http:
51+
endpoint: 0.0.0.0:14268
52+
53+
# === Splunk HEC (splunkhecreceiver) ===
54+
# Accepts Splunk HTTP Event Collector format on port 8088.
55+
# Source: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.151.0/receiver/splunkhecreceiver
56+
# Upstream stability: beta (metrics, logs)
57+
splunk_hec:
58+
endpoint: 0.0.0.0:8088
59+
60+
# NOTE: signalfxreceiver is deprecated upstream (as of 2026-02-13).
61+
# Upstream migration path is OTLP, not translation through a deprecated receiver.
62+
# Deliberately omitted from this compat layer.
63+
64+
processors:
65+
# Minimal processing — the base collector does the real work. We just batch
66+
# to avoid hammering the base collector with one RPC per span.
67+
batch:
68+
timeout: 10s
69+
send_batch_size: 1024
70+
71+
exporters:
72+
# Forward everything to the base collector as OTLP. All downstream routing
73+
# (to Data Prepper, Prometheus, etc.) is handled there.
74+
otlp_grpc:
75+
endpoint: otel-collector:4317
76+
tls:
77+
insecure: true
78+
79+
debug:
80+
verbosity: basic
81+
82+
service:
83+
pipelines:
84+
traces:
85+
receivers: [datadog, jaeger]
86+
processors: [batch]
87+
exporters: [otlp_grpc, debug]
88+
89+
metrics:
90+
receivers: [datadog, statsd, splunk_hec]
91+
processors: [batch]
92+
exporters: [otlp_grpc, debug]
93+
94+
logs:
95+
receivers: [datadog, splunk_hec]
96+
processors: [batch]
97+
exporters: [otlp_grpc, debug]
98+
99+
telemetry:
100+
logs:
101+
level: info

compat/vendors/datadog/README.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Datadog → observability-stack
2+
3+
Point Datadog-instrumented apps at observability-stack by changing one environment variable.
4+
5+
## Protocol support
6+
7+
observability-stack runs the upstream [`datadogreceiver`](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.151.0/receiver/datadogreceiver/README.md) and [`statsdreceiver`](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.151.0/receiver/statsdreceiver) from `opentelemetry-collector-contrib` v0.151.0.
8+
9+
| Signal | Receiver | Port | Upstream endpoint status |
10+
|--------|----------|------|--------------------------|
11+
| Traces | `datadogreceiver` | 8126/tcp | Alpha |
12+
| Metrics | `datadogreceiver` | 8126/tcp | Development |
13+
| Logs | `datadogreceiver` | 8126/tcp | Development |
14+
| DogStatsD metrics | `statsdreceiver` | 8125/udp | Beta |
15+
16+
"Development" / "Alpha" / "Beta" are stability tiers documented in the upstream receiver README. Refer to it for the full API-endpoint support matrix and current maturity.
17+
18+
## Pointing your agents at us
19+
20+
Datadog tracer libraries (`dd-trace-py`, `dd-trace-java`, `dd-trace-go`, `dd-trace-rb`, `dd-trace-js`, `dd-trace-dotnet`, `dd-trace-php`) all support endpoint override via environment variables:
21+
22+
| Variable | Default | Notes |
23+
|----------|---------|-------|
24+
| `DD_AGENT_HOST` | `localhost` | Agent hostname |
25+
| `DD_TRACE_AGENT_PORT` | `8126` | Agent trace port |
26+
| `DD_TRACE_AGENT_URL` || Full URL override |
27+
28+
For DogStatsD clients, point them at port 8125/udp on the host running observability-stack.
29+
30+
For DD-tracer configuration details, see each tracer library's documentation directly — Datadog owns those APIs.
31+
32+
## Canonical attribute mapping
33+
34+
These three Datadog identifiers map to the OTel equivalents that the upstream `datadogreceiver` documents as default behavior. For every other attribute, defer to the upstream receiver README and Datadog's own [tagging documentation](https://docs.datadoghq.com/getting_started/tagging/).
35+
36+
| Datadog | OTel semantic convention |
37+
|---------|--------------------------|
38+
| `service` | `service.name` |
39+
| `env` | `deployment.environment` |
40+
| `version` | `service.version` |
41+
42+
## Deployment modes
43+
44+
1. **Greenfield** — observability-stack on port 8126
45+
2. **Side-by-side** — real Datadog Agent on 8126, observability-stack on a remapped port via `COMPAT_DATADOG_APM_PORT=8127`. Useful for pre-migration validation.
46+
3. **Full replacement** — Datadog Agent removed, observability-stack on 8126
47+
48+
## What isn't covered
49+
50+
- Live processes, profiling, network monitoring — no open-source OTel receiver exists
51+
- Synthetic monitoring — not observability ingest
52+
- Datadog UI features (notebooks, SLOs, monitors) — use the OpenSearch Dashboards equivalents
53+
54+
## Caveats
55+
56+
- All paths are **Alpha** stability or below upstream. Don't push real Datadog production traffic through without evaluating what fidelity loss is acceptable for your use case.
57+
- 128-bit trace IDs from Datadog instrumented services are **gated behind a feature flag** (`receiver.datadogreceiver.Enable128BitTraceID`), disabled by default. Traces that begin in an OTel-instrumented service then hit a Datadog-instrumented service may not correlate correctly without this flag.
58+
- Metric temporality may need a [`deltatocumulativeprocessor`](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.151.0/processor/deltatocumulativeprocessor) in the pipeline for backends expecting cumulative temporality. Validate before production use.
59+
- For the full list of supported Datadog API endpoints and per-endpoint caveats, read the [upstream receiver README](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.151.0/receiver/datadogreceiver/README.md).
60+
61+
## Source of truth
62+
63+
For attribute schemas, mapping rules, and translation behavior beyond the canonical three above, defer to:
64+
65+
- [Upstream `datadogreceiver` README (v0.151.0)](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.151.0/receiver/datadogreceiver/README.md)
66+
- [Upstream `statsdreceiver` README (v0.151.0)](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.151.0/receiver/statsdreceiver)
67+
- [Datadog unified tagging](https://docs.datadoghq.com/getting_started/tagging/unified_service_tagging/)

0 commit comments

Comments
 (0)