Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .env
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,11 @@ INCLUDE_COMPOSE_EXAMPLES=docker-compose.examples.yml
INCLUDE_COMPOSE_LOCAL_OPENSEARCH=docker-compose.local-opensearch.yml
INCLUDE_COMPOSE_LOCAL_OPENSEARCH_DASHBOARDS=docker-compose.local-opensearch-dashboards.yml

OPENSEARCH_DOCKER_REPO=opensearchproject
OPENSEARCH_DOCKER_REPO=opensearchstaging


# OpenSearch Configuration
OPENSEARCH_VERSION=3.6.0
OPENSEARCH_VERSION=3.7.0
OPENSEARCH_USER=admin
OPENSEARCH_PASSWORD='My_password_123!@#'
OPENSEARCH_HOST=opensearch
Expand All @@ -24,7 +24,7 @@ OPENSEARCH_PROTOCOL=https
OPENSEARCH_JAVA_OPTS=-Xms1g -Xmx1g

# OpenSearch Dashboards Configuration
OPENSEARCH_DASHBOARDS_VERSION=3.6.0
OPENSEARCH_DASHBOARDS_VERSION=3.7.0
OPENSEARCH_DASHBOARDS_HOST=opensearch-dashboards
OPENSEARCH_DASHBOARDS_PORT=5601
OPENSEARCH_DASHBOARDS_PROTOCOL=http
Expand Down
29 changes: 29 additions & 0 deletions .env.splunk-poc.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Splunk Distribution POC credentials. Copy to .env.splunk-poc (gitignored).
#
# This POC sends to TWO Splunk products:
# 1. Splunk Observability Cloud (traces via APM, metrics via IM)
# 2. Splunk Cloud Platform (logs via HEC, searchable in Splunk Search)
#
# Splunk Observability Cloud:
# SPLUNK_ACCESS_TOKEN: Organization > Access Tokens (ingest scope)
# SPLUNK_REALM: visible in the Splunk Observability Cloud URL (us0, us1, eu0, au0)
#
# Splunk Cloud Platform:
# SPLUNK_HEC_TOKEN: Settings > Data Inputs > HTTP Event Collector > your token
# SPLUNK_HEC_URL: https://<your-stack>.splunkcloud.com:8088/services/collector
# SPLUNK_HEC_INDEX: index the token is allowed to write to (default: main)

SPLUNK_ACCESS_TOKEN=
SPLUNK_REALM=us1

SPLUNK_HEC_TOKEN=
SPLUNK_HEC_URL=https://prd-p-XXXXX.splunkcloud.com:8088/services/collector
SPLUNK_HEC_INDEX=main

# Redirect otel-demo apps through the Splunk collector.
OTEL_COLLECTOR_HOST=splunk-otel-collector

# Activate the otel-demo overlay via the base compose's include directive.
INCLUDE_COMPOSE_OTEL_DEMO=docker-compose.otel-demo.yml

SPLUNK_COLLECTOR_VERSION=latest
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,7 @@ charts/*/charts/*.tgz

# Terraform plan files
*.plan

# Splunk distribution POC
.env.splunk-poc
docker-compose/splunk-otel-collector/*.local.*
48 changes: 48 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -398,6 +398,7 @@ AI coding assistants are welcome to contribute! When contributing as an AI agent
- Note performance considerations
- Reference relevant specifications
- Keep comments up to date
- Avoid narrative prose and internal team voice (e.g., "we deliberately...") — comments are for future maintainers, not decision history

### Examples

Expand All @@ -407,6 +408,53 @@ AI coding assistants are welcome to contribute! When contributing as an AI agent
- Follow language-specific conventions
- Test examples before committing

### Writing Tenets (for agents)

Applies to all user-facing documentation: READMEs, public docs under `docs/`, migration guides, and PR descriptions.

**Framing**

- No us-vs-them. Write "Point your agents at observability-stack," not "at us." In open-source projects, the reader is part of the same community.
- Don't leak internal conversation into docs. Design-doc voice ("this is the canonical case where...", "we deliberately omitted...") belongs in PR discussion or design documents, not user-facing artifacts.
- Factual, not promotional. Avoid marketing phrases like "does ONE thing," "zero drift risk," or "honest limits."
- Acknowledge nuance via asides (`:::note` in Starlight docs) or italic notes, not prose digressions.

**Maintenance hygiene**

- Don't pin version strings. Link to `main` of upstream repos (e.g., `opentelemetry-collector-contrib/tree/main/receiver/...`), not to a specific tag. Version pins go stale.
- Don't duplicate source code in docs. Config YAML, pipeline definitions, and translation tables drift from the real source. Link to the source file instead.
- Don't maintain per-vendor translation tables beyond 3–5 canonical well-known fields. Defer to upstream receiver READMEs and vendor documentation. Positioning this repo as a schema authority creates permanent maintenance burden.
- Repo READMEs should link to public docs, not duplicate them. One source of truth per content type.

**Accuracy**

- Verify specific claims before writing them. Dates, version numbers, protocol behavior, UI terminology — check primary sources.
- If a claim cannot be verified from primary sources, phrase it more vaguely. "Modern versions support X" beats "as of v1.42, X is supported" when the version claim is unverified.
- Check existing conventions. Before using a UI name or terminology, grep the rest of the docs to see what other pages call it.
- Run the documentation build (`npm run build` in `docs/starlight-docs`) before committing doc changes. Verify internal links are valid.

**Public-doc page structure**

Pages for users migrating TO observability-stack should cover, in order:

1. Action-oriented lead (one sentence — what the reader can do)
2. Decision table when multiple paths exist ("Do I need this?" / "Which path applies?")
3. Configuration — concrete environment variables, example config, code snippet per path
4. Verify step — one-command check that it's working
5. What lands in OpenSearch — concrete example of end state (field names, index patterns)
6. Caveats — real observed gotchas surfaced from validation, not theoretical ones
7. Not covered — honest scope boundaries
8. References — upstream sources, vendor docs

**Repo READMEs** are for contributors, not migrators. Keep them short (20–40 lines for leaf READMEs; 100 max for overview). Link out to public docs for user-facing content. Include repo-local context only: config file paths, upstream receiver links, local dev workflow commands.

**Caveats from real validation are more trustworthy than theoretical ones.** When end-to-end testing reveals a gotcha (e.g., an attribute gets overwritten, a field doesn't translate), document it in the caveats section. Lead with what the user will see, not why it happens.

**Scope discipline**

- Prune aggressively when in doubt. Deletion is cheaper than maintenance.
- Don't commit working files — audit tables, compatibility matrices, session notes, TODO lists, WIP drafts. If it's not useful to a future reader with no context, it's not a docs artifact.

## Community

### Getting Help
Expand Down
102 changes: 102 additions & 0 deletions compat/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Vendor Compatibility Overlay

User-facing migration guides: https://observability.opensearch.org/docs/send-data/from-vendor/

This overlay adds a dedicated OpenTelemetry Collector that accepts Datadog, Jaeger, and Splunk HEC wire protocols, translates them to OTLP, and forwards to the base collector. No application code changes are required — vendor agents are pointed at observability-stack by changing an endpoint URL.

## Do I need this overlay?

| If your apps emit... | Need this overlay? |
|----------------------|--------------------|
| OpenTelemetry OTLP (gRPC or HTTP) | No. Send directly to the base collector on 4317 or 4318. |
| Datadog (dd-trace-*, DogStatsD) | Yes. |
| Jaeger native wire protocol (`jaeger-client-*`) | Yes. |
| Splunk HEC | Yes. |
| Jaeger via modern OpenTelemetry SDK + OTLP | No. |

## Architecture

```
vendor agents ──▶ otel-collector-compat ──OTLP──▶ otel-collector (base, unchanged) ──▶ Data Prepper / Prometheus ──▶ OpenSearch
(this overlay)

OTLP apps ─────────────────────────────────▶ (direct to base, no compat hop)
```

The compat collector uses upstream [`opentelemetry-collector-contrib`](https://github.com/open-telemetry/opentelemetry-collector-contrib) receivers. All enrichment and downstream routing happens in the base pipeline — the compat config is purely ingest + forward.

## Activation

```bash
echo "INCLUDE_COMPOSE_COMPAT=docker-compose.compat.yml" >> .env
docker compose up -d
```

Adds `otel-collector-compat` and the bundled Jaeger `hotrod` demo to the stack.

### Verify it's running

```bash
docker compose ps otel-collector-compat
curl -sI http://localhost:8126/info # HTTP 200 = Datadog receiver is live
```

## Supported vendors

| Vendor | Receiver(s) | Default ports | Repo notes |
|--------|-------------|---------------|------------|
| Datadog | `datadogreceiver`, `statsdreceiver` | 8126/tcp, 8125/udp | [vendors/datadog/](vendors/datadog/) |
| Jaeger (legacy wire protocol) | `jaegerreceiver` | 14250/tcp, 14268/tcp | [vendors/jaeger/](vendors/jaeger/) |
| Splunk HEC | `splunkhecreceiver` | 8088/tcp | [vendors/splunk/](vendors/splunk/) |

User-facing migration guides live at https://observability.opensearch.org/docs/send-data/from-vendor/.

SignalFx is not supported. The upstream `signalfxreceiver` is deprecated with explicit guidance to migrate to OTLP.

## Deployment modes

Each vendor supports greenfield, side-by-side, and full-replacement modes. See the public migration guide for each vendor for specifics.

## Port customization

Ports are remappable via environment variables. Useful when a real vendor agent already occupies the default port on the host.

| Variable | Default | Receiver |
|----------|---------|----------|
| `COMPAT_DATADOG_APM_PORT` | 8126 | Datadog trace-agent |
| `COMPAT_DATADOG_STATSD_PORT` | 8125 | DogStatsD |
| `COMPAT_JAEGER_GRPC_PORT` | 14250 | Jaeger gRPC |
| `COMPAT_JAEGER_THRIFT_HTTP_PORT` | 14268 | Jaeger Thrift HTTP |
| `COMPAT_SPLUNK_HEC_PORT` | 8088 | Splunk HEC |
| `COMPAT_COLLECTOR_MEMORY_LIMIT` | 256M | Compat collector memory limit |

## Attribute translation

Each receiver translates vendor-specific data to the OpenTelemetry data model. Translation behavior is defined by the upstream receivers. For schema details, consult:

- The upstream receiver READMEs under [`opentelemetry-collector-contrib/receiver/`](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver)
- The vendor's own instrumentation and tagging documentation

## Directory layout

```
compat/
├── README.md ← this file
├── collector/
│ ├── config.compat.yaml ← compat collector config
│ └── README.md ← compat collector design notes
└── vendors/
├── datadog/README.md ← developer notes + link to migration guide
├── jaeger/README.md
└── splunk/README.md

docker-compose.compat.yml ← overlay service definitions
```

## Adding a vendor

1. Create `vendors/<name>/README.md` with a link to the (forthcoming) public migration guide and developer notes (receiver used, config location, quick local test).
2. Add the receiver stanza to `collector/config.compat.yaml` and wire it into the appropriate pipeline(s).
3. Add port mappings to `docker-compose.compat.yml`.
4. Add a page at `docs/starlight-docs/src/content/docs/send-data/from-vendor/<name>.md` with the user migration guide.
5. Verify end-to-end: send vendor-format data → confirm it lands in OpenSearch.
48 changes: 48 additions & 0 deletions compat/collector/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Compat Collector

Developer notes for `otel-collector-compat` and [`config.compat.yaml`](./config.compat.yaml).

For migration guides and usage, see the public docs: https://observability.opensearch.org/docs/send-data/from-vendor/

## Role

Accepts vendor wire protocols on their native ports, translates to the OpenTelemetry data model via upstream [`opentelemetry-collector-contrib`](https://github.com/open-telemetry/opentelemetry-collector-contrib) receivers, and forwards OTLP to the base collector. All enrichment, filtering, and downstream routing (Data Prepper, Prometheus) happens in the base collector config.

```
vendor apps ──▶ otel-collector-compat ──OTLP──▶ otel-collector (base)
[config.compat.yaml] [unchanged]
```

The compat config contains only receivers, the `batch` processor, and an OTLP exporter pointed at the base collector. No transforms.

## Receivers

| Receiver | Upstream |
|----------|----------|
| `datadog` | [`datadogreceiver`](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/datadogreceiver) |
| `statsd` | [`statsdreceiver`](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/statsdreceiver) |
| `jaeger` | [`jaegerreceiver`](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/jaegerreceiver) |
| `splunk_hec` | [`splunkhecreceiver`](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/splunkhecreceiver) |

Pipeline wiring lives in [`config.compat.yaml`](./config.compat.yaml) under `service.pipelines`.

## Local dev workflow

Edit `config.compat.yaml`, then:

```bash
docker compose restart otel-collector-compat
docker compose logs -f otel-collector-compat
```

The `debug` exporter is wired into every pipeline. To see what's flowing through, bump its verbosity to `detailed`:

```yaml
exporters:
debug:
verbosity: detailed
```

## Resource limits

Default memory limit: 256MB (`COMPAT_COLLECTOR_MEMORY_LIMIT`). Idle usage is typically under 100MB.
78 changes: 78 additions & 0 deletions compat/collector/config.compat.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# config.compat.yaml
#
# Edge collector config. Translates vendor wire protocols to OTLP and forwards
# to the base collector, which handles enrichment and downstream export.
#
# vendor apps ──▶ otel-collector-compat ──OTLP──▶ otel-collector (base)
# (this config) (unchanged)

receivers:
# Datadog trace-agent protocol. Also accepts metrics and logs endpoints;
# all three are wired into the service pipelines below.
# See: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/datadogreceiver
datadog:
endpoint: 0.0.0.0:8126
read_timeout: 60s
trace_id_cache_size: 100

# StatsD / DogStatsD over UDP.
# See: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/statsdreceiver
statsd:
endpoint: 0.0.0.0:8125
aggregation_interval: 60s
enable_metric_type: true
is_monotonic_counter: false

# Jaeger native wire protocol (Thrift HTTP + gRPC). For legacy
# jaeger-client-* applications. Modern Jaeger apps emit OTLP and send
# directly to the base collector on 4317/4318 without this hop.
# See: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/jaegerreceiver
jaeger:
protocols:
grpc:
endpoint: 0.0.0.0:14250
thrift_http:
endpoint: 0.0.0.0:14268

# Splunk HTTP Event Collector (logs and metrics).
# See: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/splunkhecreceiver
splunk_hec:
endpoint: 0.0.0.0:8088

processors:
# Batch spans/events to reduce RPC volume against the base collector.
batch:
timeout: 10s
send_batch_size: 1024

exporters:
# Forward to the base collector. Downstream routing (Data Prepper,
# Prometheus, etc.) is configured there.
otlp_grpc:
endpoint: otel-collector:4317
tls:
insecure: true

debug:
verbosity: basic

service:
pipelines:
traces:
receivers: [datadog, jaeger]
processors: [batch]
exporters: [otlp_grpc, debug]

metrics:
receivers: [datadog, statsd, splunk_hec]
processors: [batch]
exporters: [otlp_grpc, debug]

logs:
receivers: [datadog, splunk_hec]
processors: [batch]
exporters: [otlp_grpc, debug]

telemetry:
logs:
level: info
21 changes: 21 additions & 0 deletions compat/vendors/datadog/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Datadog

Migration guide and user-facing documentation: https://observability.opensearch.org/docs/send-data/from-vendor/datadog/

## Receivers used

- [`datadogreceiver`](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/datadogreceiver) — traces, metrics, logs (8126/tcp)
- [`statsdreceiver`](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/statsdreceiver) — DogStatsD (8125/udp)

Config: [`../../collector/config.compat.yaml`](../../collector/config.compat.yaml) — `datadog:` and `statsd:` receiver blocks.

## Quick local test

```bash
# DogStatsD metric
echo "test.metric:1|c|#env:dev" | nc -u -w1 localhost 8125

# HEC-style trace payload (msgpack) — see upstream receiver README for format details
```

Traces are best exercised by pointing a `dd-trace-*` SDK application at `localhost:8126`.
Loading
Loading