Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions examples/log-ingestion-otel/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# DataPrepper Metrics Ingestion from OpenTelemetry Collector

This is an example of using the OpenTelemetry Collector to send log data to Data Prepper and then to OpenSearch.
The Data Prepper OTLP/gRPC endpoint is exposed at port 21892.
The same protocol can be used with the OpenTelemetry Collector, which listens at the OTLP default port 4317.
This setup allows to compare both endpoints.
The Collector will forward any data to Data Prepper for indexing in OpenSearch.

To generate some demo data, the OpenTelemetry Collector uses its host metrics receiver to acquire cpu and memory metrics on the machine it is running on.
Additionally, it scrapes the Prometheus metrics endpoint of the Data Prepper instance.
This also let's you investigate the Data Prepper metrics in OpenSearch.

To run:

1. Run `docker compose up`
2. Wait for everything to come up.
3. Log into OpenSearch Dashboards at <http://localhost:5601> using username `admin` and password `Developer@123`.
4. Create an Index Pattern for index `otel_logs` choosing `time` as the time field.
5. Inspect the data in the Discovery plugin.

Useful changes and additions:

1. The OpenTelemetry Collector has its [Logging Exporter](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/loggingexporter/README.md) in use. Changing the `loglevel` to `debug` or setting the `verbosity` to `detailed` will log all data to stdout. This is useful for troubleshooting.
2. The OpenTelemetry Collector can push its own metrics to Data Prepper. Follow its documentation in [Internal telemetry](https://opentelemetry.io/docs/collector/internal-telemetry/#use-internal-telemetry-to-monitor-the-collector) for details. These metrics allow comparing the event counts between the Collector and Data Prepper.
3. The OpenTelemetry Collector can be configured to translate between OTLP/HTTP and OTLP/gRPC. It can be used to proxy between sources only capable of OTLP/HTTP and Data Prepper, which only supports OTLP/gRPC.
4. The OpenTelemetry Collector can receive data from the Docker host. It can attach metadata describing the containers. Unfortunately, the required processor does not work with MacOS, so this config was not provided in this example.
Empty file.
71 changes: 71 additions & 0 deletions examples/log-ingestion-otel/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
version: '3'
services:
data-prepper:
image: opensearchproject/data-prepper
container_name: data-prepper
volumes:
- ./log_pipeline.yaml:/usr/share/data-prepper/pipelines/log_pipeline.yaml
- ../data-prepper-config.yaml:/usr/share/data-prepper/config/data-prepper-config.yaml
- ./dlq.log:/var/log/dlq.log:rw
ports:
- 2021:2021
- 21892:21892
- 4900:4900
expose:
- "2021"
- "4900"
- "21892"
networks:
- opensearch-net
depends_on:
- opensearch
opensearch:
container_name: opensearch
image: docker.io/opensearchproject/opensearch:latest
environment:
- discovery.type=single-node
- bootstrap.memory_lock=true # along with the memlock settings below, disables swapping
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
- "OPENSEARCH_INITIAL_ADMIN_PASSWORD=Developer@123"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536 # maximum number of open files for the OpenSearch user, set to at least 65536 on modern systems
hard: 65536
ports:
- 9200:9200
- 9600:9600 # required for Performance Analyzer
networks:
- opensearch-net
dashboards:
image: docker.io/opensearchproject/opensearch-dashboards:latest
container_name: opensearch-dashboards
ports:
- 5601:5601
expose:
- "5601"
environment:
OPENSEARCH_HOSTS: '["https://opensearch:9200"]'
depends_on:
- opensearch
networks:
- opensearch-net
otel-collector:
image: otel/opentelemetry-collector-contrib
container_name: otel-collector
command: ["--config=/etc/otel-collector-config.yml"]
volumes:
- ./otel-collector-config.yml:/etc/otel-collector-config.yml
- ./test.log:/var/log/test.log
environment:
OTEL_RESOURCE_ATTRIBUTES: service.name=otel-collector
ports:
- 4317:4317
depends_on:
- data-prepper
networks:
- opensearch-net
networks:
opensearch-net:
15 changes: 15 additions & 0 deletions examples/log-ingestion-otel/log_pipeline.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
log-pipeline:
source:
otel_logs_source:
ssl: false
output_format: otel
processor:
sink:
- opensearch:
hosts: [ "https://opensearch:9200" ]
insecure: true
username: admin
password: Developer@123
index: otel_logs
index_type: log-analytics-plain
dlq_file: /var/log/dlq.log
56 changes: 56 additions & 0 deletions examples/log-ingestion-otel/otel-collector-config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
receivers:
filelog:
include: [ /var/log/test.log ]
start_at: beginning
operators:
- type: json_parser
body: attributes.body
timestamp:
parse_from: attributes.time
layout: "%Y-%m-%dT%H:%M:%S.%L%z"
severity:
parse_from: attributes.severity
trace:
trace_id:
parse_from: attributes.trace_id
span_id:
parse_from: attributes.span_id
trace_flage:
parse_from: attributes.trace_flags
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
exporters:
otlp/logs:
endpoint: data-prepper:21892
tls:
insecure: true
logging:
debug:
verbosity: detailed
processors:
resourcedetection/env:
detectors: [env]
timeout: 2s
override: false
attributes:
actions:
- key: severity
action: delete
- key: body
action: delete
- key: time
action: delete
- key: trace_id
action: delete
- key: span_id
action: delete
- key: trace_flags
action: delete
service:
pipelines:
logs:
receivers: [otlp,filelog]
processors: [attributes, resourcedetection/env]
exporters: [debug, logging, otlp/logs]
3 changes: 3 additions & 0 deletions examples/log-ingestion-otel/test.log
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{"time": "2025-04-16T10:02:13.024Z", "severity": "INFO", "body": "first message", "test": {"key1":"value1", "key2": 23 }}
{"time": "2025-04-16T10:02:13.024Z", "severity": "ERROR", "body": "This creates a field conflict on \"test\" with the first message when using the plain OTel mapping in Data Prepper.", "test": "value"}

Loading