Skip to content

Commit 1bc7e06

Browse files
Provide an OTel Logs Ingestion Example
This change uses the new format and a snapshot Data Prepper image. It can be used to validate the field mappings. The docker compose file indicate how to build a snapshot image but it is set up for after the 2.11 release. Signed-off-by: Karsten Schnitter <k.schnitter@sap.com>
1 parent 6ad193b commit 1bc7e06

6 files changed

Lines changed: 171 additions & 0 deletions

File tree

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# DataPrepper Metrics Ingestion from OpenTelemetry Collector
2+
3+
This is an example of using the OpenTelemetry Collector to send log data to Data Prepper and then to OpenSearch.
4+
The Data Prepper OTLP/gRPC endpoint is exposed at port 21892.
5+
The same protocol can be used with the OpenTelemetry Collector, which listens at the OTLP default port 4317.
6+
This setup allows to compare both endpoints.
7+
The Collector will forward any data to Data Prepper for indexing in OpenSearch.
8+
9+
To generate some demo data, the OpenTelemetry Collector uses its host metrics receiver to acquire cpu and memory metrics on the machine it is running on.
10+
Additionally, it scrapes the Prometheus metrics endpoint of the Data Prepper instance.
11+
This also let's you investigate the Data Prepper metrics in OpenSearch.
12+
13+
To run:
14+
15+
1. Run `docker compose up`
16+
2. Wait for everything to come up.
17+
3. Log into OpenSearch Dashboards at <http://localhost:5601> using username `admin` and password `Developer@123`.
18+
4. Create an Index Pattern for index `otel_logs` choosing `time` as the time field.
19+
5. Inspect the data in the Discovery plugin.
20+
21+
Useful changes and additions:
22+
23+
1. The OpenTelemetry Collector has its [Logging Exporter](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/loggingexporter/README.md) in use. Changing the `loglevel` to `debug` or setting the `verbosity` to `detailed` will log all data to stdout. This is useful for troubleshooting.
24+
2. The OpenTelemetry Collector can push its own metrics to Data Prepper. Follow its documentation in [Internal telemetry](https://opentelemetry.io/docs/collector/internal-telemetry/#use-internal-telemetry-to-monitor-the-collector) for details. These metrics allow comparing the event counts between the Collector and Data Prepper.
25+
3. The OpenTelemetry Collector can be configured to translate between OTLP/HTTP and OTLP/gRPC. It can be used to proxy between sources only capable of OTLP/HTTP and Data Prepper, which only supports OTLP/gRPC.
26+
4. The OpenTelemetry Collector can receive data from the Docker host. It can attach metadata describing the containers. Unfortunately, the required processor does not work with MacOS, so this config was not provided in this example.

examples/log-ingestion-otel/dlq.log

Whitespace-only changes.
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
version: '3'
2+
services:
3+
data-prepper:
4+
image: opensearchproject/data-prepper
5+
container_name: data-prepper
6+
volumes:
7+
- ./log_pipeline.yaml:/usr/share/data-prepper/pipelines/log_pipeline.yaml
8+
- ../data-prepper-config.yaml:/usr/share/data-prepper/config/data-prepper-config.yaml
9+
- ./dlq.log:/var/log/dlq.log:rw
10+
ports:
11+
- 2021:2021
12+
- 21892:21892
13+
- 4900:4900
14+
expose:
15+
- "2021"
16+
- "4900"
17+
- "21892"
18+
networks:
19+
- opensearch-net
20+
depends_on:
21+
- opensearch
22+
opensearch:
23+
container_name: opensearch
24+
image: docker.io/opensearchproject/opensearch:latest
25+
environment:
26+
- discovery.type=single-node
27+
- bootstrap.memory_lock=true # along with the memlock settings below, disables swapping
28+
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
29+
- "OPENSEARCH_INITIAL_ADMIN_PASSWORD=Developer@123"
30+
ulimits:
31+
memlock:
32+
soft: -1
33+
hard: -1
34+
nofile:
35+
soft: 65536 # maximum number of open files for the OpenSearch user, set to at least 65536 on modern systems
36+
hard: 65536
37+
ports:
38+
- 9200:9200
39+
- 9600:9600 # required for Performance Analyzer
40+
networks:
41+
- opensearch-net
42+
dashboards:
43+
image: docker.io/opensearchproject/opensearch-dashboards:latest
44+
container_name: opensearch-dashboards
45+
ports:
46+
- 5601:5601
47+
expose:
48+
- "5601"
49+
environment:
50+
OPENSEARCH_HOSTS: '["https://opensearch:9200"]'
51+
depends_on:
52+
- opensearch
53+
networks:
54+
- opensearch-net
55+
otel-collector:
56+
image: otel/opentelemetry-collector-contrib
57+
container_name: otel-collector
58+
command: ["--config=/etc/otel-collector-config.yml"]
59+
volumes:
60+
- ./otel-collector-config.yml:/etc/otel-collector-config.yml
61+
- ./test.log:/var/log/test.log
62+
environment:
63+
OTEL_RESOURCE_ATTRIBUTES: service.name=otel-collector
64+
ports:
65+
- 4317:4317
66+
depends_on:
67+
- data-prepper
68+
networks:
69+
- opensearch-net
70+
networks:
71+
opensearch-net:
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
log-pipeline:
2+
source:
3+
otel_logs_source:
4+
ssl: false
5+
output_format: otel
6+
processor:
7+
sink:
8+
- opensearch:
9+
hosts: [ "https://opensearch:9200" ]
10+
insecure: true
11+
username: admin
12+
password: Developer@123
13+
index: otel_logs
14+
index_type: log-analytics-plain
15+
dlq_file: /var/log/dlq.log
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
receivers:
2+
filelog:
3+
include: [ /var/log/test.log ]
4+
start_at: beginning
5+
operators:
6+
- type: json_parser
7+
body: attributes.body
8+
timestamp:
9+
parse_from: attributes.time
10+
layout: "%Y-%m-%dT%H:%M:%S.%L%z"
11+
severity:
12+
parse_from: attributes.severity
13+
trace:
14+
trace_id:
15+
parse_from: attributes.trace_id
16+
span_id:
17+
parse_from: attributes.span_id
18+
trace_flage:
19+
parse_from: attributes.trace_flags
20+
otlp:
21+
protocols:
22+
grpc:
23+
endpoint: 0.0.0.0:4317
24+
exporters:
25+
otlp/logs:
26+
endpoint: data-prepper:21892
27+
tls:
28+
insecure: true
29+
logging:
30+
debug:
31+
verbosity: detailed
32+
processors:
33+
resourcedetection/env:
34+
detectors: [env]
35+
timeout: 2s
36+
override: false
37+
attributes:
38+
actions:
39+
- key: severity
40+
action: delete
41+
- key: body
42+
action: delete
43+
- key: time
44+
action: delete
45+
- key: trace_id
46+
action: delete
47+
- key: span_id
48+
action: delete
49+
- key: trace_flags
50+
action: delete
51+
service:
52+
pipelines:
53+
logs:
54+
receivers: [otlp,filelog]
55+
processors: [attributes, resourcedetection/env]
56+
exporters: [debug, logging, otlp/logs]
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{"time": "2025-04-16T10:02:13.024Z", "severity": "INFO", "body": "first message", "test": {"key1":"value1", "key2": 23 }}
2+
{"time": "2025-04-16T10:02:13.024Z", "severity": "ERROR", "body": "This creates a field conflict on \"test\" with the first message when using the plain OTel mapping in Data Prepper.", "test": "value"}
3+

0 commit comments

Comments
 (0)