Skip to content

Commit 8fb2044

Browse files
Provide an OTel Logs Ingestion Example
This change uses the new format and a snapshot Data Prepper image. It can be used to validate the field mappings. The docker compose file indicate how to build a snapshot image but it is set up for after the 2.11 release. Signed-off-by: Karsten Schnitter <k.schnitter@sap.com>
1 parent 6ad193b commit 8fb2044

6 files changed

Lines changed: 174 additions & 0 deletions

File tree

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# DataPrepper Metrics Ingestion from OpenTelemetry Collector
2+
3+
This is an example of using the OpenTelemetry Collector to send log data to Data Prepper and then to OpenSearch.
4+
The Data Prepper OTLP/gRPC endpoint is exposed at port 21892.
5+
The same protocol can be used with the OpenTelemetry Collector, which listens at the OTLP default port 4317.
6+
This setup allows to compare both endpoints.
7+
The Collector will forward any data to Data Prepper for indexing in OpenSearch.
8+
9+
To generate some demo data, the OpenTelemetry Collector uses its host metrics receiver to acquire cpu and memory metrics on the machine it is running on.
10+
Additionally, it scrapes the Prometheus metrics endpoint of the Data Prepper instance.
11+
This also let's you investigate the Data Prepper metrics in OpenSearch.
12+
13+
To run:
14+
15+
1. Run `docker compose up`
16+
2. Wait for everything to come up.
17+
3. Log into OpenSearch Dashboards at <http://localhost:5601> using username `admin` and password `Developer@123`.
18+
4. Create an Index Pattern for index `otel_logs` choosing `time` as the time field.
19+
5. Inspect the data in the Discovery plugin.
20+
21+
Useful changes and additions:
22+
23+
1. The OpenTelemetry Collector has its [Logging Exporter](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/loggingexporter/README.md) in use. Changing the `loglevel` to `debug` or setting the `verbosity` to `detailed` will log all data to stdout. This is useful for troubleshooting.
24+
2. The OpenTelemetry Collector can push its own metrics to Data Prepper. Follow its documentation in [Internal telemetry](https://opentelemetry.io/docs/collector/internal-telemetry/#use-internal-telemetry-to-monitor-the-collector) for details. These metrics allow comparing the event counts between the Collector and Data Prepper.
25+
3. The OpenTelemetry Collector can be configured to translate between OTLP/HTTP and OTLP/gRPC. It can be used to proxy between sources only capable of OTLP/HTTP and Data Prepper, which only supports OTLP/gRPC.
26+
4. The OpenTelemetry Collector can receive data from the Docker host. It can attach metadata describing the containers. Unfortunately, the required processor does not work with MacOS, so this config was not provided in this example.

examples/log-ingestion-otel/dlq.log

Whitespace-only changes.
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
version: '3'
2+
services:
3+
data-prepper:
4+
# before release 2.11 you need to build your own image with
5+
# ./gradlew clean :release:docker:docker
6+
# image: opensearch-data-prepper:2.11.0-SNAPSHOT
7+
image: latest
8+
container_name: data-prepper
9+
volumes:
10+
- ./log_pipeline.yaml:/usr/share/data-prepper/pipelines/log_pipeline.yaml
11+
- ../data-prepper-config.yaml:/usr/share/data-prepper/config/data-prepper-config.yaml
12+
- ./dlq.log:/var/log/dlq.log:rw
13+
ports:
14+
- 2021:2021
15+
- 21892:21892
16+
- 4900:4900
17+
expose:
18+
- "2021"
19+
- "4900"
20+
- "21892"
21+
networks:
22+
- opensearch-net
23+
depends_on:
24+
- opensearch
25+
opensearch:
26+
container_name: opensearch
27+
image: docker.io/opensearchproject/opensearch:latest
28+
environment:
29+
- discovery.type=single-node
30+
- bootstrap.memory_lock=true # along with the memlock settings below, disables swapping
31+
- "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
32+
- "OPENSEARCH_INITIAL_ADMIN_PASSWORD=Developer@123"
33+
ulimits:
34+
memlock:
35+
soft: -1
36+
hard: -1
37+
nofile:
38+
soft: 65536 # maximum number of open files for the OpenSearch user, set to at least 65536 on modern systems
39+
hard: 65536
40+
ports:
41+
- 9200:9200
42+
- 9600:9600 # required for Performance Analyzer
43+
networks:
44+
- opensearch-net
45+
dashboards:
46+
image: docker.io/opensearchproject/opensearch-dashboards:latest
47+
container_name: opensearch-dashboards
48+
ports:
49+
- 5601:5601
50+
expose:
51+
- "5601"
52+
environment:
53+
OPENSEARCH_HOSTS: '["https://opensearch:9200"]'
54+
depends_on:
55+
- opensearch
56+
networks:
57+
- opensearch-net
58+
otel-collector:
59+
image: otel/opentelemetry-collector-contrib
60+
container_name: otel-collector
61+
command: ["--config=/etc/otel-collector-config.yml"]
62+
volumes:
63+
- ./otel-collector-config.yml:/etc/otel-collector-config.yml
64+
- ./test.log:/var/log/test.log
65+
environment:
66+
OTEL_RESOURCE_ATTRIBUTES: service.name=otel-collector
67+
ports:
68+
- 4317:4317
69+
depends_on:
70+
- data-prepper
71+
networks:
72+
- opensearch-net
73+
networks:
74+
opensearch-net:
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
log-pipeline:
2+
source:
3+
otel_logs_source:
4+
ssl: false
5+
output_format: otel
6+
processor:
7+
sink:
8+
- opensearch:
9+
hosts: [ "https://opensearch:9200" ]
10+
insecure: true
11+
username: admin
12+
password: Developer@123
13+
index: otel_logs
14+
index_type: log-analytics-plain
15+
dlq_file: /var/log/dlq.log
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
receivers:
2+
filelog:
3+
include: [ /var/log/test.log ]
4+
start_at: beginning
5+
operators:
6+
- type: json_parser
7+
body: attributes.body
8+
timestamp:
9+
parse_from: attributes.time
10+
layout: "%Y-%m-%dT%H:%M:%S.%L%z"
11+
severity:
12+
parse_from: attributes.severity
13+
trace:
14+
trace_id:
15+
parse_from: attributes.trace_id
16+
span_id:
17+
parse_from: attributes.span_id
18+
trace_flage:
19+
parse_from: attributes.trace_flags
20+
otlp:
21+
protocols:
22+
grpc:
23+
endpoint: 0.0.0.0:4317
24+
exporters:
25+
otlp/logs:
26+
endpoint: data-prepper:21892
27+
tls:
28+
insecure: true
29+
logging:
30+
debug:
31+
verbosity: detailed
32+
processors:
33+
resourcedetection/env:
34+
detectors: [env]
35+
timeout: 2s
36+
override: false
37+
attributes:
38+
actions:
39+
- key: severity
40+
action: delete
41+
- key: body
42+
action: delete
43+
- key: time
44+
action: delete
45+
- key: trace_id
46+
action: delete
47+
- key: span_id
48+
action: delete
49+
- key: trace_flags
50+
action: delete
51+
service:
52+
pipelines:
53+
logs:
54+
receivers: [otlp,filelog]
55+
processors: [attributes, resourcedetection/env]
56+
exporters: [debug, logging, otlp/logs]
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{"time": "2025-04-16T10:02:13.024Z", "severity": "INFO", "body": "first message", "test": {"key1":"value1", "key2": 23 }}
2+
{"time": "2025-04-16T10:02:13.024Z", "severity": "ERROR", "body": "This creates a field conflict on \"test\" with the first message when using the plain OTel mapping in Data Prepper.", "test": "value"}
3+

0 commit comments

Comments
 (0)