Skip to content

Commit 4778675

Browse files
authored
Several of the documentation pages have been moved to the OpenSearch documentation project. This commit updates the existing documentation in this project to point to the official documentation for those. (#5703)
Signed-off-by: David Venable <dlv@amazon.com>
1 parent 123a4af commit 4778675

5 files changed

Lines changed: 14 additions & 734 deletions

File tree

docs/core_apis.md

Lines changed: 3 additions & 106 deletions
Original file line numberDiff line numberDiff line change
@@ -1,108 +1,5 @@
11
# Core Data Prepper APIs
22

3-
All Data Prepper instances expose a server with some control APIs. By default, this server runs
4-
on port 4900. Some plugins, especially Source plugins may expose other servers. These will be
5-
on different ports and their configurations are independent of the core API.
6-
7-
For example, to shut down Data Prepper, you can run:
8-
9-
```
10-
curl -X POST http://localhost:4900/shutdown
11-
```
12-
13-
## APIs
14-
15-
The following APIs are available:
16-
17-
```
18-
GET /list
19-
POST /list
20-
```
21-
* lists running pipelines
22-
23-
```
24-
POST /shutdown
25-
```
26-
* starts a graceful shutdown of the Data Prepper
27-
28-
```
29-
GET /metrics/prometheus
30-
POST /metrics/prometheus
31-
```
32-
* returns a scrape of the Data Prepper metrics in Prometheus text format. This API is available provided
33-
`metrics_registries` parameter in data prepper configuration file `data-prepper-config.yaml` has `Prometheus` as one
34-
of the registry
35-
36-
```
37-
GET /metrics/sys
38-
POST /metrics/sys
39-
```
40-
* returns JVM metrics in Prometheus text format. This API is available provided `metrics_registries` parameter in data
41-
prepper configuration file `data-prepper-config.yaml` has `Prometheus` as one of the registry
42-
43-
## Configuring the Server
44-
45-
You can configure your Data Prepper core APIs through the `data-prepper-config.yaml` file.
46-
47-
### SSL/TLS Connection
48-
49-
Many of the Getting Started guides in this project disable SSL on the endpoint.
50-
51-
```yaml
52-
ssl: false
53-
```
54-
55-
To enable SSL on your Data Prepper endpoint, configure your `data-prepper-config.yaml`
56-
with the following:
57-
58-
```yaml
59-
ssl: true
60-
key_store_file_path: "/usr/share/data-prepper/keystore.p12"
61-
key_store_password: "secret"
62-
private_key_password: "secret"
63-
```
64-
65-
For more information on configuring your Data Prepper server with SSL, see [Server Configuration](https://github.com/opensearch-project/data-prepper/blob/main/docs/configuration.md#server-configuration).
66-
67-
If you are using a self-signed certificate, you can add the `-k` flag to quickly test out sending curl requests for the core APIs with SSL.
68-
69-
```
70-
curl -k -X POST https://localhost:4900/shutdown
71-
```
72-
73-
### Authentication
74-
75-
The Data Prepper Core APIs support HTTP Basic authentication.
76-
You can set the username and password with the following
77-
configuration in `data-prepper-config.yaml`:
78-
79-
```yaml
80-
authentication:
81-
http_basic:
82-
username: "myuser"
83-
password: "mys3cr3t"
84-
```
85-
86-
You can disable authentication of core endpoints using the following
87-
configuration. Use this with caution because the shutdown API and
88-
others will be accessible to anybody with network access to
89-
your Data Prepper instance.
90-
91-
```yaml
92-
authentication:
93-
unauthenticated:
94-
```
95-
96-
### Peer Forwarder
97-
Peer forwarder can be configured to enable stateful aggregation across multiple Data Prepper nodes. For more information on configuring Peer Forwarder, see [Peer Forwarder Configuration](https://github.com/opensearch-project/data-prepper/blob/main/docs/peer_forwarder.md).
98-
It is supported by `service_map`, `otel_traces` and `aggregate` processors.
99-
100-
### Shutdown Timeouts
101-
When the DataPrepper `shutdown` API is invoked, the sink and processor `ExecutorService`'s are given time to gracefully shutdown and clear any in-flight data. The default graceful shutdown timeout for these `ExecutorService`'s is 10 seconds. This can be configured with the following optional parameters:
102-
103-
```yaml
104-
processor_shutdown_timeout: "PT15M"
105-
sink_shutdown_timeout: 30s
106-
```
107-
108-
The values for these parameters are parsed into a `Duration` object via the [DataPrepperDurationDeserializer](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-core/src/main/java/org/opensearch/dataprepper/parser/DataPrepperDurationDeserializer.java).
3+
This has been moved to the OpenSearch Data Prepper documentation for
4+
[Core APIs](https://docs.opensearch.org/docs/latest/data-prepper/managing-data-prepper/core-apis/)
5+
in the **Managing OpenSearch Data Prepper** section.

docs/log_analytics.md

Lines changed: 3 additions & 196 deletions
Original file line numberDiff line numberDiff line change
@@ -1,198 +1,5 @@
11
# Log Analytics
22

3-
## Introduction
4-
5-
Data Prepper is an extendable, configurable, and scalable solution for log ingestion into OpenSearch and Amazon OpenSearch Service.
6-
Currently, Data Prepper is focused on receiving logs from [FluentBit](https://fluentbit.io/) via the
7-
[Http Source](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/http-source/README.md), and processing those logs with a [Grok Processor](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/grok-processor/README.md) before ingesting them into OpenSearch through the [OpenSearch sink](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/opensearch/README.md).
8-
9-
Here is all of the components for log analytics with FluentBit, Data Prepper, and OpenSearch:
10-
<br />
11-
<br />
12-
![Log Analytics Components](images/LogAnalyticsComponents.png)
13-
<br />
14-
<br />
15-
16-
In your application environment you will have to run FluentBit.
17-
FluentBit can be containerized through Kubernetes, Docker, or Amazon ECS.
18-
It can also be run as an agent on EC2.
19-
You should configure the [FluentBit http output plugin](https://docs.fluentbit.io/manual/pipeline/outputs/http) to export log data to Data Prepper.
20-
You will then have to deploy Data Prepper as an intermediate component and configure it to send
21-
the enriched log data to your OpenSearch cluster or Amazon OpenSearch Service domain. From there, you can
22-
use OpenSearch Dashboards to perform more intensive visualization and analysis.
23-
24-
## Log Analytics Pipeline
25-
26-
Log analytic pipelines in Data Prepper are extremely customizable. A simple pipeline is shown below.
27-
28-
![](images/Log_Ingestion_FluentBit_DataPrepper_OpenSearch.jpg)
29-
30-
## Http Source
31-
32-
The [Http Source](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/http-source/README.md) accepts log data from FluentBit.
33-
More specifically, this source accepts log data in a JSON array format.
34-
This source supports industry-standard encryption in the form of TLS/HTTPS and HTTP basic authentication.
35-
36-
## Processor
37-
38-
The Data Prepper 1.2 release will come with a [Grok Processor](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/grok-processor/README.md).
39-
The Grok Processor can be an invaluable tool to structure and extract important fields from your logs in order to make them more queryable.
40-
41-
The Grok Processor comes with a wide variety of [default patterns](https://github.com/thekrakken/java-grok/blob/master/src/main/resources/patterns/patterns) that match against common log formats like apcahe logs or syslogs,
42-
but can easily accept any custom patterns that cater to your specific log format.
43-
44-
There are a lot of complex Grok features that will not be discussed here, so please read the documentation if you are interested.
45-
46-
## OpenSearch sink
47-
48-
We have a generic sink that writes the data to OpenSearch as the destination. The [opensearch sink](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/opensearch/README.md) has configuration options related to an OpenSearch cluster like endpoint, SSL/Username, index name, index template, index state management, etc.
49-
50-
## Pipeline Configuration
51-
52-
Example `pipeline.yaml` with SSL and Basic Authentication enabled for the `http-source`:
53-
54-
55-
```yaml
56-
log-pipeline:
57-
source:
58-
http:
59-
ssl: true
60-
ssl_certificate_file: "/full/path/to/certfile.crt"
61-
ssl_key_file: "/full/path/to/keyfile.key"
62-
# The default port that will listen for incoming logs
63-
port: 2021
64-
authentication:
65-
http_basic:
66-
username: "myuser"
67-
password: "mys3cret"
68-
processor:
69-
- grok:
70-
match:
71-
# This will match logs with a "log" key against the COMMONAPACHELOG pattern (ex: { "log": "actual apache log..." } )
72-
# You should change this to match what your logs look like. See the grok documenation to get started.
73-
log: [ "%{COMMONAPACHELOG}" ]
74-
sink:
75-
- opensearch:
76-
hosts: [ "https://localhost:9200" ]
77-
# Change to your credentials
78-
username: "admin"
79-
password: "admin"
80-
# Add a certificate file if you are accessing an OpenSearch cluster with a self-signed certificate
81-
#cert: /path/to/cert
82-
# If you are connecting to an Amazon OpenSearch Service domain without
83-
# Fine-Grained Access Control, enable these settings. Comment out the
84-
# username and password above.
85-
#aws_sigv4: true
86-
#aws_region: us-east-1
87-
# Since we are grok matching for apache logs, it makes sense to send them to an OpenSearch index named apache_logs.
88-
# You should change this to correspond with how your OpenSearch indices are set up.
89-
index: apache_logs
90-
```
91-
<br></br>
92-
Example `pipeline.yaml` without SSL and Basic Authentication enabled for the `http-source`:
93-
94-
```yaml
95-
log-pipeline:
96-
source:
97-
http:
98-
# Explicitly disable SSL
99-
ssl: false
100-
# Explicitly disable authentication
101-
authentication:
102-
unauthenticated:
103-
# The default port that will listen for incoming logs
104-
port: 2021
105-
processor:
106-
- grok:
107-
match:
108-
# This will match logs with a "log" key against the COMMONAPACHELOG pattern (ex: { "log": "actual apache log..." } )
109-
# You should change this to match what your logs look like. See the grok documenation to get started.
110-
log: [ "%{COMMONAPACHELOG}" ]
111-
sink:
112-
- opensearch:
113-
hosts: [ "https://localhost:9200" ]
114-
# Change to your credentials
115-
username: "admin"
116-
password: "<admin password>"
117-
# Add a certificate file if you are accessing an OpenSearch cluster with a self-signed certificate
118-
#cert: /path/to/cert
119-
# If you are connecting to an Amazon OpenSearch Service domain without
120-
# Fine-Grained Access Control, enable these settings. Comment out the
121-
# username and password above.
122-
#aws_sigv4: true
123-
#aws_region: us-east-1
124-
# Since we are grok matching for apache logs, it makes sense to send them to an OpenSearch index named apache_logs.
125-
# You should change this to correspond with how your OpenSearch indices are set up.
126-
index: apache_logs
127-
```
128-
129-
This pipeline configuration is an example of apache log ingestion. Don't forget that you can easily configure the Grok Processor for your own custom logs.
130-
131-
You will need to modify the configuration above for your OpenSearch cluster.
132-
133-
The main changes you will need to make are:
134-
135-
* `hosts` - Set to your hosts
136-
* `index` - Change this to the OpenSearch index you want to send logs to
137-
* `username`- Provide the OpenSearch username
138-
* `password` - Provide your OpenSearch password
139-
* `aws_sigv4` - If you use Amazon OpenSearch Service with AWS signing, set this to true. It will sign requests with the default AWS credentials provider.
140-
* `aws_region` - If you use Amazon OpenSearch Service with AWS signing, set this value to your region.
141-
## FluentBit
142-
143-
You will have to run FluentBit in your service environment. You can find the installation guide of FluentBit [here](https://docs.fluentbit.io/manual/installation/getting-started-with-fluent-bit).
144-
Please ensure that you can configure the [FluentBit http output plugin](https://docs.fluentbit.io/manual/pipeline/outputs/http) to your Data Prepper Http Source. Below is an example `fluent-bit.conf` that tails a log file named `test.log` and forwards it to a locally running Data Prepper's http source, which runs
145-
by default on port 2021. Note that you should adjust the file `path`, output `Host` and `Port` according to how and where you have FluentBit and Data Prepper running.
146-
147-
Example `fluent-bit.conf` without SSL and Basic Authentication enabled on the http source:
148-
149-
```
150-
[INPUT]
151-
name tail
152-
refresh_interval 5
153-
path test.log
154-
read_from_head true
155-
156-
[OUTPUT]
157-
Name http
158-
Match *
159-
Host localhost
160-
Port 2021
161-
URI /log/ingest
162-
Format json
163-
```
164-
165-
If your http source has SSL and Basic Authentication enabled, you will need to add the details
166-
of `http_User`, `http_Passwd`, `tls.crt_file`, and `tls.key_file` to the `fluent-bit.conf` as shown below.
167-
168-
Example `fluent-bit.conf` with SSL and Basic Authentication enabled on the http source:
169-
170-
```
171-
[INPUT]
172-
name tail
173-
refresh_interval 5
174-
path test.log
175-
read_from_head true
176-
177-
[OUTPUT]
178-
Name http
179-
Match *
180-
Host localhost
181-
http_User myuser
182-
http_Passwd mys3cret
183-
tls On
184-
tls.crt_file /full/path/to/certfile.crt
185-
tls.key_file /full/path/to/keyfile.key
186-
Port 2021
187-
URI /log/ingest
188-
Format json
189-
```
190-
191-
## Next Steps
192-
193-
Follow the [Log Ingestion Demo Guide](../examples/log-ingestion/README.md) to go through a specific example of Apache log ingestion from `FluentBit -> Data Prepper -> OpenSearch` running through Docker.
194-
195-
In the future, Data Prepper will contain additional sources and processors which will make more complex log analytic pipelines available. Check out our [Roadmap](https://github.com/opensearch-project/data-prepper/projects/1) to see what is coming.
196-
197-
If there is a specifc source, processor, or sink that you would like to include in your log analytic workflow, and it is not currently on the Roadmap, please bring it to our attention by making a Github issue. Additionally, if you
198-
are interested in contributing, see our [Contribuing Guidelines](../CONTRIBUTING.md) as well as our [Developer Guide](developer_guide.md) and [Plugin Development Guide](plugin_development.md).
3+
This has been moved to the OpenSearch Data Prepper documentation for
4+
[Log Analytics](https://docs.opensearch.org/docs/latest/data-prepper/common-use-cases/log-analytics/)
5+
in the **Common use cases** section.

docs/migrating_from_opendistro.md

Lines changed: 2 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,4 @@
11
# Migrating from Open Distro Data Prepper
22

3-
Starting with Data Prepper 1.1, there is only one distribution of
4-
Data Prepper - Open Search Data Prepper. This document is here to help existing users migrate
5-
from the old Open Distro Data Prepper to OpenSearch Data Prepper.
6-
7-
8-
### Change your Pipeline Configuration
9-
10-
The `elasticsearch` sink has changed to `opensearch`. You will
11-
need to change your existing pipeline to use the `opensearch` plugin
12-
instead of `elasticsearch`.
13-
14-
Please note that while the plugin is titled `opensearch` it remains compatible
15-
with Open Distro and ElasticSearch 7.x.
16-
17-
### Update Docker Image
18-
19-
The Open Distro Data Prepper Docker image was located at `amazon/opendistro-for-elasticsearch-data-prepper`.
20-
You will need to change this value to `opensearchproject/data-prepper`.
21-
22-
## More Information
23-
24-
You can find more information about Data Prepper configurations
25-
by going to the [Getting Started](getting_started.md) guide.
3+
This has been moved to the OpenSearch Data Prepper documentation for
4+
[Migrating from Open Distro](https://docs.opensearch.org/docs/latest/data-prepper/migrate-open-distro/).

0 commit comments

Comments
 (0)