Skip to content

Commit 6e7d8e9

Browse files
authored
Parser plugin docs updates (#2371)
* docs: parsers: configuring-parser: improve config parameter table - Sort parameter table alphabetically - Add Default column with values for each parameter - Add missing logfmt_no_bare_keys parameter - Add missing mysql_quoted decoder type to decode_field and decode_field_as - Fix Time_System_timezone case to time_system_timezone - Lowercase all parameter names to match repo YAML convention - Remove inaccurate format restriction on types parameter Applies to #2370 Signed-off-by: Eric D. Schabell <eric@schabell.org> * docs: parsers: decoders: improve structure and config tables - Restructure sections to Configuration parameters before Examples - Sort decoder options table alphabetically - Sort optional actions table alphabetically - Add missing mysql_quoted decoder type - Lowercase parameter references to match repo YAML convention Applies to #2370 Signed-off-by: Eric D. Schabell <eric@schabell.org> * docs: parsers: json: add link to configuring-parser Applies to #2370 Signed-off-by: Eric D. Schabell <eric@schabell.org> * docs: parsers: logfmt: add config parameter table and link - Add link to configuring-parser for common parameters - Add format-specific parameter table for logfmt_no_bare_keys Applies to #2370 Signed-off-by: Eric D. Schabell <eric@schabell.org> * docs: parsers: regular-expression: fix config table and formatting - Add link to configuring-parser for common parameters - Lowercase skip_empty_values and its default to match repo convention - Fix {% end hint %} to {% endhint %} Applies to #2370 Signed-off-by: Eric D. Schabell <eric@schabell.org> * docs: parsers: multiline-parsing: add missing parameters - Add key_group parameter for stream grouping - Add key_pattern parameter for alternative match field Applies to #2370 Signed-off-by: Eric D. Schabell <eric@schabell.org> * docs: parsers: ltsv: add link to configuring-parser Applies to #2370 Signed-off-by: Eric D. Schabell <eric@schabell.org> --------- Signed-off-by: Eric D. Schabell <eric@schabell.org>
1 parent c8f31de commit 6e7d8e9

7 files changed

Lines changed: 75 additions & 51 deletions

File tree

pipeline/parsers/configuring-parser.md

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -12,21 +12,22 @@ To define a custom parser, add an entry to the [`parsers` section](../../adminis
1212

1313
Custom parsers support the following configuration parameters:
1414

15-
| Key | Description |
16-
| --- | ----------- |
17-
| `Name` | Sets the name of your parser. |
18-
| `Format` | Specifies the format of the parser. Possible options: [`json`](json.md), [`regex`](regular-expression.md), [`ltsv`](ltsv.md), or [`logfmt`](logfmt.md). |
19-
| `Regex` | Required for parsers with the `regex` format. Specifies the Ruby regular expression for parsing and composing the structured message. |
20-
| `Time_Key` | If the log entry provides a field with a timestamp, this option specifies the name of that field. |
21-
| `Time_Format` | Specifies the format of the time field so it can be recognized and analyzed properly. Fluent Bit uses `strptime(3)` to parse time. See the [`strptime` documentation](https://linux.die.net/man/3/strptime) for available modifiers. The `%L` field descriptor is supported for fractional seconds. |
22-
| `Time_Offset` | Specifies a fixed UTC time offset (such as `-0600` or `+0200`) for local dates. |
23-
| `Time_Keep` | If enabled, when a time key is recognized and parsed, the parser will keep the original time key. If disabled, the parser will drop the original time field. |
24-
| `Time_System_timezone` | If there is no time zone (`%z`) specified in the given `Time_Format`, enabling this option will make the parser detect and use the system's configured time zone. The configured time zone is detected from the [`TZ` environment variable](https://www.gnu.org/software/libc/manual/html_node/TZ-Variable.html). |
25-
| `Types` | Specifies the data type of parsed field. The syntax is `types <field_name_1>:<type_name_1> <field_name_2>:<type_name_2> ...`. The supported types are `string` (default), `integer`, `bool`, `float`, `hex`. The option is supported by `ltsv`, `logfmt` and `regex`. |
26-
| `Decode_Field` | If the content can be decoded in a structured message, append the structured message (keys and values) to the original log message. Decoder types: `json`, `escaped`, `escaped_utf8`. The syntax is: `Decode_Field <decoder_type> <field_name>`. See [Decoders](decoders.md) for additional information. |
27-
| `Decode_Field_As` | Any decoded content (unstructured or structured) will be replaced in the same key/value, and no extra keys are added. Decoder types: `json`, `escaped`, `escaped_utf8`. The syntax is: `Decode_Field_As <decoder_type> <field_name>`. See [Decoders](decoders.md) for additional information. |
28-
| `Skip_Empty_Values` | Specifies a boolean which determines if the parser should skip empty values. The default is `true`. |
29-
| `Time_Strict` | The default value (`true`) tells the parser to be strict with the expected time format. With this option set to false, the parser will be permissive with the format of the time. You can use this when the format expects time fraction but the time to be parsed doesn't include it. |
15+
| Key | Description | Default |
16+
| --- | ----------- | ------- |
17+
| `decode_field` | If the content can be decoded in a structured message, append the structured message (keys and values) to the original log message. Decoder types: `json`, `escaped`, `escaped_utf8`, `mysql_quoted`. The syntax is: `decode_field <decoder_type> <field_name>`. See [Decoders](decoders.md) for additional information. | _none_ |
18+
| `decode_field_as` | Any decoded content (unstructured or structured) will be replaced in the same key/value, and no extra keys are added. Decoder types: `json`, `escaped`, `escaped_utf8`, `mysql_quoted`. The syntax is: `decode_field_as <decoder_type> <field_name>`. See [Decoders](decoders.md) for additional information. | _none_ |
19+
| `format` | Specifies the format of the parser. Possible options: [`json`](json.md), [`regex`](regular-expression.md), [`ltsv`](ltsv.md), or [`logfmt`](logfmt.md). | _none_ |
20+
| `logfmt_no_bare_keys` | If enabled, the `logfmt` parser rejects log entries where keys don't have associated values (bare keys). Only applies to the `logfmt` format. | `false` |
21+
| `name` | Sets the name of your parser. | _none_ |
22+
| `regex` | Required for parsers with the `regex` format. Specifies the Ruby regular expression for parsing and composing the structured message. | _none_ |
23+
| `skip_empty_values` | Specifies a boolean which determines if the parser should skip empty values. | `true` |
24+
| `time_format` | Specifies the format of the time field so it can be recognized and analyzed properly. Fluent Bit uses `strptime(3)` to parse time. See the [`strptime` documentation](https://linux.die.net/man/3/strptime) for available modifiers. The `%L` field descriptor is supported for fractional seconds. | _none_ |
25+
| `time_keep` | If enabled, when a time key is recognized and parsed, the parser will keep the original time key. If disabled, the parser will drop the original time field. | `false` |
26+
| `time_key` | If the log entry provides a field with a timestamp, this option specifies the name of that field. | _none_ |
27+
| `time_offset` | Specifies a fixed UTC time offset (such as `-0600` or `+0200`) for local dates. | _none_ |
28+
| `time_strict` | If `true`, the parser is strict with the expected time format. If `false`, the parser is permissive with the format of the time. Set to `false` when the format expects a time fraction but the time to be parsed doesn't include it. | `true` |
29+
| `time_system_timezone` | If there is no time zone (`%z`) specified in the given `time_format`, enabling this option will make the parser detect and use the system's configured time zone. The configured time zone is detected from the [`TZ` environment variable](https://www.gnu.org/software/libc/manual/html_node/TZ-Variable.html). | `false` |
30+
| `types` | Specifies the data type of a parsed field. The syntax is `types <field_name_1>:<type_name_1> <field_name_2>:<type_name_2> ...`. The supported types are `string` (default), `integer`, `bool`, `float`, `hex`. | _none_ |
3031

3132
### Time resolution and fractional seconds
3233

pipeline/parsers/decoders.md

Lines changed: 36 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -20,12 +20,43 @@ The original message is handled as an escaped string. Fluent Bit will use the or
2020

2121
Decoders are a built-in feature of parsers in Fluent Bit. Each parser definition can optionally set one or more decoders. Select from one of these decoder types:
2222

23-
- `Decode_Field`: If the content can be decoded in a structured message, append
23+
- `decode_field`: If the content can be decoded in a structured message, append
2424
the structured message (keys and values) to the original log message.
25-
- `Decode_Field_As`: Any decoded content (unstructured or structured) will be
25+
- `decode_field_as`: Any decoded content (unstructured or structured) will be
2626
replaced in the same key/value, and no extra keys are added.
2727

28-
For example, the predefined Docker parser has the following definition:
28+
Each line in the parser with a key `decode_field` instructs the parser to apply a specific decoder on a given field. Optionally, it offers the option to take an extra action if the decoder doesn't succeed.
29+
30+
## Configuration parameters
31+
32+
### Decoder options
33+
34+
| Name | Description |
35+
| ---- | ----------- |
36+
| `escaped` | Decode an escaped string. |
37+
| `escaped_utf8` | Decode a UTF-8 escaped string. |
38+
| `json` | Handle the field content as a JSON map. If the decoder finds a JSON map, it replaces the content with a structured map. |
39+
| `mysql_quoted` | Decode a MySQL-quoted string. |
40+
41+
### Optional actions
42+
43+
If a decoder fails to decode the field, or if you want to try another decoder, you can define an optional action. Available actions are:
44+
45+
| Name | Description |
46+
| -----| ----------- |
47+
| `do_next` | If the decoder succeeded or failed, apply the next decoder in the list for the same field. |
48+
| `try_next` | If the decoder failed, apply the next decoder in the list for the same field. |
49+
50+
Actions are affected by some restrictions:
51+
52+
- `decode_field_as`: If successful, another decoder of the same type and the same field can be applied only if the data continues being an unstructured message (raw text).
53+
- `decode_field`: If successful, can be applied only once for the same field. `decode_field` is intended to decode a structured message.
54+
55+
## Examples
56+
57+
### Docker parser
58+
59+
The predefined Docker parser has the following definition:
2960

3061
{% tabs %}
3162
{% tab title="parsers.yaml" %}
@@ -60,33 +91,7 @@ parsers:
6091
{% endtab %}
6192
{% endtabs %}
6293

63-
Each line in the parser with a key `Decode_Field` instructs the parser to apply a specific decoder on a given field. Optionally, it offers the option to take an extra action if the decoder doesn't succeed.
64-
65-
### Decoder options
66-
67-
| Name | Description |
68-
| -------------- | ----------- |
69-
| `json` | Handle the field content as a JSON map. If the decoder finds a JSON map, it replaces the content with a structured map. |
70-
| `escaped` | Decode an escaped string. |
71-
| `escaped_utf8` | Decode a UTF8 escaped string. |
72-
73-
### Optional actions
74-
75-
If a decoder fails to decode the field, or if you want to try another decoder, you can define an optional action. Available actions are:
76-
77-
| Name | Description |
78-
| -----| ----------- |
79-
| `try_next` | If the decoder failed, apply the next decoder in the list for the same field. |
80-
| `do_next` | If the decoder succeeded or failed, apply the next decoder in the list for the same field. |
81-
82-
Actions are affected by some restrictions:
83-
84-
- `Decode_Field_As`: If successful, another decoder of the same type and the same field can be applied only if the data continues being an unstructured message (raw text).
85-
- `Decode_Field`: If successful, can be applied only once for the same field. `Decode_Field` is intended to decode a structured message.
86-
87-
### Examples
88-
89-
#### `escaped_utf8`
94+
### `escaped_utf8`
9095

9196
Example input from `/path/to/log.log`:
9297

@@ -172,7 +177,7 @@ parsers:
172177
Format json
173178
Time_Key time
174179
Time_Format %Y-%m-%dT%H:%M:%S %z
175-
Decode_Field_as escaped_utf8 log
180+
Decode_Field_As escaped_utf8 log
176181
```
177182

178183
{% endtab %}

pipeline/parsers/json.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
Use the _JSON_ parser format to create custom parsers compatible with JSON data. This format transforms JSON logs by converting them to internal binary representations.
44

5+
For available configuration parameters, see [Configuring custom parsers](configuring-parser.md).
6+
57
For example, the default parsers configuration file includes a parser for parsing Docker logs (when the Tail input plugin is used):
68

79
{% tabs %}

pipeline/parsers/logfmt.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,16 @@
22

33
Use the _logfmt_ parser format to create custom parsers compatible with [logfmt](https://pkg.go.dev/github.com/kr/logfmt?utm_source=godoc) data.
44

5+
For available configuration parameters, see [Configuring custom parsers](configuring-parser.md).
6+
7+
## Configuration parameters
8+
9+
The `logfmt` parser supports the following format-specific configuration parameter:
10+
11+
| Key | Description | Default |
12+
| --- | ----------- | ------- |
13+
| `logfmt_no_bare_keys` | If enabled, the parser rejects log entries where keys don't have associated values (bare keys). | `false` |
14+
515
The following example shows a custom parser that uses the `logfmt` format:
616

717
{% tabs %}

pipeline/parsers/ltsv.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
Use the _LTSV_ parser format to create custom parsers compatible with [Labeled Tab-separated Values (LTSV)](http://ltsv.org/) data.
44

5+
For available configuration parameters, see [Configuring custom parsers](configuring-parser.md).
6+
57
LTSV is a variant of the Tab-separated Values (TSV) format. Each record in an LTSV file is represented as a single line. Each field is separated by a tab and has a label and a value. The label and its value are separated by a colon (`:`).
68

79
Here is an example how to use this format in the Apache access log.

pipeline/parsers/multiline-parsing.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,8 @@ To define a custom multiline parser, add an entry to the [`multiline_parsers` se
4040
| -------- | ----------- | ------- |
4141
| `flush_timeout` | Timeout in milliseconds to flush a non-terminated multiline buffer. | `4s` |
4242
| `key_content` | For an incoming structured message, specify the key that contains the data that should be processed by the regular expression and possibly concatenated. | _none_ |
43+
| `key_group` | For an incoming structured message, specify the key used as a grouping identifier. Lines with different values for this key are treated as separate streams. For example, Docker and CRI logs use the `stream` field to distinguish `stdout` from `stderr`. | _none_ |
44+
| `key_pattern` | For an incoming structured message, specify an alternative key to apply matching rules against, separate from `key_content`. Use to match against one field while concatenating content from another. | _none_ |
4345
| `match_string` | String to match against for `endswith` or `equal` types. Not used for `regex` type. | _none_ |
4446
| `name` | Specify a unique name for the multiline parser definition. A good practice is to prefix the name with the word `multiline_` to avoid confusion with normal parser definitions. | _none_ |
4547
| `negate` | Negate the pattern matching result. When set to `true`, a non-matching line is treated as matching. | `false` |

pipeline/parsers/regular-expression.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,17 +8,19 @@ Use [Tail multiline](../inputs/tail.md#multiline) when you need to support regul
88

99
This parser uses Onigmo, which is a backtracking regular expression's engine. When using complex regular expression patterns, Onigmo can take a long time to perform pattern matching. This can cause a [regular expression denial of service (ReDoS)](https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS).
1010

11-
{% end hint %}
11+
{% endhint %}
1212

1313
Setting the format to regular expressions requires a `regex` configuration key.
1414

15+
For available configuration parameters, see [Configuring custom parsers](configuring-parser.md).
16+
1517
## Configuration parameters
1618

17-
The `regex` parser supports the following configuration parameters:
19+
The `regex` parser supports the following format-specific configuration parameter:
1820

19-
| Key | Description | Default Value |
20-
| --- | ----------- | ------------- |
21-
| `Skip_Empty_Values` | If enabled, the parser ignores empty value of the record. | `True` |
21+
| Key | Description | Default |
22+
| --- | ----------- | ------- |
23+
| `skip_empty_values` | If enabled, the parser ignores empty values of the record. | `true` |
2224

2325
Fluent Bit uses the [Onigmo](https://github.com/k-takata/Onigmo) regular expression library in Ruby mode.
2426

0 commit comments

Comments
 (0)