Skip to content

Commit 5dac612

Browse files
committed
docs: outputs: add ZeroBus output plugin documentation
- Introduced new ZeroBus output plugin for sending logs to Databricks via the ZeroBus streaming ingestion interface. - Updated SUMMARY.md to include ZeroBus in the list of output plugins. - Provided detailed configuration parameters, usage examples, and record format transformations for the ZeroBus plugin. Signed-off-by: mats <mats.kazuki@gmail.com>
1 parent 3da2d5f commit 5dac612

File tree

2 files changed

+156
-0
lines changed

2 files changed

+156
-0
lines changed

SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,7 @@
221221
* [UDP](pipeline/outputs/udp.md)
222222
* [Vivo Exporter](pipeline/outputs/vivo-exporter.md)
223223
* [WebSocket](pipeline/outputs/websocket.md)
224+
* [ZeroBus](pipeline/outputs/zerobus.md)
224225

225226
## Stream processing
226227

pipeline/outputs/zerobus.md

Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
---
2+
description: Send logs to Databricks via ZeroBus
3+
---
4+
5+
# ZeroBus
6+
7+
{% hint style="info" %}
8+
**Supported event types:** `logs`
9+
{% endhint %}
10+
11+
The _ZeroBus_ output plugin lets you ingest log records into a [Databricks](https://www.databricks.com/) table through the ZeroBus streaming ingestion interface. Records are converted to JSON and sent via the ZeroBus SDK using gRPC.
12+
13+
Before you begin, you need a Databricks workspace with a Unity Catalog table configured for ZeroBus ingestion, and an OAuth2 service principal (client ID and client secret) with appropriate permissions.
14+
15+
## Configuration parameters
16+
17+
| Key | Description | Default |
18+
| :--- | :--- | :--- |
19+
| `endpoint` | ZeroBus gRPC endpoint URL. If no scheme is provided, `https://` is automatically prepended. | _none_ |
20+
| `workspace_url` | Databricks workspace URL. If no scheme is provided, `https://` is automatically prepended. | _none_ |
21+
| `table_name` | Fully qualified Unity Catalog table name in `catalog.schema.table` format. | _none_ |
22+
| `client_id` | OAuth2 client ID for authentication. | _none_ |
23+
| `client_secret` | OAuth2 client secret for authentication. | _none_ |
24+
| `add_tag` | If enabled, the Fluent Bit tag is added as a `_tag` field in each record. | `true` |
25+
| `time_key` | Key name for the injected timestamp. The timestamp is formatted as RFC 3339 with nanosecond precision. Set to an empty string to disable timestamp injection. | `_time` |
26+
| `log_key` | Comma-separated list of record keys to include in the output. When unset, all keys are included. | _none_ |
27+
| `raw_log_key` | If set, the full original record (before filtering by `log_key`) is stored as a JSON string under this key name. | _none_ |
28+
29+
## Get started
30+
31+
To send log records to Databricks via ZeroBus, configure the plugin with your ZeroBus endpoint, workspace URL, table name, and OAuth2 credentials.
32+
33+
### Configuration file
34+
35+
{% tabs %}
36+
{% tab title="fluent-bit.yaml" %}
37+
38+
```yaml
39+
pipeline:
40+
inputs:
41+
- name: tail
42+
tag: app.logs
43+
path: /var/log/app/*.log
44+
45+
outputs:
46+
- name: zerobus
47+
match: '*'
48+
endpoint: https://<workspace-id>.zerobus.<region>.cloud.databricks.com
49+
workspace_url: https://<instance-name>.cloud.databricks.com
50+
table_name: catalog.schema.logs
51+
client_id: <your-client-id>
52+
client_secret: <your-client-secret>
53+
```
54+
55+
{% endtab %}
56+
{% tab title="fluent-bit.conf" %}
57+
58+
```text
59+
[INPUT]
60+
Name tail
61+
Tag app.logs
62+
Path /var/log/app/*.log
63+
64+
[OUTPUT]
65+
Name zerobus
66+
Match *
67+
Endpoint https://<workspace-id>.zerobus.<region>.cloud.databricks.com
68+
Workspace_Url https://<instance-name>.cloud.databricks.com
69+
Table_Name catalog.schema.logs
70+
Client_Id <your-client-id>
71+
Client_Secret <your-client-secret>
72+
```
73+
74+
{% endtab %}
75+
{% endtabs %}
76+
77+
### Record format
78+
79+
Each log record is converted to a JSON object before ingestion. The plugin applies the following transformations in order:
80+
81+
1. If `raw_log_key` is set, the full original record is captured as a JSON string before any filtering.
82+
2. If `log_key` is set, only the specified keys are included in the output record.
83+
3. If `raw_log_key` is set, the captured JSON string is injected under the configured key (unless a key with that name already exists).
84+
4. If `time_key` is set, a timestamp in RFC 3339 format with nanosecond precision (for example, `2024-01-15T10:30:00.123456789Z`) is injected (unless a key with that name already exists).
85+
5. If `add_tag` is enabled, the Fluent Bit tag is injected as `_tag` (unless a key with that name already exists).
86+
87+
For example, given the following input record:
88+
89+
```json
90+
{"level": "info", "message": "request completed", "status": 200}
91+
```
92+
93+
The default configuration produces:
94+
95+
```json
96+
{
97+
"level": "info",
98+
"message": "request completed",
99+
"status": 200,
100+
"_time": "2024-01-15T10:30:00.123456789Z",
101+
"_tag": "app.logs"
102+
}
103+
```
104+
105+
### Filtering keys
106+
107+
Use `log_key` to select specific fields from the record. Combined with `raw_log_key`, you can send a filtered record while preserving the original data:
108+
109+
{% tabs %}
110+
{% tab title="fluent-bit.yaml" %}
111+
112+
```yaml
113+
pipeline:
114+
outputs:
115+
- name: zerobus
116+
match: '*'
117+
endpoint: https://<workspace-id>.zerobus.<region>.cloud.databricks.com
118+
workspace_url: https://<instance-name>.cloud.databricks.com
119+
table_name: catalog.schema.logs
120+
client_id: <your-client-id>
121+
client_secret: <your-client-secret>
122+
log_key: level,message
123+
raw_log_key: _raw
124+
```
125+
126+
{% endtab %}
127+
{% tab title="fluent-bit.conf" %}
128+
129+
```text
130+
[OUTPUT]
131+
Name zerobus
132+
Match *
133+
Endpoint https://<workspace-id>.zerobus.<region>.cloud.databricks.com
134+
Workspace_Url https://<instance-name>.cloud.databricks.com
135+
Table_Name catalog.schema.logs
136+
Client_Id <your-client-id>
137+
Client_Secret <your-client-secret>
138+
Log_Key level,message
139+
Raw_Log_Key _raw
140+
```
141+
142+
{% endtab %}
143+
{% endtabs %}
144+
145+
This produces:
146+
147+
```json
148+
{
149+
"level": "info",
150+
"message": "request completed",
151+
"_raw": "{\"level\":\"info\",\"message\":\"request completed\",\"status\":200}",
152+
"_time": "2024-01-15T10:30:00.123456789Z",
153+
"_tag": "app.logs"
154+
}
155+
```

0 commit comments

Comments
 (0)