Skip to content

Commit 9f23410

Browse files
committed
Merge branch 'refs/heads/main' into feat/async-ack
# Conflicts: # Cargo.lock
2 parents 1914b79 + 54087be commit 9f23410

53 files changed

Lines changed: 10861 additions & 293 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Cargo.lock

Lines changed: 302 additions & 287 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[workspace.package]
2-
version = "0.4.0-rc1"
2+
version = "0.5.0"
33
edition = "2021"
44
description = "High-performance Rust flow processing engine"
55
authors = ["chenquan <chenquan.dev@gmail.com>"]
@@ -46,7 +46,7 @@ flume = "=0.11"
4646
# Sql
4747
sqlx = { version = "0.8", features = ["mysql", "postgres", "runtime-tokio", "tls-native-tls"] }
4848

49-
tempfile = "3.22.0"
49+
tempfile = "3.23.0"
5050
mockall = "0.12"
5151
arkflow-core = { path = "crates/arkflow-core" }
5252
arkflow-plugin = { path = "crates/arkflow-plugin" }

crates/arkflow-plugin/Cargo.toml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -70,14 +70,14 @@ arkflow-core = { workspace = true }
7070
sqlx = { workspace = true }
7171

7272
# Websocket
73-
tokio-tungstenite = { version = "0.27", features = ["native-tls"] }
73+
tokio-tungstenite = { version = "0.28", features = ["native-tls"] }
7474

7575
# NATS
76-
async-nats = "0.43"
76+
async-nats = "0.44"
7777

7878
# Pulsar
79-
pulsar = "6.3"
80-
rand = "0.8"
79+
pulsar = "6.4"
80+
rand = "0.9"
8181

8282

8383
# modbus
Lines changed: 219 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,219 @@
1+
---
2+
sidebar_position: 1
3+
logo: ./logo.svg
4+
---
5+
6+
# Introduction
7+
8+
ArkFlow is a high-performance Rust stream processing engine that provides powerful data stream processing capabilities, supporting various input/output sources and processors.
9+
10+
:::tip
11+
Currently, ArkFlow is **stateless**, but it can still help you solve most data engineering problems. It implements transaction-based resilience and features backpressure.
12+
13+
Therefore, when connected to input and output sources that provide at-least-once semantics, it can guarantee at-least-once delivery without needing to retain messages in transit.
14+
15+
In the future, we will gradually improve the functions of ArkFlow to enable it to have transactional and state management capabilities, so as to better meet various data processing needs.
16+
17+
:::
18+
19+
20+
logo([Logo Usage Guidelines](./about-logo)):
21+
22+
![ArkFlow logo](logo.svg)
23+
24+
25+
## Core Features
26+
27+
- **High Performance**: Built on Rust and Tokio async runtime, delivering exceptional performance and low latency
28+
- **Multiple Data Sources**: Support for Kafka, MQTT, HTTP, files, and other input/output sources
29+
- **Powerful Processing**: Built-in SQL queries, JSON processing, Protobuf encoding/decoding, batch processing, and other processors
30+
- **Extensibility**: Modular design, easy to extend with new input, output, and processor components
31+
32+
## Installation
33+
34+
### Building from Source
35+
36+
```bash
37+
# Clone repository
38+
git clone https://github.com/arkflow-rs/arkflow.git
39+
cd arkflow
40+
41+
# Build project
42+
cargo build --release
43+
44+
# Run tests
45+
cargo test
46+
```
47+
48+
## Quick Start
49+
50+
1. Create a configuration file `config.yaml`:
51+
52+
```yaml
53+
logging:
54+
level: info
55+
streams:
56+
- input:
57+
type: "generate"
58+
context: '{ "timestamp": 1625000000000, "value": 10, "sensor": "temp_1" }'
59+
interval: 1s
60+
batch_size: 10
61+
buffer:
62+
type: "memory"
63+
capacity: 10
64+
timeout: 10s
65+
pipeline:
66+
thread_num: 4
67+
processors:
68+
- type: "json_to_arrow"
69+
- type: "sql"
70+
query: "SELECT * FROM flow WHERE value >= 10"
71+
72+
output:
73+
type: "stdout"
74+
error_output:
75+
type: "stdout"
76+
```
77+
78+
2. Run ArkFlow:
79+
80+
```bash
81+
./target/release/arkflow --config config.yaml
82+
```
83+
84+
## Configuration Guide
85+
86+
ArkFlow uses YAML format configuration files and supports the following main configuration items:
87+
88+
### Top-level Configuration
89+
90+
```yaml
91+
logging:
92+
level: info # Log levels: debug, info, warn, error
93+
94+
streams: # Stream definition list
95+
- input: # Input configuration
96+
# ...
97+
pipeline: # Pipeline configuration
98+
# ...
99+
output: # Output configuration
100+
# ...
101+
error_output: # Error output configuration
102+
# ...
103+
buffer: # Buffer configuration
104+
# ...
105+
```
106+
107+
108+
### Input Components
109+
110+
ArkFlow supports multiple input sources:
111+
112+
- **Kafka**: Read data from Kafka topics
113+
- **MQTT**: Subscribe to messages from MQTT topics
114+
- **HTTP**: Receive data via HTTP
115+
- **File**: Reading data from files(Csv,Json, Parquet, Avro, Arrow) using SQL
116+
- **Generator**: Generate test data
117+
- **Database**: Query data from databases(MySQL, PostgreSQL, SQLite, Duckdb)
118+
- **Nats**: Subscribe to messages from Nats topics
119+
- **Redis**: Subscribe to messages from Redis channels or lists
120+
- **Websocket**: Subscribe to messages from WebSocket connections
121+
122+
Example:
123+
124+
```yaml
125+
input:
126+
type: kafka
127+
brokers:
128+
- localhost:9092
129+
topics:
130+
- test-topic
131+
consumer_group: test-group
132+
client_id: arkflow
133+
start_from_latest: true
134+
```
135+
136+
### Processors
137+
138+
ArkFlow provides multiple data processors:
139+
140+
- **JSON**: JSON data processing and transformation
141+
- **SQL**: Process data using SQL queries
142+
- **Protobuf**: Protobuf encoding/decoding
143+
- **Batch Processing**: Process messages in batches
144+
- **Vrl**: Process data using [VRL](https://vector.dev/docs/reference/vrl/)
145+
146+
Example:
147+
148+
```yaml
149+
pipeline:
150+
thread_num: 4
151+
processors:
152+
- type: json_to_arrow
153+
- type: sql
154+
query: "SELECT * FROM flow WHERE value >= 10"
155+
```
156+
157+
### Output Components
158+
159+
ArkFlow supports multiple output targets:
160+
161+
- **Kafka**: Write data to Kafka topics
162+
- **MQTT**: Publish messages to MQTT topics
163+
- **HTTP**: Send data via HTTP
164+
- **Standard Output**: Output data to the console
165+
- **Drop**: Discard data
166+
- **Nats**: Publish messages to Nats topics
167+
168+
Example:
169+
170+
```yaml
171+
output:
172+
type: kafka
173+
brokers:
174+
- localhost:9092
175+
topic:
176+
type: value
177+
value: test-topic
178+
client_id: arkflow-producer
179+
```
180+
### Error Output Components
181+
ArkFlow supports multiple error output targets:
182+
- **Kafka**: Write error data to Kafka topics
183+
- **MQTT**: Publish error messages to MQTT topics
184+
- **HTTP**: Send error data via HTTP
185+
- **Standard Output**: Output error data to the console
186+
- **Drop**: Discard error data
187+
- **Nats**: Publish messages to Nats topics
188+
189+
Example:
190+
191+
```yaml
192+
error_output:
193+
type: kafka
194+
brokers:
195+
- localhost:9092
196+
topic:
197+
type: value
198+
value: error-topic
199+
client_id: error-arkflow-producer
200+
```
201+
202+
203+
### Buffer Components
204+
205+
ArkFlow provides buffer capabilities to handle backpressure and temporary storage of messages:
206+
207+
- **Memory Buffer**: Memory buffer, for high-throughput scenarios and window aggregation.
208+
- **Session Window**: The Session Window buffer component provides a session-based message grouping mechanism where messages are grouped based on activity gaps. It implements a session window that closes after a configurable period of inactivity.
209+
- **Sliding Window**: The Sliding Window buffer component provides a time-based windowing mechanism for processing message batches. It implements a sliding window algorithm with configurable window size, slide interval and slide size.
210+
- **Tumbling Window**: The Tumbling Window buffer component provides a fixed-size, non-overlapping windowing mechanism for processing message batches. It implements a tumbling window algorithm with configurable interval settings.
211+
212+
Example:
213+
214+
```yaml
215+
buffer:
216+
type: memory
217+
capacity: 10000 # Maximum number of messages to buffer
218+
timeout: 10s # Maximum time to buffer messages
219+
```
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# ArkFlow Logo Usage Guidelines
2+
3+
## 1. Introduction and Purpose
4+
5+
Welcome to the ArkFlow Logo Usage Guidelines! The ArkFlow Logo (hereinafter referred to as the "Logo") is an important symbol of our project's identity and core values. It represents the innovation and reliability of the ArkFlow project, as well as our commitment to the open-source community.
6+
7+
To help you use the ArkFlow Logo correctly, promote the ArkFlow project, and jointly maintain the integrity and reputation of the ArkFlow brand, we have established the following usage guidelines. We encourage community members, contributors, partners, and the media to use the ArkFlow Logo in compliance with these guidelines.
8+
9+
## 2. Rights Statement
10+
11+
The ArkFlow name and Logo are trademarks of the ArkFlow project team. We reserve all rights associated with this Logo, including but not limited to copyright and trademark rights.
12+
13+
We are happy to grant you permission to use these marks provided you adhere to these guidelines.
14+
15+
## 3. Logo Usage License
16+
17+
We hope the ArkFlow Logo will be used widely and correctly to help promote the project.
18+
19+
**Permitted Uses:**
20+
21+
* On your website, blog, or social media posts to link to the official ArkFlow project address: [https://arkflow-rs.com/](https://arkflow-rs.com/).
22+
* In introductions, reviews, tutorials, news reports, or technical seminars to refer to or discuss the ArkFlow project, provided it does not create confusion or imply official endorsement or sponsorship.
23+
* In non-commercial promotional materials for your open-source project or organization built using ArkFlow, to indicate your project's association with ArkFlow (however, please note that you cannot use the ArkFlow Logo as the primary identifier for your project or organization, or imply that your project is an official ArkFlow product).
24+
* When showcasing your contributions to, support for, or your use of the ArkFlow project.
25+
* In community-organized non-commercial events or meetups related to ArkFlow.
26+
27+
**Prohibited Uses (unless prior written permission is obtained from ArkFlow team):**
28+
29+
* **Do not modify the Logo:** This includes, but is not limited to, changing the Logo's colors, proportions, fonts, design elements, or adding any other text or graphics to it. Always use the official Logo files provided by us.
30+
* **Do not use it as your own primary brand identifier:** Do not use the ArkFlow Logo as the primary logo or core brand element for your own products, services, company, domain name, social media account name, or project.
31+
* **Do not imply official endorsement:** Do not use the Logo in any way that suggests official endorsement, sponsorship, affiliation, or backing of your products, services, events, or commercial activities by ArkFlow or ArkFlow team.
32+
* **Do not use for commercial products or paid services:** Do not use the Logo directly on commercial products (e.g., printing the Logo on products for sale that require payment) or for the brand promotion of paid services. For commercial cooperation, please contact us.
33+
* **Do not use in negative or inappropriate contexts:** Do not use the Logo in connection with any defamatory, vulgar, illegal, misleading, controversial content, or in any situation that could harm the reputation of the ArkFlow project or the image of ArkFlow team.
34+
* **Do not register as a trademark:** You may not attempt to register the ArkFlow Logo or any similar marks as your own trademark.
35+
36+
**Uses Requiring Prior Written Permission:**
37+
38+
* Any commercial use, including but not limited to merchandise sales and promotion of paid services.
39+
* Any declaration of a formal partnership with a third party.
40+
* Any use of the Logo not explicitly permitted by these guidelines that you wish to undertake.
41+
42+
## 4. Correct Usage Specifications for the Logo
43+
44+
To ensure the consistency and recognizability of the Logo, please adhere to the following visual specifications:
45+
46+
* **Official Logo Versions:** Always obtain the latest version of the Logo files from official ArkFlow channels. We typically provide SVG (vector) and PNG (bitmap) formats.
47+
* **Color:** Please use the officially designated colors for the Logo. If you need to use it on a special background, please consult us to see if there are applicable reversed (white) or monochrome versions. Do not change the colors yourself.
48+
* **Clear Space:** Sufficient empty space (clear space) should be maintained around the Logo to avoid overcrowding with other text, graphics, or borders.
49+
50+
* **Avoid Improper Operations:**
51+
* Do not stretch, compress, rotate, or tilt the Logo.
52+
* Do not add strokes, shadows, gradients (unless part of the Logo's original design), or other visual effects to the Logo.
53+
* Do not use the Logo on overly cluttered backgrounds or backgrounds that make the Logo difficult to discern.
54+
55+
## 5. Disclaimer and Reservation of Rights
56+
57+
These ArkFlow Logo Usage Guidelines may be updated from time to time. The latest version will be published on the official ArkFlow website; please check it regularly. Your continued use of the Logo after an update to the guidelines signifies your acceptance of the revised terms.
58+
ArkFlow team reserves the right to the final interpretation of the use of the ArkFlow Logo and has the right to demand correction or cessation of any Logo usage that we deem non-compliant with these guidelines or detrimental to the interests of the ArkFlow project.
59+
The permission granted to you to use the Logo is non-exclusive and may be revoked by ArkFlow team at any time at their sole discretion.
60+
61+
## 6. Contact Information
62+
63+
If you have any questions about the use of the ArkFlow Logo, wish to apply for uses not explicitly permitted by these guidelines, or have any commercial cooperation intentions, please contact Chen Quan via:
64+
Email: chenquan.dev@gmail.com
65+
66+
We thank you for your support and attention to the ArkFlow project! Correct use of the ArkFlow Logo will help us jointly build a strong and respected brand.
67+
68+
---
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
{
2+
"label": "Inputs",
3+
"link": {
4+
"title": "Inputs",
5+
"type": "generated-index",
6+
"description": "Input components are responsible for consuming data from various sources such as Kafka, MQTT, HTTP, and Memory. Each input component has its own configuration options that can be customized according to your needs."
7+
}
8+
}
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# Generate
2+
3+
Generate is an input component that generates test data.
4+
5+
## Configuration
6+
7+
### **context**
8+
9+
The context is a JSON object that will be used to generate the data. The JSON object will be serialized to bytes and sent as message content.
10+
11+
type: `string`
12+
13+
optional: `true`
14+
15+
### **count**
16+
17+
The total number of data points to generate. If not specified, the generator will run indefinitely until manually stopped.
18+
19+
type: `integer`
20+
21+
optional: `true`
22+
23+
### **interval**
24+
25+
The interval is the time between each data point.
26+
27+
type: `string`
28+
29+
example: `1ms`, `1s`, `1m`, `1h`, `1d`
30+
31+
optional: `false`
32+
33+
### **batch_size**
34+
35+
The batch size is the number of data points to generate at each interval. If the remaining count is less than batch_size, only the remaining messages will be sent.
36+
37+
type: `integer`
38+
39+
optional: `false`
40+
41+
## Examples
42+
43+
```yaml
44+
- input:
45+
type: "generate"
46+
context: '{ "timestamp": 1625000000000, "value": 10, "sensor": "temp_1" }'
47+
interval: 1ms
48+
batch_size: 1000
49+
count: 10000 # Optional: generate 10000 messages in total
50+
```

0 commit comments

Comments
 (0)