Skip to content

Commit 8fc1d78

Browse files
Ignacio Van Droogenbroeckclaude
andcommitted
Update to MessagePack columnar format and latest performance numbers
- Updated write performance from 2.01M to 2.32M RPS - Changed MessagePack examples to columnar format (9.7x faster) - Added batch example for multiple rows - Updated write benchmarks with columnar vs row format comparison - Added latency improvements (20x lower p50, 21x lower p95, 26x lower p99) Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 7e69bb5 commit 8fc1d78

2 files changed

Lines changed: 62 additions & 29 deletions

File tree

docs/getting-started.md

Lines changed: 44 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ This guide will get you up and running with Arc in 5 minutes.
1616

1717
### Option 1: Native Deployment (Recommended)
1818

19-
Native deployment provides **3.5x faster performance** than Docker (2.01M RPS vs 570K RPS).
19+
Native deployment provides **4x faster performance** than Docker (2.32M RPS vs 570K RPS).
2020

2121
```bash
2222
# Clone the repository
@@ -77,7 +77,9 @@ export ARC_TOKEN="your-token-here"
7777

7878
## Write Your First Data
7979

80-
### Using MessagePack (Recommended - 7.9x Faster)
80+
### Using MessagePack Columnar (Recommended - 9.7x Faster)
81+
82+
MessagePack columnar format provides the best performance (2.32M RPS vs 240K Line Protocol).
8183

8284
```python
8385
import msgpack
@@ -87,24 +89,18 @@ import os
8789

8890
token = os.getenv("ARC_TOKEN")
8991

90-
# Prepare data
92+
# Columnar format - arrange data by columns (fastest)
9193
data = {
92-
"batch": [
93-
{
94-
"m": "cpu", # measurement name
95-
"t": int(datetime.now().timestamp() * 1000), # timestamp (milliseconds)
96-
"h": "server01", # host tag
97-
"tags": { # additional tags
98-
"region": "us-east",
99-
"dc": "aws"
100-
},
101-
"fields": { # metric values
102-
"usage_idle": 95.0,
103-
"usage_user": 3.2,
104-
"usage_system": 1.8
105-
}
106-
}
107-
]
94+
"m": "cpu", # measurement name
95+
"columns": {
96+
"time": [int(datetime.now().timestamp() * 1000)], # timestamps
97+
"host": ["server01"], # host tag
98+
"region": ["us-east"], # region tag
99+
"dc": ["aws"], # dc tag
100+
"usage_idle": [95.0], # metric value
101+
"usage_user": [3.2], # metric value
102+
"usage_system": [1.8] # metric value
103+
}
108104
}
109105

110106
# Send data
@@ -123,6 +119,35 @@ else:
123119
print(f"Error {response.status_code}: {response.text}")
124120
```
125121

122+
**Batch multiple rows for even better performance:**
123+
124+
```python
125+
# Send multiple rows in one request
126+
data = {
127+
"m": "cpu",
128+
"columns": {
129+
"time": [
130+
int(datetime.now().timestamp() * 1000),
131+
int(datetime.now().timestamp() * 1000),
132+
int(datetime.now().timestamp() * 1000)
133+
],
134+
"host": ["server01", "server02", "server03"],
135+
"usage_idle": [95.0, 87.5, 92.3],
136+
"usage_user": [3.2, 8.1, 5.4],
137+
"usage_system": [1.8, 4.4, 2.3]
138+
}
139+
}
140+
141+
response = requests.post(
142+
"http://localhost:8000/write/v2/msgpack",
143+
headers={
144+
"Authorization": f"Bearer {token}",
145+
"Content-Type": "application/msgpack"
146+
},
147+
data=msgpack.packb(data)
148+
)
149+
```
150+
126151
### Using InfluxDB Line Protocol
127152

128153
```bash

docs/performance/benchmarks.md

Lines changed: 18 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -187,23 +187,31 @@ Arc excels at:
187187

188188
## Write Performance
189189

190-
Arc achieves exceptional write throughput through MessagePack binary protocol.
190+
Arc achieves exceptional write throughput through MessagePack columnar binary protocol.
191191

192-
### Write Benchmarks
192+
### Write Benchmarks - Format Comparison
193193

194-
| Storage Backend | Throughput | p50 Latency | p95 Latency | p99 Latency |
195-
|----------------|------------|-------------|-------------|-------------|
196-
| **Local NVMe** | **2.08M RPS** | **13.4ms** | **136ms** | **280ms** |
197-
| **MinIO** | **2.01M RPS** | **16.6ms** | **147ms** | **318ms** |
198-
| **Line Protocol** | **240K RPS** | N/A | N/A | N/A |
194+
| Wire Format | Throughput | p50 Latency | p95 Latency | p99 Latency | Notes |
195+
|-------------|------------|-------------|-------------|-------------|-------|
196+
| **MessagePack Columnar** | **2.32M RPS** | **6.75ms** | **39.46ms** | **59.09ms** | Zero-copy passthrough (RECOMMENDED) |
197+
| **MessagePack Row** | **908K RPS** | **136.86ms** | **851.71ms** | **1542ms** | Legacy format with conversion overhead |
198+
| **Line Protocol** | **240K RPS** | N/A | N/A | N/A | InfluxDB compatibility mode |
199+
200+
**Columnar Format Advantages:**
201+
- **2.55x faster throughput** vs row format (2.32M vs 908K RPS)
202+
- **20x lower p50 latency** (6.75ms vs 136.86ms)
203+
- **21x lower p95 latency** (39.46ms vs 851.71ms)
204+
- **26x lower p99 latency** (59.09ms vs 1542ms)
205+
- **67x fewer errors** under load (63 vs 4,211 errors at 2.5M target RPS)
199206

200207
**Test Configuration**:
201208
- Hardware: Apple M3 Max (14 cores)
202-
- Workers: 42 (3x CPU cores)
203-
- Protocol: MessagePack binary streaming
209+
- Workers: 400
210+
- Protocol: MessagePack columnar binary streaming
204211
- Deployment: Native mode
212+
- Storage: MinIO
205213

206-
**MessagePack vs Line Protocol**: 8.4x faster
214+
**MessagePack Columnar vs Line Protocol**: 9.7x faster
207215

208216
## Query Format Performance
209217

0 commit comments

Comments
 (0)