Skip to content

Commit cac82c3

Browse files
committed
feat: introduce StoreBase interface to support custom Python-based storage backends
1 parent ce8c625 commit cac82c3

6 files changed

Lines changed: 935 additions & 82 deletions

File tree

README.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,37 @@ if __name__ == "__main__":
110110
asyncio.run(main())
111111
```
112112

113+
## Storage Backends
114+
115+
Tryx supports 3 storage tiers — pick the right one for your use case:
116+
117+
| Tier | Backend | Overhead | Use case |
118+
|------|---------|----------|----------|
119+
| Built-in | `SqliteStore` | Zero | Default / prototyping |
120+
| Native FFI | `FfiStoreProtocol` | Near-zero | Max throughput (Postgres, custom C) |
121+
| Pure Python | `StoreBase` | Low | Redis, MongoDB, or any async DB |
122+
123+
```python
124+
# Tier 1: Built-in SQLite (default)
125+
from tryx.backend import SqliteStore
126+
app = Tryx(SqliteStore("whatsapp.db"))
127+
128+
# Tier 2: Native FFI (e.g. tryx-store-postgres)
129+
app = Tryx(PostgresStore(lib_path="./libtryx_pg.so", connect_string="..."))
130+
131+
# Tier 3: Pure Python custom backend
132+
from tryx.backend import StoreBase
133+
134+
class RedisStore(StoreBase):
135+
async def put_identity(self, address: str, key: bytes) -> None:
136+
await self.redis.set(f"identity:{address}", key)
137+
# ... implement all abstract methods ...
138+
139+
app = Tryx(RedisStore())
140+
```
141+
142+
For the full API specification, JSON schemas, and implementation guide, see [Storage Backends](docs/core-concepts/storage-backends.md).
143+
113144
## Feature Overview
114145

115146
- Event-based handlers via `@app.on(...)`
Lines changed: 355 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,355 @@
1+
# Storage Backends
2+
3+
Tryx supports a **3-tier storage architecture** so you can pick the right balance between simplicity, performance, and flexibility for your project.
4+
5+
| Tier | Backend | Language | Overhead | When to use |
6+
|------|---------|----------|----------|-------------|
7+
| 1 | `SqliteStore` | Built-in (Rust) | Zero | Default / prototyping / single-instance |
8+
| 2 | FFI Store | Native (C ABI) | Near-zero | Maximum throughput (Postgres, custom C/Rust) |
9+
| 3 | `StoreBase` | Pure Python | Low | Rapid development / exotic backends (Redis, Mongo) |
10+
11+
```mermaid
12+
flowchart LR
13+
subgraph Tryx["Tryx Rust Core"]
14+
Backend["Backend trait\n(SignalStore + AppSyncStore +\nProtocolStore + DeviceStore +\nMsgSecretStore)"]
15+
end
16+
17+
SQLite["SqliteStore\n(built-in)"]
18+
FFI["FfiBridgeStore\n(C ABI .so/.dll)"]
19+
Python["PythonStore\n(async Python)"]
20+
21+
Backend --> SQLite
22+
Backend --> FFI
23+
Backend --> Python
24+
25+
style SQLite fill:#22c55e,color:#fff
26+
style FFI fill:#3b82f6,color:#fff
27+
style Python fill:#a855f7,color:#fff
28+
```
29+
30+
---
31+
32+
## Tier 1: SqliteStore (Default)
33+
34+
The built-in SQLite backend requires zero configuration. Just pass a file path:
35+
36+
```python
37+
from tryx.backend import SqliteStore
38+
from tryx.client import Tryx
39+
40+
app = Tryx(SqliteStore("whatsapp.db"))
41+
```
42+
43+
Internally, the Rust core opens the database with WAL mode, creates all tables automatically, and handles migrations. This is the recommended backend for development and single-instance production deployments.
44+
45+
---
46+
47+
## Tier 2: FFI Store (Native Shared Library)
48+
49+
For maximum throughput with no Python overhead, implement a storage backend as a native shared library (`.so` / `.dylib` / `.dll`) that exports C-ABI entry points.
50+
51+
### Architecture
52+
53+
```mermaid
54+
sequenceDiagram
55+
participant Rust as Tryx Rust Core
56+
participant C as Native Store (.so)
57+
participant DB as Database (Postgres/etc)
58+
59+
Rust->>C: tryx_store_connect(dsn)
60+
C->>DB: Open connection pool
61+
C-->>Rust: handle ptr
62+
63+
loop Every operation
64+
Rust->>C: tryx_put_identity(handle, addr, key, len)
65+
C->>DB: INSERT INTO identities ...
66+
C-->>Rust: status code (0 = OK)
67+
end
68+
```
69+
70+
### Usage
71+
72+
Any object with `lib_path` and `connect_string` attributes is detected as an FFI backend:
73+
74+
```python
75+
from tryx.client import Tryx
76+
77+
# tryx-store-postgres exposes this interface
78+
class PostgresStore:
79+
lib_path = "./libtryx_pg.so"
80+
connect_string = "host=localhost dbname=tryx"
81+
82+
app = Tryx(PostgresStore())
83+
```
84+
85+
### Type Safety
86+
87+
Use the `FfiStoreProtocol` for type checking:
88+
89+
```python
90+
from tryx.backend import FfiStoreProtocol
91+
92+
def create_backend() -> FfiStoreProtocol:
93+
return PostgresStore() # type checker validates attributes
94+
```
95+
96+
### Required C ABI Entry Points
97+
98+
Your shared library must export these symbols:
99+
100+
| Symbol | Signature |
101+
|--------|-----------|
102+
| `tryx_store_connect` | `(dsn: *const c_char, handle: *mut *mut c_void) -> i32` |
103+
| `tryx_store_destroy` | `(handle: *mut c_void)` |
104+
| `tryx_put_identity` | `(handle, addr, key_ptr, key_len) -> i32` |
105+
| `tryx_load_identity` | `(handle, addr, out: *mut TryxBuffer) -> i32` |
106+
| `tryx_delete_identity` | `(handle, addr) -> i32` |
107+
| ... | (see `ffi_bridge.rs` for the full list of ~30 entry points) |
108+
109+
All functions return `0` on success, non-zero on error.
110+
111+
---
112+
113+
## Tier 3: PythonStore (Pure Python)
114+
115+
For maximum flexibility, inherit from `StoreBase` and implement all abstract methods using any async Python library.
116+
117+
### Architecture
118+
119+
```mermaid
120+
sequenceDiagram
121+
participant Rust as Tryx Rust Core (Tokio)
122+
participant Bridge as PythonStore Bridge
123+
participant GIL as Python GIL
124+
participant Py as Your StoreBase subclass
125+
126+
Rust->>Bridge: put_identity("addr", key_bytes)
127+
Bridge->>GIL: Python::attach(|py| ...)
128+
GIL->>Py: await store.put_identity(address="addr", key=b"...")
129+
Py-->>GIL: None
130+
GIL-->>Bridge: PyObject
131+
Bridge-->>Rust: Ok(())
132+
133+
Note over GIL: GIL is released during<br/>the Python await
134+
```
135+
136+
### Usage
137+
138+
```python
139+
import json
140+
import redis.asyncio as redis
141+
from tryx.backend import StoreBase
142+
from tryx.client import Tryx
143+
144+
class RedisStore(StoreBase):
145+
def __init__(self, url: str = "redis://localhost"):
146+
self.r = redis.from_url(url)
147+
148+
async def put_identity(self, address: str, key: bytes) -> None:
149+
await self.r.set(f"identity:{address}", key)
150+
151+
async def load_identity(self, address: str) -> bytes | None:
152+
return await self.r.get(f"identity:{address}")
153+
154+
async def delete_identity(self, address: str) -> None:
155+
await self.r.delete(f"identity:{address}")
156+
157+
# ... implement ALL abstract methods from StoreBase ...
158+
159+
app = Tryx(RedisStore())
160+
```
161+
162+
### Type Checking
163+
164+
`StoreBase` is an `ABC` with `@abstractmethod` on every required method. If you miss any method:
165+
166+
- **Mypy/Pyright** will report an error at class definition
167+
- **Runtime** will raise `TypeError: Can't instantiate abstract class`
168+
169+
```python
170+
class IncompleteStore(StoreBase):
171+
async def put_identity(self, address: str, key: bytes) -> None:
172+
pass
173+
# Missing 50+ methods → type checker error!
174+
```
175+
176+
### Performance Characteristics
177+
178+
The PythonStore bridge is designed for low overhead:
179+
180+
| Aspect | Detail |
181+
|--------|--------|
182+
| **GIL acquisition** | Uses `Python::attach()` (PyO3 0.28+), the lightest available mechanism |
183+
| **Async bridging** | `pyo3_async_runtimes::tokio::into_future()` — zero-copy future conversion |
184+
| **Argument passing** | Scalars (`str`, `int`, `bool`) passed natively via kwargs; complex structs as JSON bytes |
185+
| **GIL during I/O** | Released during Python `await` — other Rust tasks run freely |
186+
| **Cloning** | `Py<T>::clone_ref()` — reference count increment only, no deep copy |
187+
188+
**Typical overhead per call:** ~2-5µs for GIL acquire/release + Python method dispatch. The actual database I/O dominates.
189+
190+
---
191+
192+
## API Specification
193+
194+
All three tiers implement the same underlying Rust traits. Here is the complete method reference grouped by trait:
195+
196+
### SignalStore — End-to-End Encryption
197+
198+
Handles identity keys, sessions, pre-keys, signed pre-keys, and sender keys.
199+
200+
| Method | Args | Returns | Description |
201+
|--------|------|---------|-------------|
202+
| `put_identity` | `address: str, key: bytes` | `None` | Store 32-byte identity key |
203+
| `load_identity` | `address: str` | `bytes \| None` | Load identity key (32 bytes) |
204+
| `delete_identity` | `address: str` | `None` | Delete identity key |
205+
| `get_session` | `address: str` | `bytes \| None` | Get encrypted session record |
206+
| `put_session` | `address: str, session: bytes` | `None` | Store session record |
207+
| `delete_session` | `address: str` | `None` | Delete session |
208+
| `store_prekey` | `id: int, record: bytes, uploaded: bool` | `None` | Store pre-key |
209+
| `load_prekey` | `id: int` | `bytes \| None` | Load pre-key |
210+
| `remove_prekey` | `id: int` | `None` | Remove pre-key |
211+
| `get_max_prekey_id` || `int` | Max stored pre-key ID (or 0) |
212+
| `store_signed_prekey` | `id: int, record: bytes` | `None` | Store signed pre-key |
213+
| `load_signed_prekey` | `id: int` | `bytes \| None` | Load signed pre-key |
214+
| `load_all_signed_prekeys` || `list[tuple[int, bytes]]` | All signed pre-keys |
215+
| `remove_signed_prekey` | `id: int` | `None` | Remove signed pre-key |
216+
| `put_sender_key` | `address: str, record: bytes` | `None` | Store group sender key |
217+
| `get_sender_key` | `address: str` | `bytes \| None` | Get group sender key |
218+
| `delete_sender_key` | `address: str` | `None` | Delete group sender key |
219+
220+
### AppSyncStore — App State Synchronization
221+
222+
| Method | Args | Returns | Description |
223+
|--------|------|---------|-------------|
224+
| `get_sync_key` | `key_id: bytes` | `bytes \| None` | Get sync key (JSON `AppStateSyncKey`) |
225+
| `set_sync_key` | `key_id: bytes, key: bytes` | `None` | Set sync key |
226+
| `get_version` | `name: str` | `bytes` | Get collection version (JSON `HashState`) |
227+
| `set_version` | `name: str, state: bytes` | `None` | Set collection version |
228+
| `put_mutation_macs` | `name: str, version: int, mutations: bytes` | `None` | Store mutation MACs |
229+
| `get_mutation_mac` | `name: str, index_mac: bytes` | `bytes \| None` | Get mutation MAC |
230+
| `delete_mutation_macs` | `name: str, index_macs: bytes` | `None` | Delete mutation MACs |
231+
| `get_latest_sync_key_id` || `bytes \| None` | Latest sync key ID |
232+
233+
### DeviceStore — Device Persistence
234+
235+
| Method | Args | Returns | Description |
236+
|--------|------|---------|-------------|
237+
| `save` | `device: bytes` | `None` | Save device data (JSON `Device`) |
238+
| `load` || `bytes \| None` | Load device data |
239+
| `exists` || `bool` | Check if device exists |
240+
| `create` || `int` | Create device, return ID |
241+
242+
### ProtocolStore — Protocol Alignment
243+
244+
| Method | Args | Returns | Description |
245+
|--------|------|---------|-------------|
246+
| `get_sender_key_devices` | `group_jid: str` | `list[tuple[str, bool]]` | SKDM device status |
247+
| `set_sender_key_status` | `group_jid: str, entries: bytes` | `None` | Set SKDM status |
248+
| `clear_sender_key_devices` | `group_jid: str` | `None` | Clear group SKDM tracking |
249+
| `delete_sender_key_device_rows` | `device_jids: bytes` | `None` | Delete by device JID |
250+
| `clear_all_sender_key_devices` || `None` | Clear all SKDM tracking |
251+
| `get_lid_mapping` | `lid: str` | `bytes \| None` | LID→PN mapping (JSON) |
252+
| `get_pn_mapping` | `phone: str` | `bytes \| None` | PN→LID mapping (JSON) |
253+
| `put_lid_mapping` | `entry: bytes` | `None` | Upsert LID-PN mapping |
254+
| `get_all_lid_mappings` || `list[bytes]` | All mappings (JSON array) |
255+
| `save_base_key` | `address: str, message_id: str, base_key: bytes` | `None` | Retry collision detection |
256+
| `has_same_base_key` | `address: str, message_id: str, current_base_key: bytes` | `bool` | Compare base keys |
257+
| `delete_base_key` | `address: str, message_id: str` | `None` | Delete base key |
258+
| `update_device_list` | `record: bytes` | `None` | Update device registry (JSON) |
259+
| `get_devices` | `user: str` | `bytes \| None` | Get device list (JSON) |
260+
| `delete_devices` | `user: str` | `None` | Delete device list |
261+
| `get_tc_token` | `jid: str` | `bytes \| None` | Get trust token (JSON) |
262+
| `put_tc_token` | `jid: str, entry: bytes` | `None` | Set trust token |
263+
| `delete_tc_token` | `jid: str` | `None` | Delete trust token |
264+
| `get_all_tc_token_jids` || `list[str]` | All JIDs with tokens |
265+
| `delete_expired_tc_tokens` | `cutoff: int` | `int` | Prune old tokens |
266+
| `store_sent_message` | `chat_jid: str, message_id: str, payload: bytes` | `None` | Store for retry |
267+
| `take_sent_message` | `chat_jid: str, message_id: str` | `bytes \| None` | Atomic take |
268+
| `delete_expired_sent_messages` | `cutoff: int` | `int` | Prune old messages |
269+
270+
### MsgSecretStore — Message Secret Persistence
271+
272+
| Method | Args | Returns | Description |
273+
|--------|------|---------|-------------|
274+
| `put_msg_secrets` | `entries: bytes` | `int` | Batch upsert (JSON `[MsgSecretEntry]`) |
275+
| `get_msg_secret` | `chat: str, sender: str, msg_id: str` | `bytes \| None` | Fetch secret |
276+
| `delete_expired_msg_secrets` | `cutoff: int` | `int` | Prune expired secrets |
277+
278+
---
279+
280+
## JSON Struct Schemas
281+
282+
Complex types are passed as JSON bytes. Here are the key schemas:
283+
284+
### AppStateSyncKey
285+
286+
```json
287+
{
288+
"key_data": [/* u8 array */],
289+
"fingerprint": [/* u8 array */],
290+
"timestamp": 1717430400
291+
}
292+
```
293+
294+
### LidPnMappingEntry
295+
296+
```json
297+
{
298+
"lid": "100000012345678",
299+
"phone_number": "559980000001",
300+
"created_at": 1717430400,
301+
"updated_at": 1717430400,
302+
"learning_source": "usync"
303+
}
304+
```
305+
306+
### TcTokenEntry
307+
308+
```json
309+
{
310+
"token": [/* u8 array */],
311+
"token_timestamp": 1717430400,
312+
"sender_timestamp": 1717430400
313+
}
314+
```
315+
316+
### MsgSecretEntry
317+
318+
```json
319+
{
320+
"chat": "5599800001@s.whatsapp.net",
321+
"sender": "5599800002@s.whatsapp.net",
322+
"msg_id": "3EB0A1B2C3D4E5F6",
323+
"secret": [/* 32 u8 bytes */],
324+
"expires_at": 0,
325+
"message_ts": 1717430400
326+
}
327+
```
328+
329+
### DeviceListRecord
330+
331+
```json
332+
{
333+
"user": "559980000001",
334+
"devices": [
335+
{"device_id": 0, "key_index": null},
336+
{"device_id": 1, "key_index": 42}
337+
],
338+
"timestamp": 1717430400,
339+
"phash": "abc123",
340+
"raw_id": 5
341+
}
342+
```
343+
344+
---
345+
346+
## Choosing a Backend
347+
348+
| Criteria | SqliteStore | FFI Store | PythonStore |
349+
|----------|:-----------:|:---------:|:-----------:|
350+
| Setup complexity | ⭐ Zero | ⭐⭐⭐ Requires compilation | ⭐⭐ Moderate |
351+
| Throughput | ⭐⭐ Good | ⭐⭐⭐ Maximum | ⭐⭐ Good |
352+
| Multi-instance | ❌ Single file | ✅ Shared DB | ✅ Shared DB |
353+
| Language | Rust (built-in) | C/Rust | Python |
354+
| Development speed | ⭐⭐⭐ Instant | ⭐ Slow | ⭐⭐⭐ Fast |
355+
| Package decoupling | Built-in | ✅ Fully independent | ✅ Fully independent |

libs/whatsapp-rust

0 commit comments

Comments
 (0)