Skip to content

Commit 9de8327

Browse files
committed
toondb python sdk....
1 parent 2cdcf32 commit 9de8327

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+19265
-2
lines changed

CHANGELOG.md

Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
# Changelog
2+
3+
All notable changes to the ToonDB Python SDK will be documented in this file.
4+
5+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7+
8+
## [0.2.3] - 2025-01-xx
9+
10+
### Fixed
11+
- **Platform detection bug**: Fixed binary resolution using Rust target triple format (`aarch64-apple-darwin`) instead of Python platform tag format (`darwin-aarch64`)
12+
- Improved documentation accuracy across all doc files
13+
14+
### Changed
15+
- Updated to match latest Rust SDK API patterns
16+
17+
## [0.2.9] - 2026-01-02
18+
19+
### Added
20+
21+
#### Production-Grade CLI Tools
22+
23+
CLI commands now available globally after `pip install toondb-client`:
24+
25+
```bash
26+
toondb-server # IPC server for multi-process access
27+
toondb-bulk # High-performance vector operations
28+
toondb-grpc-server # gRPC server for remote vector search
29+
```
30+
31+
**toondb-server features:**
32+
- **Stale socket detection** - Auto-cleans orphaned socket files
33+
- **Health checks** - Waits for server ready before returning
34+
- **Graceful shutdown** - Handles SIGTERM/SIGINT/SIGHUP
35+
- **PID tracking** - Writes PID file for process management
36+
- **Permission checks** - Validates directory writable before starting
37+
- **stop/status commands** - Built-in process management
38+
39+
**toondb-bulk features:**
40+
- **Input validation** - Checks file exists, readable, correct extension
41+
- **Output validation** - Checks directory writable, handles overwrites
42+
- **Progress reporting** - Shows file sizes during operations
43+
- **Structured subcommands** - build-index, query, info, convert
44+
45+
**toondb-grpc-server features:**
46+
- **Port checking** - Verifies port available before binding
47+
- **Process detection** - Identifies what process is using a port
48+
- **Privileged port check** - Warns about ports < 1024 requiring root
49+
- **status command** - Check if server is running
50+
51+
#### Consistent Exit Codes
52+
53+
| Code | Name | Description |
54+
|------|------|-------------|
55+
| 0 | SUCCESS | Operation completed |
56+
| 1 | GENERAL_ERROR | General error |
57+
| 2 | BINARY_NOT_FOUND | Native binary not found |
58+
| 3 | PORT/SOCKET_IN_USE | Port or socket in use |
59+
| 4 | PERMISSION_DENIED | Permission denied |
60+
| 5 | STARTUP_FAILED | Server startup failed |
61+
| 130 | INTERRUPTED | Interrupted by Ctrl+C |
62+
63+
#### Environment Variable Overrides
64+
65+
- `TOONDB_SERVER_PATH` - Override toondb-server binary path
66+
- `TOONDB_BULK_PATH` - Override toondb-bulk binary path
67+
- `TOONDB_GRPC_SERVER_PATH` - Override toondb-grpc-server binary path
68+
69+
### Changed
70+
71+
- CLI wrappers now provide actionable error messages with fix suggestions
72+
- Binary resolution searches multiple locations with clear fallback chain
73+
- Signal handlers for graceful shutdown on all platforms
74+
75+
## [0.2.3] - 2025-01-xx
76+
77+
### Added
78+
79+
#### Cross-Platform Binary Distribution
80+
- **Zero-compile installation**: Pre-built `toondb-bulk` binaries bundled in wheels
81+
- **Platform support matrix**:
82+
- `manylinux_2_17_x86_64` - Linux x86_64 (glibc ≥ 2.17)
83+
- `manylinux_2_17_aarch64` - Linux ARM64 (AWS Graviton, etc.)
84+
- `macosx_11_0_universal2` - macOS Intel + Apple Silicon
85+
- `win_amd64` - Windows x64
86+
- **Automatic binary resolution** with fallback chain:
87+
1. Bundled in wheel (`_bin/<platform>/toondb-bulk`)
88+
2. System PATH (`which toondb-bulk`)
89+
3. Cargo target directory (development mode)
90+
91+
#### Bulk API Enhancements
92+
- `bulk_query_index()` - Query HNSW indexes for k nearest neighbors
93+
- `bulk_info()` - Get index metadata (vector count, dimension, etc.)
94+
- `get_toondb_bulk_path()` - Get resolved path to toondb-bulk binary
95+
- `_get_platform_tag()` - Platform detection (linux-x86_64, darwin-aarch64, etc.)
96+
- `_find_bundled_binary()` - Uses `importlib.resources` for installed packages
97+
98+
#### CI/CD Infrastructure
99+
- GitHub Actions workflow for building platform-specific wheels
100+
- cibuildwheel configuration for cross-platform builds
101+
- QEMU emulation for ARM64 Linux builds
102+
- PyPI publishing with trusted publishing
103+
104+
#### Documentation
105+
- [PYTHON_DISTRIBUTION.md](../docs/PYTHON_DISTRIBUTION.md) - Full distribution architecture
106+
- Updated [BULK_OPERATIONS.md](../docs/BULK_OPERATIONS.md) with troubleshooting
107+
- Updated [SDK_DOCUMENTATION.md](docs/SDK_DOCUMENTATION.md) with Bulk API reference
108+
- Updated [ARCHITECTURE.md](../docs/ARCHITECTURE.md) with Python SDK section
109+
110+
### Changed
111+
112+
- Package renamed from `toondb-client` to `toondb`
113+
- Wheel tags changed from `any` to platform-specific (`py3-none-<platform>`)
114+
- Binary resolution now uses `importlib.resources` instead of `__file__` paths
115+
116+
### Technical Details
117+
118+
#### Distribution Model
119+
Follows the "uv-style" approach where:
120+
- Wheels are tagged `py3-none-<platform>` (not CPython-ABI-tied)
121+
- One wheel per platform (not per Python version)
122+
- Artifact count: O(P·A) where P=platforms, A=architectures
123+
124+
#### Linux Compatibility
125+
- **manylinux_2_17** baseline (glibc ≥ 2.17)
126+
- Covers: CentOS 7+, RHEL 7+, Ubuntu 14.04+, Debian 8+
127+
- Same baseline used by `uv` for production deployments
128+
129+
#### macOS Strategy
130+
- **universal2** fat binaries containing both x86_64 and arm64
131+
- Created with `lipo -create` during build
132+
- Minimum macOS 11.0 (Big Sur)
133+
134+
## [0.1.0] - 2024-12-XX
135+
136+
### Added
137+
138+
- Initial release
139+
- Embedded mode with FFI access to ToonDB
140+
- IPC client mode for multi-process access
141+
- Path-native API with O(|path|) lookups
142+
- ACID transactions with snapshot isolation
143+
- Range scans and prefix queries
144+
- TOON format output for LLM context optimization
145+
- Bulk API for high-throughput vector ingestion
146+
- `bulk_build_index()` - Build HNSW indexes at ~1,600 vec/s
147+
- `convert_embeddings_to_raw()` - Convert numpy to raw f32
148+
- Support for raw f32 and NumPy .npy input formats
149+
150+
### Performance
151+
152+
| Method | 768D Throughput | Notes |
153+
|--------|-----------------|-------|
154+
| Python FFI | ~130 vec/s | Direct FFI calls |
155+
| Bulk API | ~1,600 vec/s | Subprocess to toondb-bulk |
156+
157+
FFI overhead eliminated by subprocess approach for bulk operations.

0 commit comments

Comments
 (0)