Skip to content

Commit 5a99d9b

Browse files
committed
Add comprehensive README for Rust BSON extension (PYTHON-5683)
- Document implementation history and architecture - Compare with Copilot POC from PR #2689 - Explain performance analysis (~0.21x vs C extension) - Clarify why POC appeared faster (incomplete functionality) - Detail build system differences (cargo vs maturin) - Provide steps to achieve performance parity - Include building, testing, and usage instructions
1 parent 6599757 commit 5a99d9b

File tree

1 file changed

+277
-0
lines changed

1 file changed

+277
-0
lines changed

bson/_rbson/README.md

Lines changed: 277 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,277 @@
1+
# Rust BSON Extension Module
2+
3+
This directory contains a Rust-based implementation of BSON encoding/decoding for PyMongo, developed as part of [PYTHON-5683](https://jira.mongodb.org/browse/PYTHON-5683).
4+
5+
## Overview
6+
7+
The Rust extension (`_rbson`) provides the same interface as the C extension (`_cbson`) but is implemented in Rust using:
8+
- **PyO3**: Python bindings for Rust
9+
- **bson crate**: MongoDB's official Rust BSON library
10+
- **Maturin**: Build tool for Rust Python extensions
11+
12+
## Implementation History
13+
14+
This implementation was developed through [PR #2695](https://github.com/mongodb/mongo-python-driver/pull/2695) to investigate using Rust as an alternative to C for Python extension modules.
15+
16+
### Key Milestones
17+
18+
1. **Initial Implementation** - Complete BSON type support with 100% test compatibility (88/88 tests passing)
19+
2. **Performance Optimizations** - Type caching, fast paths for common types, direct byte operations
20+
3. **Architectural Analysis** - Identified fundamental performance differences between Rust and C approaches
21+
22+
## Features
23+
24+
### Supported BSON Types
25+
26+
The Rust extension supports all BSON types:
27+
- **Primitives**: Double, String, Int32, Int64, Boolean, Null
28+
- **Complex Types**: Document, Array, Binary, ObjectId, DateTime
29+
- **Special Types**: Regex, Code, Timestamp, Decimal128, MinKey, MaxKey
30+
- **Deprecated Types**: DBPointer (decodes to DBRef)
31+
32+
### CodecOptions Support
33+
34+
Full support for PyMongo's `CodecOptions`:
35+
- `document_class` - Custom document classes
36+
- `tz_aware` - Timezone-aware datetime handling
37+
- `tzinfo` - Timezone conversion
38+
- `uuid_representation` - UUID encoding/decoding modes
39+
- `datetime_conversion` - DateTime handling modes (AUTO, CLAMP, MS)
40+
- `unicode_decode_error_handler` - UTF-8 error handling
41+
42+
### Runtime Selection
43+
44+
The Rust extension can be enabled via environment variable:
45+
```bash
46+
export PYMONGO_USE_RUST=1
47+
python your_script.py
48+
```
49+
50+
Without this variable, PyMongo uses the C extension by default.
51+
52+
## Performance Analysis
53+
54+
### Current Performance: ~0.21x (5x slower than C)
55+
56+
**Benchmark Results** (from PR #2695):
57+
```
58+
Simple documents: C: 100% | Rust: 21%
59+
Mixed types: C: 100% | Rust: 20%
60+
Nested documents: C: 100% | Rust: 18%
61+
Lists: C: 100% | Rust: 22%
62+
```
63+
64+
### Root Cause: Architectural Difference
65+
66+
The performance gap is due to a fundamental architectural difference:
67+
68+
**C Extension Architecture:**
69+
```
70+
Python objects → BSON bytes (direct)
71+
```
72+
- Writes BSON bytes directly from Python objects
73+
- No intermediate data structures
74+
- Minimal memory allocations
75+
76+
**Rust Extension Architecture:**
77+
```
78+
Python objects → Rust Bson enum → BSON bytes
79+
```
80+
- Converts Python objects to Rust `Bson` enum
81+
- Then serializes `Bson` to bytes
82+
- Extra conversion layer adds overhead
83+
84+
### Optimization Attempts
85+
86+
Multiple optimization strategies were attempted in PR #2695:
87+
88+
1. **Type Caching** - Cache frequently used Python types (UUID, datetime, etc.)
89+
2. **Fast Paths** - Special handling for common types (int, str, bool, None)
90+
3. **Direct Byte Writing** - Write BSON bytes directly without intermediate `Document`
91+
4. **PyDict Fast Path** - Use `PyDict_Next` for efficient dict iteration
92+
93+
**Result**: These optimizations improved performance from ~0.15x to ~0.21x, but the fundamental architectural difference remains.
94+
95+
## Comparison with Copilot POC (PR #2689)
96+
97+
The current implementation evolved significantly from the initial Copilot-generated proof-of-concept in PR #2689:
98+
99+
### Copilot POC (PR #2689) - Initial Spike
100+
**Status**: 53/88 tests passing (60%)
101+
102+
**Build System**: `cargo build --release` (manual copy of .so file)
103+
- Used raw `cargo` commands
104+
- Manual file copying to project root
105+
- No wheel generation
106+
- Located in `rust/` directory
107+
108+
**What it had:**
109+
- ✅ Basic BSON type support (int, float, string, bool, bytes, dict, list, null)
110+
- ✅ ObjectId, DateTime, Regex encoding/decoding
111+
- ✅ Binary, Code, Timestamp, Decimal128, MinKey, MaxKey support
112+
- ✅ DBRef and DBPointer decoding
113+
- ✅ Int64 type marker support
114+
- ✅ Basic CodecOptions (tz_aware, uuid_representation)
115+
- ✅ Buffer protocol support (memoryview, array)
116+
-_id field ordering at top level
117+
- ✅ Benchmark scripts and performance analysis
118+
- ✅ Comprehensive documentation (RUST_SPIKE_RESULTS.md)
119+
-**Same Rust architecture**: PyO3 0.27 + bson 2.13 crate (Python → Bson enum → bytes)
120+
121+
**What it lacked:**
122+
- ❌ Only 60% test pass rate (53/88 tests)
123+
- ❌ Incomplete datetime handling (no DATETIME_CLAMP, DATETIME_AUTO, DATETIME_MS modes)
124+
- ❌ Missing unicode_decode_error_handler support
125+
- ❌ No document_class support from CodecOptions
126+
- ❌ No tzinfo conversion support
127+
- ❌ Missing BSON validation (size checks, null terminator)
128+
- ❌ No performance optimizations (type caching, fast paths)
129+
- ❌ Located in `rust/` directory instead of `bson/_rbson/`
130+
131+
**Performance Claims**: 2.89x average speedup over C (from benchmarks in POC)
132+
133+
**Why the POC appeared faster:**
134+
The Copilot POC's claimed 2.89x speedup was likely due to:
135+
1. **Limited test scope** - Benchmarks only tested simple documents that passed (53/88 tests)
136+
2. **Missing validation** - No BSON size checks, null terminator validation, or extra bytes detection
137+
3. **Incomplete CodecOptions** - Skipped expensive operations like:
138+
- Timezone conversions (`tzinfo` with `astimezone()`)
139+
- DateTime mode handling (CLAMP, AUTO, MS)
140+
- Unicode error handler fallbacks to Python
141+
- Custom document_class instantiation
142+
4. **Optimistic measurements** - May have measured only the fast path without edge cases
143+
5. **Different test methodology** - POC used custom benchmarks vs production testing with full PyMongo test suite
144+
145+
When these missing features were added to achieve 100% compatibility, the true performance cost of the Rust `Bson` enum architecture became apparent.
146+
147+
### Current Implementation (PR #2695) - Production-Ready
148+
**Status**: 88/88 tests passing (100%)
149+
150+
**Build System**: `maturin build --release` (proper wheel generation)
151+
- Uses Maturin for proper Python packaging
152+
- Generates wheels with correct metadata
153+
- Extracts .so file to `bson/` directory
154+
- Located in `bson/_rbson/` directory (proper module structure)
155+
156+
**Improvements over Copilot POC:**
157+
-**100% test compatibility** (88/88 vs 53/88)
158+
-**Complete CodecOptions support**:
159+
- `document_class` - Custom document classes
160+
- `tzinfo` - Timezone conversion with astimezone()
161+
- `datetime_conversion` - All modes (AUTO, CLAMP, MS)
162+
- `unicode_decode_error_handler` - Fallback to Python for non-strict handlers
163+
-**BSON validation** (size checks, null terminator, extra bytes detection)
164+
-**Performance optimizations**:
165+
- Type caching (UUID, datetime, Pattern, etc.)
166+
- Fast paths for common types (int, str, bool, None)
167+
- Direct byte operations where possible
168+
- PyDict fast path with pre-allocation
169+
-**Production-ready error handling** (matches C extension error messages exactly)
170+
-**Proper module structure** (`bson/_rbson/` with build.sh and maturin)
171+
-**Runtime selection** via PYMONGO_USE_RUST environment variable
172+
-**Comprehensive testing** (cross-compatibility tests, performance benchmarks)
173+
-**Same Rust architecture**: PyO3 0.23 + bson 2.13 crate (Python → Bson enum → bytes)
174+
175+
**Performance Reality**: ~0.21x (5x slower than C) - see Performance Analysis section
176+
177+
**Key Insights**:
178+
1. **Same Architecture, Different Results**: Both implementations use the same Rust architecture (PyO3 + bson crate with intermediate `Bson` enum), so the build system (cargo vs maturin) is not the cause of the performance difference.
179+
2. **Incomplete vs Complete**: The POC's speed claims were based on incomplete functionality (60% test pass rate). Achieving 100% compatibility revealed the true performance cost of:
180+
- Complete CodecOptions handling (timezone conversions, datetime modes, etc.)
181+
- BSON validation (size checks, null terminators, extra bytes)
182+
- Production-ready error handling
183+
- Edge case handling for all 88 tests
184+
3. **The Fundamental Issue**: Both implementations suffer from the same architectural limitation (Python → Bson enum → bytes), but it only becomes a significant bottleneck when you implement all the features required for production use.
185+
186+
## Steps to Achieve Performance Parity with C Extensions
187+
188+
Based on the analysis in PR #2695, here are the steps needed to match C extension performance:
189+
190+
### 1. Eliminate Intermediate Bson Enum (High Impact)
191+
**Current**: Python → Bson → bytes
192+
**Target**: Python → bytes (direct)
193+
194+
Implement direct BSON byte writing from Python objects without converting to `Bson` enum first. This would require:
195+
- Custom serialization logic for each Python type
196+
- Manual BSON format handling (type bytes, length prefixes, etc.)
197+
- Bypassing the `bson` crate's serialization layer
198+
199+
**Estimated Impact**: 3-4x performance improvement
200+
201+
### 2. Optimize Python API Calls (Medium Impact)
202+
- Reduce `getattr()` calls by caching attribute lookups
203+
- Use `PyDict_GetItem` instead of `dict.get_item()`
204+
- Minimize Python exception handling overhead
205+
- Use `PyTuple_GET_ITEM` for tuple access
206+
207+
**Estimated Impact**: 1.2-1.5x performance improvement
208+
209+
### 3. Memory Allocation Optimization (Low-Medium Impact)
210+
- Pre-allocate buffers based on estimated document size
211+
- Reuse buffers across multiple encode operations
212+
- Use arena allocation for temporary objects
213+
214+
**Estimated Impact**: 1.1-1.3x performance improvement
215+
216+
### 4. SIMD Optimizations (Low Impact)
217+
- Use SIMD for byte copying operations
218+
- Vectorize validation checks
219+
- Optimize string encoding/decoding
220+
221+
**Estimated Impact**: 1.05-1.1x performance improvement
222+
223+
### Combined Potential
224+
If all optimizations are implemented successfully:
225+
- Current: 0.21x (5x slower)
226+
- Target: 0.21x × 3.5 × 1.3 × 1.2 × 1.05 = **~1.13x** (13% faster than C)
227+
228+
However, achieving this would require:
229+
- Significant engineering effort (weeks to months)
230+
- Bypassing the `bson` crate (losing its benefits)
231+
- Complex low-level code (harder to maintain)
232+
233+
## Building
234+
235+
```bash
236+
cd bson/_rbson
237+
./build.sh
238+
```
239+
240+
Or using maturin directly:
241+
```bash
242+
maturin develop --release
243+
```
244+
245+
## Testing
246+
247+
Run the test suite with the Rust extension:
248+
```bash
249+
PYMONGO_USE_RUST=1 python -m pytest test/
250+
```
251+
252+
Run performance benchmarks:
253+
```bash
254+
python test/performance/perf_test.py
255+
```
256+
257+
## Conclusion
258+
259+
The Rust extension demonstrates that:
260+
1.**Rust can provide a complete, production-ready BSON implementation**
261+
2.**100% compatibility with existing tests and APIs is achievable**
262+
3.**Performance parity with C requires bypassing the `bson` crate**
263+
4.**The engineering effort may not justify the benefits**
264+
265+
### Recommendation
266+
267+
The Rust extension is **production-ready** from a correctness standpoint but **not recommended** for performance-critical applications. The C extension remains the better choice for performance.
268+
269+
**Use Cases for Rust Extension:**
270+
- Platforms where C compilation is difficult (e.g., WebAssembly)
271+
- Development environments without C toolchain
272+
- Testing and validation purposes
273+
- Future exploration if `bson` crate performance improves
274+
275+
For more details, see:
276+
- [PYTHON-5683 JIRA ticket](https://jira.mongodb.org/browse/PYTHON-5683)
277+
- [PR #2695](https://github.com/mongodb/mongo-python-driver/pull/2695)

0 commit comments

Comments
 (0)