|
| 1 | +# Rust Extension Testing in Evergreen |
| 2 | + |
| 3 | +This directory contains configuration and scripts for testing the Rust BSON extension in Evergreen CI. |
| 4 | + |
| 5 | +## Files |
| 6 | + |
| 7 | +### `run-rust-tests.sh` |
| 8 | +Standalone script that: |
| 9 | +1. Installs Rust toolchain if needed |
| 10 | +2. Installs maturin (Rust-Python build tool) |
| 11 | +3. Builds pymongo with Rust extension enabled |
| 12 | +4. Verifies the Rust extension is active |
| 13 | +5. Runs BSON tests with the Rust extension |
| 14 | + |
| 15 | +**Usage:** |
| 16 | +```bash |
| 17 | +cd /path/to/mongo-python-driver |
| 18 | +.evergreen/run-rust-tests.sh |
| 19 | +``` |
| 20 | + |
| 21 | +**Environment Variables:** |
| 22 | +- `PYMONGO_BUILD_RUST=1` - Requires building the Rust extension (build fails if Rust unavailable) |
| 23 | +- `PYMONGO_USE_RUST=1` - Forces runtime to use Rust extension |
| 24 | + |
| 25 | +### `rust-extension.yml` |
| 26 | +Evergreen configuration for Rust extension testing. Defines: |
| 27 | +- **Functions**: `test rust extension` - Runs the Rust test script |
| 28 | +- **Tasks**: Test tasks for different Python versions (3.10, 3.12, 3.14) |
| 29 | +- **Build Variants**: Test configurations for RHEL8, macOS ARM64, and Windows |
| 30 | + |
| 31 | +**To integrate into main config:** |
| 32 | +Add to `.evergreen/config.yml`: |
| 33 | +```yaml |
| 34 | +include: |
| 35 | + - filename: .evergreen/generated_configs/functions.yml |
| 36 | + - filename: .evergreen/generated_configs/tasks.yml |
| 37 | + - filename: .evergreen/generated_configs/variants.yml |
| 38 | + - filename: .evergreen/rust-extension.yml # Add this line |
| 39 | +``` |
| 40 | +
|
| 41 | +## Integration with Generated Config |
| 42 | +
|
| 43 | +The Rust extension tests can also be integrated into the generated Evergreen configuration. |
| 44 | +
|
| 45 | +### Modifications to `scripts/generate_config.py` |
| 46 | + |
| 47 | +Three new functions have been added: |
| 48 | + |
| 49 | +1. **`create_test_rust_tasks()`** - Creates test tasks for Python 3.10, 3.12, and 3.14 |
| 50 | +2. **`create_test_rust_variants()`** - Creates build variants for RHEL8, macOS ARM64, and Windows |
| 51 | +3. **`create_test_rust_func()`** - Creates the function to run Rust tests |
| 52 | + |
| 53 | +### Regenerating Config |
| 54 | + |
| 55 | +To regenerate the Evergreen configuration with Rust tests: |
| 56 | + |
| 57 | +```bash |
| 58 | +cd .evergreen/scripts |
| 59 | +python generate_config.py |
| 60 | +``` |
| 61 | + |
| 62 | +**Note:** Requires the `shrub` Python package: |
| 63 | +```bash |
| 64 | +pip install shrub.py |
| 65 | +``` |
| 66 | + |
| 67 | +## Test Coverage |
| 68 | + |
| 69 | +The Rust extension currently passes **100% of BSON tests** (60 tests: 58 passing + 2 skipped): |
| 70 | + |
| 71 | +### Passing Tests |
| 72 | +- Basic BSON encoding/decoding |
| 73 | +- All BSON types (ObjectId, DateTime, Decimal128, Regex, Binary, Code, Timestamp, etc.) |
| 74 | +- Binary data handling (including UUID with all representation modes) |
| 75 | +- Nested documents and arrays |
| 76 | +- Exception handling (InvalidDocument, InvalidBSON, OverflowError) |
| 77 | +- Error message formatting with document property |
| 78 | +- Datetime clamping and timezone handling |
| 79 | +- Custom classes and codec options |
| 80 | +- Buffer protocol support (bytes, bytearray, memoryview, array, mmap) |
| 81 | +- Unicode decode error handlers |
| 82 | +- BSON validation (document structure, string null terminators, size fields) |
| 83 | + |
| 84 | +### Skipped Tests |
| 85 | +- **2 tests** - Require optional numpy dependency |
| 86 | + |
| 87 | +## Platform Support |
| 88 | + |
| 89 | +The Rust extension is tested on: |
| 90 | +- **Linux (RHEL8)** - Primary platform, runs on PRs |
| 91 | +- **macOS ARM64** - Secondary platform |
| 92 | +- **Windows 64-bit** - Secondary platform |
| 93 | + |
| 94 | +## Performance |
| 95 | + |
| 96 | +The Rust extension is currently **slower than the C extension** for both encoding and decoding: |
| 97 | +- Simple encoding: **0.84x** (16% slower than C) |
| 98 | +- Complex encoding: **0.21x** (5x slower than C) |
| 99 | +- Simple decoding: **0.42x** (2.4x slower than C) |
| 100 | +- Complex decoding: **0.29x** (3.4x slower than C) |
| 101 | + |
| 102 | +The main bottleneck is **Python FFI overhead** - creating Python objects from Rust incurs significant performance cost. |
| 103 | + |
| 104 | +**Benefits of Rust implementation:** |
| 105 | +- Memory safety guarantees (prevents buffer overflows and use-after-free bugs) |
| 106 | +- Easier maintenance and debugging with strong type system |
| 107 | +- Cross-platform compatibility via Rust's toolchain |
| 108 | +- 100% test compatibility with C extension |
| 109 | + |
| 110 | +**Recommendation:** C extension remains the default and recommended choice. The Rust extension demonstrates feasibility and correctness but is not yet performance-competitive for production use. |
| 111 | + |
| 112 | +## Future Work |
| 113 | + |
| 114 | +- Performance optimization (type caching, reduce FFI overhead) |
| 115 | +- Performance benchmarking suite |
| 116 | +- Additional BSON type optimizations |
0 commit comments