Skip to content

Commit 0d76638

Browse files
Release v0.2.1
* Bump version to 0.2.1 for development * feat: add configurable storage modes for product quantization * Add: bincode v2.0.1 dependency for binary serialization * feat: implement Rust backend support for storage mode configuration * 📄 docs(changelog): update for v0.2.1 release * Add: latest uv.lock file for reproducible Python dependency management * test: add comprehensive storage mode tests for quantization configurations * 📄 docs(readme): updated the content information * 📄 docs(readme): updated the content information
1 parent f73a237 commit 0d76638

10 files changed

Lines changed: 907 additions & 44 deletions

File tree

CHANGELOG.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,41 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
---
99

10+
## [0.2.1] - 2025-
11+
12+
### Added
13+
- Storage mode configuration for product quantization: New storage_mode parameter in quantization config allows users to choose between:
14+
- '"quantized_only"' (default): Maximum memory efficiency by discarding raw vectors after quantization
15+
- '"quantized_with_raw"': Keep both quantized codes and raw vectors for exact reconstruction
16+
- Case-insensitive storage mode validation: Accepts variations like "Quantized_Only", "QUANTIZED_WITH_RAW"
17+
- Automatic memory usage warnings: Users are warned when `quantized_with_raw` mode will use significantly more memory
18+
- Enhanced subvector divisor suggestions: `_suggest_subvector_divisors()` now returns `list[int]` for programmatic use
19+
- StorageMode enum: Rust backend support for `quantized_only` and `quantized_with_raw` storage modes with JSON serialization
20+
- Storage mode parsing: Complete quantization config parsing in HNSWIndex constructor with proper error handling
21+
- Intelligent vector retrieval: `get_records()` method now prioritizes raw vectors over PQ reconstruction when available
22+
- Enhanced statistics: `get_stats()` now reports storage mode, memory usage breakdown, and storage strategy information
23+
- Memory usage tracking: Real-time memory usage calculations for both raw vectors and quantized codes
24+
25+
### Changed
26+
- Quantization config validation: Now includes comprehensive validation and normalization of all parameters
27+
- Error messages: Improved clarity for storage mode validation with sorted mode suggestions
28+
- Defensive programming: Added final safety checks to ensure complete configuration before passing to Rust backend
29+
- QuantizationConfig struct: Now includes `storage_mode` field with backward-compatible defaults
30+
- add_quantized_vector logic: Respects storage mode configuration to conditionally store raw vectors
31+
- get_stats output: Enhanced with storage strategy descriptions ("memory_optimized" vs "quality_optimized")
32+
- Vector storage behavior: `quantized_only` mode stops storing raw vectors after PQ training for maximum memory efficiency
33+
34+
### Fixed
35+
- Configuration completeness: All quantization parameters now have guaranteed defaults to prevent missing key errors
36+
- None value handling: Python config cleaning now properly removes `None` values before passing to Rust backend
37+
- Constructor parameter validation: Improved error handling for missing or invalid quantization parameters
38+
- Memory statistics accuracy: Corrected memory usage calculations based on actual storage mode behavior
39+
40+
### Removed
41+
<!-- Add removals/deprecations here -->
42+
43+
---
44+
1045
## [0.2.0] - 2025-07-28
1146

1247
### Added

README.md

Lines changed: 41 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -575,12 +575,13 @@ To enable PQ, pass a `quantization_config` dictionary to the `.create()` index m
575575
| `bits` | `int` | Bits per quantized code (controls centroids per subvector) | 1-8 | `8` |
576576
| `training_size` | `int` | Minimum vectors needed for stable k-means clustering | ≥ 1000 | 1000 |
577577
| `max_training_vectors` | `int` | Maximum vectors used during training (optional limit) | ≥ training_size | `None` |
578+
| `storage_mode` | `str` | Storage strategy: "quantized_only" (memory optimized) or "quantized_with_raw" (keep raw vectors for exact reconstruction) | "quantized_only", "quantized_with_raw" | `"quantized_only"` |
578579

579580

580581
<br/>
581582

582583

583-
### 🔧 Usage Example
584+
### 🔧 Usage Example 1
584585

585586
```python
586587
from zeusdb_vector_database import VectorDatabase
@@ -646,6 +647,36 @@ Results
646647
{'id': 'doc_8148', 'score': 0.5139288306236267, 'metadata': {'category': 'tech', 'year': 2026}},
647648
{'id': 'doc_7822', 'score': 0.5151920914649963, 'metadata': {'category': 'tech', 'year': 2026}},
648649
]
650+
```
651+
<br />
652+
653+
### 🔧 Usage Example 2 - with explicit storage mode
654+
655+
```python
656+
from zeusdb_vector_database import VectorDatabase
657+
import numpy as np
658+
659+
# Create index with product quantization
660+
vdb = VectorDatabase()
661+
662+
# Configure quantization for memory efficiency
663+
quantization_config = {
664+
'type': 'pq', # `pq` for Product Quantization
665+
'subvectors': 8, # Divide 1536-dim vectors into 8 subvectors of 192 dims each
666+
'bits': 8, # 256 centroids per subvector (2^8)
667+
'training_size': 10000, # Train when 10k vectors are collected
668+
'max_training_vectors': 50000, # Use max 50k vectors for training
669+
'storage_mode': 'quantized_only' # Explicitly set storage mode to only keep quantized values
670+
}
671+
672+
# Create index with quantization
673+
# This will automatically handle training when enough vectors are added
674+
index = vdb.create(
675+
index_type="hnsw",
676+
dim=3072, # OpenAI `text-embedding-3-large` dimension
677+
quantization_config=quantization_config # Add the compression configuration
678+
)
679+
649680
```
650681

651682
<br />
@@ -658,7 +689,8 @@ quantization_config = {
658689
'type': 'pq',
659690
'subvectors': 8, # Balanced: moderate compression, good accuracy
660691
'bits': 8, # 256 centroids per subvector (high precision)
661-
'training_size': 10000 # Or higher for large datasets
692+
'training_size': 10000, # Or higher for large datasets
693+
'storage_mode': 'quantized_only' # Default, memory efficient
662694
}
663695
# Achieves ~16x–32x compression with strong recall for most applications
664696
```
@@ -670,7 +702,8 @@ quantization_config = {
670702
'type': 'pq',
671703
'subvectors': 16, # More subvectors = better compression
672704
'bits': 6, # Fewer bits = less memory per centroid
673-
'training_size': 20000
705+
'training_size': 20000,
706+
'storage_mode': 'quantized_only'
674707
}
675708
# Achieves ~32x compression ratio
676709
```
@@ -682,6 +715,7 @@ quantization_config = {
682715
'subvectors': 4, # Fewer subvectors = better accuracy
683716
'bits': 8, # More bits = more precise quantization
684717
'training_size': 50000 # More training data = better centroids
718+
'storage_mode': 'quantized_with_raw' # Keep raw vectors for exact recall
685719
}
686720
# Achieves ~4x compression ratio with minimal accuracy loss
687721
```
@@ -695,6 +729,10 @@ quantization_config = {
695729

696730
Quantization is ideal for production deployments with large vector datasets (100k+ vectors) where memory efficiency is critical.
697731

732+
`"quantized_only"` is recommended for most use cases and maximizes memory savings.
733+
734+
`"quantized_with_raw"` keeps both quantized and raw vectors for exact reconstruction, but uses more memory.
735+
698736

699737
<br/>
700738

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
# Usage Examples for Storage Mode Configuration
2+
3+
from zeusdb_vector_database import VectorDatabase
4+
import numpy as np
5+
6+
# Example 1: Default (Memory Efficient) - quantized_only
7+
vdb = VectorDatabase()
8+
index_memory_efficient = vdb.create(
9+
"hnsw",
10+
dim=768,
11+
quantization_config={
12+
"type": "pq",
13+
"subvectors": 8,
14+
"bits": 8,
15+
"training_size": 10000
16+
# storage_mode defaults to "quantized_only"
17+
}
18+
)
19+
20+
# Example 2: Explicit quantized_only mode
21+
index_explicit = vdb.create(
22+
"hnsw",
23+
dim=768,
24+
quantization_config={
25+
"type": "pq",
26+
"subvectors": 8,
27+
"bits": 8,
28+
"training_size": 10000,
29+
"storage_mode": "quantized_only"
30+
}
31+
)
32+
33+
# Example 3: Keep raw vectors for exact reconstruction
34+
index_with_raw = vdb.create(
35+
"hnsw",
36+
dim=768,
37+
quantization_config={
38+
"type": "pq",
39+
"subvectors": 8,
40+
"bits": 8,
41+
"training_size": 10000,
42+
"storage_mode": "quantized_with_raw" # Keep both quantized + raw
43+
}
44+
)
45+
# This will show a warning about increased memory usage
46+
47+
# Testing the different modes
48+
def test_storage_modes():
49+
# Generate test data
50+
vectors = np.random.random((15000, 768)).astype(np.float32)
51+
52+
# Test quantized_only mode
53+
print("=== Testing quantized_only mode ===")
54+
index1 = vdb.create("hnsw", dim=768, quantization_config={
55+
"type": "pq", "subvectors": 8, "bits": 8,
56+
"training_size": 10000, "storage_mode": "quantized_only"
57+
})
58+
59+
# Add vectors (will trigger training)
60+
result1 = index1.add(vectors.tolist())
61+
print(f"Added: {result1.total_inserted}, Errors: {result1.total_errors}")
62+
63+
# Check stats
64+
stats1 = index1.get_stats()
65+
print(f"Storage mode: {stats1['storage_mode']}")
66+
print(f"Raw vectors stored: {stats1['raw_vectors_stored']}")
67+
print(f"Quantized codes stored: {stats1['quantized_codes_stored']}")
68+
69+
# Get records (will use PQ reconstruction)
70+
records1 = index1.get_records(["vec_1"], return_vector=True)
71+
print(f"Vector available: {'vector' in records1[0] if records1 else False}")
72+
73+
if records1 and 'vector' in records1[0]:
74+
print(f"Vector shape: {len(records1[0]['vector'])}")
75+
76+
print("\n=== Testing quantized_with_raw mode ===")
77+
index2 = vdb.create("hnsw", dim=768, quantization_config={
78+
"type": "pq", "subvectors": 8, "bits": 8,
79+
"training_size": 10000, "storage_mode": "quantized_with_raw"
80+
})
81+
82+
# Add vectors (will trigger training)
83+
result2 = index2.add(vectors.tolist())
84+
print(f"Added: {result2.total_inserted}, Errors: {result2.total_errors}")
85+
86+
# Check stats
87+
stats2 = index2.get_stats()
88+
print(f"Storage mode: {stats2['storage_mode']}")
89+
print(f"Raw vectors stored: {stats2['raw_vectors_stored']}")
90+
print(f"Quantized codes stored: {stats2['quantized_codes_stored']}")
91+
92+
# Get records (will use exact raw vectors)
93+
records2 = index2.get_records(["vec_1"], return_vector=True)
94+
print(f"Vector available: {'vector' in records2[0] if records2 else False}")
95+
96+
if records2 and 'vector' in records2[0]:
97+
print(f"Vector shape: {len(records2[0]['vector'])}")
98+
99+
# Compare memory usage
100+
print("\nMemory comparison:")
101+
print(f"quantized_only - Raw vectors: {stats1['raw_vectors_stored']}")
102+
print(f"quantized_with_raw - Raw vectors: {stats2['raw_vectors_stored']}")
103+
104+
# Error handling test
105+
def test_invalid_storage_mode():
106+
try:
107+
vdb = VectorDatabase()
108+
vdb.create("hnsw", dim=768, quantization_config={
109+
"type": "pq",
110+
"subvectors": 8,
111+
"bits": 8,
112+
"training_size": 10000,
113+
"storage_mode": "invalid_mode" # This should fail
114+
})
115+
except ValueError as e:
116+
print(f"Expected error: {e}")
117+
118+
if __name__ == "__main__":
119+
test_storage_modes()
120+
test_invalid_storage_mode()

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "zeusdb-vector-database"
3-
version = "0.2.0"
3+
version = "0.2.1"
44
description = "Blazing-fast vector DB with real-time similarity search and metadata filtering."
55
readme = "README.md"
66
authors = [

src/zeusdb_vector_database/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
"""
22
ZeusDB Vector Database Module
33
"""
4-
__version__ = "0.2.0"
4+
__version__ = "0.2.1"
55

66
from .vector_database import VectorDatabase # imports the VectorDatabase class from the vector_database.py file
77

src/zeusdb_vector_database/vector_database.py

Lines changed: 47 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,8 @@ def create(self, index_type: str = "hnsw", quantization_config: Optional[Dict[st
5656
'subvectors': 8, # Number of subvectors (must divide dim evenly, default: 8)
5757
'bits': 8, # Bits per subvector (1-8, controls centroids, default: 8)
5858
'training_size': None, # Auto-calculated based on subvectors & bits (or specify manually)
59-
'max_training_vectors': None # Optional limit on training vectors used
59+
'max_training_vectors': None, # Optional limit on training vectors used
60+
'storage_mode': 'quantized_only' # Storage mode for quantized vectors (or 'quantized_with_raw')
6061
}
6162
6263
Note: Quantization reduces memory usage (typically 4-32x compression) but may
@@ -88,7 +89,8 @@ def create(self, index_type: str = "hnsw", quantization_config: Optional[Dict[st
8889
'type': 'pq',
8990
'subvectors': 16, # More subvectors = better compression
9091
'bits': 6, # Fewer bits = less memory per centroid
91-
'training_size': 75000 # Override auto-calculation
92+
'training_size': 75000, # Override auto-calculation
93+
'storage_mode': 'quantized_only' # Only store quantized vectors
9294
}
9395
index = vdb.create(
9496
index_type="hnsw",
@@ -126,11 +128,12 @@ def create(self, index_type: str = "hnsw", quantization_config: Optional[Dict[st
126128

127129
try:
128130
# Always pass quantization_config parameter
129-
clean_config = None
130131
if quantization_config is not None:
131-
# Clean quantization_config before passing to Rust (remove internal keys)
132-
clean_config = {k: v for k, v in quantization_config.items() if not k.startswith('_')}
133-
132+
# Remove keys with None values and internal keys
133+
clean_config = {k: v for k, v in quantization_config.items() if not k.startswith('_') and v is not None}
134+
else:
135+
clean_config = None
136+
134137
return constructor(quantization_config=clean_config, **kwargs)
135138
except Exception as e:
136139
raise RuntimeError(f"Failed to create {index_type.upper()} index: {e}") from e
@@ -172,7 +175,7 @@ def _validate_quantization_config(self, config: Dict[str, Any], dim: int) -> Dic
172175
if dim % subvectors != 0:
173176
raise ValueError(
174177
f"subvectors ({subvectors}) must divide dimension ({dim}) evenly. "
175-
f"Consider using subvectors: {self._suggest_subvector_divisors(dim)}"
178+
f"Consider using subvectors: {', '.join(map(str, self._suggest_subvector_divisors(dim)))}"
176179
)
177180

178181
if subvectors > dim:
@@ -206,9 +209,38 @@ def _validate_quantization_config(self, config: Dict[str, Any], dim: int) -> Dic
206209
)
207210
validated_config['max_training_vectors'] = max_training_vectors
208211

212+
# Validate storage mode
213+
storage_mode = str(validated_config.get('storage_mode', 'quantized_only')).lower()
214+
valid_modes = {'quantized_only', 'quantized_with_raw'}
215+
if storage_mode not in valid_modes:
216+
raise ValueError(
217+
f"Invalid storage_mode: '{storage_mode}'. Supported modes: {', '.join(sorted(valid_modes))}"
218+
)
219+
220+
validated_config['storage_mode'] = storage_mode
221+
209222
# Calculate and warn about memory usage
210223
self._check_memory_usage(validated_config, dim)
224+
225+
# Add helpful warnings about storage mode
226+
if storage_mode == 'quantized_with_raw':
227+
import warnings
228+
compression_ratio = validated_config.get('__memory_info__', {}).get('compression_ratio', 1.0)
229+
warnings.warn(
230+
f"storage_mode='quantized_with_raw' will use ~{compression_ratio:.1f}x more memory "
231+
f"than 'quantized_only' but enables exact vector reconstruction.",
232+
UserWarning,
233+
stacklevel=2
234+
)
211235

236+
# Final safety check: ensure all expected keys are present
237+
# This is a final defensive programming - all the keys should already be set above, but added just in case
238+
validated_config.setdefault('type', 'pq')
239+
validated_config.setdefault('subvectors', 8)
240+
validated_config.setdefault('bits', 8)
241+
validated_config.setdefault('max_training_vectors', None)
242+
validated_config.setdefault('storage_mode', 'quantized_only')
243+
212244
return validated_config
213245

214246
def _calculate_smart_training_size(self, subvectors: int, bits: int) -> int:
@@ -236,13 +268,14 @@ def _calculate_smart_training_size(self, subvectors: int, bits: int) -> int:
236268

237269
return min(max(statistical_minimum, reasonable_minimum), reasonable_maximum)
238270

239-
def _suggest_subvector_divisors(self, dim: int) -> str:
240-
"""Suggest valid subvector counts that divide the dimension evenly."""
241-
divisors = []
242-
for i in range(1, min(33, dim + 1)): # Common subvector counts up to 32
243-
if dim % i == 0:
244-
divisors.append(str(i))
245-
return ', '.join(divisors[:8]) # Show first 8 suggestions
271+
272+
def _suggest_subvector_divisors(self, dim: int) -> list[int]:
273+
"""Return valid subvector counts that divide the dimension evenly (up to 32)."""
274+
return [i for i in range(1, min(33, dim + 1)) if dim % i == 0]
275+
276+
277+
278+
246279

247280
def _check_memory_usage(self, config: Dict[str, Any], dim: int) -> None:
248281
"""

0 commit comments

Comments
 (0)