|
| 1 | +# API Stability Guarantees (v1.0) |
| 2 | + |
| 3 | +**Date:** 2025-12-02 |
| 4 | +**Version:** 1.0.0 |
| 5 | +**Status:** OFFICIAL |
| 6 | + |
| 7 | +This document defines the API stability guarantees for `binary_semantic_cache` v1.x releases. It categorizes all public APIs into three tiers: **Stable**, **Deprecated**, and **Unstable/Internal**. |
| 8 | + |
| 9 | +--- |
| 10 | + |
| 11 | +## Stability Tiers |
| 12 | + |
| 13 | +| Tier | Meaning | |
| 14 | +| :--- | :--- | |
| 15 | +| **Stable** | Will not have breaking changes in any v1.x release. Safe to depend on. | |
| 16 | +| **Deprecated** | Will be removed in v2.0.0. Use the documented replacement. | |
| 17 | +| **Unstable/Internal** | May change or be removed in any release without notice. Do not depend on. | |
| 18 | + |
| 19 | +--- |
| 20 | + |
| 21 | +## 1. Stable APIs (Will Not Break in v1.x) |
| 22 | + |
| 23 | +### 1.1 `BinarySemanticCache` |
| 24 | + |
| 25 | +The primary cache class. All methods and properties listed below are stable. |
| 26 | + |
| 27 | +| Member | Signature | Notes | |
| 28 | +| :--- | :--- | :--- | |
| 29 | +| `__init__` | `(encoder, max_entries=100000, similarity_threshold=0.80)` | Constructor | |
| 30 | +| `get` | `(embedding: np.ndarray) -> Optional[CacheEntry]` | Lookup by similarity | |
| 31 | +| `put` | `(embedding: np.ndarray, response: Any, store_embedding: bool = False) -> int` | Store entry, returns index | |
| 32 | +| `delete` | `(entry_id: int) -> bool` | Delete by index | |
| 33 | +| `clear` | `() -> None` | Remove all entries | |
| 34 | +| `stats` | `() -> CacheStats` | Get statistics | |
| 35 | +| `memory_bytes` | `() -> int` | Estimate memory usage | |
| 36 | +| `get_all_entries` | `() -> List[CacheEntry]` | Get all entries | |
| 37 | +| `save_mmap_v3` | `(path: str) -> None` | Save to v3 format | |
| 38 | +| `load_mmap_v3` | `(path: str, skip_checksum: bool = False) -> None` | Load from v3 format | |
| 39 | +| `__len__` | `() -> int` | Number of entries | |
| 40 | +| `__repr__` | `() -> str` | String representation | |
| 41 | +| `encoder` | `property -> Encoder` | Get encoder instance | |
| 42 | +| `max_entries` | `property -> int` | Maximum capacity | |
| 43 | +| `similarity_threshold` | `property -> float` | Hit threshold | |
| 44 | + |
| 45 | +### 1.2 `CacheEntry` |
| 46 | + |
| 47 | +Immutable result object returned by `get()`. All fields are stable. |
| 48 | + |
| 49 | +| Field | Type | Notes | |
| 50 | +| :--- | :--- | :--- | |
| 51 | +| `id` | `int` | Entry index | |
| 52 | +| `code` | `np.ndarray` | Binary code (uint64) | |
| 53 | +| `response` | `Any` | Cached response object | |
| 54 | +| `created_at` | `float` | Unix timestamp (creation) | |
| 55 | +| `last_accessed` | `float` | Unix timestamp (last access) | |
| 56 | +| `access_count` | `int` | Number of accesses | |
| 57 | +| `similarity` | `float` | Similarity score (default 1.0) | |
| 58 | + |
| 59 | +### 1.3 `CacheStats` |
| 60 | + |
| 61 | +Statistics dataclass returned by `stats()`. All fields and properties are stable. |
| 62 | + |
| 63 | +| Member | Type | Notes | |
| 64 | +| :--- | :--- | :--- | |
| 65 | +| `size` | `int` | Current entry count | |
| 66 | +| `max_size` | `int` | Maximum capacity | |
| 67 | +| `hits` | `int` | Total cache hits | |
| 68 | +| `misses` | `int` | Total cache misses | |
| 69 | +| `evictions` | `int` | Total evictions | |
| 70 | +| `memory_bytes` | `int` | Estimated memory usage | |
| 71 | +| `hit_rate` | `property -> float` | hits / (hits + misses) | |
| 72 | +| `memory_mb` | `property -> float` | memory_bytes / 1MB | |
| 73 | + |
| 74 | +### 1.4 `RustBinaryEncoder` |
| 75 | + |
| 76 | +The production encoder (Rust backend). All methods and properties listed below are stable. |
| 77 | + |
| 78 | +| Member | Signature | Notes | |
| 79 | +| :--- | :--- | :--- | |
| 80 | +| `__init__` | `(embedding_dim: int, code_bits: int = 256, seed: int = 42)` | Constructor | |
| 81 | +| `encode` | `(embedding: np.ndarray) -> np.ndarray` | Encode single/batch | |
| 82 | +| `embedding_dim` | `property -> int` | Input dimension | |
| 83 | +| `code_bits` | `property -> int` | Output bits (256) | |
| 84 | +| `n_words` | `property -> int` | Number of uint64 words | |
| 85 | + |
| 86 | +### 1.5 `PythonBinaryEncoder` (Test Oracle) |
| 87 | + |
| 88 | +The Python encoder retained for testing. Same interface as `RustBinaryEncoder`. |
| 89 | + |
| 90 | +**Note:** This class is stable for testing purposes only. Production code should use `RustBinaryEncoder`. |
| 91 | + |
| 92 | +### 1.6 Exception Classes |
| 93 | + |
| 94 | +All exception classes in the error hierarchy are stable. |
| 95 | + |
| 96 | +| Exception | Base | Purpose | |
| 97 | +| :--- | :--- | :--- | |
| 98 | +| `CacheError` | `Exception` | Base class for all cache errors | |
| 99 | +| `ChecksumError` | `CacheError` | SHA-256 checksum mismatch | |
| 100 | +| `FormatVersionError` | `CacheError` | Unsupported persistence format | |
| 101 | +| `CorruptFileError` | `CacheError` | Invalid or truncated cache file | |
| 102 | +| `UnsupportedPlatformError` | `CacheError` | Platform incompatibility (e.g., endianness) | |
| 103 | + |
| 104 | +### 1.7 Utility Functions |
| 105 | + |
| 106 | +| Function | Signature | Notes | |
| 107 | +| :--- | :--- | :--- | |
| 108 | +| `detect_format_version` | `(path: str) -> int` | Returns 2 (v2) or 3 (v3) | |
| 109 | + |
| 110 | +### 1.8 Constants |
| 111 | + |
| 112 | +| Constant | Value | Notes | |
| 113 | +| :--- | :--- | :--- | |
| 114 | +| `DEFAULT_MAX_ENTRIES` | `100_000` | Default cache capacity | |
| 115 | +| `DEFAULT_THRESHOLD` | `0.80` | Default similarity threshold | |
| 116 | +| `DEFAULT_CODE_BITS` | `256` | Fixed binary code size | |
| 117 | +| `MMAP_FORMAT_VERSION` | `2` | v2 format identifier | |
| 118 | +| `MMAP_FORMAT_VERSION_V3` | `3` | v3 format identifier | |
| 119 | + |
| 120 | +--- |
| 121 | + |
| 122 | +## 2. Deprecated APIs (Will Be Removed in v2.0) |
| 123 | + |
| 124 | +These methods emit `DeprecationWarning` when called. Use the documented replacements. |
| 125 | + |
| 126 | +| Method | Replacement | Removal Version | |
| 127 | +| :--- | :--- | :--- | |
| 128 | +| `BinarySemanticCache.save(path)` | `save_mmap_v3(path)` | v2.0.0 | |
| 129 | +| `BinarySemanticCache.load(path)` | `load_mmap_v3(path)` | v2.0.0 | |
| 130 | +| `BinarySemanticCache.save_mmap(path)` | `save_mmap_v3(path)` | v2.0.0 | |
| 131 | +| `BinarySemanticCache.load_mmap(path)` | `load_mmap_v3(path)` | v2.0.0 | |
| 132 | + |
| 133 | +**Migration Example:** |
| 134 | + |
| 135 | +```python |
| 136 | +# Old (deprecated) |
| 137 | +cache.save("cache.npz") |
| 138 | +cache.load("cache.npz") |
| 139 | + |
| 140 | +# New (stable) |
| 141 | +cache.save_mmap_v3("cache_v3/") |
| 142 | +cache.load_mmap_v3("cache_v3/") |
| 143 | +``` |
| 144 | + |
| 145 | +--- |
| 146 | + |
| 147 | +## 3. Unstable/Internal APIs (May Change) |
| 148 | + |
| 149 | +The following are internal implementation details and are **not** part of the public API. They may change or be removed without notice. |
| 150 | + |
| 151 | +### 3.1 Internal Methods (Prefixed with `_`) |
| 152 | + |
| 153 | +All methods starting with `_` are internal: |
| 154 | + |
| 155 | +- `_set_response(idx, response)` |
| 156 | +- `_get_response(idx)` |
| 157 | +- `_delete_response(idx)` |
| 158 | +- `_compute_checksum(data)` |
| 159 | +- `_validate_single(embedding)` |
| 160 | +- `_validate_batch(embeddings)` |
| 161 | +- `_encode_single(embedding)` |
| 162 | +- `_encode_batch(embeddings)` |
| 163 | + |
| 164 | +### 3.2 Internal Attributes |
| 165 | + |
| 166 | +- `_encoder` |
| 167 | +- `_storage` (RustCacheStorage instance) |
| 168 | +- `_responses` (Python list) |
| 169 | +- `_lock` (RLock) |
| 170 | +- `_hits`, `_misses`, `_evictions` |
| 171 | + |
| 172 | +### 3.3 Rust Internals |
| 173 | + |
| 174 | +The following Rust bindings are internal and may change: |
| 175 | + |
| 176 | +- `RustCacheStorage` (use `BinarySemanticCache` instead) |
| 177 | +- `HammingSimilarity` (use `BinarySemanticCache.get()` instead) |
| 178 | +- `hamming_distance` (internal utility) |
| 179 | +- `rust_version` (informational only) |
| 180 | + |
| 181 | +### 3.4 Protocol Classes |
| 182 | + |
| 183 | +- `EncoderProtocol` — Type hint only, not for subclassing. |
| 184 | + |
| 185 | +### 3.5 File Format Internals |
| 186 | + |
| 187 | +The following constants define the v3 file format. They are stable in terms of format compatibility but should not be used directly: |
| 188 | + |
| 189 | +- `V3_HEADER_FILE`, `V3_ENTRIES_FILE`, `V3_RESPONSES_FILE` |
| 190 | +- `V3_ENTRY_SIZE` (44 bytes) |
| 191 | +- `EPOCH_2020` |
| 192 | + |
| 193 | +--- |
| 194 | + |
| 195 | +## 4. Semantic Contracts (Frozen) |
| 196 | + |
| 197 | +The following semantic behaviors are guaranteed and will not change in v1.x: |
| 198 | + |
| 199 | +| Contract | Definition | |
| 200 | +| :--- | :--- | |
| 201 | +| **Encoder Determinism** | `RustBinaryEncoder(seed=42)` produces identical codes for identical inputs across all v1.x releases. | |
| 202 | +| **Threshold Semantics** | `HIT` if and only if `similarity >= threshold`. | |
| 203 | +| **Similarity Formula** | `similarity = 1.0 - (hamming_distance / code_bits)` | |
| 204 | +| **LRU Eviction** | When `len(cache) >= max_entries`, the least-recently-used entry is evicted. | |
| 205 | + |
| 206 | +--- |
| 207 | + |
| 208 | +## 5. Breaking Change Policy |
| 209 | + |
| 210 | +For v1.x releases: |
| 211 | + |
| 212 | +1. **Stable APIs** will not have breaking changes. |
| 213 | +2. **Deprecated APIs** will continue to work but emit warnings. |
| 214 | +3. **Unstable APIs** may change at any time. |
| 215 | + |
| 216 | +For v2.0.0: |
| 217 | + |
| 218 | +1. **Deprecated APIs** will be removed. |
| 219 | +2. **Stable APIs** may have breaking changes (with migration guide). |
| 220 | + |
| 221 | +--- |
| 222 | + |
| 223 | +*This document is the authoritative source for API stability in v1.x.* |
| 224 | + |
0 commit comments