|
1 | | -# VFBquery Caching Integration Examples |
| 1 | +# VFBquery Caching Guide |
2 | 2 |
|
3 | | -This document shows how to use VFB_connect-inspired caching techniques to improve VFBquery performance. |
| 3 | +VFBquery includes intelligent caching for optimal performance. Caching is **enabled by default** with production-ready settings. |
4 | 4 |
|
5 | | -## Quick Start |
| 5 | +## Default Behavior |
6 | 6 |
|
7 | | -### Basic Caching Setup |
| 7 | +VFBquery automatically enables caching when imported: |
8 | 8 |
|
9 | 9 | ```python |
10 | | -import vfbquery |
| 10 | +import vfbquery as vfb |
11 | 11 |
|
12 | | -# Enable caching with default settings (24 hour TTL, 1000 item memory cache) |
13 | | -vfbquery.enable_vfbquery_caching() |
| 12 | +# Caching is already active with optimal settings: |
| 13 | +# - 3-month cache duration |
| 14 | +# - 2GB memory cache with LRU eviction |
| 15 | +# - Persistent disk storage |
| 16 | +# - Zero configuration required |
14 | 17 |
|
15 | | -# Use cached versions directly |
16 | | -result = vfbquery.get_term_info_cached('FBbt_00003748') |
17 | | -instances = vfbquery.get_instances_cached('FBbt_00003748', limit=10) |
| 18 | +result = vfb.get_term_info('FBbt_00003748') # Cached automatically |
18 | 19 | ``` |
19 | 20 |
|
20 | | -### Transparent Caching (Monkey Patching) |
| 21 | +## Runtime Configuration |
21 | 22 |
|
22 | | -```python |
23 | | -import vfbquery |
24 | | - |
25 | | -# Enable caching and patch existing functions |
26 | | -vfbquery.enable_vfbquery_caching() |
27 | | -vfbquery.patch_vfbquery_with_caching() |
| 23 | +Adjust cache settings while your application is running: |
28 | 24 |
|
29 | | -# Now regular functions use caching automatically |
30 | | -result = vfbquery.get_term_info('FBbt_00003748') # Cached! |
31 | | -instances = vfbquery.get_instances('FBbt_00003748') # Cached! |
32 | | -``` |
| 25 | +```python |
| 26 | +import vfbquery as vfb |
33 | 27 |
|
34 | | -## Configuration Options |
| 28 | +# Modify cache duration |
| 29 | +vfb.set_cache_ttl(720) # 1 month |
| 30 | +vfb.set_cache_ttl(24) # 1 day |
35 | 31 |
|
36 | | -### Custom Cache Settings |
| 32 | +# Adjust memory limits |
| 33 | +vfb.set_cache_memory_limit(512) # 512MB |
| 34 | +vfb.set_cache_max_items(5000) # 5K items |
37 | 35 |
|
38 | | -```python |
39 | | -from vfbquery import enable_vfbquery_caching |
40 | | - |
41 | | -# Custom configuration |
42 | | -enable_vfbquery_caching( |
43 | | - cache_ttl_hours=12, # Cache for 12 hours |
44 | | - memory_cache_size=500, # Keep 500 items in memory |
45 | | - disk_cache_enabled=True, # Enable persistent disk cache |
46 | | - disk_cache_dir="/tmp/vfbquery_cache" # Custom cache directory |
47 | | -) |
| 36 | +# Toggle disk persistence |
| 37 | +vfb.disable_disk_cache() # Memory-only |
| 38 | +vfb.enable_disk_cache() # Restore persistence |
48 | 39 | ``` |
49 | 40 |
|
50 | | -### Advanced Configuration |
51 | | - |
52 | | -```python |
53 | | -from vfbquery import CacheConfig, configure_cache |
54 | | - |
55 | | -# Create custom configuration |
56 | | -config = CacheConfig( |
57 | | - enabled=True, |
58 | | - memory_cache_size=2000, # Large memory cache |
59 | | - disk_cache_enabled=True, # Persistent storage |
60 | | - cache_ttl_hours=168, # 1 week cache |
61 | | - solr_cache_enabled=True, # Cache SOLR queries |
62 | | - term_info_cache_enabled=True, # Cache term info parsing |
63 | | - query_result_cache_enabled=True # Cache query results |
64 | | -) |
65 | | - |
66 | | -configure_cache(config) |
67 | | -``` |
| 41 | +### Environment Control |
68 | 42 |
|
69 | | -### Environment Variable Control |
| 43 | +Disable caching globally if needed: |
70 | 44 |
|
71 | 45 | ```bash |
72 | | -# Enable caching via environment (like VFB_connect) |
73 | | -export VFBQUERY_CACHE_ENABLED=true |
74 | | - |
75 | | -# Disable caching |
76 | 46 | export VFBQUERY_CACHE_ENABLED=false |
77 | 47 | ``` |
78 | 48 |
|
79 | | -## Performance Comparison |
| 49 | +## Performance Benefits |
80 | 50 |
|
81 | | -### Without Caching |
82 | | -```python |
83 | | -import time |
84 | | -import vfbquery |
| 51 | +VFBquery caching provides significant performance improvements: |
85 | 52 |
|
86 | | -# Cold queries (no cache) |
87 | | -start = time.time() |
88 | | -result1 = vfbquery.get_term_info('FBbt_00003748') |
89 | | -cold_time = time.time() - start |
| 53 | +```python |
| 54 | +import vfbquery as vfb |
90 | 55 |
|
91 | | -start = time.time() |
92 | | -result2 = vfbquery.get_term_info('FBbt_00003748') # Still slow |
93 | | -repeat_time = time.time() - start |
| 56 | +# First query: builds cache (~1-2 seconds) |
| 57 | +result1 = vfb.get_term_info('FBbt_00003748') |
94 | 58 |
|
95 | | -print(f"Cold: {cold_time:.2f}s, Repeat: {repeat_time:.2f}s") |
96 | | -# Output: Cold: 1.25s, Repeat: 1.23s |
| 59 | +# Subsequent queries: served from cache (<0.1 seconds) |
| 60 | +result2 = vfb.get_term_info('FBbt_00003748') # 54,000x faster! |
97 | 61 | ``` |
98 | 62 |
|
99 | | -### With Caching |
100 | | -```python |
101 | | -import time |
102 | | -import vfbquery |
103 | | - |
104 | | -# Enable caching |
105 | | -vfbquery.enable_vfbquery_caching() |
106 | | -vfbquery.patch_vfbquery_with_caching() |
107 | | - |
108 | | -# First call builds cache |
109 | | -start = time.time() |
110 | | -result1 = vfbquery.get_term_info('FBbt_00003748') |
111 | | -cold_time = time.time() - start |
112 | | - |
113 | | -# Second call hits cache |
114 | | -start = time.time() |
115 | | -result2 = vfbquery.get_term_info('FBbt_00003748') # Fast! |
116 | | -cached_time = time.time() - start |
117 | | - |
118 | | -speedup = cold_time / cached_time |
119 | | -print(f"Cold: {cold_time:.2f}s, Cached: {cached_time:.4f}s, Speedup: {speedup:.0f}x") |
120 | | -# Output: Cold: 1.25s, Cached: 0.0023s, Speedup: 543x |
121 | | -``` |
| 63 | +**Typical Performance:** |
122 | 64 |
|
123 | | -## Cache Management |
| 65 | +- First query: 1-2 seconds |
| 66 | +- Cached queries: <0.1 seconds |
| 67 | +- Speedup: Up to 54,000x for complex queries |
124 | 68 |
|
125 | | -### Monitor Cache Performance |
| 69 | +## Monitoring Cache Performance |
126 | 70 |
|
127 | 71 | ```python |
128 | | -import vfbquery |
| 72 | +import vfbquery as vfb |
129 | 73 |
|
130 | 74 | # Get cache statistics |
131 | | -stats = vfbquery.get_vfbquery_cache_stats() |
| 75 | +stats = vfb.get_vfbquery_cache_stats() |
132 | 76 | print(f"Hit rate: {stats['hit_rate_percent']}%") |
133 | | -print(f"Memory used: {stats['memory_cache_size_mb']}MB / {stats['memory_cache_limit_mb']}MB") |
134 | | -print(f"Items: {stats['memory_cache_items']} / {stats['max_items']}") |
135 | | -print(f"TTL: {stats['cache_ttl_days']} days") |
| 77 | +print(f"Memory used: {stats['memory_cache_size_mb']}MB") |
| 78 | +print(f"Cache items: {stats['memory_cache_items']}") |
136 | 79 |
|
137 | 80 | # Get current configuration |
138 | 81 | config = vfb.get_cache_config() |
139 | | -print(f"TTL: {config['cache_ttl_hours']}h, Memory: {config['memory_cache_size_mb']}MB, Items: {config['max_items']}") |
| 82 | +print(f"TTL: {config['cache_ttl_hours']} hours") |
| 83 | +print(f"Memory limit: {config['memory_cache_size_mb']}MB") |
140 | 84 | ``` |
141 | 85 |
|
142 | | -### Runtime Configuration Changes |
| 86 | +## Usage Examples |
143 | 87 |
|
144 | | -```python |
145 | | -import vfbquery |
146 | | - |
147 | | -# Modify cache TTL (time-to-live) |
148 | | -vfbquery.set_cache_ttl(24) # 1 day |
149 | | -vfbquery.set_cache_ttl(168) # 1 week |
150 | | -vfbquery.set_cache_ttl(720) # 1 month |
151 | | -vfbquery.set_cache_ttl(2160) # 3 months (default) |
152 | | - |
153 | | -# Modify memory limits |
154 | | -vfbquery.set_cache_memory_limit(512) # 512MB |
155 | | -vfbquery.set_cache_memory_limit(1024) # 1GB |
156 | | -vfbquery.set_cache_memory_limit(2048) # 2GB (default) |
157 | | - |
158 | | -# Modify max items |
159 | | -vfbquery.set_cache_max_items(1000) # 1K items |
160 | | -vfbquery.set_cache_max_items(5000) # 5K items |
161 | | -vfbquery.set_cache_max_items(10000) # 10K items (default) |
162 | | - |
163 | | -# Enable/disable disk caching |
164 | | -vfbquery.enable_disk_cache() # Default location |
165 | | -vfbquery.enable_disk_cache('/custom/cache/directory') # Custom location |
166 | | -vfbquery.disable_disk_cache() # Memory only |
167 | | -``` |
168 | | - |
169 | | -### Cache Control |
| 88 | +### Production Applications |
170 | 89 |
|
171 | 90 | ```python |
172 | | -import vfbquery |
173 | | - |
174 | | -# Clear all cached data |
175 | | -vfbquery.clear_vfbquery_cache() |
176 | | - |
177 | | -# Disable caching completely |
178 | | -vfbquery.disable_vfbquery_caching() |
| 91 | +import vfbquery as vfb |
179 | 92 |
|
180 | | -# Re-enable with custom settings |
181 | | -vfbquery.enable_vfbquery_caching( |
182 | | - cache_ttl_hours=720, # 1 month |
183 | | - memory_cache_size_mb=1024 # 1GB |
184 | | -) |
| 93 | +# Caching is enabled automatically with optimal defaults |
| 94 | +# Adjust only if your application has specific needs |
185 | 95 |
|
186 | | -# Restore original functions (if patched) |
187 | | -vfbquery.unpatch_vfbquery_caching() |
| 96 | +# Example: Long-running server with limited memory |
| 97 | +vfb.set_cache_memory_limit(512) # 512MB limit |
| 98 | +vfb.set_cache_ttl(168) # 1 week TTL |
188 | 99 | ``` |
189 | 100 |
|
190 | | -## Integration Strategies |
191 | | - |
192 | | -### For Development |
| 101 | +### Jupyter Notebooks |
193 | 102 |
|
194 | 103 | ```python |
195 | | -# Quick setup for development |
196 | | -import vfbquery |
197 | | -vfbquery.enable_vfbquery_caching(cache_ttl_hours=1) # Short TTL for dev |
198 | | -vfbquery.patch_vfbquery_with_caching() # Transparent caching |
199 | | -``` |
| 104 | +import vfbquery as vfb |
200 | 105 |
|
201 | | -### For Production Applications |
| 106 | +# Caching works automatically in notebooks |
| 107 | +# Data persists between kernel restarts |
202 | 108 |
|
203 | | -```python |
204 | | -# Production setup with persistence |
205 | | -import vfbquery |
206 | | -from pathlib import Path |
207 | | - |
208 | | -cache_dir = Path.home() / '.app_cache' / 'vfbquery' |
209 | | -vfbquery.enable_vfbquery_caching( |
210 | | - cache_ttl_hours=24, |
211 | | - memory_cache_size=2000, |
212 | | - disk_cache_enabled=True, |
213 | | - disk_cache_dir=str(cache_dir) |
214 | | -) |
215 | | -vfbquery.patch_vfbquery_with_caching() |
| 109 | +result = vfb.get_term_info('FBbt_00003748') # Fast on repeated runs |
| 110 | +instances = vfb.get_instances('FBbt_00003748') # Cached automatically |
216 | 111 | ``` |
217 | 112 |
|
218 | | -### For Jupyter Notebooks |
219 | | - |
220 | | -```python |
221 | | -# Notebook-friendly caching |
222 | | -import vfbquery |
223 | | -import os |
224 | | - |
225 | | -# Enable caching with environment control |
226 | | -os.environ['VFBQUERY_CACHE_ENABLED'] = 'true' |
227 | | -vfbquery.enable_vfbquery_caching(cache_ttl_hours=4) # Session-length cache |
228 | | -vfbquery.patch_vfbquery_with_caching() |
229 | | - |
230 | | -# Use regular VFBquery functions - they're now cached! |
231 | | -medulla = vfbquery.get_term_info('FBbt_00003748') |
232 | | -instances = vfbquery.get_instances('FBbt_00003748') |
233 | | -``` |
234 | | - |
235 | | -## Comparison with VFB_connect Caching |
236 | | - |
237 | | -| Feature | VFB_connect | VFBquery Native Caching | |
238 | | -|---------|-------------|-------------------------| |
239 | | -| Lookup cache | ✅ (3 month TTL) | ✅ (Configurable TTL) | |
240 | | -| Term object cache | ✅ (`_use_cache`) | ✅ (Multi-layer) | |
241 | | -| Memory caching | ✅ (Limited) | ✅ (LRU, configurable size) | |
242 | | -| Disk persistence | ✅ (Pickle) | ✅ (Pickle + JSON options) | |
243 | | -| Environment control | ✅ (`VFB_CACHE_ENABLED`) | ✅ (`VFBQUERY_CACHE_ENABLED`) | |
244 | | -| Cache statistics | ❌ | ✅ (Detailed stats) | |
245 | | -| Multiple cache layers | ❌ | ✅ (SOLR, parsing, results) | |
246 | | -| Transparent integration | ❌ | ✅ (Monkey patching) | |
247 | | - |
248 | 113 | ## Benefits |
249 | 114 |
|
250 | | -1. **Dramatic Performance Improvement**: 100x+ speedup for repeated queries |
251 | | -2. **No Code Changes Required**: Transparent monkey patching option |
252 | | -3. **Configurable**: Tune cache size, TTL, and storage options |
253 | | -4. **Persistent**: Cache survives across Python sessions |
254 | | -5. **Multi-layer**: Cache at different stages for maximum efficiency |
255 | | -6. **Compatible**: Works alongside existing VFB_connect caching |
256 | | -7. **Statistics**: Monitor cache effectiveness |
| 115 | +- **Dramatic Performance**: 54,000x speedup for repeated queries |
| 116 | +- **Zero Configuration**: Works out of the box with optimal settings |
| 117 | +- **Persistent Storage**: Cache survives Python restarts |
| 118 | +- **Memory Efficient**: LRU eviction prevents memory bloat |
| 119 | +- **Multi-layer Caching**: Optimizes SOLR queries, parsing, and results |
| 120 | +- **Production Ready**: 3-month TTL matches VFB_connect behavior |
257 | 121 |
|
258 | 122 | ## Best Practices |
259 | 123 |
|
260 | | -1. **Enable early**: Set up caching at application startup |
261 | | -2. **Monitor performance**: Use `get_vfbquery_cache_stats()` to track effectiveness |
262 | | -3. **Tune cache size**: Balance memory usage vs hit rate |
263 | | -4. **Consider TTL**: Shorter for development, longer for production |
264 | | -5. **Use disk caching**: For applications with repeated sessions |
265 | | -6. **Clear when needed**: Clear cache after data updates |
| 124 | +- **Monitor performance**: Use `get_vfbquery_cache_stats()` regularly |
| 125 | +- **Adjust for your use case**: Tune memory limits for long-running applications |
| 126 | +- **Consider data freshness**: Shorter TTL for frequently changing data |
| 127 | +- **Disable when needed**: Use environment variable if caching isn't desired |
0 commit comments