Skip to content

Commit 0977fb4

Browse files
committed
refactor: update caching documentation and remove outdated summary file
1 parent 8927745 commit 0977fb4

4 files changed

Lines changed: 109 additions & 498 deletions

File tree

CACHING.md

Lines changed: 69 additions & 207 deletions
Original file line numberDiff line numberDiff line change
@@ -1,265 +1,127 @@
1-
# VFBquery Caching Integration Examples
1+
# VFBquery Caching Guide
22

3-
This document shows how to use VFB_connect-inspired caching techniques to improve VFBquery performance.
3+
VFBquery includes intelligent caching for optimal performance. Caching is **enabled by default** with production-ready settings.
44

5-
## Quick Start
5+
## Default Behavior
66

7-
### Basic Caching Setup
7+
VFBquery automatically enables caching when imported:
88

99
```python
10-
import vfbquery
10+
import vfbquery as vfb
1111

12-
# Enable caching with default settings (24 hour TTL, 1000 item memory cache)
13-
vfbquery.enable_vfbquery_caching()
12+
# Caching is already active with optimal settings:
13+
# - 3-month cache duration
14+
# - 2GB memory cache with LRU eviction
15+
# - Persistent disk storage
16+
# - Zero configuration required
1417

15-
# Use cached versions directly
16-
result = vfbquery.get_term_info_cached('FBbt_00003748')
17-
instances = vfbquery.get_instances_cached('FBbt_00003748', limit=10)
18+
result = vfb.get_term_info('FBbt_00003748') # Cached automatically
1819
```
1920

20-
### Transparent Caching (Monkey Patching)
21+
## Runtime Configuration
2122

22-
```python
23-
import vfbquery
24-
25-
# Enable caching and patch existing functions
26-
vfbquery.enable_vfbquery_caching()
27-
vfbquery.patch_vfbquery_with_caching()
23+
Adjust cache settings while your application is running:
2824

29-
# Now regular functions use caching automatically
30-
result = vfbquery.get_term_info('FBbt_00003748') # Cached!
31-
instances = vfbquery.get_instances('FBbt_00003748') # Cached!
32-
```
25+
```python
26+
import vfbquery as vfb
3327

34-
## Configuration Options
28+
# Modify cache duration
29+
vfb.set_cache_ttl(720) # 1 month
30+
vfb.set_cache_ttl(24) # 1 day
3531

36-
### Custom Cache Settings
32+
# Adjust memory limits
33+
vfb.set_cache_memory_limit(512) # 512MB
34+
vfb.set_cache_max_items(5000) # 5K items
3735

38-
```python
39-
from vfbquery import enable_vfbquery_caching
40-
41-
# Custom configuration
42-
enable_vfbquery_caching(
43-
cache_ttl_hours=12, # Cache for 12 hours
44-
memory_cache_size=500, # Keep 500 items in memory
45-
disk_cache_enabled=True, # Enable persistent disk cache
46-
disk_cache_dir="/tmp/vfbquery_cache" # Custom cache directory
47-
)
36+
# Toggle disk persistence
37+
vfb.disable_disk_cache() # Memory-only
38+
vfb.enable_disk_cache() # Restore persistence
4839
```
4940

50-
### Advanced Configuration
51-
52-
```python
53-
from vfbquery import CacheConfig, configure_cache
54-
55-
# Create custom configuration
56-
config = CacheConfig(
57-
enabled=True,
58-
memory_cache_size=2000, # Large memory cache
59-
disk_cache_enabled=True, # Persistent storage
60-
cache_ttl_hours=168, # 1 week cache
61-
solr_cache_enabled=True, # Cache SOLR queries
62-
term_info_cache_enabled=True, # Cache term info parsing
63-
query_result_cache_enabled=True # Cache query results
64-
)
65-
66-
configure_cache(config)
67-
```
41+
### Environment Control
6842

69-
### Environment Variable Control
43+
Disable caching globally if needed:
7044

7145
```bash
72-
# Enable caching via environment (like VFB_connect)
73-
export VFBQUERY_CACHE_ENABLED=true
74-
75-
# Disable caching
7646
export VFBQUERY_CACHE_ENABLED=false
7747
```
7848

79-
## Performance Comparison
49+
## Performance Benefits
8050

81-
### Without Caching
82-
```python
83-
import time
84-
import vfbquery
51+
VFBquery caching provides significant performance improvements:
8552

86-
# Cold queries (no cache)
87-
start = time.time()
88-
result1 = vfbquery.get_term_info('FBbt_00003748')
89-
cold_time = time.time() - start
53+
```python
54+
import vfbquery as vfb
9055

91-
start = time.time()
92-
result2 = vfbquery.get_term_info('FBbt_00003748') # Still slow
93-
repeat_time = time.time() - start
56+
# First query: builds cache (~1-2 seconds)
57+
result1 = vfb.get_term_info('FBbt_00003748')
9458

95-
print(f"Cold: {cold_time:.2f}s, Repeat: {repeat_time:.2f}s")
96-
# Output: Cold: 1.25s, Repeat: 1.23s
59+
# Subsequent queries: served from cache (<0.1 seconds)
60+
result2 = vfb.get_term_info('FBbt_00003748') # 54,000x faster!
9761
```
9862

99-
### With Caching
100-
```python
101-
import time
102-
import vfbquery
103-
104-
# Enable caching
105-
vfbquery.enable_vfbquery_caching()
106-
vfbquery.patch_vfbquery_with_caching()
107-
108-
# First call builds cache
109-
start = time.time()
110-
result1 = vfbquery.get_term_info('FBbt_00003748')
111-
cold_time = time.time() - start
112-
113-
# Second call hits cache
114-
start = time.time()
115-
result2 = vfbquery.get_term_info('FBbt_00003748') # Fast!
116-
cached_time = time.time() - start
117-
118-
speedup = cold_time / cached_time
119-
print(f"Cold: {cold_time:.2f}s, Cached: {cached_time:.4f}s, Speedup: {speedup:.0f}x")
120-
# Output: Cold: 1.25s, Cached: 0.0023s, Speedup: 543x
121-
```
63+
**Typical Performance:**
12264

123-
## Cache Management
65+
- First query: 1-2 seconds
66+
- Cached queries: <0.1 seconds
67+
- Speedup: Up to 54,000x for complex queries
12468

125-
### Monitor Cache Performance
69+
## Monitoring Cache Performance
12670

12771
```python
128-
import vfbquery
72+
import vfbquery as vfb
12973

13074
# Get cache statistics
131-
stats = vfbquery.get_vfbquery_cache_stats()
75+
stats = vfb.get_vfbquery_cache_stats()
13276
print(f"Hit rate: {stats['hit_rate_percent']}%")
133-
print(f"Memory used: {stats['memory_cache_size_mb']}MB / {stats['memory_cache_limit_mb']}MB")
134-
print(f"Items: {stats['memory_cache_items']} / {stats['max_items']}")
135-
print(f"TTL: {stats['cache_ttl_days']} days")
77+
print(f"Memory used: {stats['memory_cache_size_mb']}MB")
78+
print(f"Cache items: {stats['memory_cache_items']}")
13679

13780
# Get current configuration
13881
config = vfb.get_cache_config()
139-
print(f"TTL: {config['cache_ttl_hours']}h, Memory: {config['memory_cache_size_mb']}MB, Items: {config['max_items']}")
82+
print(f"TTL: {config['cache_ttl_hours']} hours")
83+
print(f"Memory limit: {config['memory_cache_size_mb']}MB")
14084
```
14185

142-
### Runtime Configuration Changes
86+
## Usage Examples
14387

144-
```python
145-
import vfbquery
146-
147-
# Modify cache TTL (time-to-live)
148-
vfbquery.set_cache_ttl(24) # 1 day
149-
vfbquery.set_cache_ttl(168) # 1 week
150-
vfbquery.set_cache_ttl(720) # 1 month
151-
vfbquery.set_cache_ttl(2160) # 3 months (default)
152-
153-
# Modify memory limits
154-
vfbquery.set_cache_memory_limit(512) # 512MB
155-
vfbquery.set_cache_memory_limit(1024) # 1GB
156-
vfbquery.set_cache_memory_limit(2048) # 2GB (default)
157-
158-
# Modify max items
159-
vfbquery.set_cache_max_items(1000) # 1K items
160-
vfbquery.set_cache_max_items(5000) # 5K items
161-
vfbquery.set_cache_max_items(10000) # 10K items (default)
162-
163-
# Enable/disable disk caching
164-
vfbquery.enable_disk_cache() # Default location
165-
vfbquery.enable_disk_cache('/custom/cache/directory') # Custom location
166-
vfbquery.disable_disk_cache() # Memory only
167-
```
168-
169-
### Cache Control
88+
### Production Applications
17089

17190
```python
172-
import vfbquery
173-
174-
# Clear all cached data
175-
vfbquery.clear_vfbquery_cache()
176-
177-
# Disable caching completely
178-
vfbquery.disable_vfbquery_caching()
91+
import vfbquery as vfb
17992

180-
# Re-enable with custom settings
181-
vfbquery.enable_vfbquery_caching(
182-
cache_ttl_hours=720, # 1 month
183-
memory_cache_size_mb=1024 # 1GB
184-
)
93+
# Caching is enabled automatically with optimal defaults
94+
# Adjust only if your application has specific needs
18595

186-
# Restore original functions (if patched)
187-
vfbquery.unpatch_vfbquery_caching()
96+
# Example: Long-running server with limited memory
97+
vfb.set_cache_memory_limit(512) # 512MB limit
98+
vfb.set_cache_ttl(168) # 1 week TTL
18899
```
189100

190-
## Integration Strategies
191-
192-
### For Development
101+
### Jupyter Notebooks
193102

194103
```python
195-
# Quick setup for development
196-
import vfbquery
197-
vfbquery.enable_vfbquery_caching(cache_ttl_hours=1) # Short TTL for dev
198-
vfbquery.patch_vfbquery_with_caching() # Transparent caching
199-
```
104+
import vfbquery as vfb
200105

201-
### For Production Applications
106+
# Caching works automatically in notebooks
107+
# Data persists between kernel restarts
202108

203-
```python
204-
# Production setup with persistence
205-
import vfbquery
206-
from pathlib import Path
207-
208-
cache_dir = Path.home() / '.app_cache' / 'vfbquery'
209-
vfbquery.enable_vfbquery_caching(
210-
cache_ttl_hours=24,
211-
memory_cache_size=2000,
212-
disk_cache_enabled=True,
213-
disk_cache_dir=str(cache_dir)
214-
)
215-
vfbquery.patch_vfbquery_with_caching()
109+
result = vfb.get_term_info('FBbt_00003748') # Fast on repeated runs
110+
instances = vfb.get_instances('FBbt_00003748') # Cached automatically
216111
```
217112

218-
### For Jupyter Notebooks
219-
220-
```python
221-
# Notebook-friendly caching
222-
import vfbquery
223-
import os
224-
225-
# Enable caching with environment control
226-
os.environ['VFBQUERY_CACHE_ENABLED'] = 'true'
227-
vfbquery.enable_vfbquery_caching(cache_ttl_hours=4) # Session-length cache
228-
vfbquery.patch_vfbquery_with_caching()
229-
230-
# Use regular VFBquery functions - they're now cached!
231-
medulla = vfbquery.get_term_info('FBbt_00003748')
232-
instances = vfbquery.get_instances('FBbt_00003748')
233-
```
234-
235-
## Comparison with VFB_connect Caching
236-
237-
| Feature | VFB_connect | VFBquery Native Caching |
238-
|---------|-------------|-------------------------|
239-
| Lookup cache | ✅ (3 month TTL) | ✅ (Configurable TTL) |
240-
| Term object cache | ✅ (`_use_cache`) | ✅ (Multi-layer) |
241-
| Memory caching | ✅ (Limited) | ✅ (LRU, configurable size) |
242-
| Disk persistence | ✅ (Pickle) | ✅ (Pickle + JSON options) |
243-
| Environment control | ✅ (`VFB_CACHE_ENABLED`) | ✅ (`VFBQUERY_CACHE_ENABLED`) |
244-
| Cache statistics || ✅ (Detailed stats) |
245-
| Multiple cache layers || ✅ (SOLR, parsing, results) |
246-
| Transparent integration || ✅ (Monkey patching) |
247-
248113
## Benefits
249114

250-
1. **Dramatic Performance Improvement**: 100x+ speedup for repeated queries
251-
2. **No Code Changes Required**: Transparent monkey patching option
252-
3. **Configurable**: Tune cache size, TTL, and storage options
253-
4. **Persistent**: Cache survives across Python sessions
254-
5. **Multi-layer**: Cache at different stages for maximum efficiency
255-
6. **Compatible**: Works alongside existing VFB_connect caching
256-
7. **Statistics**: Monitor cache effectiveness
115+
- **Dramatic Performance**: 54,000x speedup for repeated queries
116+
- **Zero Configuration**: Works out of the box with optimal settings
117+
- **Persistent Storage**: Cache survives Python restarts
118+
- **Memory Efficient**: LRU eviction prevents memory bloat
119+
- **Multi-layer Caching**: Optimizes SOLR queries, parsing, and results
120+
- **Production Ready**: 3-month TTL matches VFB_connect behavior
257121

258122
## Best Practices
259123

260-
1. **Enable early**: Set up caching at application startup
261-
2. **Monitor performance**: Use `get_vfbquery_cache_stats()` to track effectiveness
262-
3. **Tune cache size**: Balance memory usage vs hit rate
263-
4. **Consider TTL**: Shorter for development, longer for production
264-
5. **Use disk caching**: For applications with repeated sessions
265-
6. **Clear when needed**: Clear cache after data updates
124+
- **Monitor performance**: Use `get_vfbquery_cache_stats()` regularly
125+
- **Adjust for your use case**: Tune memory limits for long-running applications
126+
- **Consider data freshness**: Shorter TTL for frequently changing data
127+
- **Disable when needed**: Use environment variable if caching isn't desired

0 commit comments

Comments
 (0)