This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Bitcoin Analysis Stack - Optimized Edition is a storage-optimized blockchain analysis platform that uses a single shared blockchain volume across all services. This reduces storage from ~2TB to ~1.5TB (25% savings) while maintaining full analysis capabilities through read-only mounts, Redis caching, and batch processing.
Single Source of Truth: One Bitcoin Core instance maintains the blockchain (bitcoin_data volume). All other services mount this volume as read-only and build their own indexes/data separately. This is the fundamental difference from the original stack.
# Start all services
docker-compose up -d
# Start specific services
docker-compose up -d bitcoin neo4j redis
# Stop all services
docker-compose down
# View logs
docker-compose logs -f bitcoin
docker-compose logs -f btc-importer
docker-compose logs -f graphql
# Restart service
docker-compose restart btc-importer# Check blockchain sync status
docker-compose exec bitcoin bitcoin-cli getblockchaininfo
# Get block count
docker-compose exec bitcoin bitcoin-cli getblockcount
# Get specific transaction
docker-compose exec bitcoin bitcoin-cli getrawtransaction <txid> true# Access Cypher shell
docker-compose exec neo4j cypher-shell -u neo4j -p bitcoin123
# Backup database
docker-compose exec neo4j neo4j-admin dump --database=neo4j --to=/data/neo4j-backup.dump# Access Redis CLI
docker-compose exec redis redis-cli
# Check memory usage
docker-compose exec redis redis-cli INFO memory
# Check cache size
docker-compose exec redis redis-cli DBSIZE
# Flush cache (useful for debugging)
docker-compose exec redis redis-cli FLUSHDB# Health check
curl http://localhost:8000/health
# Shows cache status: {"status": "healthy", "cache": "enabled"}# Check volume mounts
docker volume ls | grep bitcoin
docker inspect bitcoin_node | grep -A 10 Mounts
docker inspect electrs_indexer | grep -A 10 Mounts
# Should show bitcoin_data volume with:
# - bitcoin_node: Mode "rw" (read-write)
# - electrs_indexer: Mode "ro" (read-only)The core optimization is volume sharing:
-
Bitcoin Core (
bitcoinservice):- Mounts
bitcoin_data:/data/.bitcoinwith RW (read-write) - Only service that writes blockchain data
- Serves RPC requests to other services
- Mounts
-
Electrs (
electrsservice):- Mounts
bitcoin_data:/bitcoin:ro(read-only) - Reads blockchain files directly from shared volume
- Writes its own index to separate
electrs_datavolume - Saves ~500GB by not duplicating blockchain
- Mounts
-
BlockSci (
blocksciservice):- Mounts
bitcoin_data:/data/bitcoin:ro(read-only) - Parses blockchain from shared volume
- Writes parsed data to separate
blocksci_datavolume
- Mounts
-
Jupyter (
jupyterservice):- Mounts
bitcoin_data:/data/bitcoin:ro(read-only) - Direct read access to blockchain files for analysis
- Mounts
Redis serves two primary functions:
-
GraphQL API Caching (DB 1):
- Caches query results to reduce Bitcoin RPC load
- TTLs: blockchain info (1min), blocks (10min), transactions (30min), addresses (5min)
- Cache keys: MD5 hash of query parameters
- Implementation:
services/graphql/server.py-get_cached()/set_cached()
-
Importer Block Caching (DB 0):
- Caches fetched block data to survive restarts
- TTL: 1 hour
- Implementation:
services/importer/importer.py-get_cached_block()/cache_block()
Key optimizations in services/importer/importer.py:
-
Redis Caching:
- Block data cached after fetching
- Reduces re-fetching on restarts
ENABLE_CACHING=trueenvironment variable
-
Batch Processing:
- Groups transactions into Neo4j batches
IMPORT_BATCH_SIZEcontrols batch size (default 100)- Single transaction per batch for atomicity
-
UNWIND Queries:
- Bulk insert outputs using UNWIND
- Reduces individual INSERT overhead by ~60%
- See
_import_transaction()method
-
Connection Pooling:
max_connection_pool_size=50for Neo4j driver- Reuses connections across batches
Key features in services/graphql/server.py:
-
Per-Query Caching:
- Each resolver checks cache before querying
- Different TTLs based on data volatility
- Cache key generation:
cache_key(*args)function
-
Health Endpoint:
- Reports cache status: enabled/disabled
- Use for monitoring cache availability
-
Graceful Cache Degradation:
- If Redis unavailable, queries still work (direct to source)
- Logs warnings but continues operation
Critical settings for optimizations:
# Importer caching
ENABLE_CACHING=true # Enable Redis cache for importer
IMPORT_BATCH_SIZE=100 # Blocks per Neo4j transaction
# GraphQL caching
ENABLE_CACHE=true # Enable Redis cache for API
CACHE_TTL=300 # Default cache TTL (seconds)
# Neo4j optimizations
NEO4J_HEAP_SIZE=4G # Heap size (increase for better performance)
NEO4J_PAGECACHE=2G # Page cache (increase for large graphs)Optimizations for shared access:
# Required for analysis
txindex=1 # Full transaction index
prune=0 # No pruning (required for shared access)
# Optimized for multiple readers
dbcache=2048 # Increase for faster sync
maxmempool=300 # Reduced mempool size
maxorphantx=100 # Reduced orphan tx memoryUnderstanding the volume layout:
bitcoin_data (600GB) # SINGLE SHARED VOLUME
├── bitcoin_node (RW) # Writes blockchain
├── electrs (RO) # Reads blockchain
├── blocksci (RO) # Reads blockchain
└── jupyter (RO) # Reads blockchain
neo4j_data (600GB) # Separate Neo4j storage
electrs_data (100GB) # Electrs index only
blocksci_data (200GB) # BlockSci parsed data
redis_data (2GB) # Cache storage
importer_state (1MB) # Import state
importer_cache (10GB) # Block cache
Same schema as original, with optimizations:
# Instead of individual inserts per output:
for output in outputs:
session.run("MERGE (a:Address {...})")
# Use UNWIND for bulk insert:
session.run("""
UNWIND $outputs as output
MERGE (a:Address {address: output.address})
SET a.first_seen = output.time
""", outputs=output_list)NEO4J_dbms_memory_transaction_total_max=1G
NEO4J_dbms_tx__log_rotation_retention__policy=1G size
Same as original stack - see original CLAUDE.md or README.md for Cypher queries.
- Generate cache key:
cache_key_str = cache_key("query_name", param1, param2)- Check cache:
cached = get_cached(cache_key_str)
if cached:
return parse_cached_data(cached)- Query and cache:
result = fetch_data()
set_cached(cache_key_str, json.dumps(result), ttl=600)
return resultEdit .env:
IMPORT_BATCH_SIZE=500 # Larger batches = faster import, more memoryRestart importer:
docker-compose restart btc-importer# Check Redis memory
docker-compose exec redis redis-cli INFO memory
# Check cache hit rate (if keys exist)
docker-compose exec redis redis-cli INFO stats
# Clear cache
docker-compose exec redis redis-cli FLUSHDB
# Disable cache for testing
ENABLE_CACHING=false
ENABLE_CACHE=false
docker-compose restart btc-importer graphqlTo add another service that reads blockchain:
- Add to
docker-compose.yml:
my-service:
image: my-image
volumes:
- bitcoin_data:/bitcoin:ro # Read-only mount
- my_service_data:/data # Service-specific storage- Ensure service waits for Bitcoin:
depends_on:
bitcoin:
condition: service_healthy- Configure service to use shared path:
environment:
- BITCOIN_DATA_DIR=/bitcoin/.bitcoin- Cause: Trying to write to read-only volume
- Fix: Ensure Electrs writes index to separate volume, not shared bitcoin_data
- Verify: Check
ELECTRS_DB_DIR=/datapoints toelectrs_datavolume
- Cause: Cache size exceeded maxmemory limit
- Fix: Increase Redis maxmemory in docker-compose.yml
command: redis-server --maxmemory 4gb --maxmemory-policy allkeys-lru- Cause: Cache cold or disabled
- Fix: Ensure
ENABLE_CACHING=trueand Redis is running - Verify: Check Redis connection in importer logs
- Cause: Cache TTL too long
- Fix: Reduce
CACHE_TTLor flush cache manually
docker-compose exec redis redis-cli FLUSHDB- Cause: Batch size too small or insufficient heap
- Fix:
- Increase
IMPORT_BATCH_SIZEto 500-1000 - Increase
NEO4J_HEAP_SIZEto 8G or 16G
- Increase
- Cause: Multiple containers have volume mounted
- Fix: Stop all services before removing volumes
docker-compose down
docker volume rm bitcoin-analysis-stack-optimized_bitcoin_dataExpected performance improvements over original stack:
- Storage: -25% (500GB saved, from ~2TB to ~1.5TB)
- Bitcoin RPC calls: -70% (Redis caching)
- GraphQL response time: -50% (cached queries)
- Neo4j import speed: +30% (UNWIND batch inserts)
- Read-Only Safety: Services cannot corrupt blockchain data accidentally
- Cache Consistency: Clear Redis cache after Bitcoin rollbacks/reorgs
- Horizontal Scaling: Multiple analysis services can mount same volume
- Backup Strategy: Only need to backup bitcoin_data once, not per service
- Network Access: All services communicate via Docker network, not shared filesystem
- Read-only mounts provide additional safety layer
- Redis cache should not be exposed publicly (port 6379)
- Same password security considerations as original stack
- Cache may contain sensitive address queries - secure Redis accordingly
- Implement cache warming on startup
- Add cache hit/miss metrics to GraphQL
- Explore Neo4j query result caching
- Consider blockchain file deduplication with ZFS/BTRFS
- Add Prometheus metrics for cache performance