Detailed technical information for the BANC data processing pipeline.
VFB Database → BANC Public Data → Coordinate Transform → VFB File Structure
↓ ↓ ↓ ↓
Neuron IDs Skeleton/Mesh JRC2018U/VNC Space volume.[swc|obj|nrrd]
- Input: BANC space (nanometers)
- Intermediate: JRC2018F space (micrometers)
- Output: JRC2018U (brain) or JRCVNC2018U (VNC) space (micrometers)
Neurons are automatically classified as brain or VNC based on:
- Coordinate analysis of neuron extent
- VFB database template mappings
- Automatic selection of appropriate template space
- Base URL:
gs://lee-lab_brain-and-nerve-cord-fly-connectome/ - Skeletons:
neuron_skeletons/swcs-from-pcg-skel/ - Meshes:
neuron_meshes/meshes/ - Format: Neuroglancer precomputed mesh format (JSON + binary)
- Host: kbw.virtualflybrain.org
- Type: Neo4j graph database
- Query: Neurons with
EXISTS(r.folder)condition - Data: Neuron organization, template mapping, VFB identifiers
- Format: Standard SWC (Space-separated values)
- Units: Micrometers
- Coordinate System: Template space (JRC2018U or JRCVNC2018U)
- Processing: Direct coordinate transformation from BANC skeleton data
- Typical Size: 4KB per neuron
- Format: Wavefront OBJ mesh format
- Source: BANC precomputed mesh fragments
- Processing Steps:
- Download JSON manifest and binary mesh data
- Parse binary mesh format (vertices + triangles)
- Apply coordinate transformation
- Generate OBJ with vertex normals
- Quality: High-detail meshes (70K+ vertices, 150K+ triangles)
- Typical Size: 5-10MB per neuron
- Format: Nearly Raw Raster Data
- Source: Generated from transformed OBJ meshes
- Processing Steps:
- Mesh voxelization using template-specific resolution
- Template metadata injection
- NRRD header generation with coordinate system info
- Voxel Size: 0.622µm (JRC2018U) or 0.4µm (JRCVNC2018U)
- Typical Size: 200-500KB per neuron
- Missing Mesh Data: Falls back to skeleton-based mesh generation
- Network Failures: Retry logic with exponential backoff
- Coordinate Transform Errors: Skip neuron with detailed logging
- File Write Errors: Cleanup partial files and retry
- State File:
processing_state.jsontracks completion status - Resume Logic: Automatically skips completed neurons
- Force Reprocess: Use
--no-skip-existingflag
- Log Format: Timestamped with progress indicators
- Log Levels: DEBUG, INFO, WARNING, ERROR
- Progress Tracking: Emoji indicators for visual status
- Error Context: Full stack traces for debugging
- Average: ~15 seconds per neuron (including all formats)
- Bottlenecks: Mesh download and coordinate transformation
- Parallelization: Configurable worker processes
- Memory: ~500MB per worker process
- Disk I/O: Sequential write patterns
- Network: Burst downloads from Google Cloud Storage
- Worker Processes: Linear scaling up to I/O limits
- Batch Processing: Processes neurons in database query order
- Memory Management: Automatic cleanup between neurons
- Coordinate Bounds: Verify transforms produce reasonable coordinates
- File Integrity: Check file sizes and format validity
- Template Alignment: Validate against known anatomical landmarks
- Mesh Quality: Verify vertex counts and triangle connectivity
- Coordinate Precision: Limited by source data precision
- Template Coverage: Some neurons may fall outside template bounds
- Mesh Artifacts: Occasional gaps in BANC mesh data
- Processing Time: Large meshes can take significant time
# Limited test run
python run_full_banc_production.py --limit 5 --dry-run
# Single format testing
python run_full_banc_production.py --limit 10 --formats swc
# Debug mode
python run_full_banc_production.py --limit 1 --verboseprocess.py: Core processing logic and coordinate transformationsrun_full_banc_production.py: Main pipeline orchestrationrequirements.txt: Python dependenciesinstall_banc_transforms.sh: Setup script for transformation tools
- Custom Transforms: Add new coordinate transformation methods
- Output Formats: Implement additional file format generators
- Quality Filters: Add neuron filtering based on quality metrics
- Batch Processing: Modify for different batching strategies