COMPASS is a deep learning project dedicated to accurately predicting protein-ligand binding affinities. It leverages a state-of-the-art Graph Neural Network (GNN), ViSNet, to learn from the intricate 3D geometry of molecular complexes, aiming to accelerate the process of drug discovery.
This project is built not just on a powerful model, but on a core philosophy of extreme data robustness and a highly efficient, mode-driven development workflow. Its name, COMPASS, reflects its ability to navigate the often-problematic landscape of real-world structural biology data, ensuring reliable and reproducible results.
- State-of-the-Art Model: Implements the ViSNet architecture for high-precision, geometry-aware predictions.
- Robust Data Pipeline: The cornerstone of COMPASS. The data processing pipeline is meticulously designed to handle common and obscure issues found in PDB data.
- Automated Hardware Optimization: A built-in tool to automatically find the best-performing configuration for your specific hardware, eliminating memory errors.
- Mode-Driven Workflow: A four-stage development process (
smoke_test,prototyping,validation,production) that allows for seamless switching between quick checks, rapid experimentation, and full-scale training. - High-Performance Training: Utilizes Automatic Mixed Precision (AMP) for significant speed-ups.
- Resilient & Manageable: Features robust checkpointing, graceful exit handling, and automated log/checkpoint organization.
- Important ViSNet Constraint: The ViSNet model requires that the number of hidden channels be divisible by the number of attention heads. This is a key consideration when configuring the model.
-
Clone the repository:
git clone https://github.com/Sign-up-admin/AIDD-TRAIN.git cd AIDD-TRAIN -
Create a virtual environment (recommended):
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install dependencies: This project relies on PyTorch and PyTorch Geometric. Please follow their official installation instructions for your specific CUDA version first.https://pytorch.org/get-started/locally/
# Example for CUDA 12.8 win11 pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu128 pip install torch_geometric pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.8.0+cu128.html pip install rdkit-pypi biopython tqdm scipy -
Prepare Data:
- Download the PDBbind dataset (v.2020 or other).
- Open
compass/config.pyand update thedataset_pathandindex_filevariables to point to your dataset location.
This project contains an intelligent hardware optimizer, hardware_optimizer.py, designed to find the perfect model configuration for different stages of the development lifecycle.
This optimizer was built upon a clear, hierarchical development philosophy defined by its architect, in collaboration with the Gemini agent. It recognizes that the "best" configuration is not a single setting, but a set of trade-offs tailored to the specific goal of each development phase. The optimizer's intelligence lies in its ability to weigh these trade-offs, using real-time performance estimation.
The optimizer targets three core stages, each with a unique goal:
-
prototyping(Soft Target: ~20 min/cycle)- Philosophy: Time is the ruler. This stage is for rapid trial and error. The configuration must be fast enough to allow developers to quickly test ideas. The optimizer targets a ~20 minute cycle time (for a fixed 450-batch run) but allows for a 20-minute flexibility window.
- Strategy: It searches its dedicated small model space to find the configuration with the highest throughput (max batch size) that fits within this flexible time budget. This ensures the fastest possible iteration speed without prematurely discarding a slightly slower but much more powerful configuration.
-
validation(Soft Target: ~90 min/cycle)- Philosophy: Balance is the key. This stage acts as the crucial bridge between a promising prototype and a full-scale production run. It must be close enough to production quality to give meaningful results, but fast enough to not halt the development flow. It serves to seriously validate the findings from the
prototypingstage. - Strategy: It targets a ~90 minute cycle time, also with a 20-minute flexibility window. It searches its dedicated large model space for the configuration with the highest throughput, striking the perfect balance between speed and quality.
- Philosophy: Balance is the key. This stage acts as the crucial bridge between a promising prototype and a full-scale production run. It must be close enough to production quality to give meaningful results, but fast enough to not halt the development flow. It serves to seriously validate the findings from the
-
production(Goal: Time-unlimited)- Philosophy: Quality is the ultimate goal. Time is no longer the primary constraint. This stage is for building the best possible model that the hardware and data can support, ready for deployment.
- Strategy: It employs a two-stage optimization: first, it finds the highest-quality (largest) model that respects both data and hardware limits. Second, it squeezes all remaining performance out of the hardware by finding the maximum possible batch size for that single best model.
This structured approach ensures that from the earliest idea to the final deployment, there is a perfectly optimized configuration to support the task at hand.
-
Open your terminal.
-
Navigate to the project's root directory (
AIDD-TRAIN). -
Run the optimizer module with the following command:
python -m compass.optimizer
This command will optimize for all four modes (production, validation, prototyping, and smoke_test) in the correct order. The process may take some time.
The script will create or update a hardware_profile.json file in the project root. The main training script will automatically load the appropriate settings from this file based on the DEVELOPMENT_MODE you select in compass/config.py.
If you wish to re-run the optimization for only specific modes, you can use the --modes argument:
# Example: Optimize only for production and validation
python -m compass.optimizer --modes production validationOnce the one-time setup and optimization are complete, your daily workflow is very simple.
-
Select Your Mode: Open
compass/config.pyand set theDEVELOPMENT_MODEvariable to one of the four modes:'smoke_test','prototyping','validation', or'production'. -
Run Training: Execute the main script from your terminal.
python -m compass
The script will automatically use the best settings for your chosen mode—either the optimized parameters from hardware_profile.json or the default settings if no optimization was run for that mode.
All logs and model checkpoints will be saved into a uniquely named directory (e.g., checkpoints/visnet_prototyping_.../).
COMPASS implements a phased workflow to balance speed and rigor. You can switch between modes by changing a single variable in compass/config.py.
smoke_test: "Does the code run?" A minimal check that runs in minutes.prototyping: "Is my idea promising?" A lightweight configuration for rapid experimentation.validation: "How does my idea perform under realistic conditions?" A medium-sized configuration for pre-production validation.production: "What are the final, best-effort results?" The full-scale configuration for generating final results.
This project was forged through a deep-dive debugging session to solve the sudden appearance of NaN (Not a Number) values during training. Instead of simply skipping problematic data, we developed a strategy of "Pause and Autopsy".
This journey underscores the COMPASS philosophy: true progress in scientific machine learning comes not just from powerful architectures, but from a relentless commitment to understanding and purifying the data that fuels them.
For production deployments, the following security measures should be configured:
Set the CORS_ORIGINS environment variable to specify allowed origins:
# Production example
export CORS_ORIGINS="https://yourdomain.com,https://api.yourdomain.com"
# Development (default)
# Uses: http://localhost:8501,http://127.0.0.1:8501,http://localhost:3000,http://127.0.0.1:3000Important: Never use wildcard (*) origins in production. The system will reject wildcard origins for security.
Enable API key authentication for production:
# Enable authentication
export AUTH_ENABLED="true"
# Set API key (single key)
export API_KEY="your-secure-api-key-here"
# Or set multiple API keys (comma-separated) for key rotation
export API_KEYS="key1,key2,key3"
# Force authentication for critical endpoints in production
export FORCE_AUTH_CRITICAL="true"Critical endpoints that require authentication in production:
/api/v1/training/tasks- Training task management/api/v1/data/upload- Dataset uploads/api/v1/data/datasets- Dataset management/api/v1/inference- Model inference/api/v1/models- Model management
Configure rate limits to prevent abuse:
# Default rate limit (requests per minute)
export RATE_LIMIT_DEFAULT="100"
# Training endpoints (more restrictive)
export RATE_LIMIT_TRAINING="10"
export RATE_LIMIT_TRAINING_WINDOW="60"
# Upload endpoints (very restrictive)
export RATE_LIMIT_UPLOAD="3"
export RATE_LIMIT_UPLOAD_WINDOW="60"
# Inference endpoints
export RATE_LIMIT_INFERENCE="20"
export RATE_LIMIT_INFERENCE_WINDOW="60"Set the environment to production:
export ENVIRONMENT="production"This enables:
- Stricter security checks
- Enhanced error message sanitization
- Production-optimized logging
- Required authentication for critical endpoints
Configure database connection timeouts:
# Database connection timeout (seconds)
export DB_CONNECTION_TIMEOUT="10.0"
# Database busy timeout (milliseconds)
export DB_BUSY_TIMEOUT="5000"
# Database cache size (negative = KB)
export DB_CACHE_SIZE="-2000"The service automatically adds security headers to all responses:
- Content-Security-Policy (CSP) - Strict policy for API endpoints
- X-Frame-Options: DENY
- X-Content-Type-Options: nosniff
- X-XSS-Protection: 1; mode=block
- Referrer-Policy: strict-origin-when-cross-origin
All user inputs are automatically sanitized to prevent:
- XSS (Cross-Site Scripting) attacks
- SQL injection (via parameterized queries)
- Path traversal attacks
- Command injection
File uploads are protected by:
- File type validation (only
.zip,.tar,.tar.gzallowed) - File size limits (configurable via
COMPASS_UPLOAD_MAX_SIZE) - Zip bomb detection
- Upload queue management (prevents resource exhaustion)
- Python 3.12+
- CUDA-capable GPU (recommended)
- Sufficient disk space for datasets and checkpoints
- Network access for service registry (if using)
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
pip install -r requirements_service.txt
# Install PyTorch with CUDA support
# See: https://pytorch.org/get-started/locally/Create a .env file or set environment variables:
# Service configuration
export ENVIRONMENT="production"
export COMPASS_HOST="0.0.0.0"
export COMPASS_PORT="8080"
# Security
export AUTH_ENABLED="true"
export API_KEY="your-secure-api-key"
export CORS_ORIGINS="https://yourdomain.com"
# Resource limits
export COMPASS_MAX_WORKERS="4"
export MAX_CONCURRENT_UPLOADS="2"
export MAX_CONCURRENT_TASKS="4"
# Database
export REGISTRY_DB_PATH="./registry.db"
export DB_CONNECTION_TIMEOUT="10.0"
# Logging
export LOG_LEVEL="INFO"
export COMPASS_LOG_DIR="./logs"# Start COMPASS service
python -m compass.service
# Or use the service startup script
python compass/service_main.pyMonitor service health:
# Basic health check
curl http://localhost:8080/health
# Readiness check
curl http://localhost:8080/health/ready
# Metrics endpoint
curl http://localhost:8080/metricsExample Nginx configuration:
server {
listen 80;
server_name yourdomain.com;
location / {
proxy_pass http://localhost:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}For production, use a process manager like systemd or supervisord:
systemd example (/etc/systemd/system/compass.service):
[Unit]
Description=COMPASS Service
After=network.target
[Service]
Type=simple
User=compass
WorkingDirectory=/path/to/AIDD-TRAIN
Environment="PATH=/path/to/venv/bin"
ExecStart=/path/to/venv/bin/python -m compass.service
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target-
Service won't start
- Check port availability:
netstat -an | grep 8080 - Verify environment variables are set correctly
- Check logs in
logs/directory
- Check port availability:
-
Authentication failures
- Verify
AUTH_ENABLEDandAPI_KEYare set - Check API key format in request headers:
X-API-Key: your-keyorAuthorization: Bearer your-key
- Verify
-
Rate limiting issues
- Check rate limit statistics:
curl http://localhost:8080/metrics - Adjust rate limits via environment variables if needed
- Check rate limit statistics:
-
Database connection errors
- Verify database file permissions
- Check
DB_CONNECTION_TIMEOUTsetting - Ensure sufficient disk space
-
File upload failures
- Check file size limits
- Verify file type is allowed
- Check upload queue capacity
Monitor service health and performance:
# Get metrics
curl http://localhost:8080/metrics
# Check rate limiting stats
curl http://localhost:8080/metrics | jq '.rate_limiting'
# View authentication failures (check logs)
tail -f logs/compass-service.log | grep "Authentication failed"This project maintains high code quality standards using automated tools:
Python Code Quality:
# Install development dependencies
pip install -r requirements-dev.txt
# Run all code quality checks
python scripts/run_all_checks.py
# Auto-fix formatting issues
python scripts/run_all_checks.py --format
# Run tests
python scripts/run_tests.batFrontend Code Quality:
Option 1: Using Docker (Recommended, no Node.js installation needed)
# Install Docker Desktop first
# Then run checks using Docker
python scripts/check_frontend_docker.pyOption 2: Using Local Node.js
# Install Node.js dependencies (requires Node.js >= 14.0.0)
npm install
# Extract frontend code from Python files
python scripts/extract_frontend_code.py FLASH_DOCK-main
# Run frontend code checks
python scripts/check_frontend.py
# Or use npm scripts directly
npm run lint:all
npm run formatPython Code Quality:
- Black: Code formatting (PEP 8)
- Flake8: Code style and complexity checking
- Pylint: Code quality analysis
- MyPy: Static type checking
- Bandit: Security vulnerability scanning
- Pytest: Unit testing and coverage
Frontend Code Quality:
- ESLint: JavaScript code linting
- Stylelint: CSS/SCSS code linting
- Prettier: Code formatting (HTML/CSS/JS)
- HTMLHint: HTML code quality checking
All quality check reports are saved to lint_reports/ directory. For detailed information, see:
This project is licensed under the GNU AGPLv3 License. See the LICENSE file for details.
This project utilizes the PDBbind dataset. We gratefully acknowledge the creators and maintainers of this valuable resource.