|
| 1 | +# Deploy to Production |
| 2 | + |
| 3 | +Configure DataJoint for production environments with controlled schema changes and project isolation. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +Development and production environments have different requirements: |
| 8 | + |
| 9 | +| Concern | Development | Production | |
| 10 | +|---------|-------------|------------| |
| 11 | +| Schema changes | Automatic table creation | Controlled, explicit changes only | |
| 12 | +| Naming | Ad-hoc schema names | Consistent project prefixes | |
| 13 | +| Configuration | Local settings | Environment-based | |
| 14 | + |
| 15 | +DataJoint 2.0 provides settings to enforce production discipline. |
| 16 | + |
| 17 | +## Prevent Automatic Table Creation |
| 18 | + |
| 19 | +By default, DataJoint creates tables automatically when you first access them. This is convenient during development but dangerous in production—a typo or code bug could create unintended tables. |
| 20 | + |
| 21 | +### Enable Production Mode |
| 22 | + |
| 23 | +Set `create_tables=False` to prevent automatic table creation: |
| 24 | + |
| 25 | +```python |
| 26 | +import datajoint as dj |
| 27 | + |
| 28 | +# Production mode: no automatic table creation |
| 29 | +dj.config.database.create_tables = False |
| 30 | +``` |
| 31 | + |
| 32 | +Or via environment variable: |
| 33 | + |
| 34 | +```bash |
| 35 | +export DJ_CREATE_TABLES=false |
| 36 | +``` |
| 37 | + |
| 38 | +Or in `datajoint.json`: |
| 39 | + |
| 40 | +```json |
| 41 | +{ |
| 42 | + "database": { |
| 43 | + "create_tables": false |
| 44 | + } |
| 45 | +} |
| 46 | +``` |
| 47 | + |
| 48 | +### What Changes |
| 49 | + |
| 50 | +With `create_tables=False`: |
| 51 | + |
| 52 | +| Action | Development (True) | Production (False) | |
| 53 | +|--------|-------------------|-------------------| |
| 54 | +| Access existing table | Works | Works | |
| 55 | +| Access missing table | Creates it | **Raises error** | |
| 56 | +| Explicit `Schema(create_tables=True)` | Creates | Creates (override) | |
| 57 | + |
| 58 | +### Example: Production Safety |
| 59 | + |
| 60 | +```python |
| 61 | +import datajoint as dj |
| 62 | + |
| 63 | +dj.config.database.create_tables = False |
| 64 | +schema = dj.Schema('myproject_ephys') |
| 65 | + |
| 66 | +@schema |
| 67 | +class Recording(dj.Manual): |
| 68 | + definition = """ |
| 69 | + recording_id : int |
| 70 | + --- |
| 71 | + path : varchar(255) |
| 72 | + """ |
| 73 | + |
| 74 | +# If table doesn't exist in database: |
| 75 | +Recording() # Raises DataJointError: Table not found |
| 76 | +``` |
| 77 | + |
| 78 | +### Override for Migrations |
| 79 | + |
| 80 | +When you need to create tables during a controlled migration: |
| 81 | + |
| 82 | +```python |
| 83 | +# Explicit override for this schema only |
| 84 | +schema = dj.Schema('myproject_ephys', create_tables=True) |
| 85 | + |
| 86 | +@schema |
| 87 | +class NewTable(dj.Manual): |
| 88 | + definition = """...""" |
| 89 | + |
| 90 | +NewTable() # Creates the table |
| 91 | +``` |
| 92 | + |
| 93 | +## Use Schema Prefixes |
| 94 | + |
| 95 | +When multiple projects share a database server, use prefixes to avoid naming collisions and organize schemas. |
| 96 | + |
| 97 | +### Configure Project Prefix |
| 98 | + |
| 99 | +```python |
| 100 | +import datajoint as dj |
| 101 | + |
| 102 | +dj.config.database.schema_prefix = 'myproject_' |
| 103 | +``` |
| 104 | + |
| 105 | +Or via environment variable: |
| 106 | + |
| 107 | +```bash |
| 108 | +export DJ_SCHEMA_PREFIX=myproject_ |
| 109 | +``` |
| 110 | + |
| 111 | +Or in `datajoint.json`: |
| 112 | + |
| 113 | +```json |
| 114 | +{ |
| 115 | + "database": { |
| 116 | + "schema_prefix": "myproject_" |
| 117 | + } |
| 118 | +} |
| 119 | +``` |
| 120 | + |
| 121 | +### Apply Prefix to Schemas |
| 122 | + |
| 123 | +Use the prefix when creating schemas: |
| 124 | + |
| 125 | +```python |
| 126 | +import datajoint as dj |
| 127 | + |
| 128 | +prefix = dj.config.database.schema_prefix # 'myproject_' |
| 129 | + |
| 130 | +# Schema names include prefix |
| 131 | +subject_schema = dj.Schema(prefix + 'subject') # myproject_subject |
| 132 | +session_schema = dj.Schema(prefix + 'session') # myproject_session |
| 133 | +ephys_schema = dj.Schema(prefix + 'ephys') # myproject_ephys |
| 134 | +``` |
| 135 | + |
| 136 | +### Benefits |
| 137 | + |
| 138 | +- **Isolation**: Multiple projects coexist without conflicts |
| 139 | +- **Visibility**: Easy to identify which schemas belong to which project |
| 140 | +- **Permissions**: Grant access by prefix pattern (`myproject_*`) |
| 141 | +- **Cleanup**: Drop all project schemas by prefix |
| 142 | + |
| 143 | +### Database Permissions by Prefix |
| 144 | + |
| 145 | +```sql |
| 146 | +-- Grant access to all schemas with prefix |
| 147 | +GRANT ALL PRIVILEGES ON `myproject\_%`.* TO 'developer'@'%'; |
| 148 | + |
| 149 | +-- Read-only access to another project |
| 150 | +GRANT SELECT ON `otherproject\_%`.* TO 'developer'@'%'; |
| 151 | +``` |
| 152 | + |
| 153 | +## Environment-Based Configuration |
| 154 | + |
| 155 | +Use different configurations for development, staging, and production. |
| 156 | + |
| 157 | +### Configuration Hierarchy |
| 158 | + |
| 159 | +DataJoint loads settings in priority order: |
| 160 | + |
| 161 | +1. **Environment variables** (highest priority) |
| 162 | +2. **Secrets directory** (`.secrets/`) |
| 163 | +3. **Config file** (`datajoint.json`) |
| 164 | +4. **Defaults** (lowest priority) |
| 165 | + |
| 166 | +### Development Setup |
| 167 | + |
| 168 | +**datajoint.json** (committed): |
| 169 | +```json |
| 170 | +{ |
| 171 | + "database": { |
| 172 | + "host": "localhost", |
| 173 | + "create_tables": true |
| 174 | + } |
| 175 | +} |
| 176 | +``` |
| 177 | + |
| 178 | +**.secrets/database.user**: |
| 179 | +``` |
| 180 | +dev_user |
| 181 | +``` |
| 182 | + |
| 183 | +### Production Setup |
| 184 | + |
| 185 | +Override via environment: |
| 186 | + |
| 187 | +```bash |
| 188 | +# Production database |
| 189 | +export DJ_HOST=prod-db.example.com |
| 190 | +export DJ_USER=prod_user |
| 191 | +export DJ_PASS=prod_password |
| 192 | + |
| 193 | +# Production mode |
| 194 | +export DJ_CREATE_TABLES=false |
| 195 | +export DJ_SCHEMA_PREFIX=myproject_ |
| 196 | + |
| 197 | +# Disable interactive prompts |
| 198 | +export DJ_SAFEMODE=false |
| 199 | +``` |
| 200 | + |
| 201 | +### Docker/Kubernetes Example |
| 202 | + |
| 203 | +```yaml |
| 204 | +# docker-compose.yaml |
| 205 | +services: |
| 206 | + worker: |
| 207 | + image: my-pipeline:latest |
| 208 | + environment: |
| 209 | + - DJ_HOST=db.example.com |
| 210 | + - DJ_USER_FILE=/run/secrets/db_user |
| 211 | + - DJ_PASS_FILE=/run/secrets/db_password |
| 212 | + - DJ_CREATE_TABLES=false |
| 213 | + - DJ_SCHEMA_PREFIX=prod_ |
| 214 | + secrets: |
| 215 | + - db_user |
| 216 | + - db_password |
| 217 | +``` |
| 218 | +
|
| 219 | +## Complete Production Configuration |
| 220 | +
|
| 221 | +### datajoint.json (committed) |
| 222 | +
|
| 223 | +```json |
| 224 | +{ |
| 225 | + "database": { |
| 226 | + "host": "localhost", |
| 227 | + "port": 3306 |
| 228 | + }, |
| 229 | + "stores": { |
| 230 | + "default": "main", |
| 231 | + "main": { |
| 232 | + "protocol": "s3", |
| 233 | + "endpoint": "s3.amazonaws.com", |
| 234 | + "bucket": "my-org-data", |
| 235 | + "location": "myproject" |
| 236 | + } |
| 237 | + } |
| 238 | +} |
| 239 | +``` |
| 240 | + |
| 241 | +### Production Environment Variables |
| 242 | + |
| 243 | +```bash |
| 244 | +# Database |
| 245 | +export DJ_HOST=prod-mysql.example.com |
| 246 | +export DJ_USER=prod_service |
| 247 | +export DJ_PASS=<from-secret-manager> |
| 248 | + |
| 249 | +# Production behavior |
| 250 | +export DJ_CREATE_TABLES=false |
| 251 | +export DJ_SCHEMA_PREFIX=prod_ |
| 252 | +export DJ_SAFEMODE=false |
| 253 | + |
| 254 | +# Logging |
| 255 | +export DJ_LOG_LEVEL=WARNING |
| 256 | +``` |
| 257 | + |
| 258 | +### Verification Script |
| 259 | + |
| 260 | +```python |
| 261 | +#!/usr/bin/env python |
| 262 | +"""Verify production configuration before deployment.""" |
| 263 | +import datajoint as dj |
| 264 | + |
| 265 | +def verify_production_config(): |
| 266 | + """Check that production settings are correctly applied.""" |
| 267 | + errors = [] |
| 268 | + |
| 269 | + # Check create_tables is disabled |
| 270 | + if dj.config.database.create_tables: |
| 271 | + errors.append("create_tables should be False in production") |
| 272 | + |
| 273 | + # Check schema prefix is set |
| 274 | + if not dj.config.database.schema_prefix: |
| 275 | + errors.append("schema_prefix should be set in production") |
| 276 | + |
| 277 | + # Check not pointing to localhost |
| 278 | + if dj.config.database.host == 'localhost': |
| 279 | + errors.append("database.host is localhost - expected production host") |
| 280 | + |
| 281 | + if errors: |
| 282 | + for e in errors: |
| 283 | + print(f"ERROR: {e}") |
| 284 | + return False |
| 285 | + |
| 286 | + print("Production configuration verified") |
| 287 | + return True |
| 288 | + |
| 289 | +if __name__ == '__main__': |
| 290 | + import sys |
| 291 | + sys.exit(0 if verify_production_config() else 1) |
| 292 | +``` |
| 293 | + |
| 294 | +## Summary |
| 295 | + |
| 296 | +| Setting | Development | Production | |
| 297 | +|---------|-------------|------------| |
| 298 | +| `database.create_tables` | `true` | `false` | |
| 299 | +| `database.schema_prefix` | `""` or `dev_` | `prod_` | |
| 300 | +| `safemode` | `true` | `false` (automated) | |
| 301 | +| `loglevel` | `DEBUG` | `WARNING` | |
| 302 | + |
| 303 | +## See Also |
| 304 | + |
| 305 | +- [Manage Pipeline Project](manage-pipeline-project.md) — Project organization |
| 306 | +- [Configuration Reference](../reference/configuration.md) — All settings |
| 307 | +- [Manage Secrets](manage-secrets.md) — Credential management |
0 commit comments