Commit a09067c
committed
1168780 docs: [#310] add lessons learned from backup POC implementation (Jose Celano)
74a082d docs: [#310] update issue spec progress - all tasks complete (Jose Celano)
c92c199 feat: [#310] add SQLite backup support to backup container (Jose Celano)
c7cc2c5 docs: [#310] recommend maintenance-window as primary backup solution (Jose Celano)
2d97bf1 docs: [#310] add unit tests and improve documentation clarity (Jose Celano)
1c63945 docs: [#310] add maintenance-window solution artifacts and update default backup interval (Jose Celano)
b9fd407 docs: [#310] add Torrust Demo database analysis for exclude-statistics (Jose Celano)
717b4b8 docs: [#310] add alternative backup solutions for large databases (Jose Celano)
6cff3b6 docs: [#310] reorganize backup-strategies folder structure (Jose Celano)
7b3fb8a docs: [#310] complete Phase 7 documentation and update conclusions (Jose Celano)
7ea89cf docs: [#310] add SQLite large database backup findings (Jose Celano)
d040367 docs: [#310] Phase 6 - Restore validation complete (Jose Celano)
e370b48 test: [#310] add unit tests for backup script with bats-core (Jose Celano)
e9b506c refactor: [#310] add Rust-style documentation to backup script (Jose Celano)
bf7e719 docs: [#310] implement Phase 5 - backup maintenance (packaging & retention) (Jose Celano)
4eb29fe docs: [#310] update Phase 5 plan with maintenance approach (Jose Celano)
7076964 docs: [#310] mark Phase 4 as complete in progress table (Jose Celano)
672fafc docs: [#310] reference issue #313 for Ansible ownership fix (Jose Celano)
cb36c17 docs: [#310] add production considerations and fix non-root container (Jose Celano)
745b102 docs: [#310] update Phase 4 documentation with implementation details (Jose Celano)
fd8c19a docs: [#310] implement Phase 4 - config files backup (Jose Celano)
550d382 docs: [#310] add backup integrity verification tests to Phase 3 (Jose Celano)
bfbedd9 docs: [#310] reorganize artifacts with backup-container folder (Jose Celano)
270ef7c docs: [#310] complete Phase 3 - MySQL backup with mariadb-dump (Jose Celano)
7b661aa docs: [#310] complete Phase 2 - minimal backup container (Jose Celano)
37fca8a docs: [#310] reorganize PoC into structured folder with phase docs (Jose Celano)
1406df4 docs: [#310] complete Phase 1 of sidecar backup PoC (Jose Celano)
8b8aa51 docs: [#310] add PoC plan and answer backup requirements questions (Jose Celano)
2fcf835 docs: [#310] add MySQL backup research and sidecar container solution (Jose Celano)
17b15a8 docs: [#310] add preliminary conclusions for backup research (Jose Celano)
848dbde docs: [#310] research database backup strategies (Jose Celano)
Pull request description:
## Summary
Comprehensive research documentation for database backup strategies as part of Epic #309 (Add backup support).
This PR includes **complete research** for SQLite and MySQL backup strategies, backup tools evaluation, container backup architectures, a **working proof-of-concept backup container** with **58 bats-core unit tests**, and a **recommended solution** (Maintenance Window Hybrid approach).
## What's Included
### Database Backup Strategies
#### SQLite
- **Backup approaches**: `.backup` command (Online Backup API), `VACUUM INTO`, file copy risks
- **WAL mode analysis**: Checkpointing behavior, persistence, pros/cons
- **Backup verification and restore procedures**: Integrity checks, recovery steps
- **Torrust Live Demo analysis**: Current implementation (unsafe `cp`), proposed improvements
- **⚠️ Critical Large Database Finding**: SQLite `.backup` stalls at 10% after 16+ hours for 17GB database (~37 MB/hour effective rate). Maintenance window backup completes in 72 seconds.
#### MySQL
- **Backup approaches**: `mysqldump`, physical backups, binary log backups
- **Container-specific considerations**: Accessing MySQL in Docker containers
- **Backup verification and restore procedures**
### Container Backup Architectures
- 5 patterns documented: Host Crontab, Centralized, Sidecar, Orchestrator, External Tool
- Comparison matrix with pros/cons
- Decision flowchart for pattern selection
### Backup Tools Evaluation
- ✅ **Restic**: Recommended - mature, encrypted, deduplicated, Docker support
- ⚠️ **Kopia**: Alternative - newer, more features (GUI, ECC, server mode), less mature
- ❌ **Rustic**: Discarded - beta status, not production-ready
- Two-phase backup approach documented (DB dump → file backup)
### Solution Comparison (NEW)
Four backup solutions evaluated with detailed trade-off analysis:
| Solution | Best For | Complexity |
|----------|----------|------------|
| Continuous Sidecar | Hot backups, simple setup | Low |
| **Maintenance Window** | Large DBs, complete consistency | Medium |
| External Scheduler | Multi-service environments | High |
| Native Database | WAL-enabled SQLite | Low |
**Recommended Solution**: Maintenance Window Hybrid (95% container, 5% host script)
### Maintenance Window Backup POC (Complete - NEW)
A working proof-of-concept with **58 bats-core unit tests** supporting both MySQL and SQLite:
| Feature | Status |
|---------|--------|
| MySQL backup with mysqldump | ✅ Complete |
| SQLite backup with sqlite3 | ✅ Complete |
| Config file backup | ✅ Complete |
| Retention policy (delete old backups) | ✅ Complete |
| Single mode (run once, exit) | ✅ Complete |
| Continuous mode (loop) | ✅ Complete |
| Host orchestration script | ✅ Complete |
| Crontab configuration | ✅ Complete |
| 58 unit tests | ✅ All passing |
**POC Artifacts**:
- Multi-stage Dockerfile with MySQL and SQLite support
- `backup.sh` script with modular functions
- `maintenance-backup.sh` host orchestration script
- Docker Compose examples for MySQL and SQLite
- Production and test crontab configurations
- Lessons learned document with implementation concerns
## Key Findings
| Finding | Details |
|---------|---------|
| SQLite Safe Backup | Use `.backup` command (Online Backup API) - safe during concurrent writes |
| SQLite Large DB Limitation | `.backup` impractical for DBs > 1GB due to locking overhead (~37 MB/hour) |
| Maintenance Window Backup | 72 seconds for 17GB SQLite (vs ~17 days with `.backup`) |
| Disk I/O Capacity | 445 MB/s proven - SQLite locking is bottleneck, not disk |
| MySQL Backup | `mysqldump` works reliably for containerized deployments |
| WAL Mode | Optional for safe backups, useful for read performance under high load |
| Recommended Tool | Restic - battle-tested, simple, Docker-native, sufficient features |
| Recommended Solution | Maintenance Window Hybrid - container + host crontab |
| Sidecar Pattern | Best for single-server deployments with few services |
## Lessons Learned (Implementation Concerns)
Key pain points discovered during POC that affect future implementation:
| Pain Point | Severity | Notes |
|------------|----------|-------|
| Template conditionals for DB type | Medium | Docker Compose env vars differ for MySQL vs SQLite |
| Path translation (host/container) | Medium | Multiple representations of same path |
| SSH agent key selection | Low | Use `IdentitiesOnly=yes` |
| Container exits in single mode | Low | Expected behavior, just surprising |
| Log rotation missing | Low | Easy to add, often forgotten |
| Backup verification missing | Medium | Important for production |
## Related Issues
- Closes #310 (Research Database Backup Strategies)
- Part of Epic #309
- Created on torrust-demo repo:
- Issue #85: Use `.backup` instead of `cp`
- Issue #86: Evaluate WAL mode for high-traffic scenario
## Checklist
### Research Complete
- [x] SQLite backup approaches documented
- [x] SQLite large database findings (17GB test)
- [x] MySQL backup approaches documented
- [x] WAL mode analysis with checkpointing behavior
- [x] Backup verification and restore procedures
- [x] Torrust Live Demo analysis
- [x] Container backup architectures (5 patterns)
- [x] Backup tools evaluation (Restic, Kopia, Rustic)
- [x] Solution comparison (4 approaches)
- [x] Recommended solution documented
### POC Complete
- [x] Multi-stage Dockerfile with MySQL and SQLite support
- [x] 58 bats-core unit tests (all passing)
- [x] MySQL backup/restore validated
- [x] SQLite backup/restore validated
- [x] Config file backup
- [x] Retention policy (delete expired backups)
- [x] Single mode (run once, exit)
- [x] Continuous mode (loop with interval)
- [x] Host orchestration script
- [x] Crontab configurations (production + test)
- [x] Docker Compose examples (MySQL + SQLite)
- [x] Lessons learned document
- [x] Issue spec progress updated (all tasks complete)
### Future Work (out of scope for this PR)
- [ ] Implement backup command in deployer
- [ ] Off-site transfer automation (S3, Backblaze B2)
- [ ] Backup encryption
- [ ] Backup verification command
## Documentation Structure
```
docs/research/backup-strategies/
├── README.md # Overview and navigation
├── conclusions.md # Key findings and recommendations
├── requirements.md # Design preferences
├── architectures/
│ └── container-patterns.md # 5 architecture patterns
├── databases/
│ ├── mysql/
│ │ ├── README.md
│ │ └── backup-approaches.md
│ └── sqlite/
│ ├── README.md
│ ├── backup-approaches.md
│ ├── large-database-backup.md # Critical 17GB findings
│ └── torrust-live-demo/
│ ├── README.md
│ ├── current-implementation.md
│ └── proposed-improvements.md
├── tools/
│ ├── README.md # Tools overview
│ ├── restic.md # Detailed Restic evaluation
│ └── restic-vs-kopia.md # Comparison document
└── solutions/
├── README.md # Solution comparison (NEW)
├── sidecar-container/ # Original sidecar POC
└── maintenance-window/ # Recommended solution (NEW)
├── README.md # Architecture and workflow
├── implementation-recommendations.md # Lessons learned
└── artifacts/
├── backup-container/
│ ├── Dockerfile
│ ├── backup.sh
│ └── backup_test.bats # 58 tests
├── docker-compose-with-backup-mysql.yml
├── docker-compose-with-backup-sqlite.yml
├── maintenance-backup.sh
├── maintenance-backup.cron
└── maintenance-backup-test.cron
```
ACKs for top commit:
josecelano:
ACK 1168780
Tree-SHA512: 90338487494b44eefe56fc6943497a2caa2715e9e459a24d36f287d5ff50938d48ce33c9d2dfb3eef08929dfb79ec80fbf73b228c570ee16b0e26d2034939c98
53 files changed
Lines changed: 11249 additions & 74 deletions
File tree
- docs
- issues
- research/backup-strategies
- architectures
- databases
- mysql
- sqlite
- torrust-live-demo
- solutions
- exclude-statistics
- maintenance-window
- artifacts
- backup-container
- backup-storage/etc
- sidecar-container
- artifacts
- backup-container
- backup-storage/etc
- phases
- tools
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
16 | | - | |
17 | | - | |
18 | | - | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
23 | | - | |
24 | | - | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | | - | |
29 | | - | |
30 | | - | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | | - | |
35 | | - | |
36 | | - | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | | - | |
41 | | - | |
| 40 | + | |
| 41 | + | |
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| |||
221 | 221 | | |
222 | 222 | | |
223 | 223 | | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
232 | 232 | | |
233 | 233 | | |
234 | 234 | | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
240 | | - | |
241 | | - | |
242 | | - | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
243 | 243 | | |
244 | 244 | | |
245 | 245 | | |
246 | | - | |
247 | | - | |
248 | | - | |
249 | | - | |
250 | | - | |
251 | | - | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
252 | 252 | | |
253 | 253 | | |
254 | 254 | | |
255 | | - | |
256 | | - | |
257 | | - | |
258 | | - | |
259 | | - | |
260 | | - | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
261 | 261 | | |
262 | 262 | | |
263 | 263 | | |
264 | | - | |
265 | | - | |
266 | | - | |
267 | | - | |
268 | | - | |
269 | | - | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
270 | 270 | | |
271 | 271 | | |
272 | 272 | | |
273 | 273 | | |
274 | 274 | | |
275 | 275 | | |
276 | 276 | | |
277 | | - | |
| 277 | + | |
278 | 278 | | |
279 | 279 | | |
280 | 280 | | |
281 | | - | |
282 | | - | |
283 | | - | |
284 | | - | |
285 | | - | |
286 | | - | |
287 | | - | |
288 | | - | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
289 | 289 | | |
290 | 290 | | |
291 | 291 | | |
292 | | - | |
293 | | - | |
294 | | - | |
295 | | - | |
296 | | - | |
297 | | - | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
298 | 298 | | |
299 | 299 | | |
300 | 300 | | |
301 | | - | |
302 | | - | |
303 | | - | |
304 | | - | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
305 | 305 | | |
306 | 306 | | |
307 | 307 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
0 commit comments