Skip to content

Commit 29f5d17

Browse files
committed
Merge #406: feat: [#405] deploy Hetzner demo tracker and document process (in progress)
675b826 docs: add improvements index for hetzner demo tracker deployment (Jose Celano) 11c020e docs: add bugs index for hetzner demo tracker deployment (Jose Celano) 81a31b9 docs: add tracker registry guide for newTrackon submission (Jose Celano) 4c79724 docs(maintenance): add uptime monitoring guide (Jose Celano) 9b11d27 docs: record successful reboot, tick API/SSH checklist, add public REST health check endpoint (Jose Celano) 4b55ad0 docs(maintenance): add step 4g - delete old SSH key from Hetzner console (done) (Jose Celano) 954a6a8 docs(maintenance): mark local file archival done (step 7) and secrets rotation complete (Jose Celano) fda93d0 docs(maintenance): mark SSH deployer key rotation as done (step 4) (Jose Celano) d3d1401 docs(maintenance): mark MySQL torrust and root password rotation as done (step 2) (Jose Celano) c55763c docs(maintenance): mark tracker admin token rotation as done (step 1) (Jose Celano) c684920 docs(maintenance): fix restart vs recreate for env var changes in step 1 (Jose Celano) a3775f1 docs(maintenance): add OS updates guide with apply/verify/reboot procedure (Jose Celano) 8a0064d docs(maintenance): mark Grafana admin password as rotated (step 3 done) (Jose Celano) 96d1628 docs(maintenance): mark Hetzner Cloud and DNS API tokens as deleted (Jose Celano) 0894d85 docs(maintenance): add secrets rotation guide for post-AI-agent deployment (Jose Celano) ea7ea22 docs(post-provision): add Hetzner backups step to ToC and post-provision index (Jose Celano) 36af759 docs(post-provision): add screenshot of Hetzner backups enabled state (Jose Celano) 7001123 docs(deploy): update progress — all 9 services verified, fill in service endpoints table (Jose Celano) 7a4f714 docs(verify): add Torrust tracker client announce tests for HTTP and UDP trackers (Jose Celano) 487c1b5 docs(verify): add backup verification and document credentials oversight (Jose Celano) 6870512 docs(verify): add actual tree output to storage verification (Jose Celano) 31c4869 docs(verify): add storage volume mount verification (Jose Celano) 8f97bf8 docs(verify): add MySQL database connectivity verification (Jose Celano) c5ab1db docs(verify): add Docker services health and log verification (Jose Celano) 7c3d68b docs(verify): add UDP tracker verification results (Jose Celano) c0dd03a docs(verify): add Grafana verification results (Jose Celano) a2d454c docs(verify): fix corrupted results table in api.md (Jose Celano) 96224e0 docs(verify): add API, HTTP tracker, and health check verification results (Jose Celano) fb288b3 docs: add test command output, verify guides, floating IP improvement (Jose Celano) 45f06bf docs(hetzner-demo): populate run README and add test command docs (Jose Celano) 16167cc docs(hetzner-demo): add Bug 3 (URL encoding) and run improvements doc (Jose Celano) d3d6c64 docs(hetzner-demo): clarify Bug 2 root password was never implemented (Jose Celano) b136e5c docs(hetzner-demo): document run command MySQL bugs and failure (Jose Celano) a5c7913 docs: document successful release command (task 3.3 done) (Jose Celano) fb7a7ff fix: skip docker-compose local validation when docker is not in PATH (Jose Celano) e874e96 docs: document release fails when deployer runs inside Docker (docker not in PATH) (Jose Celano) beaf2f9 docs: add observations file and fill in missing ToC entries (Jose Celano) 31d724c docs: document volume/IP setup sequencing tradeoffs in post-provision (Jose Celano) 650936f docs: document configure command execution (task 3.2 done) (Jose Celano) 1422f83 docs: fix misleading volume snapshot claim; document limitation (Jose Celano) 4fba745 docs: document volume setup via Hetzner Cloud API (Jose Celano) 4a4914b docs: document DNS record creation via Hetzner Cloud API (Jose Celano) 837057d docs: [#405] configure floating IPs permanently on VM via netplan (Jose Celano) db6a702 docs: [#405] add post-provision guides (DNS + volume setup) and assign floating IPs (Jose Celano) 3872406 docs: [#405] mark provision task as complete in issue tracker (Jose Celano) fda04f1 docs: [#405] add attempt-4 screenshot and document Hetzner activity log confusion (Jose Celano) 9ba435f docs: [#405] document provision success, passphrase bug, IPv6 omission, and UDP domains bug (Jose Celano) 642d043 docs: [#405] add cleanup-between-attempts guide and update provision README (Jose Celano) a28016b feat: [#405] increase SSH retry budget to 5 minutes with 5s interval (Jose Celano) 182e33b feat: [#405] log full SSH stderr in wait_for_connectivity retry messages (Jose Celano) 3248a63 fix: [#405] add IdentitiesOnly=yes to Ansible ssh_args (Jose Celano) 019e39c fix: [#405] add IdentitiesOnly=yes to default SSH options (Jose Celano) f4c5e8f docs: [#405] add provision improvements document with deployer enhancement recommendations (Jose Celano) 010a053 docs: [#405] refine Problem 5 root cause with precise log-based evidence (Jose Celano) 3ffb1d4 docs: [#405] add debug-command-failure skill for investigating deployer errors (Jose Celano) 2034809 docs: [#405] refactor hetzner-demo-tracker docs into per-command subdirectories (Jose Celano) 61e24fe docs: [#405] configure environment and create for Hetzner demo tracker (Phase 2) (Jose Celano) 739f003 docs: [#405] document and complete prerequisites for Hetzner demo tracker (Jose Celano) e5ad25c docs: [#405] create deployment journal directory structure for Hetzner demo tracker (Jose Celano) Pull request description: Closes #405 ## Summary Real-world deployment of a Torrust Tracker demo instance to Hetzner Cloud using the deployer tool, with full documentation of every step, decision, and problem encountered. The documentation will serve as both an internal reference and a source for a blog post on [torrust.com](https://torrust.com). ## Progress - [x] Phase 1: Prerequisites documented - [x] Phase 2: Environment created and configured - [x] Phase 3.1: Infrastructure provisioned (Hetzner `ccx23` at `nbg1`) - [x] Phase 3.2: Post-provision manual steps (DNS, volume, Hetzner backups) - [x] Phase 3.3: Configure instance (Docker 28.2.2, Docker Compose v2.29.2) - [x] Phase 3.4: Release application (all Docker images staged) - [x] Phase 3.5: Run services (all 5 services healthy) - [x] Phase 4: Verify and document (HTTP tracker, UDP tracker, API, Grafana, health check, MySQL, storage, backup) - [x] Phase 5: Post-deployment maintenance - [x] Secrets rotation (all 7 secrets rotated) - [x] OS updates (59 packages incl. 37 security, reboot, all services healthy) - [x] Uptime monitoring documented (Hetzner has no native monitoring — external tools listed) - [x] Tracker registry — `udp1` submitted to newTrackon (2026-03-04) - [x] Phase 6: Bug and improvement indexes - [x] [bugs.md](docs/deployments/hetzner-demo-tracker/bugs.md) — 11 bugs, 1 fixed - [x] [improvements.md](docs/deployments/hetzner-demo-tracker/improvements.md) — 13 recommendations ## What's in this PR ### Documentation (`docs/deployments/hetzner-demo-tracker/`) New deployment journal with per-command subdirectories: - **prerequisites.md** — Hetzner account, API token, SSH key setup, tool versions - **deployment-spec.md** — Environment config decisions and sanitized config - **commands/provision/** — Command walkthrough, problems (5), improvements (7), bugs, cleanup procedure - **commands/configure/** — Docker installation walkthrough - **commands/release/** — Image pull walkthrough - **commands/run/** — Service start walkthrough, problems, improvements, bugs (3) - **post-provision/** — DNS setup, volume setup, Hetzner backups - **verify/** — Full verification index: HTTP tracker, UDP tracker, API, Grafana, health check, Docker services, MySQL, storage, backup - **maintenance/** - `secrets-rotation.md` — Complete procedure for rotating all 7 secrets; records first rotation (2026-03-04) - `os-updates.md` — `apt` update procedure with log of first run (59 updates, reboot, all services healthy) - `uptime-monitoring.md` — Documents Hetzner monitoring gap vs DigitalOcean; lists external tools (UptimeRobot, Freshping, etc.) - **tracker-registry.md** — newTrackon submission; explains why only `udp1` is listed (`udp2` kept quiet for production debugging) - **bugs.md** — All 11 deployer bugs found, with severity, status, and links to full descriptions - **improvements.md** — All 13 improvement recommendations, with links to full descriptions - **observations.md** — Cross-cutting insights and deployer learnings ### Code fixes (`src/`, `templates/`) - `fix: IdentitiesOnly=yes` added to default SSH options (`src/adapters/ssh/client.rs`) - `fix: IdentitiesOnly=yes` added to Ansible `ssh_args` (`templates/ansible/ansible.cfg`) - `feat:` SSH retry budget increased to 300s (60 × 5s) - `feat:` Full SSH stderr logged in retry messages for easier diagnosis - `fix:` `release` no longer hard-fails when `docker` is not in PATH inside Docker container ### New skill - `.github/skills/usage/operations/debug-command-failure/skill.md` ## Service Endpoints | Service | URL | Status | | -------------- | ------------------------------------------------- | ---------- | | HTTP Tracker 1 | `https://http1.torrust-tracker-demo.com/announce` | ✅ Running | | HTTP Tracker 2 | `https://http2.torrust-tracker-demo.com/announce` | ✅ Running | | UDP Tracker 1 | `udp://udp1.torrust-tracker-demo.com:6969` | ✅ Running | | UDP Tracker 2 | `udp://udp2.torrust-tracker-demo.com:6868` | ✅ Running | | Tracker API | `https://api.torrust-tracker-demo.com/api/v1` | ✅ Running | | Grafana | `https://grafana.torrust-tracker-demo.com` | ✅ Running | ## Bugs Found (11 total, 1 fixed) | ID | Command | Description | Severity | Status | | ---- | ----------- | ----------------------------------------------------- | -------- | -------- | | B-01 | `create` | Template binds to `0.0.0.0` (IPv4 only) | Major | 🔴 Open | | B-02 | `create` | Template defaults to SQLite silently | Major | 🔴 Open | | B-03 | `create` | `instance_name: null` unexplained | Minor | 🔴 Open | | B-04 | `provision` | SSH probe budget too short for Hetzner (120 s) | Major | 🔴 Open | | B-05 | `provision` | Passphrase-protected SSH keys fail silently in Docker | Major | 🔴 Open | | B-06 | `provision` | UDP tracker domains missing from output | Minor | 🔴 Open | | B-07 | `release` | Fails when `docker` not in PATH | High | 🟢 Fixed | | B-08 | `run` | MySQL `"root"` username not rejected at creation time | High | 🔴 Open | | B-09 | `run` | MySQL root password silently diverges | Medium | 🔴 Open | | B-10 | `run` | MySQL password not URL-encoded in connection string | High | 🔴 Open | | B-11 | `test` | DNS check false positives with floating IP | Minor | 🔴 Open | ## Key learnings documented 1. **Passphrase-protected SSH keys fail silently in Docker** — no agent, no TTY, signing fails. Root cause of all provision failures. Fix: remove passphrase from deployment keys. 2. **`docker compose restart` does not re-read env vars** — must use `up -d --force-recreate` after rotating secrets in `.env`. 3. **MySQL password URL-encoding in `tracker.toml`** — `/` in password must be encoded as `%2F` in the connection string. 4. **Hetzner has no native uptime monitoring** — requires an external service such as UptimeRobot. 5. **UDP Tracker 2 kept off public tracker lists** — public registration causes constant announce noise that makes production log debugging impractical. ACKs for top commit: josecelano: ACK 675b826 Tree-SHA512: 7698d2888be422336c210c57e60dd06d906d95fff48ae64bdf9b98ccd92fb6c04ee29c7e167cacdff273f1fbdff07c1c9e3ca04949b4876b3552617e582acc21
2 parents 2c078ba + 675b826 commit 29f5d17

70 files changed

Lines changed: 6267 additions & 164 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 231 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,231 @@
1+
---
2+
name: debug-command-failure
3+
description: Guide for debugging and investigating deployer command failures. Covers reading error output, locating trace files, inspecting environment state, examining build artifacts, and running manual verification steps. Use when any deployer command (provision, configure, release, run, etc.) fails. Triggers on "command failed", "debug failure", "investigate error", "why did it fail", "trace", "deployer error", or "command error".
4+
metadata:
5+
author: torrust
6+
version: "1.0"
7+
---
8+
9+
# Debugging Deployer Command Failures
10+
11+
This skill walks through collecting and interpreting diagnostic information when any deployer
12+
command fails.
13+
14+
## Investigation Layers (in order)
15+
16+
```text
17+
1. Console error output → immediate symptom + tip
18+
2. Environment state → data/{env}/environment.json
19+
3. Trace log → data/{env}/traces/{timestamp}-{command}.log
20+
4. Build artifacts → build/{env}/
21+
5. Manual verification → SSH, curl, provider console
22+
```
23+
24+
Work top-to-bottom. Each layer provides richer context than the previous.
25+
26+
---
27+
28+
## Layer 1 — Console Error Output
29+
30+
A failed command prints:
31+
32+
```text
33+
❌ <command> command failed: <error summary>
34+
Tip: <actionable hint>
35+
Tip: Check logs and try running with --log-output file-and-stderr for more details
36+
```
37+
38+
Note the **error summary** and the **tip** lines. The summary often names the failed step and the
39+
kind of error.
40+
41+
---
42+
43+
## Layer 2 — Environment State
44+
45+
After any command failure, the deployer writes machine-readable state:
46+
47+
```text
48+
data/{env-name}/environment.json
49+
```
50+
51+
Key fields to inspect:
52+
53+
```json
54+
{
55+
"state": {
56+
"context": {
57+
"failed_step": "WaitSshConnectivity",
58+
"error_kind": "NetworkConnectivity",
59+
"error_summary": "SSH connectivity failed: ...",
60+
"failed_at": "2026-03-03T15:33:32Z",
61+
"execution_started_at": "2026-03-03T15:30:00Z",
62+
"execution_duration": { "secs": 212, "nanos": 885591647 },
63+
"trace_id": "bcba0ee9-b2cf-4302-be0e-5ed04c665141",
64+
"trace_file_path": "./data/{env-name}/traces/20260303-153332-provision.log"
65+
}
66+
}
67+
}
68+
```
69+
70+
| Field | What it tells you |
71+
| -------------------- | ---------------------------------------------------------- |
72+
| `failed_step` | Which internal step failed (maps to deployer source code) |
73+
| `error_kind` | Category: `NetworkConnectivity`, `TemplateRendering`, etc. |
74+
| `error_summary` | Human-readable description of the error |
75+
| `execution_duration` | How long the command ran before failing |
76+
| `trace_file_path` | Exact path to the full trace log |
77+
78+
```bash
79+
# Quick inspection
80+
cat data/{env-name}/environment.json | python3 -m json.tool
81+
# or
82+
jq '.state.context' data/{env-name}/environment.json
83+
```
84+
85+
---
86+
87+
## Layer 3 — Trace Log
88+
89+
The trace log records every step, sub-step, and decision the deployer made:
90+
91+
```text
92+
data/{env-name}/traces/{YYYYMMDD-HHMMSS}-{command}.log
93+
```
94+
95+
The exact path is in `environment.json → state.context.trace_file_path`.
96+
97+
```bash
98+
# Read the full log
99+
cat data/{env-name}/traces/20260303-153332-provision.log
100+
101+
# Focus on errors and warnings
102+
grep -E 'ERROR|WARN|failed|error' data/{env-name}/traces/20260303-153332-provision.log
103+
104+
# Show the last 50 lines (where failures are usually recorded)
105+
tail -50 data/{env-name}/traces/20260303-153332-provision.log
106+
```
107+
108+
The trace contains structured log lines with timestamps, log levels, and context fields. Look for
109+
`ERROR` lines and the step names that precede them.
110+
111+
---
112+
113+
## Layer 4 — Build Artifacts
114+
115+
The `build/` directory holds rendered templates and intermediate files generated before
116+
infrastructure is touched:
117+
118+
```text
119+
build/{env-name}/
120+
├── tofu/
121+
│ └── hetzner/ (or lxd/)
122+
│ ├── main.tf # OpenTofu infrastructure definition
123+
│ ├── cloud-init.yml # cloud-init script run on first boot
124+
│ └── *.tf # Other Terraform/OpenTofu files
125+
└── ansible/
126+
├── inventory.ini # Ansible inventory
127+
└── playbooks/ # Ansible playbooks
128+
```
129+
130+
Common inspections:
131+
132+
```bash
133+
# Verify SSH public key was correctly injected into cloud-init
134+
grep -A3 'ssh_authorized_keys' build/{env-name}/tofu/hetzner/cloud-init.yml
135+
136+
# Compare with the actual public key
137+
cat ~/.ssh/torrust_tracker_deployer_ed25519.pub
138+
139+
# Inspect the infrastructure definition
140+
cat build/{env-name}/tofu/hetzner/main.tf
141+
```
142+
143+
**Why this matters**: Build artifacts are generated from your config file without touching the
144+
cloud provider. If the artifact is wrong, the root cause is in the environment config or a
145+
template bug — not in the network or provider.
146+
147+
---
148+
149+
## Layer 5 — Manual Verification
150+
151+
When the deployer fails but the cloud resource appears to be up, verify the resource directly.
152+
153+
### SSH connectivity
154+
155+
```bash
156+
# Test SSH manually with verbose output (-v shows handshake details)
157+
ssh -v -i ~/.ssh/torrust_tracker_deployer_ed25519 torrust@{server-ip} "whoami && cloud-init status"
158+
```
159+
160+
A successful response looks like:
161+
162+
```text
163+
torrust
164+
status: done
165+
```
166+
167+
If `cloud-init status` returns `status: running`, cloud-init is still executing — wait and retry.
168+
169+
### Cloud-init timing
170+
171+
```bash
172+
# Check cloud-init completion and timing
173+
ssh -i ~/.ssh/torrust_tracker_deployer_ed25519 torrust@{server-ip} \
174+
"cloud-init status --long && sudo journalctl -u ssh --since '5 minutes ago' | tail -20"
175+
```
176+
177+
**Note**: If the clock timestamp shows `1970-01-01`, the system clock was not yet NTP-synced when
178+
cloud-init completed — this is normal and does not indicate a failure.
179+
180+
### Port availability
181+
182+
```bash
183+
# Check if SSH port is open (times out quickly if no service is listening)
184+
nc -zv {server-ip} 22
185+
186+
# Check if HTTP tracker port is open
187+
nc -zv {server-ip} 6969
188+
```
189+
190+
---
191+
192+
## Common Error Patterns
193+
194+
| `failed_step` | `error_kind` | Likely Cause |
195+
| ------------------------- | --------------------- | -------------------------------------------------------------------- |
196+
| `RenderOpenTofuTemplates` | `TemplateRendering` | SSH key path not found — check container vs host path in config |
197+
| `WaitSshConnectivity` | `NetworkConnectivity` | Server SSH not ready within timeout — server may need more boot time |
198+
| `RunAnsiblePlaybook` | `Ansible` | SSH key rejected or unreachable — verify `~/.ssh/known_hosts` |
199+
| `CreateServer` | `ProviderApi` | API token invalid or quota exceeded — check Hetzner console |
200+
201+
---
202+
203+
## After Investigation
204+
205+
Once the root cause is identified, the recovery path depends on how far the command progressed:
206+
207+
- **Failed before any cloud resources were created** (e.g., `TemplateRendering`): fix the config,
208+
`purge --force`, `create environment`, retry command.
209+
210+
- **Failed after cloud resources were created** (e.g., `WaitSshConnectivity`): the deployer state
211+
is `ProvisionFailed` or `ConfigureFailed`. Resources exist in the cloud. Must `destroy` to clean
212+
up both cloud resources and local state, then `create environment` and retry.
213+
214+
```bash
215+
# Destroy cloud resources + local state
216+
docker run --rm \
217+
-v $(pwd)/data:/var/lib/torrust/deployer/data \
218+
-v $(pwd)/build:/var/lib/torrust/deployer/build \
219+
-v $(pwd)/envs:/var/lib/torrust/deployer/envs \
220+
-v ~/.ssh:/home/deployer/.ssh:ro \
221+
torrust/tracker-deployer:latest \
222+
destroy {env-name}
223+
224+
# Recreate local environment
225+
docker run --rm \
226+
-v $(pwd)/data:/var/lib/torrust/deployer/data \
227+
-v $(pwd)/build:/var/lib/torrust/deployer/build \
228+
-v $(pwd)/envs:/var/lib/torrust/deployer/envs \
229+
torrust/tracker-deployer:latest \
230+
create environment --env-file envs/{env-name}.json
231+
```

AGENTS.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -179,6 +179,7 @@ Available skills:
179179
| Creating issues | `.github/skills/dev/planning/create-issue/skill.md` |
180180
| Creating new skills | `.github/skills/add-new-skill/skill.md` |
181181
| Creating refactor plans | `.github/skills/dev/planning/create-refactor-plan/skill.md` |
182+
| Debugging command failures | `.github/skills/usage/operations/debug-command-failure/skill.md` |
182183
| Debugging test errors | `.github/skills/dev/testing/debug-test-errors/skill.md` |
183184
| Handling errors in code | `.github/skills/dev/rust-code-quality/handle-errors-in-code/skill.md` |
184185
| Handling secrets | `.github/skills/dev/rust-code-quality/handle-secrets/skill.md` |
Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
# Deployment Journal: Hetzner Demo Tracker
2+
3+
**Issue**: [#405](https://github.com/torrust/torrust-tracker-deployer/issues/405)
4+
**Date started**: 2026-03-03
5+
**Domain**: `torrust-tracker-demo.com`
6+
**Provider**: Hetzner Cloud
7+
8+
## Purpose
9+
10+
Deploy a public Torrust Tracker demo instance to Hetzner Cloud and document every step of the process. This journal will serve as the source material for a blog post on [torrust.com](https://torrust.com).
11+
12+
## Table of Contents
13+
14+
1. [Prerequisites](prerequisites.md) — Account setup, tools, SSH keys
15+
2. [Deployment Specification](deployment-spec.md) — What we want to deploy: config decisions,
16+
endpoints, sanitized config
17+
3. Deployment commands — step-by-step per deployer command:
18+
- [create](commands/create/README.md) — generate template, validate, create environment
19+
- [provision](commands/provision/README.md) — create the Hetzner VM
20+
- [configure](commands/configure/README.md) — install Docker and Docker Compose on the server
21+
- [release](commands/release/README.md) — pull and stage Docker images
22+
- [run](commands/run/README.md) — start all services
23+
4. Post-provision manual steps (done once, before `configure`):
24+
- [DNS setup](post-provision/dns-setup.md) — assign floating IPs, create DNS records, verify
25+
- [Volume setup](post-provision/volume-setup.md) — create and mount Hetzner volume for storage
26+
- [Hetzner Backups](post-provision/hetzner-backups.md) — enable automated server backups (can be done any time after provisioning)
27+
5. [Service Verification](verify/README.md) — verifying all services after deployment:
28+
- [HTTP Tracker](verify/http-tracker.md)
29+
- [UDP Tracker](verify/udp-tracker.md)
30+
- [Tracker API](verify/api.md)
31+
- [Grafana](verify/grafana.md)
32+
- [Health Check](verify/health-check.md)
33+
- [Docker Services](verify/docker-services.md)
34+
- [MySQL Database](verify/mysql.md)
35+
- [Storage Volume](verify/storage.md)
36+
- [Backup](verify/backup.md)
37+
6. Problems — issues encountered, per command:
38+
- [create problems](commands/create/problems.md)
39+
- [provision problems](commands/provision/problems.md)
40+
7. Improvements — recommended deployer improvements found during this deployment:
41+
- [provision improvements](commands/provision/improvements.md)
42+
8. [Observations](observations.md) — cross-cutting insights and learnings about the deployer
43+
9. [Maintenance](maintenance/README.md) — post-deployment operational tasks:
44+
- [Secrets rotation](maintenance/secrets-rotation.md) — rotate all secrets after AI-assisted deployment
45+
10. [Tracker Registry](tracker-registry.md) — submit the tracker to public registries (newTrackon)
46+
11. [Bugs](bugs.md) — all deployer bugs discovered during this deployment (11 bugs, 1 fixed)
47+
12. [Improvements](improvements.md) — all improvement recommendations collected in one place (13 items)
48+
49+
## Deployment
50+
51+
> This section will be filled in as we execute each deployment phase.
52+
53+
### Phase 1: Setup and Prerequisites
54+
55+
See [prerequisites.md](prerequisites.md) for the complete checklist.
56+
57+
### Phase 2: Create and Configure Environment
58+
59+
See [deployment-spec.md](deployment-spec.md) for config decisions and the sanitized config.
60+
See [commands/create/README.md](commands/create/README.md) for running the `create template`, `validate`, and
61+
`create environment` commands.
62+
63+
### Phase 3: Provision Infrastructure
64+
65+
See [commands/provision/README.md](commands/provision/README.md) for running the `provision` command and server
66+
details.
67+
68+
### Phase 3.5: Post-Provision Setup
69+
70+
Manual steps done once after provisioning, required before `configure`:
71+
72+
1. [DNS setup](post-provision/dns-setup.md) — assign floating IPs to the server and create DNS
73+
records for all six domains.
74+
2. [Volume setup](post-provision/volume-setup.md) — create a 50 GB Hetzner volume and mount it
75+
at `/opt/torrust/storage` so persistent data lives on a separate disk.
76+
3. [Hetzner Backups](post-provision/hetzner-backups.md) — enable automated daily server backups
77+
via the Hetzner Console (can be done at any time after provisioning).
78+
79+
See [post-provision/README.md](post-provision/README.md) for the full overview.
80+
81+
### Phase 4: Configure Instance
82+
83+
See [commands/configure/README.md](commands/configure/README.md) for running the `configure`
84+
command. Installs Docker 28.2.2 and Docker Compose v2.29.2.
85+
86+
### Phase 5: Release Application
87+
88+
See [commands/release/README.md](commands/release/README.md) for running the `release`
89+
command. Pulled and staged all Docker images (~134 s, state=`Released`).
90+
91+
### Phase 6: Run Services
92+
93+
See [commands/run/README.md](commands/run/README.md) for running the `run`
94+
command. All services started successfully (state=`Running`).
95+
96+
### Phase 7: Verify Deployment
97+
98+
See [verify/README.md](verify/README.md) for the full verification index.
99+
All 9 services verified — HTTP tracker, UDP tracker, Tracker API, Grafana,
100+
health check, Docker services, MySQL database, storage volume, and backup.
101+
Verification included end-to-end announce tests using the Torrust reference
102+
client (`http_tracker_client` and `udp_tracker_client`).
103+
104+
## Service Endpoints
105+
106+
> Will be filled after deployment.
107+
108+
| Service | URL | Status |
109+
| -------------- | ------------------------------------------------- | ---------- |
110+
| HTTP Tracker 1 | `https://http1.torrust-tracker-demo.com/announce` | ✅ Running |
111+
| HTTP Tracker 2 | `https://http2.torrust-tracker-demo.com/announce` | ✅ Running |
112+
| UDP Tracker 1 | `udp://udp1.torrust-tracker-demo.com:6969` | ✅ Running |
113+
| UDP Tracker 2 | `udp://udp2.torrust-tracker-demo.com:6868` | ✅ Running |
114+
| Tracker API | `https://api.torrust-tracker-demo.com/api/v1` | ✅ Running |
115+
| Health Check | `http://127.0.0.1:1313/health_check` (internal) | ✅ Running |
116+
| Grafana | `https://grafana.torrust-tracker-demo.com` | ✅ Running |
117+
118+
## Cost
119+
120+
> Will be documented after choosing server type.
121+
122+
| Resource | Monthly Cost (EUR) |
123+
| -------- | ------------------ |
124+
| Server | TBD |
125+
| Total | TBD |

0 commit comments

Comments
 (0)