feat(issue-21): scale up server CCX23 → CCX33 for better UDP uptime#22
Draft
josecelano wants to merge 7 commits intomainfrom
Draft
feat(issue-21): scale up server CCX23 → CCX33 for better UDP uptime#22josecelano wants to merge 7 commits intomainfrom
josecelano wants to merge 7 commits intomainfrom
Conversation
Documents the full resize workflow: pre-resize baseline capture, graceful shutdown, Hetzner panel action by human operator, post-resize recovery and validation, evidence capture, and 7-day observation period. Notes the key behaviour that Hetzner in-place resizes preserve all IP addresses (public, private, and Floating IPs), so no DNS or IP reassignment is needed. Refs: #21
The default conntrack table (262144 entries) fills up under sustained UDP tracker load, causing "nf_conntrack: table full, dropping packet" kernel errors and intermittent UDP timeouts on uptime monitors. Applied kernel tunables: - nf_conntrack_max: 262144 → 1048576 (4x increase) - nf_conntrack_udp_timeout_stream: 120 s → 15 s (8x reduction) - nf_conntrack_udp_timeout: 30 s → 10 s Added /etc/modules-load.d/conntrack.conf to pre-load the nf_conntrack module at boot so sysctl settings are applied before Docker starts. Without this, net.netfilter.* keys don't exist when sysctl runs and the settings are silently skipped after a reboot. Refs: #21
Fill in the D+1 row (2026-04-20) in the daily checks log: - HTTP: ~1564 req/s, UDP: ~1015 req/s, total ~2579 req/s (~322/vCPU) - Host load: 6.05/5.49/4.80 - UDP newTrackon uptime: 83.9% (includes resize downtime + conntrack overflow period; fix applied same day) Update the pre/post comparison table with available metrics and mark the decision as "partial" — resize alone was insufficient, conntrack overflow was the actual bottleneck. Follow-up plan added. Refs: #21
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Scales the Hetzner server from CCX23 (4 vCPU, 16 GB RAM) to CCX33 (8 vCPU, 32 GB RAM) to address the UDP uptime issues tracked in #19. This PR contains the full evidence trail for the resize experiment, including pre-resize baseline, execution log, and daily observation templates.
Changes
docs/issues/ISSUE-21-scale-up-server-for-udp-uptime.md— issue spec with acceptance criteriadocs/infrastructure-resize-history.md— new file tracking server resize events with req/s loaddocs/infrastructure.md— updated with traffic, price, and resize history linkdocs/issues/evidence/ISSUE-21/00-pre-resize-baseline.md— measured Prometheus values before resize (HTTP ~1350 req/s, UDP ~1507 req/s)docs/issues/evidence/ISSUE-21/01-resize-execution.md— full resize execution log with commands, outputs, and external health checksdocs/issues/evidence/ISSUE-21/02-post-resize-daily-checks.md— 7-day daily observation template (to be filled over the next 7 days)docs/issues/evidence/ISSUE-21/03-pre-post-comparison.md— final comparison template (to be filled after observation window)Resize Summary
Observation Window
The resize was executed on 2026-04-13. This PR will be merged after a 7-day observation window ending around 2026-04-20, once daily checks have been collected and the pre/post comparison is complete.
Acceptance Criteria
03-pre-post-comparison.mdRefs: #21