- ADR: ADR 0040
- Title: Docker runtime container telemetry
- Status: live_applied
- Branch:
codex/adr-0040-runtime-container-telemetry - Worktree:
../proxmox-host_server-runtime-container-telemetry - Owner: codex
- Depends On: ADR 0011
- Conflicts With: none
- Shared Surfaces:
docker-runtime,playbooks/monitoring-stack.yml,roles/docker_runtime_observability, managed Grafana dashboards
- collect Docker container metrics from
docker-runtimethrough Telegraf's Docker input plugin - ship container telemetry into the existing InfluxDB bucket on
monitoring - extend the managed
LV3 docker-runtime Detaildashboard with container-level panels and a runtime snapshot - document the convergence and verification path for operators
- changing the Docker runtime VM software baseline from ADR 0023
- changing public publication for runtime-hosted services
- adding alert rules or notification routing for container health
- changing protected integration files such as
VERSION,changelog.md,README.md, orversions/stack.yamlon the workstream branch
playbooks/monitoring-stack.ymlroles/docker_runtime_observability/roles/monitoring_vm/templates/_grafana_dashboard_macros.j2roles/monitoring_vm/templates/lv3-vm-detail.json.j2inventory/host_vars/proxmox-host.ymldocs/runbooks/monitoring-stack.mddocs/repository-map.mddocs/adr/0040-docker-runtime-container-telemetry-via-telegraf-docker-input.mdworkstreams.yaml
docker-runtimerunstelegrafwith Docker socket access- InfluxDB receives
docker_container_*measurements fromdocker-runtime LV3 docker-runtime Detailshows container-level runtime data
make syntax-check-monitoringansible -i /Users/live/Documents/GITHUB_PROJECTS/proxmox-host_server/inventory/hosts.yml docker-runtime -m shell -a 'systemctl is-active telegraf && id -nG telegraf' --private-key /Users/live/Documents/GITHUB_PROJECTS/proxmox-host_server/.local/ssh/hetzner_llm_agents_ed25519 -e proxmox_guest_ssh_connection_mode=proxmox_host_jumpssh -i /Users/live/Documents/GITHUB_PROJECTS/proxmox-host_server/.local/ssh/hetzner_llm_agents_ed25519 -o IdentitiesOnly=yes -J ops@100.118.189.95 ops@10.10.10.40 'sudo influx query --host http://127.0.0.1:8086 --org lv3 --token "$(sudo cat /etc/lv3/monitoring/influxdb-operator.token)" '\''from(bucket: "proxmox") |> range(start: -15m) |> filter(fn: (r) => r.host == "docker-runtime" and (r._measurement == "docker_container_status" or r._measurement == "docker_container_cpu" or r._measurement == "docker_container_mem" or r._measurement == "docker_container_net" or r._measurement == "docker_container_health")) |> limit(n: 20)'\'''
- runtime telemetry convergence is idempotent
- the managed runtime detail dashboard is provisioned from repo state
- the workstream registry and this document are current
- protected integration files are reconciled only during integration on
main
- Live apply completed on
2026-03-22throughmake converge-monitoring. - Verification confirmed
telegrafis active ondocker-runtime, InfluxDB is receivingdocker_container_*rows foruptime-kuma, and the runtime detail dashboard now contains16panels. - The first live rerun failed because
roles/monitoring_vm/tasks/main.ymlmixedRestart Grafanaandrestart grafana; commitaad25fefixed the handler names before the final idempotent rerun.