Add max_retries and retry_interval to [(api_)database] conf

melwitt · melwitt · commit b8071eb0dce6 · 2026-06-16T04:51:20.000-07:00
Nova's default database connection retry settings (max_retries=10,
retry_interval=10) mean nova-api can spend up to 100 seconds retrying a
connection to an unavailable database. This exceeds the Kubernetes
startup probe window (60 seconds), so the pod gets killed before nova
finishes retrying, leading to a CrashLoopBackOff when a cell database
is down.

In RHOSO, Kubernetes provides its own higher-level retry mechanism
by killing and recreating pods that fail to start. This is
preferable to Nova retrying internally because it reports the
situation clearly via CR status fields and events, and allows
Kubernetes to reschedule the pod to another worker if needed.

Set max_retries to 3 and retry_interval to 1 second for both
[database] and [api_database] so that Nova gives up on an
unreachable database quickly and lets Kubernetes handle the
recovery.

Resolves: OSPRH-30130
Signed-off-by: melanie witt &lt;melwittt@gmail.com&gt;
diff --git a/templates/nova/nova.conf b/templates/nova/nova.conf
@@ -241,12 +241,16 @@ cpu_power_management=false
 
 {{if (index . "cell_db_address")}}
 [database]
+max_retries = 3
+retry_interval = 1
 connection = mysql+pymysql://{{ .cell_db_user }}:{{ .cell_db_password}}@{{ .cell_db_address }}/{{ .cell_db_name }}?read_default_file=/etc/my.cnf
 {{end}}
 
 
 {{if (index . "api_db_address")}}
 [api_database]
+max_retries = 3
+retry_interval = 1
 connection = mysql+pymysql://{{ .api_db_user }}:{{ .api_db_password }}@{{ .api_db_address }}/{{ .api_db_name }}?read_default_file=/etc/my.cnf
 {{end}}