Skip to content

Commit a43c3dd

Browse files
fix(wait-for-grafana): remove fail-fast from Phase 1
Any curl error during startup is a transient condition for this action's use case (localhost Grafana container). Enumerating specific exit codes to fail-fast on is fragile — exit 56 broke a real CI run immediately after #213 merged. Simplify Phase 1: keep waiting on any curl error until startup_timeout expires or a non-000 response is received. Only Phase 2 fails fast (on 4xx), where a bad response genuinely indicates misconfiguration.
1 parent df34a21 commit a43c3dd

1 file changed

Lines changed: 9 additions & 18 deletions

File tree

wait-for-grafana/wait-for-grafana.sh

Lines changed: 9 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -12,42 +12,33 @@ echo "Startup timeout (TCP bind): $startup_timeout seconds"
1212
echo "Health timeout (after bind): $timeout seconds"
1313
echo "Interval: $interval seconds"
1414

15-
# Phase 1: wait for TCP port to bind.
16-
# Fail fast only on codes that indicate misconfiguration and will never self-resolve:
17-
# exit 3 = URL malformed, exit 6 = could not resolve host.
18-
# All other non-zero exits are transient startup conditions — keep waiting:
19-
# exit 7 = ECONNREFUSED (port not yet bound)
20-
# exit 52 = got nothing (port open but server not yet responding)
21-
# exit 56 = recv error (connection accepted then reset during startup)
15+
# Phase 1: wait for the server to return any HTTP status.
16+
# Any curl error is treated as a transient startup condition — keep waiting.
17+
# This covers ECONNREFUSED, recv errors, connection resets, and any other
18+
# transient state that can occur while the process is starting up.
2219
startup_end=$((SECONDS + startup_timeout))
2320
port_bound=false
2421

2522
while [ $SECONDS -lt $startup_end ]; do
2623
response=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 2 "$url")
27-
curl_exit=$?
28-
29-
if [ $curl_exit -eq 3 ] || [ $curl_exit -eq 6 ]; then
30-
echo "curl failed with exit code $curl_exit — misconfiguration error, failing fast"
31-
exit 1
32-
fi
3324

3425
if [ "$response" != "000" ]; then
3526
port_bound=true
3627
break
3728
fi
3829

39-
echo "Waiting for TCP bind (curl exit: $curl_exit). Current status: $response"
30+
echo "Waiting for server to start. Current status: $response"
4031
sleep 5
4132
done
4233

4334
if [ "$port_bound" = false ]; then
44-
echo "Startup timeout reached. Server TCP port did not bind within $startup_timeout seconds"
35+
echo "Startup timeout reached. Server did not respond within $startup_timeout seconds"
4536
exit 1
4637
fi
4738

48-
echo "TCP port bound. Waiting for server to respond with status code $expected_response_code..."
39+
echo "Server is responding. Waiting for status code $expected_response_code..."
4940

50-
# Phase 2: port is open, wait for a healthy response.
41+
# Phase 2: server is up, wait for a healthy response.
5142
# --connect-timeout and --max-time bound each curl call so a stalled connection
5243
# cannot outlast the health window.
5344
# Fail fast on 4xx — indicates a URL misconfiguration, not a timing issue.
@@ -70,5 +61,5 @@ while [ $SECONDS -lt $health_end ]; do
7061
sleep "$interval"
7162
done
7263

73-
echo "Timeout reached. Server did not respond with status code $expected_response_code within $timeout seconds after TCP bind"
64+
echo "Timeout reached. Server did not respond with status code $expected_response_code within $timeout seconds"
7465
exit 1

0 commit comments

Comments
 (0)