Skip to content

Commit 1388cec

Browse files
committed
fix(wait-for-hydra): add periodic re-checks during SSE stream
CDN proxies like Cloudflare may buffer SSE events, preventing real-time delivery of status updates to the client. This caused the script to hang indefinitely on the SSE stream even after the build completed. Three fixes: 1. Periodic one-shot re-check: every 120s, poll the bridge's cached /status endpoint directly. This catches status changes even when SSE events are buffered or lost. 2. Reserve time for fallback: set curl --max-time to TIMEOUT - 120s so there's always 2 minutes left for one-shot re-check + polling. 3. Final one-shot re-check after SSE stream ends, before entering the polling fallback loop.
1 parent afaa453 commit 1388cec

1 file changed

Lines changed: 38 additions & 2 deletions

File tree

wait-for-hydra/support/wait.sh

Lines changed: 38 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -114,9 +114,45 @@ sse_wait() {
114114
# when the SSE stream is idle (Cloudflare buffering, no new events).
115115
# Without this, read blocks indefinitely and the script's TIMEOUT is
116116
# never enforced. In bash, read -t returns >128 on timeout, 1 on EOF.
117-
echo "SSE: Connecting to ${base_url}/events (filtering for '$HYDRA_JOB')"
117+
# Cap the SSE stream time to leave room for one-shot re-check and
118+
# polling fallback if the stream ends without delivering our event.
119+
# Reserve 2 minutes for fallback; minimum SSE time is 60 seconds.
120+
local sse_max_time
121+
if [ "$TIMEOUT" -gt 0 ]; then
122+
sse_max_time=$((TIMEOUT - SECONDS - 120))
123+
[ "$sse_max_time" -lt 60 ] && sse_max_time=60
124+
else
125+
sse_max_time=86400 # no timeout: cap at 24h (matches cache TTL)
126+
fi
127+
128+
# Track when we last did a one-shot re-check so we can poll the
129+
# cached endpoint periodically. CDN proxies (e.g. Cloudflare) may
130+
# buffer SSE events, so we re-check every 120s as a safety net.
131+
local last_recheck="$SECONDS"
132+
133+
echo "SSE: Connecting to ${base_url}/events (max ${sse_max_time}s, filtering for '$HYDRA_JOB')"
118134
while true; do
119135
check_timeout
136+
137+
# Periodic one-shot re-check: catch status changes that the SSE
138+
# stream failed to deliver (CDN buffering, lost events, etc.).
139+
if [ $((SECONDS - last_recheck)) -ge 120 ]; then
140+
last_recheck="$SECONDS"
141+
local poll_state
142+
poll_state=$(sse_get_current_status "$base_url") || true
143+
if [ -n "$poll_state" ]; then
144+
local poll_result
145+
poll_result=$(classify_status "$poll_state")
146+
if [ "$poll_result" = "success" ]; then
147+
echo "$HYDRA_JOB succeeded (from periodic re-check at ${SECONDS}s)"
148+
exit 0
149+
elif [ "$poll_result" = "failure" ]; then
150+
echo "$HYDRA_JOB failed (from periodic re-check at ${SECONDS}s)"
151+
exit 1
152+
fi
153+
fi
154+
fi
155+
120156
local read_rc=0
121157
IFS= read -r -t 60 line || read_rc=$?
122158
if [ "$read_rc" -gt 128 ]; then
@@ -146,7 +182,7 @@ sse_wait() {
146182
fi
147183
;;
148184
esac
149-
done < <(curl -Nsf --max-time "$((TIMEOUT > 0 ? TIMEOUT - SECONDS : 0))" \
185+
done < <(curl -Nsf --max-time "$sse_max_time" \
150186
"${base_url}/events" 2>/dev/null)
151187

152188
# SSE stream ended without a terminal event for HYDRA_JOB. This can

0 commit comments

Comments
 (0)