Skip to content

Commit e700ff2

Browse files
committed
Fix bg fill teardown crash in update_size_and_time_stats
Production crash logs show ink_assert(0) firing in HttpTransact::update_size_and_time_stats from HttpSM::update_stats via HttpSM::kill_this, triggered by an Http2Stream event re-entering the state machine while background_fill was still in the STARTED state. The background_fill state is normally driven to a terminal value (COMPLETED or ABORTED) by tunnel_handler_server, but when the SM is torn down before that handler runs the state stays STARTED and the unreachable default branch in update_size_and_time_stats asserts. The same path also leaks proxy.process.http.background_fill_current_count because tunnel_handler_server is the only site that decrements the gauge after tunnel_handler_ua incremented it. This normalizes STARTED to ABORTED at the top of HttpSM::update_stats and decrements the gauge there, so the accounting balances and the downstream stats helper sees a terminal state. As a defensive backstop it also folds the STARTED case into the ABORTED branch of update_size_and_time_stats so a future caller that reaches this point mid-fill records the bytes against background_fill_bytes_aborted instead of crashing the server.
1 parent 450d3f3 commit e700ff2

2 files changed

Lines changed: 21 additions & 1 deletion

File tree

src/proxy/http/HttpSM.cc

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7777,6 +7777,21 @@ HttpSM::update_stats()
77777777
ATS_PROBE2(milestone_sm_finish, sm_id, static_cast<int>(status_code));
77787778
milestones[TS_MILESTONE_SM_FINISH] = ink_get_hrtime();
77797779

7780+
// The background fill state is normally driven to a terminal value
7781+
// (COMPLETED or ABORTED) by tunnel_handler_server when the server-side
7782+
// tunnel finishes. If the SM is torn down before that handler runs (for
7783+
// example, when an Http2Stream event re-enters the SM and drives kill_this
7784+
// while the bg fill is still in flight), the state can remain STARTED. In
7785+
// that case the background_fill_current_count gauge would also leak,
7786+
// because tunnel_handler_server is the only place that decrements it after
7787+
// tunnel_handler_ua incremented it. Normalize the state here so the
7788+
// accounting balances and update_size_and_time_stats does not see an
7789+
// unexpected value.
7790+
if (background_fill == BackgroundFill_t::STARTED) {
7791+
background_fill = BackgroundFill_t::ABORTED;
7792+
Metrics::Gauge::decrement(http_rsb.background_fill_current_count);
7793+
}
7794+
77807795
if (is_action_tag_set("bad_length_state_dump")) {
77817796
if (status_code == HTTPStatus::OK) {
77827797
int64_t p_resp_cl = t_state.hdr_info.client_response.get_content_length();

src/proxy/http/HttpTransact.cc

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8957,6 +8957,12 @@ HttpTransact::update_size_and_time_stats(State *s, ink_hrtime total_time, ink_hr
89578957
Metrics::Counter::increment(http_rsb.background_fill_bytes_completed, bg_size);
89588958
break;
89598959
}
8960+
// STARTED is normally transitioned to COMPLETED or ABORTED by
8961+
// tunnel_handler_server. HttpSM::update_stats() normalizes any leftover
8962+
// STARTED state to ABORTED before calling here, but treat STARTED the same
8963+
// as ABORTED defensively in case a future code path reaches this point with
8964+
// the bg fill still in flight, so we record the bytes rather than abort.
8965+
case BackgroundFill_t::STARTED:
89608966
case BackgroundFill_t::ABORTED: {
89618967
int64_t bg_size = origin_server_response_body_size - user_agent_response_body_size;
89628968

@@ -8968,7 +8974,6 @@ HttpTransact::update_size_and_time_stats(State *s, ink_hrtime total_time, ink_hr
89688974
}
89698975
case BackgroundFill_t::NONE:
89708976
break;
8971-
case BackgroundFill_t::STARTED:
89728977
default:
89738978
ink_assert(0);
89748979
}

0 commit comments

Comments
 (0)