Skip to content

feat(ai-proxy): abort upstream read on client disconnect during streaming#13254

Merged
nic-6443 merged 7 commits intoapache:masterfrom
nic-6443:fix/ai-proxy-client-disconnect
Apr 20, 2026
Merged

feat(ai-proxy): abort upstream read on client disconnect during streaming#13254
nic-6443 merged 7 commits intoapache:masterfrom
nic-6443:fix/ai-proxy-client-disconnect

Conversation

@nic-6443
Copy link
Copy Markdown
Member

Summary

When a downstream client disconnects mid-stream (browser tab closed, Ctrl+C, request cancelled), the proxy continues reading all remaining chunks from the LLM and performing SSE parsing, token counting, and protocol conversion — burning CPU and LLM API quota for no benefit.

Root Cause

lua_response_filter used ngx.flush() (async, no wait), which never surfaces client disconnection errors. There was no mechanism to detect a dead downstream in the streaming loop.

Fix

Add an optional wait parameter to lua_response_filter. When wait=true, it uses ngx.flush(true) (synchronous flush) which returns an error if the client connection is gone. The streaming path in parse_streaming_response now passes wait=true and on flush failure immediately closes the upstream connection and exits the read loop.

The wait parameter defaults to false so all existing callers are unaffected.

Changes

  • apisix/plugin.lua: add optional wait param to lua_response_filter; return (ok, err) for disconnect detection
  • apisix/plugins/ai-providers/base.lua: use wait=true in parse_streaming_response output loop; on flush failure close upstream and return
  • t/plugin/ai-proxy-client-disconnect.t: integration test verifying upstream is aborted after client disconnect

Test

The new test sets up a slow SSE mock (30ms/chunk, up to 2000 chunks) and a client that reads 3 chunks then closes. It verifies via shared dict that the upstream served well under 50 chunks total (stopped shortly after the disconnect).

…ming

When a downstream client disconnects mid-stream, the proxy was continuing
to read all remaining chunks from the LLM, performing SSE parsing, token
counting, and protocol conversion unnecessarily.

Fix by passing wait=true to lua_response_filter in the streaming path.
ngx.flush(true) returns an error when the client connection is gone, at
which point we close the upstream httpc connection and return early.

Changes:
- plugin.lua: add optional wait param to lua_response_filter; return
  (ok, err) so callers can detect client disconnection
- ai-providers/base.lua: use wait=true in parse_streaming_response output
  loop; on flush failure close upstream and return immediately
Copilot AI review requested due to automatic review settings April 18, 2026 06:53
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Apr 18, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves APISIX’s AI proxy streaming behavior by detecting downstream client disconnects and promptly aborting the upstream streaming read, avoiding wasted CPU and LLM quota.

Changes:

  • Extend apisix.plugin.lua_response_filter with an optional wait parameter to allow synchronous flushing and surface disconnect errors via (ok, err) returns.
  • Update the AI streaming loop to flush synchronously, detect disconnects, and close the upstream HTTP client immediately.
  • Add an integration test that simulates a client disconnect and asserts upstream streaming stops early.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
apisix/plugin.lua Adds wait option and return values to enable disconnect detection during streaming flush.
apisix/plugins/ai-providers/base.lua Uses wait=true during streaming output and closes upstream on flush failure.
t/plugin/ai-proxy-client-disconnect.t New integration test validating upstream abort behavior after downstream disconnect.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread apisix/plugin.lua Outdated
Comment thread apisix/plugin.lua
Comment thread apisix/plugins/ai-providers/base.lua
nic-6443 and others added 3 commits April 18, 2026 15:07
…ct handler

- update docstring to accurately reflect that lua_response_filter always
  returns (ok, err) regardless of wait parameter
- avoid passing nil to ngx_flush: explicitly call ngx_flush(true) when
  wait==true, ngx_flush() otherwise
- extract abort_on_disconnect local helper in parse_streaming_response
  to deduplicate the log+close+mark-done pattern

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…sconnect test

Lua-style comments (--) placed outside content_by_lua_block are parsed
as nginx directives, causing nginx to fail to start with
'unknown directive "--"'.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Lua table serialization via cjson does not guarantee field order, so
the response may have output_tokens before input_tokens. Use a lookahead
regex to match both fields regardless of order.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@nic-6443
Copy link
Copy Markdown
Member Author

Also fixed a pre-existing test failure in ai-proxy-anthropic.t TEST 6: the regex /"input_tokens":10.*"output_tokens":5/ was assuming a specific JSON field order, but Lua's cjson serializes hash tables without guaranteed ordering. Updated to a lookahead regex that matches both fields regardless of order.

The curl to port 9100 was firing before the etcd stream-route change
propagated to the stream workers, causing 'matched route: null' and
skipping DNS resolution. Add a 1s sleep after the admin PUT to let
the config sync complete before making the test request.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@nic-6443
Copy link
Copy Markdown
Member Author

Fixed the linux_apisix_current_luarocks stream DNS test failure. The root cause was a race condition: the curl http://127.0.0.1:9100 was firing immediately after the admin PUT, before the stream route had propagated from etcd to the stream workers. The APISIX log showed matched route: null followed by the etcd sync insert data by key: 1 happening afterwards. Added a 1-second sleep between the route creation and the test request to let the config sync complete.

moonming
moonming previously approved these changes Apr 20, 2026
membphis
membphis previously approved these changes Apr 20, 2026
# Conflicts:
#	apisix/plugins/ai-providers/base.lua
@nic-6443 nic-6443 dismissed stale reviews from membphis and moonming via d57f165 April 20, 2026 02:54
@nic-6443 nic-6443 requested review from membphis and moonming April 20, 2026 05:57
@nic-6443 nic-6443 merged commit a7e327e into apache:master Apr 20, 2026
22 of 25 checks passed
@nic-6443 nic-6443 deleted the fix/ai-proxy-client-disconnect branch April 20, 2026 06:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants