Skip to content

Commit 1008af0

Browse files
committed
fix(ai-proxy): yield to scheduler in streaming SSE loop to avoid worker CPU starvation
When an upstream LLM emits SSE chunks in a tight burst (e.g. a model hallucinating and producing tokens at 100+ per second), the streaming loop in parse_streaming_response can run for an extended period without yielding to the nginx scheduler. body_reader() (cosocket recv) only yields when the recv buffer is empty; if the kernel has already buffered several chunks, successive calls return immediately. ngx.flush(true) only yields when the downstream send buffer is full; a fast client drains immediately. So neither end of the loop guarantees a yield, and the SSE coroutine ends up monopolizing the worker — starving health checks, concurrent requests, and timer callbacks on the same worker. Add an explicit ngx.sleep(0) at the end of each loop iteration. This is a no-op timer that just yields the current coroutine, allowing other ready coroutines to run. The cost is negligible: in normal AI traffic chunks already arrive with inter-chunk gaps so an extra yield per chunk is invisible; in burst scenarios it caps per-coroutine runtime to one chunk's worth of work.
1 parent 4223a07 commit 1008af0

1 file changed

Lines changed: 7 additions & 0 deletions

File tree

apisix/plugins/ai-providers/base.lua

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -361,6 +361,13 @@ function _M.parse_streaming_response(self, ctx, res, target_proto, converter)
361361
else
362362
plugin.lua_response_filter(ctx, res.headers, chunk)
363363
end
364+
365+
-- Yield to the nginx scheduler so other coroutines on this worker
366+
-- (health checks, concurrent requests) can run. body_reader() and
367+
-- ngx.flush() do not yield when the upstream socket already has data
368+
-- buffered or the downstream client drains immediately, so under
369+
-- bursty SSE upstreams this loop can monopolize the worker CPU.
370+
ngx.sleep(0)
364371
end
365372
end
366373

0 commit comments

Comments
 (0)