Fix: single Ctrl+C shutdown hang after image generation#112
Conversation
lstein
left a comment
There was a problem hiding this comment.
Shutdown no longer hangs, but I'm getting a series of warnings at the end:
^C[2026-03-02 09:40:26,173]::[ModelInstallService]::INFO --> Installer thread 139743997077184 exiting
[2026-03-02 09:40:27,177]::[InvokeAI]::INFO --> InvokeAI shutting down...
Task was destroyed but it is pending!
task: <Task pending name='Task-472' coro=<AsyncSocket._send_ping() running at /home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/engineio/async_socket.py:135> wait_for=<Future pending cb=[Task.task_wakeup()]>>
Task was destroyed but it is pending!
task: <Task pending name='Task-845' coro=<AsyncSocket._send_ping() running at /home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/engineio/async_socket.py:135> wait_for=<Future pending cb=[Task.task_wakeup()]>>
Task was destroyed but it is pending!
task: <Task pending name='Task-847' coro=<AsyncSocket._send_ping() running at /home/lstein/invokeai-lstein/.venv/lib/python3.12/site-packages/engineio/async_socket.py:135> wait_for=<Future pending cb=[Task.task_wakeup()]>>
Fixed in 01a83c5. The warnings were caused by |
lstein
left a comment
There was a problem hiding this comment.
Great improvement. However, when I generate a keyboard interrupt when a generation is in process (e.g. during the denoising process), I get an ugly stack trace:
Traceback (most recent call last):
File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/session_processor/session_processor_default.py", line 459, in _process
self.session_runner.run(queue_item=self._queue_item)
File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/session_processor/session_processor_default.py", line 98, in run
self.run_node(invocation, queue_item)
File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/session_processor/session_processor_default.py", line 151, in run_node
self._on_node_error(
File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/session_processor/session_processor_default.py", line 288, in _on_node_error
queue_item = self._services.session_queue.fail_queue_item(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/session_queue/session_queue_sqlite.py", line 393, in fail_queue_item
queue_item = self._set_queue_item_status(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/session_queue/session_queue_sqlite.py", line 265, in _set_queue_item_status
self.__invoker.services.events.emit_queue_item_status_changed(queue_item, batch_status, queue_status)
File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/events/events_base.py", line 101, in emit_queue_item_status_changed
self.dispatch(QueueItemStatusChangedEvent.build(queue_item, batch_status, queue_status))
File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/events/events_fastapievents.py", line 31, in dispatch
self._loop.call_soon_threadsafe(self._queue.put_nowait, event)
File "/home/lstein/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/asyncio/base_events.py", line 844, in call_soon_threadsafe
self._check_closed()
File "/home/lstein/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/asyncio/base_events.py", line 545, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
Could the session process catch the "Event loop is closed" event and issue a friendlier message such as "Event loop terminated; generation session aborted."
Fixed in fe31ac4. Added a |
lstein
left a comment
There was a problem hiding this comment.
The last commit results in a core dump when the user interrupts a generation in mid-flight:
[2026-03-02 10:01:09,996]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model '5e0fc394-89d3-4bce-bb7e-782413fede38:scheduler' (DDPMScheduler) onto cuda device in 0.00s. Total model size: 0.00MB, VRAM: 0.00MB (0.0%)
20%|█████████████████████████████████▌ | 6/30 [00:02<00:08, 2.79it/s]
23%|███████████████████████████████████████▏ | 7/30 [00:02<00:08, 2.80it/s]
[2026-03-02 10:01:12,858]::[ModelInstallService]::INFO --> Installer thread 125658999027392 exiting
[2026-03-02 10:01:12,860]::[InvokeAI]::INFO --> InvokeAI shutting down...
27%|████████████████████████████████████████████▊ | 8/30 [00:02<00:07, 2.80it/s]
terminate called without an active exception
Aborted (core dumped)
Fixed in aaaf3fe. The core dump was caused by a race between CUDA denoising and Python teardown: our The fix sets |
aaaf3fe to
bc509eb
Compare
…generation hang Co-authored-by: lstein <111189+lstein@users.noreply.github.com> Fix: cancel pending asyncio tasks before loop.close() to suppress destroyed-task warnings Co-authored-by: lstein <111189+lstein@users.noreply.github.com> Fix: suppress stack trace when dispatching events after event loop is closed on shutdown Co-authored-by: lstein <111189+lstein@users.noreply.github.com> Fix: cancel in-progress generation on stop() to prevent core dump during mid-flight Ctrl+C Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
1bf5121 to
5b536c7
Compare
…i#8936) * Fix: Kill the server with one keyboard interrupt (#94) * Initial plan * Handle KeyboardInterrupt in run_app to allow single Ctrl+C shutdown Co-authored-by: lstein <111189+lstein@users.noreply.github.com> * Force os._exit(0) on KeyboardInterrupt to avoid hanging on background threads Co-authored-by: lstein <111189+lstein@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: lstein <111189+lstein@users.noreply.github.com> Fix graceful shutdown to wait for download/install worker threads (#102) * Initial plan * Replace os._exit(0) with ApiDependencies.shutdown() on KeyboardInterrupt Instead of immediately force-exiting the process on CTRL+C, call ApiDependencies.shutdown() to gracefully stop the download and install manager services, allowing active work to complete or cancel cleanly before the process exits. Co-authored-by: lstein <111189+lstein@users.noreply.github.com> * Make stop() idempotent in download and model install services When CTRL+C is pressed, uvicorn's graceful shutdown triggers the FastAPI lifespan which calls ApiDependencies.shutdown(), then a KeyboardInterrupt propagates from run_until_complete() hitting the except block which tries to call ApiDependencies.shutdown() a second time. Change both stop() methods to return silently (instead of raising) when the service is not running. This handles: - Double-shutdown: lifespan already stopped the services - Early interrupt: services were never fully started Co-authored-by: lstein <111189+lstein@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: lstein <111189+lstein@users.noreply.github.com> Fix shutdown hang on session processor thread lock (#108) * Initial plan * Fix shutdown hang: wake session processor thread on stop() and mark daemon Co-authored-by: lstein <111189+lstein@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: lstein <111189+lstein@users.noreply.github.com> * Fix: shut down asyncio executor on KeyboardInterrupt to prevent post-generation hang (#112) Fix: cancel pending asyncio tasks before loop.close() to suppress destroyed-task warnings Fix: suppress stack trace when dispatching events after event loop is closed on shutdown Fix: cancel in-progress generation on stop() to prevent core dump during mid-flight Ctrl+C Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: lstein <111189+lstein@users.noreply.github.com> --------- Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Summary
After any image generation, pressing Ctrl+C once caused an indefinite hang requiring a second Ctrl+C to exit, producing a
KeyboardInterruptinsidethreading._shutdown(). Additionally, pressing Ctrl+C during an active generation produced either an uglyRuntimeError: Event loop is closedstack trace or a C++std::terminate()core dump.Root cause:
asyncio.to_thread()insession_queue_sqlite.pyruns SQLite operations during generation using the asyncio event loop's defaultThreadPoolExecutor. In Python 3.12,ThreadPoolExecutorthreads havedaemon=False. WhenKeyboardInterruptinterruptsloop.run_until_complete(server.serve()), the loop exits without being shut down — those non-daemon executor threads persist and blockthreading._shutdown()indefinitely. This is generation-specific becauseasyncio.to_thread()is never called until a queue item is processed.invokeai/app/run_app.py"Task was destroyed but it is pending!"warnings on shutdown.ApiDependencies.shutdown(), callsloop.run_until_complete(loop.shutdown_default_executor())thenloop.close()to drain and terminate the executor's non-daemon threads before Python teardown begins.invokeai/app/services/events/events_fastapievents.pyloop.is_closed()guard inFastAPIEventService.dispatch(). When Ctrl+C is pressed mid-generation, the shutdown code closes the event loop while the generation thread is still winding down and trying to emit status events. Events are silently dropped when the loop is already closed, preventing aRuntimeError: Event loop is closedstack trace.invokeai/app/services/session_processor/session_processor_default.pyDefaultSessionProcessor.stop()now sets_cancel_eventin addition to_stop_event. This signals any in-progress generation (e.g. denoising) to stop at the next step boundary viaCanceledException, rather than running to completion. Without this, the generation thread could still be executing CUDA operations while Python teardown begins, causing a C++std::terminate()core dump (terminate called without an active exception).tests/test_asyncio_shutdown.pyFour isolated subprocess tests:
asyncio.to_thread()leaves a non-daemon thread alive (reproduces the bug).shutdown_default_executor()+loop.close()eliminates it.KeyboardInterrupt.loop.close()suppresses the"Task was destroyed but it is pending!"warnings.Related Issues / Discussions
QA Instructions
"Task was destroyed but it is pending!"warnings, noRuntimeError: Event loop is closedstack trace, and no core dump.Merge Plan
Checklist
What's Newcopy (if doing a release after this PR)Original prompt
🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.