Skip to content

Make WebSocket policy inference nonblocking#938

Open
taivu1998 wants to merge 1 commit into
Physical-Intelligence:mainfrom
taivu1998:tdv/issue-717-websocket-nonblocking
Open

Make WebSocket policy inference nonblocking#938
taivu1998 wants to merge 1 commit into
Physical-Intelligence:mainfrom
taivu1998:tdv/issue-717-websocket-nonblocking

Conversation

@taivu1998
Copy link
Copy Markdown

Summary

Fixes #717.

This changes the WebSocket policy server so blocking policy.infer() calls no longer run directly on the asyncio event-loop thread. The server now offloads inference to a bounded ThreadPoolExecutor, with --inference-workers=1 as the default so calls into the shared policy object remain serialized unless users explicitly opt into more concurrency.

Root Cause

WebsocketPolicyServer._handler() awaited websocket receive/send operations, but called self._policy.infer(obs) synchronously between those awaits. A slow JAX/PyTorch inference could therefore occupy the event loop and delay other clients from connecting, receiving metadata, or making progress through websocket I/O.

Changes

  • Add a bounded inference executor and semaphore to WebsocketPolicyServer.
  • Keep default behavior semantically conservative by serializing policy calls with one worker.
  • Add cancellation handling so a cancelled websocket request does not release the inference slot until the worker call actually finishes.
  • Add an inference_workers server/CLI option for explicitly thread-safe policies that can run concurrent inference calls.
  • Document the remote-inference behavior and the safety caveat for increasing worker count.
  • Add fake-policy websocket tests for responsive handshakes, worker limiting, error propagation, timing metadata, constructor validation, and cancellation slot accounting.

Validation

  • git diff --check
  • uvx ruff format src/openpi/serving/websocket_policy_server.py src/openpi/serving/websocket_policy_server_test.py scripts/serve_policy.py
  • uvx ruff check src/openpi/serving/websocket_policy_server.py src/openpi/serving/websocket_policy_server_test.py scripts/serve_policy.py docs/remote_inference.md
  • PYTHONPATH=src:packages/openpi-client/src uvx --python 3.11 --with websockets --with msgpack --with numpy --with pynvml pytest src/openpi/serving/websocket_policy_server_test.py

Focused tests passed with 7 passed, 1 warning.

The full repo-native uv run pytest --strict-markers -m "not manual" could not be run on this macOS arm64 host because jax-cuda12-plugin==0.5.3 only provides Linux wheels.

@taivu1998 taivu1998 marked this pull request as ready for review May 11, 2026 03:37
@jimmyt857 jimmyt857 removed their request for review May 11, 2026 04:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Potential blocking behavior in WebSocket policy server despite async design

1 participant