Skip to content

fix: resolve Windows stdio deadlocks caused by grpc.aio and google.auth#139

Open
gitrishiom wants to merge 2 commits intogoogleanalytics:mainfrom
gitrishiom:fix/windows-grpc-stdio-deadlocks-v2
Open

fix: resolve Windows stdio deadlocks caused by grpc.aio and google.auth#139
gitrishiom wants to merge 2 commits intogoogleanalytics:mainfrom
gitrishiom:fix/windows-grpc-stdio-deadlocks-v2

Conversation

@gitrishiom
Copy link
Copy Markdown

@gitrishiom gitrishiom commented Apr 19, 2026

Description

This PR resolves a critical starvation/deadlock issue when running the MCP server over standard I/O (stdio) on Windows.

The Problem

  1. Using google.auth.default() spawns subprocesses to resolve credentials, which can interfere with the standard I/O pipes used by the MCP JSON-RPC protocol.
  2. The grpc.aio async clients are known to cause event loop stalls and thread exhaustion on Windows when run concurrently with standard I/O streams.

The Solution

  1. Updated client factory methods in utils.py to use the synchronous AnalyticsAdminServiceClient and BetaAnalyticsDataClient.
  2. Offloaded blocking API calls inside the MCP tool functions (admin/info.py, reporting/*.py) using asyncio.to_thread(_fetch) to maintain non-blocking concurrency.
  3. Added explicit file-loading logic (Credentials.from_authorized_user_file) in utils.py to bypass subprocess spawning during authentication.

This has been tested locally with Google Antigravity on Windows 11 with the analytics-mcp connected to a Gemini Code Assist instance, resulting in perfect stability where it previously hung indefinitely.

- Replaced grpc.aio async clients with synchronous variants offloaded via asyncio.to_thread to prevent transport starvation on Windows pipes.
- Provided fallback logic for google.auth.default() to prevent blocking subprocess spawns during credential resolution.
@gitrishiom gitrishiom changed the title fix: resolve Windows stdio deadlocks caused by grpc.aio and google.auth fix: resolve Windows stdio deadlocks caused by grpc.aio and google.auth in MCP used in Antigravity Apr 19, 2026
@gitrishiom gitrishiom changed the title fix: resolve Windows stdio deadlocks caused by grpc.aio and google.auth in MCP used in Antigravity fix: resolve Windows stdio deadlocks caused by grpc.aio and google.auth Apr 19, 2026
Copy link
Copy Markdown

@ZLeventer ZLeventer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a real issue — grpc.aio and the Google auth library both try to manage their own event loops, and on Windows this creates deadlocks when the MCP server's stdio transport is also running an event loop.

The approach of dropping to sync clients and wrapping in asyncio.to_thread() is a pragmatic fix. It avoids the event loop collision by keeping gRPC off the async event loop entirely, while still presenting async interfaces to the MCP framework.

A few observations:

  1. Import placementimport asyncio is repeated inside each function body. Since it's a stdlib module with negligible import cost, moving it to the top-level import block would be cleaner. Not a blocker, just style.

  2. Client creation per callcreate_admin_api_client_sync() is called fresh inside each _fetch() closure. If the sync client creation is expensive (auth token fetch, channel setup), consider caching the sync client the same way the async client was cached at module level. This matters for repeated tool calls in a single MCP session.

  3. Thread pool sizingto_thread uses the default executor. For an MCP server handling one request at a time this is fine, but worth noting if concurrent tool calls are ever supported.

The core fix is sound — this unblocks Windows users who were getting hung MCP servers.

Copy link
Copy Markdown

@ZLeventer ZLeventer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The asyncio.to_thread + sync client approach is a legitimate solution to the grpc.aio / Windows stdio deadlock. A few things worth addressing before merge:

Debug logging left in production code (blocker)
_create_credentials now contains multiple print(..., file=sys.stderr) statements that will write noise to every MCP client's stderr on every tool call:

print("_create_credentials started", file=sys.stderr)
print(f"Loading credentials directly from {path}", file=sys.stderr)

These should be removed before merge.

Module-level imports moved inside function
import sys, import os, and import google.oauth2.credentials are now inside _create_credentials. Python caches imports so it's not a performance issue, but it's inconsistent with the rest of the codebase and harder to spot during review. Worth moving to module level.

Docstrings removed from client factories
The sync client factory functions dropped their docstrings. The originals documented the credential scope used — worth keeping.

Alternative worth considering
Rather than converting all clients to sync + asyncio.to_thread unconditionally, a targeted approach would only apply the workaround on Windows:

import sys
if sys.platform == 'win32':
    return await asyncio.to_thread(_fetch_sync)
else:
    response = await async_client.run_report(request)

This avoids adding thread-pool overhead on Linux/macOS where grpc.aio works fine, and makes the Windows-specific workaround more visible.

The underlying diagnosis (grpc.aio + stdio deadlock on Windows) is correct and this is a real user-affecting bug. Happy to see it fixed — the approach just needs cleanup before it's mergeable.

- Remove all debug print() statements from _create_credentials (blocker)
- Move import asyncio, sys, os, google.oauth2.credentials to top-level imports
- Restore docstrings on all client factory functions
- Keep both sync and async client factories; async clients remain for Linux/macOS
- Apply Windows-only workaround via sys.platform == 'win32' conditional:
  uses sync client + asyncio.to_thread on Windows, native async on other platforms
- Avoids thread-pool overhead on Linux/macOS where grpc.aio works correctly
@gitrishiom
Copy link
Copy Markdown
Author

Thanks for the detailed review @ZLeventer! I've pushed a follow-up commit addressing all your feedback:

  1. Debug prints removed - All print(..., file=sys.stderr) statements have been removed from _create_credentials.\n2. Imports moved to top-level - import asyncio, import sys, import os, and import google.oauth2.credentials are now at the module level, consistent with the rest of the codebase.\n3. Docstrings restored - All three client factory functions now have proper docstrings documenting the credential scope.\n4. Platform-conditional workaround - The sync client + asyncio.to_thread path is now gated behind sys.platform == 'win32'. Linux/macOS users continue to use the native async grpc.aio clients with zero overhead. Both sync and async factory functions are kept in utils.py.\n\nPlease let me know if there's anything else that needs adjustment!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants