fix: resolve OAuth port conflict on re-auth in stdio mode#606
fix: resolve OAuth port conflict on re-auth in stdio mode#606Alexheath21 wants to merge 1 commit into
Conversation
…inst itself In stdio mode, when a token expires and the MCP server needs to re-authenticate, ensure_oauth_callback_available() calls start() on MinimalOAuthServer. The server thread may still be alive and holding the port even though is_running was set to False by an exception handler — causing start() to fail with "port already in use" against its own process. Three fixes: - start() now checks server_thread.is_alive() before the is_running flag, detecting and reusing a server whose flag is stale - The socket bind fallback now probes is_actually_running() before returning an error, recovering when the port is held by our own server - stop() no longer short-circuits on is_running=False, ensuring the uvicorn server and thread are properly cleaned up even when the flag is stale Fixes taylorwilsdon#546 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughModified Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@auth/oauth_callback_server.py`:
- Around line 240-249: The except block currently catches all OSError instances;
update it to catch OSError as exc and only perform the "reuse" fallback when
exc.errno == errno.EADDRINUSE (matching the logic in is_actually_running()). If
errno matches, log the reuse message, set self.is_running = True and return
True, ""; otherwise handle the other errors normally (e.g., build the error_msg
using self.port and hostname or re-raise/return failure). Ensure errno is
imported where oauth_callback_server.py defines this logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1c014115-734c-4b27-8581-38a8c3d4cc10
📒 Files selected for processing (2)
auth/oauth_callback_server.pytests/auth/test_oauth_callback_server.py
| except OSError: | ||
| # Before failing, check if the port is held by a previous server | ||
| # instance in our own process that we can still use for callbacks. | ||
| if self.is_actually_running(): | ||
| logger.info( | ||
| f"Port {self.port} is in use by our own OAuth server, reusing" | ||
| ) | ||
| self.is_running = True | ||
| return True, "" | ||
| error_msg = f"Port {self.port} is already in use on {hostname}. Cannot start minimal OAuth server." |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd oauth_callback_server.pyRepository: taylorwilsdon/google_workspace_mcp
Length of output: 148
🏁 Script executed:
head -20 auth/oauth_callback_server.py && wc -l auth/oauth_callback_server.pyRepository: taylorwilsdon/google_workspace_mcp
Length of output: 548
🏁 Script executed:
sed -n '230,260p' auth/oauth_callback_server.pyRepository: taylorwilsdon/google_workspace_mcp
Length of output: 1364
🏁 Script executed:
grep -n "import errno\|from errno" auth/oauth_callback_server.pyRepository: taylorwilsdon/google_workspace_mcp
Length of output: 92
🏁 Script executed:
sed -n '1,400p' auth/oauth_callback_server.py | grep -n "is_actually_running" -A 10 -B 2Repository: taylorwilsdon/google_workspace_mcp
Length of output: 3314
🏁 Script executed:
sed -n '200,280p' auth/oauth_callback_server.pyRepository: taylorwilsdon/google_workspace_mcp
Length of output: 3461
🏁 Script executed:
grep -n "except OSError\|except " auth/oauth_callback_server.pyRepository: taylorwilsdon/google_workspace_mcp
Length of output: 407
🏁 Script executed:
sed -n '155,175p' auth/oauth_callback_server.pyRepository: taylorwilsdon/google_workspace_mcp
Length of output: 772
🏁 Script executed:
sed -n '135,170p' auth/oauth_callback_server.pyRepository: taylorwilsdon/google_workspace_mcp
Length of output: 1491
Narrow bind-recovery fallback to EADDRINUSE only.
Line 240 currently catches any OSError, but the reuse fallback should only apply to address-in-use conflicts. Other bind failures (e.g., permission denied, invalid address) should not be treated as reusable-running. This is inconsistent with the is_actually_running() method (line 163), which already checks if exc.errno == errno.EADDRINUSE for the same scenario.
💡 Suggested fix
- except OSError:
+ except OSError as exc:
# Before failing, check if the port is held by a previous server
# instance in our own process that we can still use for callbacks.
- if self.is_actually_running():
+ if exc.errno == errno.EADDRINUSE and self.is_actually_running():
logger.info(
f"Port {self.port} is in use by our own OAuth server, reusing"
)
self.is_running = True
return True, ""
- error_msg = f"Port {self.port} is already in use on {hostname}. Cannot start minimal OAuth server."
+ if exc.errno == errno.EADDRINUSE:
+ error_msg = (
+ f"Port {self.port} is already in use on {hostname}. "
+ "Cannot start minimal OAuth server."
+ )
+ else:
+ error_msg = (
+ f"Failed to bind minimal OAuth server on {hostname}:{self.port}: {exc}"
+ )
logger.error(error_msg)
return False, error_msg🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@auth/oauth_callback_server.py` around lines 240 - 249, The except block
currently catches all OSError instances; update it to catch OSError as exc and
only perform the "reuse" fallback when exc.errno == errno.EADDRINUSE (matching
the logic in is_actually_running()). If errno matches, log the reuse message,
set self.is_running = True and return True, ""; otherwise handle the other
errors normally (e.g., build the error_msg using self.port and hostname or
re-raise/return failure). Ensure errno is imported where
oauth_callback_server.py defines this logic.
|
There was a PR that the author closed looking to do something similar in #590 Will take a look. Some of the coderabbit feedback looks actionable before merge! |
Summary
MinimalOAuthServerfails with "port already in use" when re-authenticating against its own process in stdio modestart()now detects a live server thread even whenis_runningis stale (set toFalseby an exception handler while the thread still holds the port)is_actually_running()before erroring, recovering when the port is held by our own serverstop()no longer short-circuits onis_running=False, ensuring proper cleanup of the uvicorn server and threadFixes #546
Root Cause
In stdio mode,
MinimalOAuthServerstarts a uvicorn server on a daemon thread. When the server thread encounters an error, it setsis_running = Falsebut the thread and port binding persist. On the next re-auth attempt:start()seesis_running=Falseand skips the "already running" checkstart()returns"Port 8000 is already in use"— blocking all subsequent MCP calls until restartFix Details
Three targeted changes to
auth/oauth_callback_server.py:start()— thread-alive check beforeis_runningflag: Checksserver_thread.is_alive()first, which catches the stale-flag scenario. If the thread is alive and the port responds, reuses the server.start()— bind failure recovery: When the socket bind fails withEADDRINUSE, probesis_actually_running()before giving up. If the port is responding (our own server), marks it as running and succeeds.stop()— unconditional cleanup: Removed theif not self.is_running: returnguard sostop()always signalsshould_exitand joins the thread, preventing zombie server threads.Test Plan
test_start_reuses_server_when_thread_alive_and_port_responding— reproduces the exact re-auth bug scenariotest_start_recovers_when_port_in_use_by_own_server— covers the bind-failure recovery pathtest_stop_cleans_up_even_when_is_running_false— verifies stop() works with stale flag🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests