Skip to content

Commit ebbbe94

Browse files
5queezerclaude
andauthored
feat: GCS-backed auth state persistence for Cloud Run (#9)
* feat(config): add StorageConfig for external auth-state storage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(config): load AUTH_STORAGE_* env vars Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(storage): add StorageBackend protocol and LocalBackend Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(storage): add sync orchestration tests with in-memory backend Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(storage): add GCSBackend implementation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: add [gcs] optional dependency for google-cloud-storage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(startup): sync auth state from remote storage before auth check * feat(login): sync auth state to remote storage after login * feat(shutdown): sync auth state to remote storage after cookie export * feat(logout): delete remote auth state when storage backend is configured * fix: address code review findings for storage module - Guard against None bucket in get_storage_backend() - Replace assert with `or ""` fallback for username type narrowing - Skip storage sync when OAuth is enabled (no cookie auth needed) - Add missing delete_remote failure test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add GCS auth storage design doc and implementation plan Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: include [gcs] extra in Docker image Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: run GCS sync even when OAuth is enabled Cloud Run needs both: OAuth protects the MCP endpoint, AND cookie-based auth is needed for LinkedIn scraping. The storage sync must run regardless of OAuth mode. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: skip profile dir check when remote storage provides auth When AUTH_STORAGE_BACKEND=gcs, only cookies.json and source-state.json are synced — the Chromium profile directory doesn't exist locally. Relax the startup check to only require source state + cookies when remote storage is configured, since Cloud Run always bridges from cookies as a foreign runtime. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 99774bd commit ebbbe94

16 files changed

Lines changed: 2001 additions & 3 deletions

File tree

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends git && rm -rf /
2121
# Set browser install location (Patchright reads PLAYWRIGHT_BROWSERS_PATH internally)
2222
ENV PLAYWRIGHT_BROWSERS_PATH=/opt/patchright
2323
# Install dependencies, system libs for Chromium, and patched Chromium binary
24-
RUN uv sync --frozen && \
24+
RUN uv sync --frozen --extra gcs && \
2525
uv run patchright install-deps chromium && \
2626
uv run patchright install chromium && \
2727
chmod -R 755 /opt/patchright

docs/design-gcs-auth-storage.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# Design: GCS-backed auth state persistence
2+
3+
Resolves: [#7](https://github.com/5queezer/linkedin-mcp-server/issues/7)
4+
5+
## Problem
6+
7+
Cloud Run cold starts wipe the filesystem. The server loses `cookies.json` and `source-state.json`, making LinkedIn auth unrecoverable without manual `--login`.
8+
9+
## Key finding
10+
11+
LinkedIn OAuth does not cover messaging/inbox access. The existing browser-based auth (cookie extraction) is the only viable approach. The repo's OAuth support protects the MCP endpoint only — it is separate from LinkedIn auth.
12+
13+
## Solution
14+
15+
Persist portable auth artifacts (`cookies.json` + `source-state.json`) to Google Cloud Storage. Restore on startup, re-sync after login and on shutdown.
16+
17+
Cloud Run always runs as a **foreign runtime** — it bridges from cookies on every cold start. No full browser profile is persisted (avoids 50-200MB transfers and cross-platform issues).
18+
19+
## Configuration
20+
21+
```
22+
AUTH_STORAGE_BACKEND=local|gcs # default: local (no-op)
23+
AUTH_STORAGE_GCS_BUCKET=my-bucket # required when backend=gcs
24+
AUTH_STORAGE_GCS_PREFIX=linkedin-mcp # optional, default: empty
25+
AUTH_STORAGE_USERNAME=williamhgates # required when backend != local
26+
```
27+
28+
GCS object layout:
29+
30+
```
31+
gs://{bucket}/{prefix}/{username}/cookies.json
32+
gs://{bucket}/{prefix}/{username}/source-state.json
33+
```
34+
35+
## Architecture
36+
37+
### StorageBackend protocol
38+
39+
```python
40+
class StorageBackend(Protocol):
41+
def download(self, remote_key: str, local_path: Path) -> bool: ...
42+
def upload(self, local_path: Path, remote_key: str) -> bool: ...
43+
def delete(self, remote_key: str) -> bool: ...
44+
```
45+
46+
Implementations:
47+
- `LocalBackend`: no-op, all methods return `True`
48+
- `GCSBackend`: uses `google-cloud-storage`, authenticates via ADC (automatic on Cloud Run)
49+
50+
### Package structure
51+
52+
```
53+
linkedin_mcp_server/
54+
├── storage/
55+
│ ├── __init__.py # exports public API
56+
│ ├── backend.py # StorageBackend protocol, LocalBackend, StorageSyncError
57+
│ └── gcs.py # GCSBackend (lazy import)
58+
```
59+
60+
`google-cloud-storage` is an optional `[gcs]` extra dependency.
61+
62+
### Sync operations
63+
64+
| Function | When called | Behavior on failure |
65+
|----------|-------------|---------------------|
66+
| `sync_from_remote()` | Startup, before auth validation | **Fail hard** — raise StorageSyncError |
67+
| `sync_to_remote()` | After `--login`, on `close_browser()` shutdown | **Best-effort** — log warning, don't crash |
68+
| `delete_remote()` | During `--logout` | Log warning on failure |
69+
70+
### Hook points
71+
72+
**`cli_main.py` — startup:** Call `sync_from_remote()` before `ensure_authentication_ready()`.
73+
74+
**`setup.py` — post-login:** Call `sync_to_remote()` after `write_source_state()` and `export_cookies()`.
75+
76+
**`browser.py` — shutdown:** Call `sync_to_remote()` after `export_cookies()` in `close_browser()`.
77+
78+
## Constraints
79+
80+
- 10s Cloud Run shutdown grace period is sufficient for cookie export + KB-sized GCS upload
81+
- GCS default encryption (Google-managed AES-256) — no KMS
82+
- `AUTH_STORAGE_USERNAME` env var required because at cold start there is no local state to extract a username from
83+
- Config validation fails fast if `backend=gcs` but `gcs_bucket` or `username` is missing
84+
85+
## Testing
86+
87+
- Unit tests with mock `StorageBackend` (in-memory dict)
88+
- Test sync_from raises on download failure
89+
- Test sync_to logs but doesn't raise on upload failure
90+
- Test config validation rejects incomplete GCS config
91+
- No live GCS tests in CI
92+
93+
## Decision log
94+
95+
| # | Decision | Alternatives | Rationale |
96+
|---|----------|-------------|-----------|
97+
| 1 | Portable artifacts only | Full auth root, +derived snapshot | KB transfers, no cross-platform issues, bridge path works |
98+
| 2 | StorageBackend protocol | No abstraction, lifecycle manager | Low overhead, extensible, testable via mock |
99+
| 3 | Optional `[gcs]` extra | Required dependency | Keep base package lightweight |
100+
| 4 | Sync on shutdown + after login | +periodic timer | Covers both flows without background complexity |
101+
| 5 | Fail startup if GCS download fails | Fall back to local | Remote state is source of truth when configured |
102+
| 6 | Best-effort upload on shutdown | Fail hard | Don't crash server over transient GCS error |
103+
| 7 | Username via env var | Extract from cookies | No local state at cold start to extract from |
104+
| 8 | User-keyed by LinkedIn username | SHA256 hash | Human-readable, auditable |
105+
| 9 | GCS default encryption | Customer-managed KMS | Sufficient, no extra config |
106+
| 10 | 10s shutdown grace | 30s | KB upload well within window |

0 commit comments

Comments
 (0)