|
| 1 | +# Pyth MCP Code Mode Adoption Plan |
| 2 | + |
| 3 | +## Decision |
| 4 | + |
| 5 | +Adopt Code Mode and host on Cloudflare Workers by default. |
| 6 | + |
| 7 | +This is the shortest path to: |
| 8 | +- Stable `search` + `execute` tool surface as APIs evolve |
| 9 | +- Strong token security via server-side injection |
| 10 | +- Full observability with low operational overhead |
| 11 | + |
| 12 | +## Why This Direction |
| 13 | + |
| 14 | +- Code Mode keeps MCP tool count fixed while backend APIs grow. |
| 15 | +- Server-side token injection avoids passing `access_token` in model-visible calls. |
| 16 | +- Cloudflare execution sandboxing is production-ready for generated code workloads. |
| 17 | +- Cloudflare access controls and centralized logging align with security requirements. |
| 18 | + |
| 19 | +## Architecture (Target) |
| 20 | + |
| 21 | +- Expose Code Mode tools: |
| 22 | + - `search` |
| 23 | + - `execute` |
| 24 | +- Keep existing traditional tools as fallback during rollout. |
| 25 | +- Route `get_latest_price` through a wrapper that injects a server-managed token. |
| 26 | +- Never expose the token in tool schema, prompt context, or user-provided arguments. |
| 27 | + |
| 28 | +## Hosting Recommendation |
| 29 | + |
| 30 | +Primary: |
| 31 | +- Cloudflare Workers + Dynamic Worker Loader |
| 32 | + |
| 33 | +Fallback (only if Cloudflare is not allowed): |
| 34 | +- Kubernetes + Node + `isolated-vm` + OpenTelemetry |
| 35 | + |
| 36 | +## Security Requirements |
| 37 | + |
| 38 | +- Use one server-managed Pyth Pro token from a secret manager. |
| 39 | +- Inject token only inside execution boundary. |
| 40 | +- Block outbound network from generated code except approved tool proxy path. |
| 41 | +- Enforce per-request timeouts and rate limits. |
| 42 | +- Redact secrets from all logs and error payloads. |
| 43 | + |
| 44 | +## Observability Requirements |
| 45 | + |
| 46 | +- Traces: |
| 47 | + - MCP request span |
| 48 | + - code execution span |
| 49 | + - upstream API spans |
| 50 | +- Metrics: |
| 51 | + - execution latency (p50/p95/p99) |
| 52 | + - sandbox timeout/error rates |
| 53 | + - upstream error rates |
| 54 | + - tool calls per execution |
| 55 | + - response size |
| 56 | +- Structured logs: |
| 57 | + - `requestId`, `sessionId`, `clientName`, `toolsCalled`, `executionTimeMs` |
| 58 | +- Dashboards: |
| 59 | + - reliability |
| 60 | + - security events |
| 61 | + - Code Mode adoption and efficiency |
| 62 | + |
| 63 | +## Rollout Plan |
| 64 | + |
| 65 | +1. Add feature flag: `ENABLE_CODE_MODE`. |
| 66 | +2. Implement `search` and `execute` with token-injecting wrapper. |
| 67 | +3. Add tests for: |
| 68 | + - token injection |
| 69 | + - sandbox timeout/network blocking |
| 70 | + - multi-step one-roundtrip execution |
| 71 | +4. Launch internal beta with fallback tools enabled. |
| 72 | +5. Make Code Mode default after stability and observability targets are met. |
| 73 | + |
| 74 | +## Exit Criteria for Default-On |
| 75 | + |
| 76 | +- No token leakage in request/response/log pipelines. |
| 77 | +- Sandbox timeout and error rates within SLO. |
| 78 | +- Code Mode handles majority of complex multi-step queries. |
| 79 | +- Traditional fallback remains available for client compatibility. |
0 commit comments