Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 24 additions & 2 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,16 @@ jobs:
echo "Version: $VERSION"

- name: Install Python dependencies into package directory
run: uv pip install --target ./package -r requirements.txt
# Pin --python-version and --python-platform so wheel resolution is
# independent of the runner's ambient interpreter and OS. Matches
# the uv path in scripts/deploy.sh and the python3.11 runtime pinned
# in terraform/aws/main.tf.
run: |
uv pip install -r requirements.txt \
--target ./package \
--python-platform x86_64-manylinux2014 \
--python-version 3.11 \
--no-compile

- name: Copy application code into package
run: |
Expand All @@ -122,9 +131,22 @@ jobs:
cp examples/boston-opendata/config.yaml package/config.yaml

- name: Create Lambda ZIP
# Use Python stdlib zipfile to match scripts/deploy.sh and
# .github/workflows/infra.yml — avoids depending on the `zip` binary.
run: |
cd package
zip -r ../opencontext-lambda-${{ steps.get_version.outputs.version }}.zip .
python - "../opencontext-lambda-${{ steps.get_version.outputs.version }}.zip" <<'PY'
import os
import sys
import zipfile

zip_path = sys.argv[1]
with zipfile.ZipFile(zip_path, "w", zipfile.ZIP_DEFLATED) as z:
for root, _, files in os.walk("."):
for name in files:
path = os.path.join(root, name)
z.write(path, os.path.relpath(path, "."))
PY

- name: Upload Lambda ZIP artifact
uses: actions/upload-artifact@v4
Expand Down
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,10 @@ terraform/*/lambda-deployment.zip
**/*.tfstate
**/*.tfstate.*

# Terraform backend config — contains the deployer's AWS account ID in the
# S3 bucket name. Each fork ships its own. See terraform/aws/backend.tf.example.
terraform/aws/backend.tf

# OpenContext client binaries
opencontext-client
opencontext-client-*
Expand Down
72 changes: 72 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Build & Development Commands

```bash
# Install dependencies (uv preferred, pip fallback)
uv sync # or: pip install -r requirements.txt

# Run local MCP server (no Lambda needed)
python3 scripts/local_server.py # Serves on http://localhost:8000/mcp
# Or: python3 local_server.py # Alternate entry point, serves on / and /mcp

# Validate config
python3 -c "from core.validators import load_and_validate_config; load_and_validate_config('config.yaml')"

# Tests
uv run pytest tests/ -n auto # All tests, parallel
uv run pytest tests/test_ckan_plugin.py -v # Single file
uv run pytest tests/test_ckan_plugin.py::TestClass::test_name -v # Single test
uv run pytest tests/ --cov=core --cov=plugins --cov-report=term-missing # With coverage (80% minimum)

# Linting (ruff)
uv run ruff check core/ plugins/ server/ tests/ # Check
uv run ruff check core/ plugins/ server/ tests/ --fix # Auto-fix
uv run ruff format core/ plugins/ server/ tests/ # Format

# Pre-commit hooks
pre-commit run --all-files

# Go client (requires Go 1.21+)
cd client && make build

# Deploy to AWS
./scripts/deploy.sh --environment staging
```

## Architecture

**Core rule: One Fork = One MCP Server.** Each deployment runs exactly ONE plugin. This is enforced at config validation time (`core/validators.py`) and at runtime (`PluginManager.load_plugins()`). To deploy multiple MCP servers, fork the repo per plugin.

**Request flow:**
```
Claude (stdio) → Go client (client/) or stdio_bridge.py → HTTP POST /mcp
→ Lambda (server/adapters/aws_lambda.py) or local_server.py
→ server/http_handler.py → core/mcp_server.py (JSON-RPC 2.0)
→ core/plugin_manager.py → Plugin → External API
```

**Key modules:**
- `core/interfaces.py` — Abstract bases: `MCPPlugin`, `DataPlugin`, plus `ToolDefinition`, `ToolResult`, `PluginType` enum
- `core/plugin_manager.py` — Discovers plugins by scanning `plugins/` and `custom_plugins/` for `plugin.py` files. Registers tools with `pluginname__toolname` prefix. Routes `tools/call` to the correct plugin.
- `core/mcp_server.py` — Handles MCP JSON-RPC methods: `initialize`, `tools/list`, `tools/call`, `ping`
- `core/validators.py` — Loads config from `config.yaml` (local) or `OPENCONTEXT_CONFIG` env var (Lambda). Enforces single-plugin rule.
- `server/adapters/aws_lambda.py` — AWS Lambda entry point (handler: `server.adapters.aws_lambda.lambda_handler`). Also `server/lambda_handler.py` as legacy entry point.
- `server/http_handler.py` — Cloud-agnostic HTTP handler shared by Lambda and local server
- `stdio_bridge.py` — Python stdio-to-HTTP bridge for connecting Claude Desktop/Code to the local server (alternative to Go client)

**Built-in plugins** (`plugins/`): `ckan`, `arcgis`, `socrata` — each implements `DataPlugin` with `search_datasets`, `get_dataset`, `query_data`. Custom plugins go in `custom_plugins/` and are auto-discovered.

## Plugin Development

New plugins must implement `MCPPlugin` (or `DataPlugin` for data sources). Place in `custom_plugins/<name>/plugin.py`. The class must define `plugin_name`, `plugin_type`, `plugin_version` and implement `initialize()`, `shutdown()`, `get_tools()`, `execute_tool()`, `health_check()`. Tool names are auto-prefixed — return bare names from `get_tools()`.

## Configuration

Copy `config-example.yaml` to `config.yaml`. Enable exactly one plugin. Config supports `${ENV_VAR}` substitution. For Lambda, config is serialized to the `OPENCONTEXT_CONFIG` env var by Terraform.

## CI

GitHub Actions (`.github/workflows/ci.yml`) runs ruff lint/format, pip-audit, pytest with coverage, and Go tests on push to main/develop and on PRs.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ See [Getting Started](docs/GETTING_STARTED.md) for full setup.
| [Getting Started](docs/GETTING_STARTED.md) | Setup and usage |
| [Architecture](docs/ARCHITECTURE.md) | System design and plugins |
| [Deployment](docs/DEPLOYMENT.md) | AWS, Terraform, monitoring |
| [AWS Deployment (Boston)](docs/AWS_DEPLOYMENT.md) | Boston fork: region, concurrency, domain, packaging |
| [Security](docs/SECURITY.md) | SQL hardening, rate limits, upstream-portal protection |
| [Testing](docs/TESTING.md) | Local testing (Terminal, Claude, MCP Inspector) |


Expand Down
29 changes: 28 additions & 1 deletion config.yaml
185 changes: 185 additions & 0 deletions docs/AWS_DEPLOYMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
# AWS Deployment (Boston fork)

This document describes how the Boston-specific deployment of OpenContext is hosted on AWS, what changed relative to the upstream defaults, and how to operate the stack. It complements [DEPLOYMENT.md](DEPLOYMENT.md), which covers the upstream single-Lambda/API-Gateway architecture.

- **Public endpoint (prod):** `https://boston-data.codeforanchorage.org`
- **Upstream data source:** Boston CKAN portal at `https://data.boston.gov/`
- **Runtime:** AWS Lambda (Python 3.11) behind API Gateway, us-west-2

> **Design constraint:** this fork's top operational priority is **not overwhelming `data.boston.gov`**. It is a shared civic resource, not our infrastructure. Every defensive control below — reserved Lambda concurrency, API Gateway rate limits and daily quota, enforced `LIMIT` on SQL, clamped aggregation limits, body-size caps — exists to keep this MCP server from becoming the noisiest client on that portal. See [SECURITY.md §1](SECURITY.md#1-protecting-the-upstream-data-portal) for the full rationale.

---

## 1. What changed in this fork

The upstream deployment assumes a single-region (us-east-1) Lambda with a standard rate-limited API Gateway in front of it. This fork makes the following operational changes:

### 1.1 Region moved to us-west-2

Terraform variables and the deploy script default to `us-west-2`:

- `terraform/aws/prod.tfvars`, `terraform/aws/staging.tfvars`: `aws_region = "us-west-2"`
- `config.yaml`: `aws.region: "us-west-2"`

The move is for co-location with other Code for Anchorage infrastructure and has no functional effect on the Lambda. Cost numbers in [DEPLOYMENT.md](DEPLOYMENT.md#cost-us-east-1) still apply; us-west-2 pricing is effectively identical for Lambda and API Gateway.

### 1.2 Terraform backend extracted and renamed

The upstream `main.tf` hard-coded an `opencontext-terraform-state` bucket in us-east-1. This fork moves the backend into its own file so the bootstrap account+region+bucket are explicit, and renames the bucket to the convention used by `scripts/setup-backend.sh`:

`terraform/aws/backend.tf` (new file):

```hcl
terraform {
backend "s3" {
bucket = "boston-opencontext-tfstate-<AWS_ACCOUNT_ID>-us-west-2"
key = "terraform.tfstate"
region = "us-west-2"
dynamodb_table = "terraform-state-lock"
encrypt = true
}
}
```

The actual `backend.tf` in this repo hardcodes the Code for Anchorage AWS account ID — Terraform cannot interpolate variables into a backend block, so the literal value has to live in the file. A DynamoDB table (`terraform-state-lock`) is used for state locking — forked deployments should run `scripts/setup-backend.sh` to create both the bucket and the lock table, update the account ID in `backend.tf`, then `terraform init` against `terraform/aws/`.

### 1.3 Reserved Lambda concurrency

A new `lambda_reserved_concurrency` variable caps the number of concurrent Lambda invocations. Default is **10**, set in both staging and prod `.tfvars`.

```hcl
# terraform/aws/variables.tf
variable "lambda_reserved_concurrency" {
default = 10
}
```

This serves two purposes. The first is cost containment: a surprise traffic spike can't run the bill away. The second, more important one, is **protecting the upstream open-data portal**. Boston's CKAN portal is a shared civic resource; if a misbehaving client fans out into thousands of parallel SQL queries, reserved concurrency bounds how much of that load we can relay. See [SECURITY.md](SECURITY.md#3-upstream-portal-protection) for the full threat model.

Set to `-1` to disable the cap (fall back to the account-wide concurrency limit). Don't do this in prod without a reason.

### 1.4 API Gateway quota raised, rate limits unchanged

```
api_quota_limit = 3000 # was 1000 upstream
api_rate_limit = 5 # unchanged — sustained req/s
api_burst_limit = 10 # unchanged — burst req/s
```

The daily quota was raised to 3000 after staging traffic showed legitimate per-connector usage (tool discovery + a handful of queries per conversation) could brush against 1000/day for a single user. The per-second rate is kept low deliberately — see [SECURITY.md §2](SECURITY.md#2-rate-limiting-and-body-size).

### 1.5 Custom domain

Prod now fronts the API Gateway with an ACM cert and the custom domain `boston-data.codeforanchorage.org`. Staging has no custom domain (`custom_domain = ""`) — use the raw API Gateway URL from `terraform output`.

### 1.6 Cross-platform, 3.11-pinned packaging

Both `scripts/deploy.sh` and `.github/workflows/release.yml` were updated so the Lambda ZIP matches the runtime regardless of the build host.

- Detects `python3` or falls back to `python` (Windows build hosts).
- Forces cp311 manylinux wheels on every dependency install:
```bash
pip install -r requirements.txt -t ./package \
--platform manylinux2014_x86_64 \
--python-version 3.11 \
--implementation cp \
--abi cp311 \
--only-binary :all: \
--no-compile
```
Without the pin, a build host running Python 3.14 will pull cp314 wheels that fail to import at Lambda cold start with a 502 `InternalServerErrorException`.
- Builds the ZIP with Python's stdlib `zipfile` module instead of the `zip` binary, which isn't present on every runner (notably the staging CI image and Windows).

### 1.7 `local_server.py` serves both `/` and `/mcp`

The Claude Desktop stdio bridge posts to `/mcp`; some earlier testing tools post to `/`. The local dev server now accepts both so you can point Claude Desktop and MCP Inspector at the same endpoint without editing routes.

### 1.8 Concrete Boston CKAN `config.yaml`

Upstream `config.yaml` is a symlink to the DC ArcGIS example. This fork replaces it with a concrete CKAN config targeting `data.boston.gov`. ArcGIS is kept `enabled: false` in the file for reference (Boston's ArcGIS hub at `data-boston.hub.arcgis.com` returns 401 without auth; CKAN is the public entry point).

```yaml
plugins:
ckan:
enabled: true
base_url: "https://data.boston.gov/"
portal_url: "https://data.boston.gov/"
city_name: "Boston"
timeout: 120
arcgis:
enabled: false
```

---

## 2. Operator reference

### 2.1 First-time bootstrap

```bash
# 1. Create the state bucket + lock table (once per account/region)
export AWS_REGION=us-west-2
./scripts/setup-backend.sh

# 2. Initialize Terraform against the S3 backend
cd terraform/aws
terraform init
```

### 2.2 Deploying changes

The deploy script validates `config.yaml`, builds a cp311/manylinux Lambda ZIP, and runs `terraform apply`:

```bash
# Staging
./scripts/deploy.sh --environment staging

# Prod
./scripts/deploy.sh --environment prod
```

Under the hood:

1. Counts enabled plugins (must be exactly one — enforced by `core/validators.py`).
2. Builds `lambda-deployment.zip` with dependencies forced to cp311 manylinux wheels.
3. `terraform apply -var-file=<env>.tfvars` against `terraform/aws/`.

### 2.3 Environment configuration

| Variable | Staging | Prod |
| ------------------------------- | ---------------------------- | ------------------------------------------ |
| `lambda_name` | `boston-ckan-mcp-staging` | `boston-opencontext-mcp-prod` |
| `aws_region` | `us-west-2` | `us-west-2` |
| `lambda_memory` | 512 MB | 512 MB |
| `lambda_timeout` | 120 s | 120 s |
| `lambda_reserved_concurrency` | 10 | 10 |
| `api_quota_limit` | 3000 / day | 3000 / day |
| `api_rate_limit` / `burst` | 5 / 10 req/s | 5 / 10 req/s |
| `custom_domain` | *(none)* | `boston-data.codeforanchorage.org` |

### 2.4 Getting the endpoint URL

```bash
cd terraform/aws
terraform output -raw api_gateway_url # Custom domain on prod, exec-api URL on staging
```

### 2.5 Monitoring

CloudWatch log group `/aws/lambda/<lambda_name>`, 14-day retention. Logs are JSON-structured (`logging.format: json` in `config.yaml`) and include a `request_id` field you can join against API Gateway access logs.

```bash
aws logs tail /aws/lambda/boston-opencontext-mcp-prod --follow --region us-west-2
```

### 2.6 Cost

Expected steady-state cost at current quota is well under \$5/month: at 3000 requests/day × 30 days × 512 MB × ~1 s, Lambda runs roughly \$1–2/month. API Gateway REST API adds ~\$3.50 per million requests; at 100k/month that is ~\$0.35. Route 53 hosted zone + ACM cert are the fixed floor (~\$0.50/month).

---

## 3. Known limitations

- **Single-region, single-AZ.** No failover. Fine for a civic-data read proxy; not for critical services.
- **Reserved concurrency is a fuse, not a queue.** Beyond 10 in-flight requests, API Gateway returns 429. Clients must retry with backoff.
- **ArcGIS plugin is disabled.** Enabling it requires an authenticated portal; Boston's hub returns 401 without auth.
Loading
Loading