Skip to content

Boston fork: security hardening, upstream-portal protection, AWS hosting docs#1

Open
brendanbabb wants to merge 4 commits intomainfrom
boston/security-hardening-and-aws-docs
Open

Boston fork: security hardening, upstream-portal protection, AWS hosting docs#1
brendanbabb wants to merge 4 commits intomainfrom
boston/security-hardening-and-aws-docs

Conversation

@brendanbabb
Copy link
Copy Markdown

Summary

Three logical changes, committed separately so they can be reviewed
independently:

  1. Infra — move to us-west-2, add reserved Lambda concurrency as a hard
    cap on fan-out into the upstream CKAN portal, pin Lambda packaging to
    cp311/manylinux wheels so the ZIP imports correctly regardless of the
    build host's Python version, and extract the S3 backend config so the
    deployer's AWS account ID does not live in the public repo.
  2. Security — rewrite the SQL validator and the aggregate_data SQL
    builder to be allowlist-only and AST-aware, so that caller-supplied
    strings cannot escape into the generated query; cap HTTP request bodies
    at 64 KB with a 413 response before JSON parsing.
  3. Docsdocs/AWS_DEPLOYMENT.md and docs/SECURITY.md describe
    the above for future operators and for anyone forking this to run
    against another open-data portal. A Python stdio_bridge.py and a
    CLAUDE.md for repo guidance come along for the ride.

Why

The top design constraint is not overwhelming the upstream open-data
portal
. data.boston.gov is a shared civic resource, and an MCP server
in front of it can easily become its noisiest client — one Claude
conversation can translate into dozens of SQL queries, each hitting CKAN's
DataStore. Four overlapping layers protect the portal:

  • Reserved Lambda concurrency (default 10) — even if a client bypasses
    the API Gateway rate limit via the Lambda Function URL, they cannot drive
    more than 10 parallel upstream queries.
  • API Gateway rate limit (5 rps sustained / 10 burst) and daily
    quota
    (3000 requests/day per API key).
  • Enforced LIMIT on execute_sqlSQLValidator.enforce_row_limit
    appends LIMIT 10000 to any validated SQL lacking a top-level LIMIT.
  • Clamped LIMIT on aggregate_dataSafeSQLBuilder.clamp_limit
    enforces MAX_LIMIT = 10000.

The SQL surface was also tightened against injection and DoS-via-expensive-
query regardless of portal load:

  • Comments are stripped before scanning (defeats /* */ obfuscation).
  • Forbidden keyword list expanded (PREPARE, COPY, LISTEN,
    NOTIFY, VACUUM, ANALYZE, CLUSTER, REINDEX, LOAD,
    DO).
  • New forbidden-function list (xp_cmdshell, pg_sleep, pg_read_file,
    pg_ls_dir, pg_stat_file, lo_import, lo_export,
    current_setting, set_config, dblink).
  • FROM/JOIN targets are AST-validated — must be a UUID-quoted
    resource or a CTE alias; schema-qualified targets like
    pg_catalog.pg_class are rejected.
  • aggregate_data path no longer builds SQL by string concatenation —
    every identifier, metric expression, filter value, and LIMIT goes through
    SafeSQLBuilder allowlist validation.

Privacy posture

Stateless: no database, no accounts, no sessions. CloudWatch logs retain
request_id, method/path, duration, status, and truncated SQL (500 chars)
for 14 days. All data returned is public open data from data.boston.gov.
See docs/SECURITY.md §4 for the full rationale.

Test plan

  • CI green on the PR (ruff, pip-audit, pytest, Go tests)
  • ./scripts/deploy.sh --environment staging produces a cp311 ZIP
    that cold-starts without 502
  • terraform plan in terraform/aws/ shows
    reserved_concurrent_executions = 10 as the only Lambda diff
    for an already-deployed stack
  • Manual smoke: oversized body returns 413; DROP TABLE rejected;
    comment-obfuscated DROP rejected; schema-qualified FROM rejected
  • SELECT * FROM "<uuid>" without an explicit LIMIT is automatically
    clamped to 10000 rows

🤖 Generated with Claude Code

brendanbabb and others added 4 commits April 16, 2026 13:01
… pin

Move the deployment to us-west-2, add reserved Lambda concurrency as the
primary brake on fan-out into the upstream CKAN portal, and pin Lambda
packaging to cp311/manylinux wheels so the ZIP works regardless of build
host Python version.

- terraform/aws: add lambda_reserved_concurrency (default 10) wired to
  aws_lambda_function.reserved_concurrent_executions. Extract the S3
  backend config out of main.tf; real backend.tf is gitignored because
  the bucket name embeds the deployer's AWS account ID. Ship
  backend.tf.example as the template.
- prod/staging tfvars: aws_region=us-west-2, api_quota_limit=3000
  (was 1000), lambda_reserved_concurrency=10. Prod custom domain is
  boston-data.codeforanchorage.org; staging has no custom domain.
- scripts/deploy.sh + .github/workflows/release.yml: force cp311
  manylinux wheel resolution on every pip/uv install (without this,
  a Python 3.14 build host produces a ZIP that 502s at Lambda cold
  start). Detect python3/python cross-platform. Build the ZIP with
  stdlib zipfile instead of the `zip` binary so the packaging step
  works on CI images and Windows.
- scripts/setup-backend.sh: fix malformed bucket name
  (boston-opencontext-opendataterraform-state-... → boston-opencontext-
  tfstate-...).
- config.yaml: replace symlink-to-example with a concrete Boston CKAN
  config targeting data.boston.gov. ArcGIS kept disabled for reference.
- local_server.py: accept POSTs on both / and /mcp so the same local
  server works with Claude Desktop stdio bridges and MCP Inspector.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…size

Tighten the two attack surfaces that directly forward user-controlled
input into upstream CKAN: the execute_sql path and the aggregate_data
path. Add a request-body-size cap on the HTTP handler to bound the
work a single JSON-RPC call can cost. See docs/SECURITY.md for the
full threat model.

- plugins/ckan/sql_validator.py
  * SQLValidator: shrink MAX_SQL_LENGTH 50000 → 8192; strip SQL
    comments before keyword/function scans so /* ... */ and -- ...
    obfuscation can't smuggle forbidden tokens past the checks;
    expand FORBIDDEN_KEYWORDS with PREPARE/COPY/LISTEN/NOTIFY/VACUUM/
    ANALYZE/CLUSTER/REINDEX/LOAD/DO; add FORBIDDEN_FUNCTIONS
    (xp_cmdshell, pg_sleep, pg_read_file, pg_ls_dir, pg_stat_file,
    lo_import, lo_export, current_setting, set_config, dblink);
    walk the sqlparse AST to require every FROM/JOIN target to be a
    UUID-quoted resource or a CTE alias (rejects schema-qualified
    targets like pg_catalog.pg_class); match INTO OUTFILE/DUMPFILE.
  * New enforce_row_limit: appends LIMIT 10000 to any validated SQL
    that lacks a top-level LIMIT so a caller can't trigger an
    unbounded scan on a multi-million-row CKAN DataStore table.
  * New SafeSQLBuilder: typed, allowlist-only builder for the
    aggregate_data path. Identifiers must match [A-Za-z_]\w*, metric
    expressions must be count(*) or {count|sum|avg|min|max|stddev}
    ([DISTINCT] <ident>), filter values coerced per type with '
    escaping, order_by parsed and quoted, limit clamped to 10000,
    HAVING values must be numeric.
- plugins/ckan/plugin.py: route aggregate_data through
  SafeSQLBuilder (was string concatenation); call
  SQLValidator.enforce_row_limit after validate_query.
- server/http_handler.py: reject JSON-RPC bodies > 65 KB with
  HTTP 413 before parsing. The MCP surface fits in a few KB; a
  megabyte payload is either a bug or abuse.
- tests: cover body-size cap at and over the boundary, each new
  forbidden keyword/function, comment obfuscation, schema-qualified
  FROM rejection, enforce_row_limit behavior, and every
  SafeSQLBuilder method.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Document the Boston fork's AWS hosting and security posture, with
portal protection as the top design constraint. Add stdio_bridge.py
as a Python alternative to the Go stdio client for Claude Desktop.

- docs/AWS_DEPLOYMENT.md: how this fork is hosted (us-west-2, custom
  domain, reserved concurrency, cp311 packaging), what changed vs.
  upstream's single-region default, and how to operate the stack.
  Leads with the portal-protection design constraint.
- docs/SECURITY.md: the full rationale behind the hardening changes,
  organized around who is being protected — upstream portal first,
  deployment second, end users third. Covers the SQL validator and
  SafeSQLBuilder surface, rate limits and body-size cap, privacy
  posture (stateless, no PII, 14-day log retention, SQL truncated
  to 500 chars), and known gaps.
- README.md: link both new docs from the documentation table.
- stdio_bridge.py: minimal Python stdio-to-HTTP bridge. Reads
  JSON-RPC messages from stdin, POSTs them to the local/remote MCP
  server, writes responses to stdout. Useful where the Go client
  is impractical (Windows, no Go toolchain).
- CLAUDE.md: repo guidance for Claude Code sessions — commands,
  request flow, architecture notes, single-plugin rule.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds Method 4 to docs/TESTING.md covering how to wire the local HTTP
server to Claude Desktop (claude_desktop_config.json) and Claude Code
(.mcp.json) via stdio_bridge.py. The bridge was previously only mentioned
in CLAUDE.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant