Commit 1e3a73b
Fix 11 bugs from audit (security + correctness) (#1473)
* Fix indentation bug in flatten_variables_from_household
Move flattened_variables.append(new_pair) inside the period loop so
every (entity, variable, period) tuple is captured. Previously only
the last period's entry was appended.
Fixes #1462
* Tighten rate limit on public /calculate_demo endpoint
The demo endpoint is intentionally public (documented in
config/README.md). Its previous "1 per second" limit allowed up to
86,400 free calls per IP per day, which is cheap abuse. Tighten to
1 per 10 seconds. Auth protection is still applied to /calculate and
/ai-analysis via @require_auth_if_enabled().
Fixes #1463
* Fix sqlite URI and protect analytics DB from wipe on init
Two related issues in debug-mode analytics init:
1. The SQLALCHEMY_DATABASE_URI was built with five slashes
("sqlite:////" + "/private/...") which breaks SQLAlchemy URL
parsing. Use f"sqlite:///{db_url}" so absolute paths produce the
canonical four-slash URI.
2. The unconditional Path(db_url).unlink() wiped the captured
analytics on every process start in debug mode. Gate that on
RESET_ANALYTICS=1 (or analytics.reset config) so a developer must
explicitly opt in to clearing the SQLite file.
Fixes #1464
* Restrict CORS to PolicyEngine origins by default
CORS(app) allowed every origin, so any third-party site could invoke
/calculate and (when auth is disabled) read user-supplied household
responses. Default to the PolicyEngine production domains and allow
overrides via CORS_ALLOWED_ORIGINS (comma-separated env) or the new
cors.allowed_origins config entry. Regex patterns are used so that
*.policyengine.org subdomains match without extra glue.
Fixes #1465
* Replace invalid ConnectionError kwargs with GCPError
Python's built-in Exception/ConnectionError do not accept a
`description` keyword, so every error path in GoogleCloudStorageManager
raised `TypeError: __init__() got an unexpected keyword argument
'description'` instead of the intended GCP error. Introduce a small
GCPError class that preserves `.description` as an attribute and use
it in the four call sites that were previously broken. No call site
reads `.args[0]` so switching exception class is safe.
Fixes #1466
* Stop coercing "0"/"1" env vars to booleans
The bool branch in ConfigLoader._convert_value ran first and included
"0" and "1", so every numeric config set via env var (PORT, pool
sizes, counts, ...) silently collapsed into True/False. Reorder to
try int -> float -> bool words, and only treat the literal words
true/false/yes/no as booleans. "0" now stays int 0, "1" stays int 1.
Fixture expectations and FLASK_DEBUG test value are updated to match.
Fixes #1467
* Verify JWT signature before trusting sub claim in analytics
The analytics decorator decoded incoming bearer tokens with
options={"verify_signature": False} and stored the resulting sub
claim directly as Visit.client_id. That meant anyone could set the
Authorization header to a self-signed JWT and have their chosen
identity recorded against every request. Also datetime.utcnow() is
deprecated in Python 3.12+ and returns naive UTC.
Changes:
* Add _verified_sub_claim() that fetches the Auth0 JWKS via PyJWKClient
and runs jwt.decode with verify_signature=True.
* If Auth0 isn't configured, or verification fails, store None for
client_id instead of an attacker-controlled string.
* Replace datetime.utcnow() with datetime.now(timezone.utc).
* Update analytics fixtures to mock _verified_sub_claim (verified path)
and add a regression test for the unverified-signature case.
Fixes #1468
* Re-raise tracer failures instead of implicit None return
PolicyEngineCountry.calculate() swallowed exceptions from the tracer
block with a bare `except Exception as e: print(...)`, after which
the function fell through and implicitly returned None. Callers in
endpoints/household.py unpack the return value into
(result, computation_tree_uuid), so a tracer failure surfaced as
`TypeError: cannot unpack non-iterable NoneType` instead of a
meaningful 500. Re-raise so the endpoint's own try/except can return
a proper error response.
Fixes #1469
* Validate /calculate payloads and cap axes scans
The /calculate endpoint accepted arbitrary JSON and passed it
straight to the simulation, exposing three abuse vectors:
* Malformed household payloads reached the compute layer and
produced opaque 500s instead of clear 400s.
* `axes` scans multiply cost by count_0 * count_1 * ..., so an
unbounded axes list could pin the compute pool.
* Per-endpoint rate limiting was absent.
Fixes:
* Validate the household payload against the country-specific
HouseholdModel{US,UK,Generic} before calling country.calculate().
* Cap `axes` at MAX_AXES_ENTRIES=10 and each `count` at
MAX_AXES_COUNT=100, returning 400 on violation.
* Add `@limiter.limit("60 per minute")` to /calculate.
Fixes #1470
* Time-bound and lazy-load the Auth0 JWKS fetch
Auth0JWTBearerTokenValidator called urlopen() on the JWKS URL at
construction time with no timeout and no error handling. In practice:
* Any Auth0 outage at boot bricked the API (process wedged or
crashed, depending on the failure mode).
* Module-level construction made every import pay the network cost.
Refactor to:
* Wrap urlopen in a try/except that logs on failure and returns None.
* Always pass `timeout=JWKS_FETCH_TIMEOUT` (10s).
* Cache successful fetches via functools.lru_cache so repeat
validator constructions don't re-fetch the key set.
* Override authenticate_token() to lazily retry the JWKS fetch when
the initial load failed.
Fixes #1471
* Use dpath.search instead of deprecated dpath.util.search
The dpath.util namespace emits DeprecationWarning and is slated for
removal in dpath 3.x. All util functions are accessible at the dpath
package top level. Switch country.get_requested_computations over to
the new import path.
Fixes #1472
* Add changelog entry for bug-audit batch
* Anchor default CORS regex with $ to block suffix-match bypass
Flask-CORS matches origins with re.match (prefix match), so
https://.*\.policyengine\.org without a trailing anchor admits
hostile origins like https://evil.policyengine.org.attacker.com.
Anchor the wildcard with $, add localhost / 127.0.0.1 regexes for
dev, and document CORS_ALLOWED_ORIGINS and RESET_ANALYTICS in
.env-example and config/README.md. Adds unit tests replicating
flask-cors' re.match/probably_regex semantics to guard the bypass.
Refs #1465, #1464.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Cache JWKS successes only so the lazy retry actually retries
The previous fix wrapped _fetch_jwks in functools.lru_cache, which
memoised the None return on failure. The "lazy retry" in
Auth0JWTBearerTokenValidator.authenticate_token therefore kept
getting the cached None back and never hit the network again — so a
transient Auth0 outage at import time bricked the validator until
the process restarted.
Replace with a module-level success cache plus a per-issuer
last-failure timestamp. On failure we do not memoise the None; on
the next call we re-fetch, throttled by JWKS_RETRY_INTERVAL_SECONDS
so we don't hammer Auth0 while it is degraded. Construction is
still non-blocking.
Add a regression test that patches _fetch_jwks_uncached to return
None once and then a sentinel, and asserts the lazy retry actually
calls the network twice and swaps in the sentinel key.
Refs #1471.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Align analytics error path to NULL client_id
Missing Authorization and JWT-decode errors previously set
client_id to the sentinel string "unknown", diverging from the
signature-verification-fail path which already stored None. The
downstream analytics layer has to filter one or the other out, and
inconsistency is a bug magnet. Collapse both paths to None so any
unverifiable identity is a SQL NULL. Updates the two affected tests
to assert NULL.
Refs #1468.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Drop backward-looking "previously..." comments
Code comments describe current behavior, not history — git log is
the history. Trim the narrative lead-ins from the tracer re-raise
and the analytics-reset gate so each comment states only what the
code does today.
Refs #1469.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Require people on HouseholdModel and cap JSON body size
A request with no people has nothing to compute against; making
people required in HouseholdModelGeneric surfaces this at the
validation layer instead of deep in the solver. households stays
optional to keep axes-scan payloads (which omit it) valid.
Also cap total request body size via MAX_CONTENT_LENGTH (default
10 MiB, overridable by env var) so a single oversize payload cannot
force the API to allocate unbounded memory before any view runs.
Refs #1470.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Cover zero-period variable in flatten_variables_from_household
Regression guard for the indentation-era bug where new_pair was
referenced outside the period loop. With an empty period dict the
inner loop never runs; the test asserts we return no entries and
do not raise UnboundLocalError or leak a previous variable's tuple.
Refs #1462.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 0f9fe32 commit 1e3a73b
24 files changed
Lines changed: 964 additions & 67 deletions
File tree
- config
- policyengine_household_api
- auth
- data
- decorators
- endpoints
- models
- utils
- tests
- fixtures
- decorators
- utils
- to_refactor/python/data
- unit
- auth
- decorators
- endpoints
- utils
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
270 | 270 | | |
271 | 271 | | |
272 | 272 | | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
273 | 306 | | |
274 | 307 | | |
275 | 308 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
| |||
40 | 41 | | |
41 | 42 | | |
42 | 43 | | |
43 | | - | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
44 | 90 | | |
45 | 91 | | |
46 | 92 | | |
| |||
59 | 105 | | |
60 | 106 | | |
61 | 107 | | |
| 108 | + | |
62 | 109 | | |
63 | 110 | | |
64 | 111 | | |
| |||
84 | 131 | | |
85 | 132 | | |
86 | 133 | | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
87 | 137 | | |
88 | | - | |
| 138 | + | |
89 | 139 | | |
90 | 140 | | |
91 | 141 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
2 | 5 | | |
3 | 6 | | |
4 | 7 | | |
5 | 8 | | |
6 | 9 | | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
7 | 84 | | |
8 | 85 | | |
9 | 86 | | |
10 | 87 | | |
11 | | - | |
12 | | - | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
13 | 99 | | |
| 100 | + | |
14 | 101 | | |
15 | 102 | | |
16 | 103 | | |
17 | 104 | | |
18 | 105 | | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
| 2 | + | |
2 | 3 | | |
3 | 4 | | |
4 | 5 | | |
| |||
432 | 433 | | |
433 | 434 | | |
434 | 435 | | |
435 | | - | |
436 | | - | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
437 | 442 | | |
438 | 443 | | |
439 | 444 | | |
| |||
478 | 483 | | |
479 | 484 | | |
480 | 485 | | |
481 | | - | |
| 486 | + | |
482 | 487 | | |
483 | 488 | | |
484 | 489 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
50 | | - | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
51 | 58 | | |
52 | 59 | | |
53 | 60 | | |
54 | | - | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
55 | 65 | | |
56 | 66 | | |
57 | 67 | | |
| |||
0 commit comments