You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- field_valid_values table in database as a source of truth for fields that have semantic meaning external to the database hierarchy (variable, constraint_variable, period, operation, active).
5
+
- Event listeners that raise an error if inconsistent operations or parent-child relationships are attempted to be inserted into the database.
**field_valid_values** - Lookup table that defines which values are allowed for specific fields. SQL triggers on `stratum_constraints` and `targets` check incoming rows against this table and reject invalid values with `RAISE(ABORT)`.
110
+
111
+
Validated fields:
112
+
113
+
| Field | Table | Examples | How populated |
114
+
|-------|-------|----------|---------------|
115
+
|`operation`| stratum_constraints |`==`, `>`, `<=`| Static list in `create_field_valid_values.py`|
|`variable`| targets |`eitc`, `person_count`| Dynamic from `policyengine-us` variables |
118
+
|`active`| targets |`0`, `1`| Static list |
119
+
|`period`| targets |`2022`, `2023`, `2024`, `2025`| Static list |
120
+
|`source`| targets |`IRS SOI`, `Census ACS S0101`| Static list (see below) |
121
+
122
+
**Adding new values**: If you introduce a new data source, time period, or constraint operation, you must register it in `create_field_valid_values.py` before any ETL script can use it. Otherwise the SQL trigger will reject the row at insert time. PolicyEngine variables are registered automatically at database creation time.
123
+
124
+
### Data Integrity Enforcement
125
+
126
+
The database enforces consistency through three mechanisms in `create_database_tables.py`:
127
+
128
+
**1. SQL trigger validation** - Before every INSERT/UPDATE on `targets` and `stratum_constraints`, triggers verify that field values exist in `field_valid_values`. Invalid values are rejected immediately. The `source` field is optional (NULL allowed), but if set, must match a registered value.
129
+
130
+
**2. Constraint consistency** - A SQLAlchemy `before_insert`/`before_update` listener on `Stratum` calls `ensure_consistent_constraint_set()` to verify that a stratum's constraints are logically compatible (e.g., no contradictory bounds like `age > 50` and `age < 30` on the same stratum).
131
+
132
+
**3. Parent-child constraint inheritance** - A SQLAlchemy listener ensures child strata include all parent constraints. This prevents a child from claiming to be in a different geographic or demographic scope than its parent. Two cases:
133
+
-**Geographic-to-geographic** (e.g., state to CD): Instead of requiring literal constraint duplication, the validator checks geographic containment. A CD's `congressional_district_geoid` must encode the parent's `state_fips` (i.e., `geoid // 100 == state_fips`). Geographic variables are compared as integers to handle zero-padding differences (`"1"` vs `"01"`).
134
+
-**Demographic children** (e.g., state to age group): The child must include all parent constraints verbatim (e.g., a child under `state_fips == 6` must also have `state_fips == 6` in its own constraints).
0 commit comments