PolicyEngine · MaxGhenis · May 21, 2026 · May 21, 2026 · May 21, 2026 · May 21, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -23,6 +23,9 @@ read `docs/engineering/skills/pipeline_operations.md`.
 When adding, changing, or reviewing calibration target definitions, read
 `docs/engineering/skills/calibration_targets.md`.
 
+When adding, changing, or reviewing donor-survey imputations, read
+`docs/engineering/skills/imputation.md`.
+
 ## Calibration targets
 
 Manually sourced national or local-file calibration targets must be registered

diff --git a/changelog.d/1103.changed b/changelog.d/1103.changed
@@ -0,0 +1 @@
+Use target-specific source-quality filters for ACS and SIPP imputations.
diff --git a/docs/engineering/skills/README.md b/docs/engineering/skills/README.md
@@ -14,6 +14,8 @@ Current skills:
   notes.
 - `github-prs.md`: same-repository PR workflow, PR head verification, and title
   conventions.
+- `imputation.md`: donor-survey imputation provenance rules, including
+  target-level exclusion of allocated source values.
 - `pipeline_docs.md`: decorator-backed pipeline map maintenance and generated
   pydoc-style artifacts.
 - `pipeline_operations.md`: model-neutral workflow for diagnosing deployed Modal

diff --git a/docs/engineering/skills/imputation.md b/docs/engineering/skills/imputation.md
@@ -0,0 +1,36 @@
+# Imputation
+
+Use this guide when adding, changing, or reviewing donor-survey imputations.
+
+## Source Provenance
+
+Do not train an imputation target on donor rows whose source value for that
+target is itself allocated, hot-decked, edited, or imputed by the source survey.
+Wire source-survey allocation or quality flags into the training frame whenever
+the donor file exposes them.
+
+Apply this rule at the target-variable level, not the donor-row level. A donor
+row with observed tip income but allocated bank-account assets can train
+`tip_income`; the same row must be excluded from the `bank_account_assets`
+training target. Use `policyengine_us_data.utils.source_quality` to build
+target masks, then pass them to `microimpute` through `target_filters` or
+`row_filter` so the filtering logic lives in the imputation library rather than
+in one-off model wrappers.
+
+Do not drop final CPS, ECPS, or calibration records solely because a donor
+survey target was excluded from training. The exclusion applies to donor
+training rows only; recipient datasets should remain complete.
+
+When a donor source lacks target-level quality flags, document that limitation
+near the imputation code and keep the training surface structured so flags can
+be added later.
+
+## Tests
+
+Add focused regression tests when adding a donor imputation or a source-quality
+flag:
+
+- allocation flags are read from the donor source,
+- allocated source values are excluded for the affected target,
+- unrelated observed targets from the same row can still train, and
+- legacy and current imputation surfaces use the same target provenance rule.
diff --git a/policyengine_us_data/calibration/puf_impute.py b/policyengine_us_data/calibration/puf_impute.py
@@ -325,6 +325,29 @@ def _qrf_ss_shares(
     for sub in shares:
         shares[sub] = np.where(total > 0, shares[sub] / total, 0.0)
 
+    if (
+        "age" in data
+        and "social_security_retirement" in shares
+        and "social_security_disability" in shares
+    ):
+        # Preserve QRF survivor/dependent predictions, but anchor the
+        # retirement-vs-disability split to the same age rule as the fallback.
+        age = data["age"][time_period][:n_cps][puf_has_ss]
+        is_old = age >= MINIMUM_RETIREMENT_AGE
+        retirement_or_disability = (
+            shares["social_security_retirement"] + shares["social_security_disability"]
+        )
+        shares["social_security_retirement"] = np.where(
+            is_old,
+            retirement_or_disability,
+            0.0,
+        )
+        shares["social_security_disability"] = np.where(
+            is_old,
+            0.0,
+            retirement_or_disability,
+        )
+
     del fitted, predictions
     gc.collect()
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		Use target-specific source-quality filters for ACS and SIPP imputations.