Skip to content

Salary sacrifice pension variable#214

Merged
vahid-ahmadi merged 4 commits into
mainfrom
salary-sacrifice
Nov 20, 2025
Merged

Salary sacrifice pension variable#214
vahid-ahmadi merged 4 commits into
mainfrom
salary-sacrifice

Conversation

@vahid-ahmadi

@vahid-ahmadi vahid-ahmadi commented Nov 19, 2025

Copy link
Copy Markdown
Collaborator
  • Link to reform: link
  • Link to FRS variable: link

What was added

1. New variable: pension_contributions_via_salary_sacrifice

Location: policyengine_uk_data/datasets/frs.py (lines 634-638)

Source data: FRS job table, column SPNAMT ("Amount received for Salary Sacrifice Pension")

Code:

pe_person["pension_contributions_via_salary_sacrifice"] = np.maximum(
    0,
    sum_to_entity(job.spnamt.fillna(0), job.person_id, person.person_id)
    * WEEKS_IN_YEAR,
)

What it does:

  1. Reads SPNAMT from FRS job.tab file
  2. Handles missing values (fills with 0)
  3. Sums across multiple jobs per person
  4. Converts from weekly to annual amounts (multiplies by 52.18)
  5. Ensures no negative values

2. Uprating factors

Files updated:

  • uprating_factors.csv (line 24)
  • uprating_growth_factors.csv (line 24)

@vahid-ahmadi vahid-ahmadi self-assigned this Nov 19, 2025
@vahid-ahmadi vahid-ahmadi changed the title Salary sacrifice pension Salary sacrifice pension variable Nov 19, 2025
@MaxGhenis

MaxGhenis commented Nov 19, 2025

Copy link
Copy Markdown
Contributor

Comprehensive Review of PR #214: Salary Sacrifice Pension Variable

I've completed a thorough validation of this PR. Overall, the implementation is structurally sound and follows established patterns, but there are some important considerations before merging.


✅ What's Working Well

1. Companion PR Coordination

  • ✅ Variable exists in policyengine-uk PR #1372
  • ✅ Proper definition with documentation and reference
  • ✅ Uses correct uprating (average earnings)

2. Implementation Quality

  • ✅ Follows exact pattern of employee_pension_contributions
  • ✅ Correct entity aggregation (job → person via sum_to_entity)
  • ✅ No hard-coded values (uses WEEKS_IN_YEAR constant appropriately)
  • ✅ Handles missing values (.fillna(0))
  • ✅ Prevents negative values (np.maximum(0, ...))
  • ✅ Proper placement in code (grouped with pension variables)

3. Supporting Files

  • ✅ Uprating factors properly configured
  • ✅ Changelog entry correct (minor bump, added category)
  • ✅ All CI checks passing

⚠️ Key Considerations

1. Outlier Detection Decision Needed

Observation: There's an inconsistency in how pension variables handle outliers:

With Clipping (pen_prov table data):

# personal_pension_contributions - clips to 95th percentile
pe_person["personal_pension_contributions"] = np.maximum(
    0,
    sum_to_entity(...).clip(0, pen_prov.penamt.quantile(0.95))
    * WEEKS_IN_YEAR,
)

Without Clipping (job table data):

# employee_pension_contributions - no clipping
pe_person["employee_pension_contributions"] = np.maximum(
    0,
    sum_to_entity(job.deduc1.fillna(0), ...)
    * WEEKS_IN_YEAR,
)

# pension_contributions_via_salary_sacrifice - no clipping (matches above)
pe_person["pension_contributions_via_salary_sacrifice"] = np.maximum(
    0,
    sum_to_entity(job.spnamt.fillna(0), ...)
    * WEEKS_IN_YEAR,
)

Pattern Analysis: The codebase appears to trust job table data without clipping, but clips pen_prov (pension provider) data due to known quality issues.

Recommendation: The current implementation (no clipping) is consistent with employee_pension_contributions, so it's acceptable. However, consider adding a comment explaining this pattern:

# Salary sacrifice pension contributions from job data
# Uses same pattern as employee_pension_contributions (deduc1) without 
# outlier clipping, as job-level data is generally cleaner than pension 
# provider data (penamt).
pe_person["pension_contributions_via_salary_sacrifice"] = np.maximum(
    0,
    sum_to_entity(job.spnamt.fillna(0), job.person_id, person.person_id)
    * WEEKS_IN_YEAR,
)

2. Test Coverage

Issue: No tests for the new variable.

Recommendation: Add a basic test to test_aggregates.py or create a smoke test:

def test_pension_contributions_via_salary_sacrifice_loads(baseline):
    """Ensure salary sacrifice pension variable loads without errors."""
    values = baseline.calculate(
        "pension_contributions_via_salary_sacrifice", 
        period=2025
    )
    # Basic validation
    assert (values >= 0).all(), "Salary sacrifice amounts must be non-negative"
    # Should have some non-zero values
    assert values.sum() > 0, "Expected some salary sacrifice contributions"

If aggregate statistics are available from HMRC, add them to AGGREGATES dict.

3. Documentation Enhancement

Current:

# link to the variable: https://datacatalogue.ukdataservice.ac.uk/datasets/dataset/630d4a8d-ba6a-82b3-f33d-c713c66efcb3

Suggested Enhancement:

# Salary sacrifice pension contributions from FRS Job table (SPNAMT field)
# SPNAMT represents employer pension contributions made via salary sacrifice
# arrangements where employees forego salary in exchange for increased pension
# contributions. This is separate from regular employee pension contributions
# (deduc1) and provides tax advantages.
# Source: https://datacatalogue.ukdataservice.ac.uk/datasets/dataset/630d4a8d-ba6a-82b3-f33d-c713c66efcb3
# Note: Values are annualized from weekly amounts reported in the survey.

4. Double-Counting Check

Important: I was unable to access the FRS data dictionary to verify that SPNAMT and DEDUC1 (employee contributions) don't overlap. This should be confirmed before merge to avoid double-counting pension contributions.

Action needed: Can you confirm from FRS documentation that:

  • SPNAMT is distinct from DEDUC1
  • These fields are mutually exclusive or properly additive
  • No risk of counting the same contribution twice

Recommendations Priority

Must Address:

  • Confirm SPNAMT/DEDUC1 are distinct (no double-counting)

Should Address:

  • Add basic test coverage
  • Enhance inline documentation

Nice to Have:

  • Add explanatory comment about outlier handling pattern
  • Validate against HMRC aggregate statistics if available

Validation Summary

Check Status Notes
Variable exists in model ✅ PASS policyengine-uk PR #1372
No hard-coded values ✅ PASS Uses constants appropriately
Pattern consistency ✅ PASS Matches employee_pension_contributions
Entity aggregation ✅ PASS Correct job → person mapping
Uprating factors ✅ PASS Properly configured
Changelog ✅ PASS Correct format
CI checks ✅ PASS All passing
Outlier handling ⚠️ ACCEPTABLE Consistent with job table pattern
Test coverage ❌ MISSING No tests
Documentation ⚠️ MINIMAL Has link, needs context
Double-counting check ⚠️ UNVERIFIED Needs confirmation

Conclusion

This PR is nearly ready to merge. The implementation is clean and follows established patterns. The main concern is confirming that SPNAMT doesn't overlap with DEDUC1 to prevent double-counting.

Once you address the "Must Address" item (and ideally the "Should Address" items), this can be safely merged.

Great work on the implementation! 🎉

…ions

- Add comprehensive smoke test for pension_contributions_via_salary_sacrifice
  - Validates non-negative values
  - Checks for presence of data
  - Ensures reasonableness against total employment income
- Enhance inline documentation with detailed explanation:
  - What SPNAMT represents
  - Relationship to other pension variables
  - Rationale for not using outlier clipping (follows job table pattern)
  - Source attribution and data transformation notes

Addresses review feedback on PR #214

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@MaxGhenis

Copy link
Copy Markdown
Contributor

Updates Applied

I've proactively addressed the "Should Address" recommendations from my review:

✅ Changes Made

  1. Added Test Coverage (test_pension_contributions_via_salary_sacrifice.py)

    • Validates that all values are non-negative
    • Checks that there are some non-zero salary sacrifice contributions
    • Ensures total contributions don't exceed total employment income (sanity check)
  2. Enhanced Documentation (in frs.py)

    • Added comprehensive inline comments explaining:
      • What SPNAMT represents (salary sacrifice pension contributions)
      • Why it's separate from regular employee contributions (deduc1)
      • Rationale for not using outlier clipping (follows job table pattern)
      • Data transformation notes (weekly to annual conversion)
    • Improved source attribution with context

📊 CI Status

Waiting for CI to run on the new changes. The test should pass as it follows the same pattern as existing tests and uses the baseline fixture.

🔄 Still Needs Confirmation

The key question from the original review remains:

  • Confirm SPNAMT/DEDUC1 are distinct (no double-counting risk)

This requires access to FRS data dictionary or confirmation from someone familiar with the FRS fields.

Once that's confirmed, this PR should be ready to merge! 🚀

@vahid-ahmadi

Copy link
Copy Markdown
Collaborator Author

I can confirm that SPNAMT and DEDUC1 are distinct fields that measure different types of pension contributions:

  • DEDUC1 captures money the employee chooses to contribute from their net earnings
  • SPNAMT captures employer contributions funded by salary reduction (never appears as employee income)

@vahid-ahmadi vahid-ahmadi merged commit a24b284 into main Nov 20, 2025
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants