Fix #86: normalize promoted dependent attributes for age realism by RandomOscillations · Pull Request #93 · exaforge/extropy

RandomOscillations · 2026-02-17T06:08:13Z

Summary

seed dependent promotion sampling with fixed age/gender so downstream sampling evaluates with child context
add age-aware normalization for promoted minors (education/employment/occupation/income) using spec-driven attribute/options heuristics
keep household-scoped economics inherited from household context
add focused regression tests for promoted minor realism

Validation

pytest -q tests/test_agent_focus.py::TestPromotedDependentQuality

Closes #86

DeveshParagiri

Code Review

Verdict: ✅ Ready to merge

Summary

Comprehensive fix that normalizes age-inappropriate attributes for promoted minors using spec-driven heuristics.

Normalization Applied

Attribute	Logic
Education	Capped by age (≤10: elementary, ≤13: middle, ≤17: high school)
Employment	Restricted to student/part_time/intern/none; <16 prefers student/none
Occupation	Forced to student/none for minors
Income	personal_income=0, household_income preserved from parent

Edge Cases

Age	Max Education	Employment
5-10	elementary	student/none
11-13	middle school	student/none
14-15	high school	student/none
16-17	high school	student/part_time/intern
18+	Not normalized	Not normalized

Minor Suggestion (non-blocking)

Consider adding a boundary-age test (e.g., 16-year-old can have part_time but 10-year-old cannot).

No blocking changes required.

DeveshParagiri

Code Review

Verdict: ❌ Needs changes - fragile keyword matching

Problem

The attribute type detection uses substring matching against hardcoded keywords:

_INCOME_KEYWORDS = ("income", "salary", "wage", "earnings", "compensation")
_EDUCATION_KEYWORDS = ("education", "degree")
_EMPLOYMENT_KEYWORDS = ("employment", "work_status", "labor_status")

if any(token in key for token in _INCOME_KEYWORDS):
    # normalize income for minors

This silently fails if the LLM names attributes differently:

Attribute Name	Matches?	Result
`household_income`	✅	Normalized correctly
`annual_earnings`	✅	Normalized correctly
`earning`	❌	10-year-old keeps $150k income
`pay`	❌	10-year-old keeps adult pay
`annual_pay`	❌	Silent failure
`diploma`	❌	10-year-old keeps "masters"
`schooling`	❌	Silent failure
`work`	❌	10-year-old keeps "full_time"

Suggested Fix

Add a semantic_type field to AttributeSpec that explicitly declares the attribute's category:

# population.v1.yaml
attributes:
  - name: annual_pay
    type: numeric
    semantic_type: income  # explicit declaration
    
  - name: schooling_level
    type: categorical
    semantic_type: education
    options: ["none", "elementary", "high_school", "bachelors", "masters"]

Then in the sampler:

for attr in spec.attributes:
    if attr.semantic_type == "income":
        agent[attr.name] = _coerce_minor_income(agent[attr.name], age)
    elif attr.semantic_type == "education":
        agent[attr.name] = _coerce_minor_education(agent[attr.name], age, attr.options)
    # etc.

Fallback: Keep the keyword heuristics as a fallback for specs without semantic_type, but log a warning so users know to add explicit types.

This approach:

Explicit > heuristic
No silent failures
Spec authors control the behavior
Backward compatible with warning

DeveshParagiri

Additional Issue: _education_stage() also uses fragile keyword matching

The _education_stage() function at line 434-450 has the same problem for education values:

stage_map = [
    (0, ("none", "home", "preschool", "pre_k", "kindergarten")),
    (1, ("elementary", "primary", "grade_school")),
    (2, ("middle_school", "middle", "junior_high")),
    (3, ("high_school", "secondary", "ged")),
    (5, ("bachelor", "college", "undergraduate", "university")),
    # ...
]
if any(marker in v for marker in markers):
    return stage

Failure cases:

Education Value	Expected Stage	Actual	Why
`high_school`	3	3	✅
`12th_grade`	3	None	❌ No keywords
`completed_12`	3	None	❌ No keywords
`bachelors`	5	5	✅ "bachelor" matches
`4_year_degree`	5	None	❌ No keywords
`BA`	5	None	❌ No keywords
`BS`	5	None	❌ No keywords

When _education_stage() returns None, the coercion logic may not cap education properly.

Suggested fix: Use the spec's options list order as the canonical stage mapping, or add education_stage: int to each option in the spec.

fix(sampler): make promoted dependent attributes age-appropriate

2b61c1c

DeveshParagiri reviewed Feb 17, 2026

View reviewed changes

DeveshParagiri requested changes Feb 17, 2026

View reviewed changes

DeveshParagiri reviewed Feb 17, 2026

View reviewed changes

DeveshParagiri deleted the branch codex/issue-85-dependent-names February 17, 2026 18:56

DeveshParagiri closed this Feb 17, 2026

DeveshParagiri mentioned this pull request Feb 17, 2026

Fix #86: normalize promoted dependent attributes for age realism #103

Closed

DeveshParagiri deleted the codex/issue-86-dependent-attributes branch February 23, 2026 01:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #86: normalize promoted dependent attributes for age realism#93

Fix #86: normalize promoted dependent attributes for age realism#93
RandomOscillations wants to merge 1 commit intocodex/issue-85-dependent-namesfrom
codex/issue-86-dependent-attributes

RandomOscillations commented Feb 17, 2026

Uh oh!

DeveshParagiri left a comment

Uh oh!

DeveshParagiri left a comment

Uh oh!

DeveshParagiri left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RandomOscillations commented Feb 17, 2026

Summary

Validation

Uh oh!

DeveshParagiri left a comment

Choose a reason for hiding this comment

Code Review

Summary

Normalization Applied

Edge Cases

Minor Suggestion (non-blocking)

Uh oh!

DeveshParagiri left a comment

Choose a reason for hiding this comment

Code Review

Problem

Suggested Fix

Uh oh!

DeveshParagiri left a comment

Choose a reason for hiding this comment

Additional Issue: _education_stage() also uses fragile keyword matching

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants