diff --git a/docs/capabilities.md b/docs/capabilities.md new file mode 100644 index 0000000..d3684b0 --- /dev/null +++ b/docs/capabilities.md @@ -0,0 +1,260 @@ +# What Extropy Can Simulate + +This document walks through Extropy's capabilities by example. Each section describes a type of scenario you can run today, with the underlying mechanics that make it work. + +--- + +## Any Country, Any Culture + +Extropy isn't locked to US demographics. You can simulate populations anywhere. + +**US population responding to a Netflix price hike**: Out of the box. Names come from bundled SSA baby name data (1940-2010 birth decades) and Census surname data by ethnicity. A 45-year-old Black woman born in 1980 gets a name that reflects naming trends for Black girls in that era. Her white husband gets an appropriately correlated name. Their kids get names consistent with 2010s trends. + +**Japanese employees reacting to a remote work policy**: Works. You provide a `NameConfig` with Japanese naming conventions (or let the LLM research it), and Extropy generates culturally appropriate names. The simulation mechanics are identical - the agents reason in first-person, form opinions, share with their network. + +**Indian consumers across Mumbai, Delhi, and Bangalore responding to a fintech launch**: Works. Define city as an attribute with your desired distribution. Agents get sampled across cities, connected via network edges that respect geography, and exposed to marketing through channels you define. + +**Brazilian families deciding whether to migrate for work**: Works. Household sampling gives you family units with correlated attributes. Partners share socioeconomic status, have age gaps that reflect real patterns, and kids are generated as dependents with appropriate ages. + +The pattern: define your population's attributes and distributions, optionally provide country-specific name data, and run the pipeline. The simulation engine doesn't care about geography - it cares about attributes, networks, and reasoning. + +--- + +## Individuals or Households + +You control whether agents exist as isolated individuals or as family units. + +**Individual professionals responding to an industry disruption**: Set `household_mode: false`. Each agent is independent. Good for workplace scenarios, B2B decisions, or any context where family structure doesn't matter. + +**Households deciding whether to adopt rooftop solar**: Set `household_mode: true`. Now you get family units. A married couple shares a `household_id` and `last_name`. They have correlated attributes - similar education levels, aligned political views (with configurable assortative mating rates), compatible religious backgrounds. + +The `agent_focus` field in your population spec controls who reasons vs. who exists as context: + +**Primary only (default)**: Only the primary adult in each household is a reasoning agent. Partners and children exist as NPC data attached to that agent - named, with attributes, but not making decisions. Use this when you care about one decision-maker per household (e.g., "the subscriber", "the homeowner"). + +**Couples**: Both adults in a household are reasoning agents. Children are NPCs. Partners influence each other through network edges (weight 1.0). Use this when both adults' opinions matter (e.g., couples deciding on a major purchase, spouses with different political views). + +**All (families)**: Everyone in the household is a reasoning agent, including children old enough to have opinions. Use this when family dynamics matter (e.g., teens influencing parents on tech adoption, multi-generational disagreements). + +The mechanism: set `agent_focus` in your population spec metadata. Values like "families", "households", or "everyone" trigger the "all" mode. Values like "couples", "partners", or "spouses" trigger couples mode. Everything else defaults to primary-only. + +**Household types are sampled by age bracket**: +- Singles (one adult, no kids) +- Couples (two adults, no kids) +- Single parents (one adult with kids) +- Couples with kids (two adults with kids) +- Multi-generational (extended family) + +Each age bracket has configurable weights. Young adults skew toward singles and couples. Middle-aged adults skew toward families. Elderly skew toward couples and singles again. + +**Single adults**: Sampled naturally from the "single" household type. No partner, no dependents. Network edges come from coworkers, neighbors, congregation - not family. + +**Childless couples**: Sampled from the "couple" household type. Two adults linked by `partner_id`, sharing `household_id` and `last_name`. No dependents generated. + +--- + +## Realistic Social Networks + +Agents don't exist in isolation. They're connected in networks that reflect real social structures. + +**Partners influence each other most strongly**: Partner edges have weight 1.0. When one spouse changes their opinion, the other is heavily exposed. This models the reality that intimate partners shape each other's views more than anyone else. + +**Coworkers share industry-specific information**: Agents in the same occupation category get connected with `coworker` edges (weight 0.6, capped at 8 connections). An accountant hears about regulatory changes from other accountants, not from nurses. + +**Neighbors observe each other's behavior**: Agents in the same region with similar ages get `neighbor` edges (weight 0.4, capped at 4). When your neighbor installs solar panels, you notice. This models the "keeping up with the Joneses" dynamic. + +**Religious communities spread information through congregations**: Agents sharing religious affiliation and high religiosity get `congregation` edges (weight 0.4, capped at 4). Church announcements, mosque discussions, temple gatherings - information flows through these communities. + +**Parents of school-age children form their own network**: Agents with kids in school, in the same region, get `school_parent` edges (weight 0.35, capped at 3). PTA meetings, school pickup conversations, parent group chats - this captures that social layer. + +**The rest is similarity-based**: After structural edges are placed, remaining connections fill based on attribute similarity. People befriend others like themselves. The overall degree distribution follows a power law - most people have a handful of connections, a few have many. + +--- + +## Static Events or Evolving Timelines + +Some scenarios are a single shock. Others unfold over time. + +**Static: Netflix announces a price increase**: One event, one moment. Netflix raises prices by $3/month. Agents hear about it through news, social media, or email. They form opinions - cancel, keep, or downgrade. They share with their network. Over a few timesteps, information propagates, opinions stabilize, and you see the final distribution. The event itself doesn't change; what evolves is awareness and social influence. + +This is the right model when: +- The event is a discrete announcement or decision +- What matters is how the population responds and influences each other +- There's no new information after the initial shock + +**Evolving: Netflix password crackdown unfolds over months**: +- Month 1: Netflix announces upcoming password-sharing restrictions +- Month 2: Enforcement begins in select markets +- Month 3: Full rollout, first reports of account lockouts +- Month 4: Netflix offers discounted "extra member" add-on +- Month 5: Competitor promotions target frustrated users + +Each timestep, agents see what's happened so far. Their prompts include a recap: "Over the past few months, Netflix first announced the crackdown, then started enforcing it. Last month they offered a cheaper add-on option." The timeline creates a narrative arc where agent reasoning evolves with new information. + +This is the right model when: +- The situation develops with new facts over time +- Agent responses to Week 1 should differ from Week 5 +- You want to model how opinions shift as circumstances change + +**Evolving: A crisis that develops and resolves**: +- Day 1: Initial reports of data breach, uncertainty about scope +- Day 2: Company confirms breach, announces investigation +- Day 3: Details emerge - 10 million accounts affected +- Day 5: Company offers free credit monitoring +- Day 7: CEO resigns +- Day 10: New security measures announced + +Agents experience the crisis as it unfolds. Early timesteps have high uncertainty and speculation. Later timesteps have concrete information. Memory traces let agents reference their earlier reactions: "Last week I was panicking about my data. Now that they're offering monitoring, I'm less worried but still annoyed." + +**Automatic detection**: If you define a single event with no timeline, Extropy treats it as static. If you provide multiple events or explicit timeline entries, it switches to evolving mode. You can override with `timeline_mode: static` or `timeline_mode: evolving`. + +**Timestep units are configurable**: Days, weeks, months - whatever fits your scenario. A crisis might unfold over days. A policy change might play out over months. A generational shift might span years. + +--- + +## Multiple Exposure Channels + +People hear about things through different channels, and the channel matters. + +**Mainstream media reaches broadly but impersonally**: High reach probability, but agents process it as "something I saw on the news." Good for initial awareness, less effective for deep persuasion. + +**Social media spreads through networks**: Reach follows network edges. If your connections are sharing something, you see it. The viral dynamic emerges naturally - well-connected agents amplify information. + +**Word of mouth is personal and trusted**: Exposure happens through direct network edges only. Lower reach, higher impact. When your brother tells you something, it carries more weight than a headline. + +**Official communication targets specific groups**: An employer announcement reaches employees. A utility notice reaches customers. Channel targeting filters by attributes - only relevant agents get exposed. + +**Observation lets agents notice behavior**: Agents can witness what others do, not just hear what they say. When a neighbor buys an electric car, agents on that network edge might get exposed through observation. This models the "seeing is believing" dynamic. + +Each channel has: +- Reach probability (what fraction of eligible agents get exposed) +- Targeting rules (which agents are eligible) +- Experience template (how the agent encounters the information) + +You can mix channels. A scenario might start with mainstream media coverage, then spread through social media, then deepen through word of mouth as people discuss it with family. + +--- + +## Any Kind of Outcome + +You decide what you're measuring. + +**Categorical choices**: "Will you support, oppose, or remain neutral?" "Will you buy, wait, or skip?" Any discrete set of options. The first required categorical outcome becomes the agent's "position" - the headline metric for aggregation. + +**Boolean decisions**: "Will you share this with others?" "Will you attend the event?" Yes or no. + +**Continuous scales**: "How price-sensitive are you on a scale of 0 to 1?" "What's your trust level from 0 to 100?" Useful for measuring intensity, not just direction. + +**Open-ended responses**: "What are your main concerns?" Free text. The agent reasons naturally without being forced into categories. These skip the classification pass entirely - the reasoning itself is the outcome. + +You can mix outcome types. A scenario might have: +- A categorical position (support/oppose/neutral) +- A boolean share intention +- A continuous intensity score +- An open-ended elaboration + +All get captured in the same simulation run. + +--- + +## Two-Pass or Merged Reasoning + +You control the tradeoff between cost and reasoning quality. + +**Two-pass (default)**: +1. Pass 1 asks the agent to reason freely in first-person. No outcome categories in sight. Just "You're this person, this happened, how do you feel?" +2. Pass 2 takes that reasoning and classifies it into your defined outcomes using a faster, cheaper model. + +This separation prevents the central tendency problem where agents gravitate to safe middle options when they see the categories upfront. Reasoning quality is higher because the agent isn't gaming the schema. + +**Merged pass** (`--merged-pass`): +Single call with both reasoning and outcomes in one schema. Cheaper - one API call instead of two. Faster - no round-trip between passes. But the agent sees the outcome categories while reasoning, which can bias responses toward the middle. + +Use merged pass for: +- Cost-sensitive runs with many agents +- Quick exploratory simulations +- Scenarios where you trust the model to reason past the schema + +Use two-pass for: +- Final production runs where quality matters +- Scenarios with polarizing topics where central tendency is a real risk +- Research where reasoning traces need to be unbiased + +--- + +## Memory and Temporal Awareness + +Agents aren't goldfish. They remember. + +**Full memory traces**: Every timestep, agents get their complete history. "In Week 1, I was skeptical. By Week 3, I was coming around. Last week I committed to trying it." Memories include the reasoning summary, sentiment, and conviction at each point. + +**Temporal labeling**: Prompts explicitly state the current timestep. "It's now Week 5 of this situation." Agents can reason about time - how long something has been going on, whether their views have been stable or shifting. + +**Emotional trajectory**: The system detects sentiment trends. "You started skeptical but have been warming up" or "Your enthusiasm has been fading over the past few weeks." This shapes agent self-awareness. + +**Intent accountability**: If an agent said they'd do something, they get reminded. "Last week you said you were going to look into alternatives. Has anything changed?" This prevents agents from making bold claims they never follow through on. + +**Conviction decay**: Strong opinions fade without reinforcement. A conviction score of 0.9 doesn't stay at 0.9 forever. Configurable decay rate means you can model how quickly certainty erodes. + +**Flip resistance**: High-conviction agents are harder to move. If someone is absolutely certain, new information needs to be compelling to shift them. This prevents unrealistic opinion swings. + +--- + +## Social Dynamics That Emerge + +You don't program social behavior explicitly. It emerges from the mechanics. + +**Peer pressure**: Agents see what their network neighbors think. "My coworker Darnell is strongly opposed. My neighbor Maria is on board." This named, specific peer pressure is more realistic than abstract statistics. + +**Conformity variation**: Agents have a `conformity` attribute (0-1). High-conformity agents get prompted with "I tend to go along with what most people around me are doing." Low-conformity agents get "I tend to form my own opinion regardless of what others think." This shapes how they weight peer opinions. + +**Local mood**: Agents sense the aggregate sentiment of their network. "Most people around me seem worried." This is vibes, not statistics - realistic ambient social pressure. + +**Macro trends as context**: Population-level shifts get injected as background. "The general mood is shifting toward acceptance." "More and more people are taking action." Agents sense the zeitgeist without knowing exact numbers. + +**Viral sharing**: Agents with high conviction are more likely to share. When they share, their network neighbors get exposed. Popular opinions spread; unpopular ones don't. Network structure determines what goes viral - well-connected agents amplify. + +--- + +## What You Get Out + +After simulation runs, you have: + +**Position distributions**: What fraction of the population supports, opposes, or remains neutral? Segmented by any attribute - how do young people differ from old? Urban from rural? High-income from low-income? + +**Sentiment trajectories**: How did emotional response evolve over time? Did initial negativity soften? Did enthusiasm fade? + +**Conviction patterns**: Where are the true believers vs. the persuadable middle? How does certainty correlate with position? + +**Sharing behavior**: Who's talking about this? Which demographics amplify vs. stay silent? + +**Reasoning traces**: The actual first-person reasoning each agent produced. Qualitative insight into why people think what they think. + +**Network effects**: How did information flow? Which communities adopted early? Where did resistance cluster? + +--- + +## Scenarios You Can Run Today + +To make it concrete, here are scenarios that work right now with no additional development: + +- US households responding to a streaming service price increase +- Japanese employees adapting to return-to-office mandates +- Indian consumers in multiple cities evaluating a new fintech app +- Brazilian families weighing migration decisions +- UK residents responding to congestion pricing expansion +- German citizens reacting to energy policy changes +- Mixed urban/rural populations facing a natural disaster +- Multi-generational households navigating technology adoption +- Professional networks processing industry disruption news +- Religious communities responding to doctrinal changes +- Parent networks reacting to school policy updates +- Any population, any country, any event, any outcome structure + +The constraints are: +- No agent-to-agent conversations (yet) +- No agents creating public social posts (yet) +- No runtime fidelity/cost tradeoffs beyond merged pass (yet) +- No validation against historical ground truth (yet) + +Those are Phases D, E, F, and G. What's here now is Phases A, B, and C - the core simulation engine with households, networks, timelines, and reasoning. diff --git a/docs/simulation-v2-architecture.md b/docs/simulation-v2-architecture.md index 2311b23..6b7384a 100644 --- a/docs/simulation-v2-architecture.md +++ b/docs/simulation-v2-architecture.md @@ -34,7 +34,7 @@ Decisions confirmed before implementation. These override any conflicting detail | 5 | Timeline merge semantics | Timeline entry overrides base event for that timestep. | | 6 | DB schema for new artifacts | Define conversations/posts/action_history tables before Phase D. | | 7 | Name data | Local SSA baby names + Census surnames, bundled CSVs (~500KB), US-only. Non-US via country-specific CSVs later behind same interface: `generate_name(gender, ethnicity, birth_decade, country="US")`. | -| 8 | Conformity/threshold mechanics | Soft prompt signal only (inject local adoption ratio + conformity phrasing). No hard numeric gate. | +| 8 | Conformity/threshold mechanics | Soft prompt signal only (conformity self-awareness + peer opinions + mood rendering). No explicit ratios or hard numeric gates. | | 9 | Backtesting ground-truth | Define one validation dataset schema before Phase G. | ### Phase-Specific Decisions @@ -1034,12 +1034,12 @@ The 12 tenets below define what a high-fidelity simulation must satisfy. Tenets | 3 | Social hierarchy & influence topology | **Partial** | Structural role edges, degree multipliers in network config, edge weight hierarchy | No explicit power-law degree enforcement; no hub/opinion-leader generation. Need scenario-dependent centrality targets | | 4 | Behavioral heterogeneity | **Partial** | Big Five personality, risk tolerance, institutional trust, cognitive attributes vary per agent | Decision-policy heterogeneity relies entirely on LLM interpretation of persona. Need explicit behavioral parameters (conformity threshold, action inertia) as agent attributes | | 5 | Temporal dynamics & decay | **Strong** | Conviction decay, temporal prompt awareness, emotional trajectory, memory history, scenario timeline with evolving events | Need intent→action accountability loop (surface prior action_intent, ask about follow-through) | -| 6 | Social contagion & network effects | **Partial** | Network propagation, share modifiers, conversation system, aggregate mood, peer opinions | No explicit threshold/complex contagion. Need per-agent conformity parameter + local adoption ratio injection | +| 6 | Social contagion & network effects | **Partial** | Network propagation, share modifiers, conversation system, aggregate mood, peer opinions | No explicit threshold/complex contagion. Need per-agent conformity parameter + conformity-aware prompt phrasing | | 7 | Friction & transaction costs | **Weak** | option_friction on outcomes, bounded confidence mechanics | Biggest gap. Need explicit intent→behavior pipeline: surface what agent planned vs what they actually did. Friction emerges from agent constraints but isn't tracked or measured | | 8 | Bounded rationality & heuristics | **Strong** | LLM is inherently a bounded rationality engine. Persona attributes (education, digital literacy, neuroticism) shape heuristic use. Agents satisfice, anchor, exhibit status quo bias naturally | Could strengthen with explicit bias nudges in prompts for specific attributes | | 9 | Environmental & contextual sensitivity | **Partial** | Scenario timeline handles exogenous shocks. Channel templates adapt to agent demographics | Need ambient macro context in every prompt (economic conditions, cultural moment). Need previous-timestep macro summary injection | | 10 | Identity & group membership | **Partial** | race_ethnicity, political_orientation, religious_affiliation in persona. Social role edges create in-group connections | Need identity-threat framing: when the scenario threatens a group identity, persona rendering should explicitly flag it as identity-relevant | -| 11 | Preference interdependence | **Partial** | Aggregate mood rendering ("most people I know are doing X"), peer opinions, social posts. Bandwagon/FOMO effects emerge from context | Need explicit local adoption ratio in prompt: "X% of people you know have already done Y." Makes interdependence concrete, not just vibes | +| 11 | Preference interdependence | **Partial** | Aggregate mood rendering ("most people I know are doing X"), peer opinions, social posts. Bandwagon/FOMO effects emerge from context | Named peer opinions + local mood + macro summary provide social pressure without omniscient ratio framing | | 12 | Macro-micro feedback loops | **Partial** | Micro→macro works (agent decisions → aggregate stats). Timeline handles exogenous macro shifts | No endogenous macro: agent behavior doesn't produce emergent macro variables that feed back. Need at minimum: inject previous timestep aggregates as ambient context | ### Concrete Fixes to Close Gaps @@ -1087,16 +1087,19 @@ These are the minimum changes needed to move every tenet to **Strong**. Listed i **Soft conformity/threshold behavior:** - Add `conformity` as a standard personality attribute (0-1 scale, correlated with agreeableness). Sampled at population creation time. -- At prompt build time, compute **local adoption ratio**: what fraction of this agent's network has already taken action (changed position, shared, etc.) -- Inject into prompt: "About 7 out of 10 people you know have already started making changes. You tend to [wait until most people around you have acted / act independently of what others are doing]." (phrasing depends on conformity level) -- This gives the LLM explicit threshold context without hardcoding a threshold formula. A high-conformity agent seeing 70% adoption will likely act. A low-conformity agent seeing the same might resist specifically because everyone else is doing it (contrarian behavior). +- Inject conformity self-awareness into prompt: "I tend to go along with what most people around me are doing" (high) or "I tend to form my own opinion regardless of what others think" (low). Mid-range agents get no explicit phrasing. +- Social pressure is conveyed through **existing mechanisms**, not explicit ratios: + - Named peer opinions: "My coworker Darnell thinks X" + - Local mood rendering: "Most people around me seem worried" + - Macro summary: "The general mood is shifting toward X" +- **Rationale:** People don't actually know "7 out of 10 contacts did X" — that's omniscient narrator framing. Real social pressure comes from specific conversations and vague impressions, which the peer opinion and mood systems already capture. **Macro state feedback:** - After each timestep, compute macro summary from TimestepSummary data: - Position distribution rendered as "Most people are choosing X. A growing minority is doing Y." - Sentiment trend: "The general mood is getting worse / stabilizing / improving." - Exposure saturation: "Almost everyone has heard about this now." - - Action adoption rate: "About X% of people have already taken concrete action." + - Action momentum: "More and more people are taking action" / "Most people are still waiting" - Inject this into every agent's next-timestep prompt as ambient context, rendered as what the agent would sense from media/social feeds, not raw numbers. - This closes the macro→micro loop: agent decisions → aggregate stats → rendered as ambient context → influences next round of agent decisions. @@ -1203,7 +1206,7 @@ Ship this alone. Every simulation immediately feels more human, and the accounta - Scenario timeline: sequence of events at specified timesteps - Timeline injection into agent prompts as "what's happened since last time" -- Local adoption ratio computed per agent ("7 out of 10 people you know have acted") +- Named peer opinions + local mood convey social pressure without explicit ratios - Conformity-aware prompt rendering ("You tend to wait for others / act independently") - Ambient scenario context field (`background_context` in ScenarioSpec) - Macro state feedback: timestep aggregates rendered as ambient vibes in next prompt diff --git a/extropy/cli/commands/extend.py b/extropy/cli/commands/extend.py index d81efb6..d0c441b 100644 --- a/extropy/cli/commands/extend.py +++ b/extropy/cli/commands/extend.py @@ -233,6 +233,8 @@ def do_hydration(): attributes=bound_attrs, sampling_order=sampling_order, sources=sources, + household_config=household_config, + name_config=name_config, ) merged_spec = base.merge(extension_spec) diff --git a/extropy/cli/commands/scenario.py b/extropy/cli/commands/scenario.py index 6a5d99d..164f037 100644 --- a/extropy/cli/commands/scenario.py +++ b/extropy/cli/commands/scenario.py @@ -36,6 +36,11 @@ def scenario_command( "-o", help="Output path (defaults to {population_stem}.scenario.yaml)", ), + timeline: str = typer.Option( + "auto", + "--timeline", + help="Timeline mode: auto (LLM decides), static (single event), evolving (multi-event)", + ), yes: bool = typer.Option(False, "--yes", "-y", help="Skip confirmation prompts"), ): """ @@ -116,6 +121,8 @@ def on_progress(step: str, status: str): def run_pipeline(): nonlocal result_spec, validation_result, pipeline_error try: + # Convert timeline mode (auto -> None for LLM decision) + timeline_mode = None if timeline == "auto" else timeline result_spec, validation_result = create_scenario( description=scenario_desc, population_spec_path=population, @@ -124,6 +131,7 @@ def run_pipeline(): network_id=network_id, output_path=None, # Don't save yet on_progress=on_progress, + timeline_mode=timeline_mode, ) except Exception as e: pipeline_error = e @@ -214,6 +222,29 @@ def run_pipeline(): ) console.print() + # Timeline info (Phase C) + if result_spec.timeline: + console.print(f"[bold]Timeline:[/bold] {len(result_spec.timeline)} events") + for te in result_spec.timeline[:3]: + desc = te.description or te.event.content[:40] + console.print(f" • t={te.timestep}: {desc}") + if len(result_spec.timeline) > 3: + console.print(f" [dim]... and {len(result_spec.timeline) - 3} more[/dim]") + console.print() + else: + console.print("[bold]Timeline:[/bold] static (single event)") + console.print() + + # Background context (Phase C) + if result_spec.background_context: + ctx_preview = ( + result_spec.background_context[:60] + "..." + if len(result_spec.background_context) > 60 + else result_spec.background_context + ) + console.print(f"[bold]Background:[/bold] {ctx_preview}") + console.print() + # Validation Results if validation_result.errors: console.print( diff --git a/extropy/cli/commands/simulate.py b/extropy/cli/commands/simulate.py index 3b7b883..8cd802b 100644 --- a/extropy/cli/commands/simulate.py +++ b/extropy/cli/commands/simulate.py @@ -186,6 +186,11 @@ def simulate_command( "-p", help="PersonaConfig YAML for embodied personas (auto-detected if not specified)", ), + merged_pass: bool = typer.Option( + False, + "--merged-pass", + help="Use single merged reasoning pass instead of two-pass (experimental)", + ), quiet: bool = typer.Option(False, "--quiet", "-q", help="Suppress progress output"), verbose: bool = typer.Option(False, "--verbose", "-v", help="Show detailed logs"), debug: bool = typer.Option( @@ -316,6 +321,7 @@ def on_progress(timestep: int, max_timesteps: int, status: str): writer_queue_size=writer_queue_size, db_write_batch_size=db_write_batch_size, resource_governor=governor, + merged_pass=merged_pass, ) simulation_error = None except Exception as e: @@ -352,6 +358,7 @@ def do_simulation(): writer_queue_size=writer_queue_size, db_write_batch_size=db_write_batch_size, resource_governor=governor, + merged_pass=merged_pass, ) except Exception as e: simulation_error = e diff --git a/extropy/core/models/__init__.py b/extropy/core/models/__init__.py index 3a09c9b..7b558a8 100644 --- a/extropy/core/models/__init__.py +++ b/extropy/core/models/__init__.py @@ -57,6 +57,8 @@ # Event EventType, Event, + # Timeline + TimelineEvent, # Exposure ExposureChannel, ExposureRule, @@ -160,6 +162,8 @@ # Scenario - Event "EventType", "Event", + # Scenario - Timeline + "TimelineEvent", # Scenario - Exposure "ExposureChannel", "ExposureRule", diff --git a/extropy/core/models/scenario.py b/extropy/core/models/scenario.py index 17d4ef6..d7415fe 100644 --- a/extropy/core/models/scenario.py +++ b/extropy/core/models/scenario.py @@ -114,6 +114,26 @@ class SeedExposure(BaseModel): ) +class TimelineEvent(BaseModel): + """A development in the scenario timeline. + + Timeline events represent how a scenario evolves over time. For evolving + scenarios (crises, campaigns), multiple events occur at different timesteps. + Static scenarios (policy announcements) have no timeline events. + """ + + timestep: int = Field(ge=0, description="When this development occurs") + event: Event = Field(description="The event content at this timestep") + exposure_rules: list[ExposureRule] | None = Field( + default=None, + description="Custom exposure rules; if None, reuses seed_exposure.rules with updated content", + ) + description: str | None = Field( + default=None, + description="Human-readable context for this development", + ) + + # ============================================================================= # Interaction Model # ============================================================================= @@ -285,6 +305,10 @@ class ScenarioSpec(BaseModel): meta: ScenarioMeta event: Event + timeline: list[TimelineEvent] | None = Field( + default=None, + description="Subsequent developments; None or empty = static scenario", + ) seed_exposure: SeedExposure interaction: InteractionConfig spread: SpreadConfig diff --git a/extropy/core/models/simulation.py b/extropy/core/models/simulation.py index a226f63..134a5d7 100644 --- a/extropy/core/models/simulation.py +++ b/extropy/core/models/simulation.py @@ -314,6 +314,23 @@ class ReasoningContext(BaseModel): default_factory=dict, description="Mapping of agent_id → first name for resolving peer references", ) + # Phase C additions + timeline_recap: list[str] | None = Field( + default=None, + description="Bullet list of what's happened so far in the scenario", + ) + current_development: str | None = Field( + default=None, + description="This timestep's new development (if any)", + ) + observable_peer_actions: int | None = Field( + default=None, + description="Count of neighbors who visibly acted (shared/posted)", + ) + conformity: float | None = Field( + default=None, + description="Agent's conformity attribute (0-1)", + ) # ============================================================================= @@ -390,6 +407,10 @@ class SimulationRunConfig(BaseModel): default=None, description="Max concurrent async reasoning calls (None = auto from RPM)", ) + merged_pass: bool = Field( + default=False, + description="Use single merged pass instead of two-pass reasoning (experimental)", + ) # Backward compat aliases @property diff --git a/extropy/core/providers/anthropic.py b/extropy/core/providers/anthropic.py index 3363fbf..a987b47 100644 --- a/extropy/core/providers/anthropic.py +++ b/extropy/core/providers/anthropic.py @@ -29,13 +29,27 @@ def _clean_schema_for_tool(schema: dict) -> dict: """Clean a JSON schema for use as a tool input_schema. - Removes fields that aren't valid in tool input schemas - (like 'additionalProperties' in nested objects that Claude - doesn't support in tool definitions). + Anthropic structured outputs support additionalProperties: false but NOT + schema-valued additionalProperties (e.g. {"type": "number"}). + + This function: + - Keeps additionalProperties: false (valid and useful) + - Strips additionalProperties when it's a dict/schema (not supported) + - Logs a warning when stripping schema-valued additionalProperties """ cleaned = {} for key, value in schema.items(): if key == "additionalProperties": + if value is False: + # Keep additionalProperties: false - it's valid + cleaned[key] = value + elif isinstance(value, dict): + # Schema-valued additionalProperties not supported - strip with warning + logger.warning( + "Stripping schema-valued additionalProperties from tool schema " + "(not supported by Anthropic structured outputs)" + ) + # Skip other truthy values (True, etc.) continue if isinstance(value, dict): cleaned[key] = _clean_schema_for_tool(value) diff --git a/extropy/population/persona/renderer.py b/extropy/population/persona/renderer.py index cfc0966..82fb87e 100644 --- a/extropy/population/persona/renderer.py +++ b/extropy/population/persona/renderer.py @@ -297,6 +297,180 @@ def render_intro(agent: dict[str, Any], config: PersonaConfig) -> str: return f"## Who I Am\n\n[Error rendering intro: {e}]" +# ============================================================================= +# Household Section Rendering +# ============================================================================= + +# Templates for partner phrases, keyed by (has_kids, partner_gender) +_PARTNER_TEMPLATES = [ + "My {title} {name} is {age}.", + "{name} and I have been together for a while now — {pronoun}'s {age}.", + "I live with {name} ({age}), my {title}.", +] + +# Templates for kids, keyed by count +_KIDS_TEMPLATES_SINGLE = [ + "Our {relationship} {name} is {age}{school_phrase}.", + "We have a {relationship}, {name}, who's {age}{school_phrase}.", +] + +_KIDS_TEMPLATES_MULTI = [ + "We have {count} kids: {kid_list}.", + "Our children are {kid_list}.", +] + +# Templates for elderly dependents +_ELDERLY_TEMPLATES = [ + "My {relationship} {name} ({age}) also lives with us.", + "{name}, my {relationship}, lives with us at {age}.", +] + +# Single parent templates +_SINGLE_PARENT_TEMPLATES = [ + "It's just me and {kid_summary}.", + "I'm raising {kid_summary} on my own.", +] + + +def _format_age(age: int | str) -> str: + """Format age, handling babies specially.""" + try: + age_int = int(age) + if age_int == 0: + return "less than a year old" + if age_int == 1: + return "1 year old" + return str(age_int) + except (ValueError, TypeError): + return str(age) + + +def _format_kid(dep: dict[str, Any]) -> str: + """Format a single kid for listing.""" + name = dep.get("name", "") + age = dep.get("age", "") + school = dep.get("school_status") + + age_str = _format_age(age) + + if school and school not in ("adult", "working_adult", "home"): + return f"{name} ({age_str}, {school.replace('_', ' ')})" + return f"{name} ({age_str})" + + +def _get_school_phrase(dep: dict[str, Any]) -> str: + """Get school status phrase for a dependent.""" + school = dep.get("school_status") + if not school or school in ("adult", "working_adult", "home"): + return "" + return f", in {school.replace('_', ' ')}" + + +def render_household_section(agent: dict[str, Any], rng: Any = None) -> str: + """Render the household section with partner and dependents. + + Args: + agent: Agent dict with optional partner_npc and dependents + rng: Optional random source for template variation (defaults to hash-based) + + Returns: + Rendered household section, or empty string if no household context + """ + partner = agent.get("partner_npc") + dependents = agent.get("dependents", []) + partner_id = agent.get("partner_id") # If partner is also an agent + + # Skip if no household context + if not partner and not dependents and not partner_id: + return "" + + # Use agent ID for deterministic randomness + if rng is None: + import random + + seed = hash(agent.get("_id", "")) % (2**31) + rng = random.Random(seed) + + phrases = [] + + # Separate kids from elderly + kids = [d for d in dependents if d.get("relationship") in ("son", "daughter")] + elderly = [ + d + for d in dependents + if d.get("relationship") in ("mother", "father", "grandmother", "grandfather") + ] + + # Partner phrase + if partner: + name = partner.get("first_name", "my partner") + age = partner.get("age", "") + gender = partner.get("gender", "") + + # Title based on gender + title = ( + "husband" + if gender == "male" + else "wife" + if gender == "female" + else "partner" + ) + pronoun = "he" if gender == "male" else "she" if gender == "female" else "they" + + template = rng.choice(_PARTNER_TEMPLATES) + phrase = template.format(name=name, age=age, title=title, pronoun=pronoun) + phrases.append(phrase) + elif partner_id: + # Partner is an agent, just note we have one + phrases.append("I live with my partner.") + + # Kids phrase + if kids: + is_single_parent = not partner and not partner_id + + if len(kids) == 1: + kid = kids[0] + name = kid.get("name", "my child") + age = _format_age(kid.get("age", "")) + rel = kid.get("relationship", "child") + school_phrase = _get_school_phrase(kid) + + if is_single_parent: + template = rng.choice(_SINGLE_PARENT_TEMPLATES) + phrase = template.format(kid_summary=f"my {rel} {name} ({age})") + else: + template = rng.choice(_KIDS_TEMPLATES_SINGLE) + phrase = template.format( + name=name, age=age, relationship=rel, school_phrase=school_phrase + ) + phrases.append(phrase) + else: + kid_list = ", ".join(_format_kid(k) for k in kids[:-1]) + kid_list += f" and {_format_kid(kids[-1])}" + + if is_single_parent: + template = rng.choice(_SINGLE_PARENT_TEMPLATES) + phrases.append(template.format(kid_summary=f"my {len(kids)} kids")) + phrases.append(f"That's {kid_list}.") + else: + template = rng.choice(_KIDS_TEMPLATES_MULTI) + phrases.append(template.format(count=len(kids), kid_list=kid_list)) + + # Elderly phrase + for dep in elderly: + name = dep.get("name", "") + age = dep.get("age", "") + rel = dep.get("relationship", "parent") + + template = rng.choice(_ELDERLY_TEMPLATES) + phrases.append(template.format(name=name, age=age, relationship=rel)) + + if not phrases: + return "" + + return "## My Household\n\n" + " ".join(phrases) + + def render_persona( agent: dict[str, Any], config: PersonaConfig, @@ -321,6 +495,11 @@ def render_persona( if intro: sections.append(intro) + # Render household section (partner, kids, elderly) + household = render_household_section(agent) + if household: + sections.append(household) + decision_set = set(decision_relevant_attributes or []) # Render decision-relevant attributes first if specified diff --git a/extropy/population/sampler/core.py b/extropy/population/sampler/core.py index a6ccf2a..dc03686 100644 --- a/extropy/population/sampler/core.py +++ b/extropy/population/sampler/core.py @@ -22,6 +22,7 @@ SamplingStats, SamplingResult, HouseholdConfig, + NameConfig, ) from ...utils.callbacks import ItemProgressCallback from .distributions import sample_distribution, coerce_to_type @@ -217,6 +218,7 @@ def _generate_npc_partner( categorical_options: dict[str, list[str]], rng: random.Random, config: HouseholdConfig, + name_config: NameConfig | None = None, ) -> dict[str, Any]: """Generate a lightweight NPC partner profile for context. @@ -246,6 +248,18 @@ def _generate_npc_partner( if attr in primary: partner[attr] = primary[attr] + # Generate name for partner + partner_age = partner.get("age") + birth_decade = age_to_birth_decade(partner_age) if partner_age is not None else None + first_name, _ = generate_name( + gender=partner["gender"], + ethnicity=partner.get("race_ethnicity"), + birth_decade=birth_decade, + seed=rng.randint(0, 2**31), + name_config=name_config, + ) + partner["first_name"] = first_name + if primary.get("last_name"): partner["last_name"] = primary["last_name"] @@ -391,7 +405,12 @@ def _sample_population_households( else: # Partner is NPC context on the primary agent npc_partner = _generate_npc_partner( - adult1, household_attrs, categorical_options, rng, config + adult1, + household_attrs, + categorical_options, + rng, + config, + name_config=spec.meta.name_config, ) adult1["partner_npc"] = npc_partner adult1["partner_id"] = None diff --git a/extropy/population/spec_builder/hydrator.py b/extropy/population/spec_builder/hydrator.py index 2ff4614..b75461a 100644 --- a/extropy/population/spec_builder/hydrator.py +++ b/extropy/population/spec_builder/hydrator.py @@ -14,6 +14,8 @@ unterminated strings, invalid formulas, etc. before proceeding. """ +import logging + from ...core.llm import RetryCallback from ...core.models import ( AttributeSpec, @@ -32,6 +34,83 @@ hydrate_name_config, ) +logger = logging.getLogger(__name__) + +# Keywords that suggest household context is relevant +_HOUSEHOLD_KEYWORDS = frozenset( + [ + # Population keywords + "family", + "families", + "couple", + "couples", + "household", + "households", + "parent", + "parents", + "retired", + "retiree", + "retirees", + "spouse", + "spouses", + "married", + "cohabit", + "living together", + "home owner", + "homeowner", + # Scenario keywords that imply household context + "childcare", + "child care", + "parental", + "maternity", + "paternity", + "housing", + "mortgage", + "rent", + "school district", + "relocation", + "moving", + ] +) + + +def _should_hydrate_household_config( + description: str, + attributes: list[DiscoveredAttribute], +) -> bool: + """Check if this population/scenario needs household config research. + + Household config is expensive (agentic research with web search). + Only hydrate it when: + 1. Any discovered attribute has scope="household", OR + 2. Description (population or scenario) mentions household-related keywords + + At spec time, description is just the population ("2000 physicians"). + At extend time, description is "population + scenario" so scenario + keywords like "childcare policy" will trigger household research. + + Args: + description: Population description OR "population + scenario" for extend + attributes: Discovered attributes from selector + + Returns: + True if household config should be researched + """ + # Check if any attribute has household scope + has_household_attrs = any( + getattr(attr, "scope", "individual") == "household" for attr in attributes + ) + if has_household_attrs: + return True + + # Check if description implies household context + desc_lower = description.lower() + for keyword in _HOUSEHOLD_KEYWORDS: + if keyword in desc_lower: + return True + + return False + # ============================================================================= # Main Orchestrator @@ -223,17 +302,25 @@ def on_retry(attempt: int, max_retries: int, error_summary: str): len(modifier_sources), ) - # Step 2e: Household config - report("2e", "Researching household composition...") - household_config, hh_sources = hydrate_household_config( - population=population, - geography=geography, - model=model, - reasoning_effort=reasoning_effort, - on_retry=make_retry_callback("2e"), - ) - all_sources.extend(hh_sources) - report("2e", "Household config researched", len(hh_sources)) + # Step 2e: Household config (conditional - skip for purely professional populations) + if _should_hydrate_household_config(population, attributes): + report("2e", "Researching household composition...") + household_config, hh_sources = hydrate_household_config( + population=population, + geography=geography, + model=model, + reasoning_effort=reasoning_effort, + on_retry=make_retry_callback("2e"), + ) + all_sources.extend(hh_sources) + report("2e", "Household config researched", len(hh_sources)) + else: + logger.info( + "Skipping household config hydration - no household-scoped attributes " + "or household keywords in population description" + ) + household_config = HouseholdConfig() + report("2e", "Skipped (no household context)", 0) # Step 2f: Name config report("2f", "Researching population-appropriate names...") diff --git a/extropy/population/spec_builder/hydrators/household.py b/extropy/population/spec_builder/hydrators/household.py index 4222fee..4a8a02c 100644 --- a/extropy/population/spec_builder/hydrators/household.py +++ b/extropy/population/spec_builder/hydrators/household.py @@ -42,15 +42,20 @@ def hydrate_household_config( For this population, provide statistically grounded values for: -1. **Age brackets**: List of [upper_bound_exclusive, bracket_label] pairs that partition the adult age range. E.g. [[30, "18-29"], [45, "30-44"], [65, "45-64"], [999, "65+"]] +1. **Age brackets**: Array of objects with `upper_bound` (exclusive integer) and `label` (string). + Example: [{{"upper_bound": 30, "label": "18-29"}}, {{"upper_bound": 45, "label": "30-44"}}, {{"upper_bound": 65, "label": "45-64"}}, {{"upper_bound": 999, "label": "65+"}}] -2. **Household type weights**: For each age bracket, the probability distribution over household types: "single", "couple", "single_parent", "couple_with_kids", "multi_generational". Weights must sum to ~1.0 per bracket. +2. **Household type weights**: Array of objects, each with `bracket` (label from age_brackets) and `types` (array of {{"type": string, "weight": number}}). + Valid types: "single", "couple", "single_parent", "couple_with_kids", "multi_generational". Weights must sum to ~1.0 per bracket. + Example: [{{"bracket": "18-29", "types": [{{"type": "single", "weight": 0.4}}, {{"type": "couple", "weight": 0.3}}, ...]}}] -3. **Same-group partner rates**: Probability that a partner shares the same race/ethnicity group. Provide rates by group name (e.g. {{"white": 0.90, "black": 0.82}}). +3. **Same-group partner rates**: Array of objects with `group` (ethnicity name) and `rate` (probability 0-1). + Example: [{{"group": "white", "rate": 0.90}}, {{"group": "black", "rate": 0.82}}] 4. **Default same-group rate**: Fallback rate for groups not explicitly listed. -5. **Assortative mating coefficients**: Probability that partners share the same value for correlated attributes. Keys are attribute names like "education_level", "religious_affiliation", "political_orientation". +5. **Assortative mating**: Array of objects with `attribute` (attribute name) and `correlation` (probability 0-1). + Example: [{{"attribute": "education_level", "correlation": 0.65}}, {{"attribute": "religious_affiliation", "correlation": 0.70}}] 6. **Partner age gap**: Mean offset (partner_age - primary_age, negative means younger) and standard deviation. @@ -65,7 +70,8 @@ def hydrate_household_config( - elderly_min_offset: Minimum age gap between primary adult and elderly dependent - elderly_max_offset: Maximum age gap between primary adult and elderly dependent -10. **Life stages**: Age thresholds for dependent life stages based on the education system. Each entry has max_age (exclusive) and label. E.g. [{{"max_age": 6, "label": "preschool"}}, {{"max_age": 12, "label": "primary"}}] +10. **Life stages**: Age thresholds for dependent life stages based on the education system. Each entry has max_age (exclusive) and label. + Example: [{{"max_age": 6, "label": "preschool"}}, {{"max_age": 12, "label": "primary"}}] 11. **Adult stage label**: Label for post-school adults (e.g. "adult"). @@ -113,36 +119,51 @@ def hydrate_household_config( def _parse_household_config(data: dict) -> HouseholdConfig: - """Parse LLM response into a HouseholdConfig, falling back to defaults for bad fields.""" + """Parse LLM response into a HouseholdConfig, falling back to defaults for bad fields. + + Converts array-of-objects LLM output back to the dict/tuple structures that + HouseholdConfig expects: + - age_brackets: [{upper_bound, label}] -> [(int, str)] + - household_type_weights: [{bracket, types: [{type, weight}]}] -> nested dict + - same_group_rates: [{group, rate}] -> {str: float} + - assortative_mating: [{attribute, correlation}] -> {str: float} + """ kwargs: dict = {} - # Age brackets: list of [int, str] tuples + # Age brackets: [{upper_bound, label}] -> [(int, str)] if "age_brackets" in data and isinstance(data["age_brackets"], list): brackets = [] for item in data["age_brackets"]: - if isinstance(item, (list, tuple)) and len(item) == 2: - brackets.append((int(item[0]), str(item[1]))) + if isinstance(item, dict) and "upper_bound" in item and "label" in item: + brackets.append((int(item["upper_bound"]), str(item["label"]))) if brackets: kwargs["age_brackets"] = brackets - # Household type weights + # Household type weights: [{bracket, types: [{type, weight}]}] -> nested dict if "household_type_weights" in data and isinstance( - data["household_type_weights"], dict + data["household_type_weights"], list ): weights = {} - for bracket_label, type_weights in data["household_type_weights"].items(): - if isinstance(type_weights, dict): - weights[str(bracket_label)] = { - str(k): float(v) for k, v in type_weights.items() - } + for entry in data["household_type_weights"]: + if isinstance(entry, dict) and "bracket" in entry and "types" in entry: + bracket_label = str(entry["bracket"]) + type_weights = {} + for tw in entry.get("types", []): + if isinstance(tw, dict) and "type" in tw and "weight" in tw: + type_weights[str(tw["type"])] = float(tw["weight"]) + if type_weights: + weights[bracket_label] = type_weights if weights: kwargs["household_type_weights"] = weights - # Same-group rates - if "same_group_rates" in data and isinstance(data["same_group_rates"], dict): - kwargs["same_group_rates"] = { - str(k): float(v) for k, v in data["same_group_rates"].items() - } + # Same-group rates: [{group, rate}] -> {str: float} + if "same_group_rates" in data and isinstance(data["same_group_rates"], list): + rates = {} + for item in data["same_group_rates"]: + if isinstance(item, dict) and "group" in item and "rate" in item: + rates[str(item["group"])] = float(item["rate"]) + if rates: + kwargs["same_group_rates"] = rates # Scalar fields for field in ( @@ -165,11 +186,14 @@ def _parse_household_config(data: dict) -> HouseholdConfig: if field in data and data[field] is not None: kwargs[field] = int(data[field]) - # Assortative mating - if "assortative_mating" in data and isinstance(data["assortative_mating"], dict): - kwargs["assortative_mating"] = { - str(k): float(v) for k, v in data["assortative_mating"].items() - } + # Assortative mating: [{attribute, correlation}] -> {str: float} + if "assortative_mating" in data and isinstance(data["assortative_mating"], list): + mating = {} + for item in data["assortative_mating"]: + if isinstance(item, dict) and "attribute" in item and "correlation" in item: + mating[str(item["attribute"])] = float(item["correlation"]) + if mating: + kwargs["assortative_mating"] = mating # Life stages if "life_stages" in data and isinstance(data["life_stages"], list): diff --git a/extropy/population/spec_builder/schemas.py b/extropy/population/spec_builder/schemas.py index ea9eb89..fde105a 100644 --- a/extropy/population/spec_builder/schemas.py +++ b/extropy/population/spec_builder/schemas.py @@ -246,35 +246,78 @@ def build_conditional_base_schema() -> dict: def build_household_config_schema() -> dict: - """Build JSON schema for household config hydration.""" + """Build JSON schema for household config hydration. + + Uses array-of-objects patterns instead of dict/tuple schemas for LLM compatibility. + Both Anthropic and OpenAI structured outputs require additionalProperties: false + (not a schema) and don't support tuple-style array items. + """ return { "type": "object", "properties": { + # Array of {upper_bound, label} instead of tuple [int, str] "age_brackets": { "type": "array", "items": { - "type": "array", - "items": [ - {"type": "integer"}, - {"type": "string"}, - ], + "type": "object", + "properties": { + "upper_bound": {"type": "integer"}, + "label": {"type": "string"}, + }, + "required": ["upper_bound", "label"], + "additionalProperties": False, }, }, + # Array of {bracket, types: [{type, weight}]} instead of nested dict "household_type_weights": { - "type": "object", - "additionalProperties": { + "type": "array", + "items": { "type": "object", - "additionalProperties": {"type": "number"}, + "properties": { + "bracket": {"type": "string"}, + "types": { + "type": "array", + "items": { + "type": "object", + "properties": { + "type": {"type": "string"}, + "weight": {"type": "number"}, + }, + "required": ["type", "weight"], + "additionalProperties": False, + }, + }, + }, + "required": ["bracket", "types"], + "additionalProperties": False, }, }, + # Array of {group, rate} instead of dict "same_group_rates": { - "type": "object", - "additionalProperties": {"type": "number"}, + "type": "array", + "items": { + "type": "object", + "properties": { + "group": {"type": "string"}, + "rate": {"type": "number"}, + }, + "required": ["group", "rate"], + "additionalProperties": False, + }, }, "default_same_group_rate": {"type": "number"}, + # Array of {attribute, correlation} instead of dict "assortative_mating": { - "type": "object", - "additionalProperties": {"type": "number"}, + "type": "array", + "items": { + "type": "object", + "properties": { + "attribute": {"type": "string"}, + "correlation": {"type": "number"}, + }, + "required": ["attribute", "correlation"], + "additionalProperties": False, + }, }, "partner_age_gap_mean": {"type": "number"}, "partner_age_gap_std": {"type": "number"}, diff --git a/extropy/scenario/__init__.py b/extropy/scenario/__init__.py index 31c8893..0172a85 100644 --- a/extropy/scenario/__init__.py +++ b/extropy/scenario/__init__.py @@ -29,6 +29,8 @@ # Event EventType, Event, + # Timeline + TimelineEvent, # Exposure ExposureChannel, ExposureRule, @@ -58,6 +60,7 @@ from .exposure import generate_seed_exposure from .interaction import determine_interaction_model from .outcomes import define_outcomes +from .timeline import generate_timeline from .compiler import create_scenario, compile_scenario_from_files from .validator import validate_scenario, load_and_validate_scenario @@ -66,6 +69,8 @@ # Models - Event "EventType", "Event", + # Models - Timeline + "TimelineEvent", # Models - Exposure "ExposureChannel", "ExposureRule", @@ -94,6 +99,7 @@ "generate_seed_exposure", "determine_interaction_model", "define_outcomes", + "generate_timeline", "create_scenario", "compile_scenario_from_files", "validate_scenario", diff --git a/extropy/scenario/compiler.py b/extropy/scenario/compiler.py index bb108f2..7f2f7b6 100644 --- a/extropy/scenario/compiler.py +++ b/extropy/scenario/compiler.py @@ -24,6 +24,7 @@ from .exposure import generate_seed_exposure from .interaction import determine_interaction_model from .outcomes import define_outcomes +from .timeline import generate_timeline from ..utils.callbacks import StepProgressCallback from .validator import validate_scenario from ..storage import open_study_db @@ -91,20 +92,18 @@ def create_scenario( network_id: str = "default", output_path: str | Path | None = None, on_progress: StepProgressCallback | None = None, + timeline_mode: str | None = None, ) -> tuple[ScenarioSpec, ValidationResult]: """ Create a complete scenario spec from a description. Orchestrates the full pipeline: - 1. Load population spec - 2. Parse scenario description into Event - 3. Generate seed exposure rules - 4. Determine interaction model and spread config - 5. Define outcomes - 6. Generate simulation config - 7. Assemble ScenarioSpec - 8. Validate - 9. Optionally save to YAML + 1. Load population spec and parse event + 2. Generate seed exposure rules + 3. Determine interaction model and spread config + 4. Define outcomes + 5. Generate timeline and background context + 6. Assemble ScenarioSpec and validate Args: description: Natural language scenario description @@ -114,6 +113,8 @@ def create_scenario( network_id: Network ID in study DB output_path: Optional path to save scenario YAML on_progress: Optional callback(step, status) for progress updates + timeline_mode: Timeline mode override. None = auto-detect, "static" = single event, + "evolving" = multi-event timeline. Returns: Tuple of (ScenarioSpec, ValidationResult) @@ -145,7 +146,7 @@ def progress(step: str, status: str): # Load inputs # ========================================================================= - progress("1/5", "Loading population spec...") + progress("1/6", "Loading population spec...") if not population_spec_path.exists(): raise FileNotFoundError(f"Population spec not found: {population_spec_path}") @@ -165,7 +166,7 @@ def progress(step: str, status: str): # Step 1: Parse scenario description # ========================================================================= - progress("1/5", "Parsing event definition...") + progress("1/6", "Parsing event definition...") event = parse_scenario(description, population_spec) @@ -173,7 +174,7 @@ def progress(step: str, status: str): # Step 2: Generate seed exposure # ========================================================================= - progress("2/5", "Generating seed exposure rules...") + progress("2/6", "Generating seed exposure rules...") seed_exposure = generate_seed_exposure( event, @@ -185,7 +186,7 @@ def progress(step: str, status: str): # Step 3: Determine interaction model # ========================================================================= - progress("3/5", "Determining interaction model...") + progress("3/6", "Determining interaction model...") interaction_config, spread_config = determine_interaction_model( event, @@ -197,7 +198,7 @@ def progress(step: str, status: str): # Step 4: Define outcomes # ========================================================================= - progress("4/5", "Defining outcomes...") + progress("4/6", "Defining outcomes...") outcome_config = define_outcomes( event, @@ -206,14 +207,27 @@ def progress(step: str, status: str): ) # ========================================================================= - # Step 5: Assemble scenario spec + # Step 5: Generate timeline + background context # ========================================================================= - progress("5/5", "Assembling scenario spec...") + progress("5/6", "Generating timeline...") # Generate simulation config based on population size simulation_config = _determine_simulation_config(population_spec.meta.size) + timeline_events, background_context = generate_timeline( + scenario_description=description, + base_event=event, + simulation_config=simulation_config, + timeline_mode=timeline_mode, + ) + + # ========================================================================= + # Step 6: Assemble scenario spec + # ========================================================================= + + progress("6/6", "Assembling scenario spec...") + # Generate scenario name scenario_name = _generate_scenario_name(description) @@ -232,11 +246,13 @@ def progress(step: str, status: str): spec = ScenarioSpec( meta=meta, event=event, + timeline=timeline_events if timeline_events else None, seed_exposure=seed_exposure, interaction=interaction_config, spread=spread_config, outcomes=outcome_config, simulation=simulation_config, + background_context=background_context, ) # ========================================================================= diff --git a/extropy/scenario/timeline.py b/extropy/scenario/timeline.py new file mode 100644 index 0000000..5b720dd --- /dev/null +++ b/extropy/scenario/timeline.py @@ -0,0 +1,258 @@ +"""Timeline generation for evolving scenarios. + +Determines whether a scenario is static (single event) or evolving (multi-event), +and generates timeline events + background context via LLM. +""" + +import logging +from typing import Any + +from ..core.llm import reasoning_call +from ..core.models import Event, SimulationConfig, TimelineEvent + +logger = logging.getLogger(__name__) + +TIMELINE_SCHEMA: dict[str, Any] = { + "type": "object", + "properties": { + "scenario_type": { + "type": "string", + "enum": ["static", "evolving"], + "description": ( + "static = single event (product change, policy announcement), " + "evolving = developments over time (crisis, campaign, adoption)" + ), + }, + "background_context": { + "type": "string", + "description": ( + "Ambient context injected into every prompt (economic conditions, " + "cultural moment, season). 1-2 sentences." + ), + }, + "timeline_events": { + "type": "array", + "items": { + "type": "object", + "properties": { + "timestep": { + "type": "integer", + "description": "When this development occurs (0 = immediate)", + }, + "description": { + "type": "string", + "description": "Human-readable summary of the development", + }, + "content": { + "type": "string", + "description": "The announcement/news content for this timestep", + }, + "source": { + "type": "string", + "description": "Who/what announces this development", + }, + "credibility": { + "type": "number", + "minimum": 0, + "maximum": 1, + "description": "Source credibility (0-1)", + }, + "emotional_valence": { + "type": "number", + "minimum": -1, + "maximum": 1, + "description": "Emotional framing (-1 to 1)", + }, + }, + "required": [ + "timestep", + "description", + "content", + "source", + "credibility", + "emotional_valence", + ], + }, + "description": ( + "Only populated if scenario_type=evolving. " + "3-6 developments at meaningful intervals." + ), + }, + }, + "required": ["scenario_type", "background_context"], +} + + +def _build_timeline_prompt( + scenario_description: str, + base_event: Event, + simulation_config: SimulationConfig, + timeline_mode: str | None, +) -> str: + """Build the LLM prompt for timeline generation.""" + parts = [ + "You are designing how a scenario unfolds over time for a population simulation.", + "", + "## Scenario Description", + "", + scenario_description, + "", + "## Initial Event (t=0)", + "", + f"Type: {base_event.type.value}", + f"Source: {base_event.source}", + f"Content: {base_event.content}", + f"Credibility: {base_event.credibility}", + f"Emotional valence: {base_event.emotional_valence}", + "", + "## Simulation Parameters", + "", + f"Duration: {simulation_config.max_timesteps} {simulation_config.timestep_unit.value}s", + "", + "## Your Task", + "", + ] + + if timeline_mode == "static": + parts.extend( + [ + "This is a STATIC scenario. Generate only background_context.", + "Set scenario_type to 'static' and leave timeline_events empty.", + ] + ) + elif timeline_mode == "evolving": + parts.extend( + [ + "This is an EVOLVING scenario. Generate 3-6 timeline events.", + "Set scenario_type to 'evolving'.", + "", + "Timeline event guidelines:", + "- Space events at meaningful intervals (not every timestep)", + "- Each event should escalate, complicate, or resolve the situation", + "- Include reactions, developments, or new information", + "- Vary sources (officials, media, social, leaked info)", + ] + ) + else: + parts.extend( + [ + "Determine if this is a STATIC or EVOLVING scenario:", + "", + "STATIC scenarios (scenario_type='static'):", + "- One-time announcements (price changes, policy updates)", + "- Product launches with no expected developments", + "- Simple changes with immediate, stable reactions", + "", + "EVOLVING scenarios (scenario_type='evolving'):", + "- Crises that unfold over time (safety issues, scandals)", + "- Campaigns with multiple phases", + "- Situations where new information emerges", + "- Events that trigger reactions, counter-reactions", + "", + "For EVOLVING scenarios, generate 3-6 timeline events:", + "- Space events at meaningful intervals", + "- Each event should escalate, complicate, or resolve", + "- Vary sources appropriately", + ] + ) + + parts.extend( + [ + "", + "For ALL scenarios, generate background_context:", + "- 1-2 sentences of ambient framing", + "- Economic conditions, cultural moment, season if relevant", + "- This appears in every agent's reasoning prompt", + ] + ) + + return "\n".join(parts) + + +def generate_timeline( + scenario_description: str, + base_event: Event, + simulation_config: SimulationConfig, + timeline_mode: str | None = None, +) -> tuple[list[TimelineEvent], str | None]: + """Generate timeline events and background context. + + Args: + scenario_description: Natural language scenario description + base_event: Parsed t=0 event + simulation_config: Simulation parameters (timesteps, unit) + timeline_mode: Explicit mode override. If None, LLM decides based on scenario. + - "static": Single event, no timeline (Netflix-style) + - "evolving": Multi-event timeline (ASI-style) + + Returns: + Tuple of (timeline_events, background_context) + timeline_events will be empty for static scenarios + """ + prompt = _build_timeline_prompt( + scenario_description, + base_event, + simulation_config, + timeline_mode, + ) + + logger.info("[TIMELINE] Generating timeline and background context...") + + response = reasoning_call( + prompt=prompt, + response_schema=TIMELINE_SCHEMA, + schema_name="timeline_generation", + ) + + if not response: + logger.warning("[TIMELINE] LLM returned empty response, using defaults") + return [], None + + scenario_type = response.get("scenario_type", "static") + background_context = response.get("background_context") + raw_events = response.get("timeline_events", []) + + # Honor explicit mode override + if timeline_mode == "static": + scenario_type = "static" + raw_events = [] + elif timeline_mode == "evolving" and not raw_events: + logger.warning("[TIMELINE] Evolving mode requested but no events generated") + + logger.info( + f"[TIMELINE] Type: {scenario_type}, " + f"Events: {len(raw_events)}, " + f"Background: {background_context[:50] + '...' if background_context else 'None'}" + ) + + # Convert raw events to TimelineEvent models + timeline_events: list[TimelineEvent] = [] + + if scenario_type == "evolving" and raw_events: + for raw in raw_events: + timestep = raw.get("timestep", 0) + # Skip t=0 events (that's the base event) + if timestep == 0: + continue + + event = Event( + type=base_event.type, # Inherit type from base + content=raw.get("content", ""), + source=raw.get("source", base_event.source), + credibility=raw.get("credibility", base_event.credibility), + ambiguity=base_event.ambiguity, # Inherit + emotional_valence=raw.get("emotional_valence", 0.0), + ) + + timeline_event = TimelineEvent( + timestep=timestep, + event=event, + exposure_rules=None, # Reuse seed exposure rules + description=raw.get("description"), + ) + timeline_events.append(timeline_event) + + # Sort by timestep + timeline_events.sort(key=lambda te: te.timestep) + + return timeline_events, background_context diff --git a/extropy/simulation/engine.py b/extropy/simulation/engine.py index 1ab1d16..0b95faf 100644 --- a/extropy/simulation/engine.py +++ b/extropy/simulation/engine.py @@ -49,7 +49,11 @@ batch_reason_agents_async, create_reasoning_context, ) -from .propagation import apply_seed_exposures, propagate_through_network +from .propagation import ( + apply_seed_exposures, + apply_timeline_exposures, + propagate_through_network, +) from .stopping import evaluate_stopping_conditions from ..utils.callbacks import TimestepProgressCallback from ..utils.resource_governor import ResourceGovernor @@ -296,6 +300,9 @@ def __init__( self.total_reasoning_calls = 0 self.total_exposures = 0 + # Timeline state (active event for current timestep, if any) + self._active_timeline_event: Any = None + # Token usage tracking self.pivotal_input_tokens = 0 self.pivotal_output_tokens = 0 @@ -569,7 +576,7 @@ def _execute_timestep(self, timestep: int) -> TimestepSummary: return summary def _apply_exposures(self, timestep: int) -> int: - """Apply seed and network exposures for this timestep. + """Apply seed, timeline, and network exposures for this timestep. Returns: Total new exposures this timestep. @@ -583,6 +590,20 @@ def _apply_exposures(self, timestep: int) -> int: ) logger.info(f"[TIMESTEP {timestep}] Seed exposures: {new_seed}") + # Apply timeline event exposures (if any timeline event fires this timestep) + new_timeline, active_event = apply_timeline_exposures( + timestep, + self.scenario, + self.agents, + self.state_manager, + self.rng, + ) + if new_timeline > 0: + logger.info(f"[TIMESTEP {timestep}] Timeline exposures: {new_timeline}") + + # Store active timeline event for prompt rendering + self._active_timeline_event = active_event + new_network = propagate_through_network( timestep, self.scenario, @@ -595,7 +616,7 @@ def _apply_exposures(self, timestep: int) -> int: ) logger.info(f"[TIMESTEP {timestep}] Network exposures: {new_network}") - return new_seed + new_network + return new_seed + new_timeline + new_network def _reason_agents(self, timestep: int) -> tuple[int, int, int]: """Identify agents needing reasoning, run in chunks, commit per-chunk. @@ -1141,19 +1162,39 @@ def _build_reasoning_context( ctx.local_mood_summary = local_mood_summary ctx.background_context = self.scenario.background_context ctx.agent_names = self._agent_names + + # Populate Phase C fields + ctx.observable_peer_actions = self._compute_observable_adoption(agent_id) + ctx.conformity = agent.get("conformity") + + # Build timeline recap (accumulated events up to current timestep) + if self.scenario.timeline: + recap = [] + current_dev = None + unit = self.scenario.simulation.timestep_unit.value + for te in self.scenario.timeline: + if te.timestep < timestep: + desc = te.description or te.event.content[:80] + recap.append(f"{unit} {te.timestep + 1}: {desc}") + elif te.timestep == timestep: + current_dev = te.event.content + ctx.timeline_recap = recap if recap else None + ctx.current_development = current_dev + return ctx def _get_peer_opinions(self, agent_id: str) -> list[PeerOpinion]: - """Get opinions of connected peers. + """Get opinions of connected peers who have visibly shared. - In the redesigned simulation, peers share public_statement + sentiment, - NOT position labels. Position is output-only (Pass 2). + Only includes peers who have will_share=True — this models real-world + observability where agents can only perceive what peers have explicitly + shared or posted. Silent position changes are invisible. Args: agent_id: Agent ID Returns: - List of peer opinions + List of peer opinions (only from peers who shared) """ neighbors = self.adjacency.get(agent_id, []) opinions = [] @@ -1161,6 +1202,10 @@ def _get_peer_opinions(self, agent_id: str) -> list[PeerOpinion]: for neighbor_id, edge_data in neighbors[:5]: # Limit to 5 peers neighbor_state = self.state_manager.get_agent_state(neighbor_id) + # Only include peer if they actively shared (observable behavior) + if not neighbor_state.will_share: + continue + peer_sentiment = ( neighbor_state.public_sentiment if neighbor_state.public_sentiment is not None @@ -1168,7 +1213,7 @@ def _get_peer_opinions(self, agent_id: str) -> list[PeerOpinion]: ) peer_position = neighbor_state.public_position or neighbor_state.position - # Include peer if they have formed any public opinion. + # Include peer if they have formed any public opinion AND shared if peer_sentiment is not None or neighbor_state.public_statement: peer_data = self.agent_map.get(neighbor_id, {}) opinions.append( @@ -1176,15 +1221,40 @@ def _get_peer_opinions(self, agent_id: str) -> list[PeerOpinion]: agent_id=neighbor_id, peer_name=peer_data.get("first_name"), relationship=edge_data.get("type", "contact"), - position=peer_position, # kept for backwards compat + position=peer_position, sentiment=peer_sentiment, public_statement=neighbor_state.public_statement, - credibility=0.85, # Phase 2 will make this dynamic + credibility=0.85, ) ) return opinions + def _compute_observable_adoption(self, agent_id: str) -> int | None: + """Count neighbors who have visibly acted (will_share=True). + + Real-world model: agents can only perceive what peers have + explicitly shared or posted. Silent position changes are invisible. + + Args: + agent_id: Agent ID + + Returns: + Count of neighbors who shared, or None if no neighbors + """ + neighbors = self.adjacency.get(agent_id, []) + if not neighbors: + return None + + visible_actors = 0 + for neighbor_id, _ in neighbors: + ns = self.state_manager.get_agent_state(neighbor_id) + # Only count peers who shared/posted — observable behavior + if ns.will_share: + visible_actors += 1 + + return visible_actors + def _render_macro_summary(self, summary: TimestepSummary) -> str: """Convert a TimestepSummary into a human-readable vibes sentence. @@ -1508,6 +1578,7 @@ def run_simulation( writer_queue_size: int = 256, db_write_batch_size: int = 100, resource_governor: ResourceGovernor | None = None, + merged_pass: bool = False, ) -> SimulationSummary: """Run a simulation from a scenario file. @@ -1534,6 +1605,7 @@ def run_simulation( writer_queue_size: Maximum buffered chunks waiting for DB writer db_write_batch_size: Number of chunks applied per DB writer transaction resource_governor: Optional governor for runtime downshift guardrails + merged_pass: Use single merged reasoning pass instead of two-pass (experimental) Returns: SimulationSummary with results @@ -1665,6 +1737,7 @@ def _reset_runtime_tables(path: Path, run_key: str) -> None: multi_touch_threshold=multi_touch_threshold, random_seed=random_seed, max_concurrent=entropy_config.simulation.max_concurrent, + merged_pass=merged_pass, ) effective_strong = strong or entropy_config.resolve_sim_strong() effective_fast = fast or entropy_config.resolve_sim_fast() diff --git a/extropy/simulation/propagation.py b/extropy/simulation/propagation.py index 5b71349..989b78b 100644 --- a/extropy/simulation/propagation.py +++ b/extropy/simulation/propagation.py @@ -12,11 +12,13 @@ from ..core.models import ( ScenarioSpec, + Event, ExposureRule, SpreadConfig, ExposureRecord, SimulationEvent, SimulationEventType, + TimelineEvent, ) from ..population.sampler import eval_condition, ConditionError from .state import StateManager @@ -149,6 +151,110 @@ def apply_seed_exposures( return new_exposures +def apply_timeline_exposures( + timestep: int, + scenario: ScenarioSpec, + agents: list[dict[str, Any]], + state_manager: StateManager, + rng: random.Random, +) -> tuple[int, Event | None]: + """Apply timeline event exposures for this timestep. + + Timeline events represent scenario developments (new information, escalations, + resolutions) that occur at specific timesteps. Each timeline event can have + custom exposure rules or reuse the seed exposure rules with updated content. + + Args: + timestep: Current timestep + scenario: Scenario specification + agents: List of all agents + state_manager: State manager for recording exposures + rng: Random number generator + + Returns: + Tuple of (new_exposure_count, active_timeline_event_or_none) + """ + if not scenario.timeline: + return 0, None + + # Find timeline event for this timestep + active_event: TimelineEvent | None = None + for te in scenario.timeline: + if te.timestep == timestep: + active_event = te + break + + if active_event is None: + return 0, None + + logger.info( + f"[TIMELINE] Timestep {timestep}: Applying timeline event - " + f"{active_event.description or active_event.event.content[:50]}" + ) + + # Determine which exposure rules to use + if active_event.exposure_rules is not None: + rules = active_event.exposure_rules + else: + # Reuse seed exposure rules but substitute with timeline event content + rules = scenario.seed_exposure.rules + + new_exposures = 0 + event_content = active_event.event.content + event_credibility = active_event.event.credibility + + for rule in rules: + # For timeline events, ignore the rule's timestep field — we're applying now + # (Rules are designed for t=0 seed exposure but we reuse them for timeline) + channel_credibility = get_channel_credibility(scenario, rule.channel) + + for i, agent in enumerate(agents): + agent_id = agent.get("_id", str(i)) + + # Evaluate the "when" condition (skip timestep check since we're applying now) + when_cond = rule.when.lower() + if when_cond != "true" and when_cond != "1": + try: + if not eval_condition(rule.when, agent, raise_on_error=True): + continue + except ConditionError as e: + logger.warning( + f"Failed to evaluate timeline exposure rule '{rule.when}': {e}" + ) + continue + + # Probabilistic exposure + if rng.random() > rule.probability: + continue + + exposure = ExposureRecord( + timestep=timestep, + channel=rule.channel, + source_agent_id=None, + content=event_content, + credibility=min(1.0, event_credibility * channel_credibility), + ) + + state_manager.record_exposure(agent_id, exposure) + state_manager.log_event( + SimulationEvent( + timestep=timestep, + event_type=SimulationEventType.SEED_EXPOSURE, + agent_id=agent_id, + details={ + "channel": rule.channel, + "timeline_event": True, + "description": active_event.description, + }, + ) + ) + new_exposures += 1 + + logger.info(f"[TIMELINE] Timestep {timestep}: {new_exposures} new exposures") + + return new_exposures, active_event.event + + def get_neighbors( network: dict[str, Any], agent_id: str, diff --git a/extropy/simulation/reasoning.py b/extropy/simulation/reasoning.py index 55d1535..47c83b0 100644 --- a/extropy/simulation/reasoning.py +++ b/extropy/simulation/reasoning.py @@ -176,6 +176,36 @@ def build_pass1_prompt( if context.macro_summary: prompt_parts.extend(["", context.macro_summary]) + # --- Timeline recap (Phase C) --- + if context.timeline_recap: + prompt_parts.extend(["", "## What's Happened So Far", ""]) + for entry in context.timeline_recap: + prompt_parts.append(f"- {entry}") + + # --- Current development (Phase C) --- + if context.current_development: + prompt_parts.extend( + [ + "", + f"## This {context.timestep_unit}'s Development", + "", + context.current_development, + ] + ) + + # --- Conformity self-awareness (Phase C) --- + if context.conformity is not None: + prompt_parts.append("") + if context.conformity >= 0.7: + prompt_parts.append( + "I tend to go along with what most people around me are doing." + ) + elif context.conformity <= 0.3: + prompt_parts.append( + "I tend to form my own opinion regardless of what others think." + ) + # Mid-range (0.3-0.7): no explicit phrasing (neutral) + # --- Memory trace (full, uncapped, fidelity-gated) --- if context.memory_trace: prompt_parts.extend(["", "## What I've Been Thinking", ""]) @@ -376,18 +406,22 @@ def build_pass2_schema(outcomes: OutcomeConfig) -> dict[str, Any] | None: """Build JSON schema for Pass 2 (classification) from scenario outcomes. Only includes categorical, boolean, and float outcomes — - these are the ones that need classification. + open_ended outcomes are captured in Pass 1 free text and skipped here. Args: outcomes: Outcome configuration from scenario Returns: - JSON schema dictionary, or None if no classifiable outcomes + JSON schema dictionary, or None if all outcomes are open_ended (skip Pass 2) """ properties: dict[str, Any] = {} required: list[str] = [] for outcome in outcomes.suggested_outcomes: + # Skip open_ended outcomes — they're captured in Pass 1 free text + if outcome.type == OutcomeType.OPEN_ENDED: + continue + outcome_prop: dict[str, Any] = { "description": outcome.description, } @@ -401,8 +435,6 @@ def build_pass2_schema(outcomes: OutcomeConfig) -> dict[str, Any] | None: outcome_prop["type"] = "number" outcome_prop["minimum"] = outcome.range[0] outcome_prop["maximum"] = outcome.range[1] - elif outcome.type == OutcomeType.OPEN_ENDED: - outcome_prop["type"] = "string" else: outcome_prop["type"] = "string" @@ -410,6 +442,7 @@ def build_pass2_schema(outcomes: OutcomeConfig) -> dict[str, Any] | None: if outcome.required: required.append(outcome.name) + # If no classifiable outcomes remain (all were open_ended), skip Pass 2 if not properties: return None @@ -421,6 +454,96 @@ def build_pass2_schema(outcomes: OutcomeConfig) -> dict[str, Any] | None: } +# ============================================================================= +# Merged pass: Combined role-play + classification schema +# ============================================================================= + + +def build_merged_schema(outcomes: OutcomeConfig) -> dict[str, Any]: + """Build JSON schema for merged single-pass reasoning. + + Combines Pass 1 (role-play) fields with Pass 2 (classification) outcome fields + into a single schema. Used when merged_pass=True for cheaper/faster reasoning. + + Args: + outcomes: Outcome configuration from scenario + + Returns: + JSON schema dictionary with both reasoning and outcome fields + """ + # Start with Pass 1 fields + properties: dict[str, Any] = { + "reasoning": { + "type": "string", + "description": "Your honest first reaction in 2-4 sentences. Be direct — state what you think, not both sides.", + }, + "public_statement": { + "type": "string", + "description": "What would you bluntly tell a friend about this? One strong sentence.", + }, + "reasoning_summary": { + "type": "string", + "description": "A single sentence capturing your core reaction (for your own memory).", + }, + "sentiment": { + "type": "number", + "minimum": -1.0, + "maximum": 1.0, + "description": "Your emotional reaction: -1 = very negative, 0 = neutral, 1 = very positive.", + }, + "conviction": { + "type": "integer", + "minimum": 0, + "maximum": 100, + "description": "How sure are you? 0 = genuinely no idea what to think, 25 = starting to lean one way, 50 = clear opinion, 75 = quite sure and hard to change your mind, 100 = absolutely certain.", + }, + "will_share": { + "type": "boolean", + "description": "Will you actively discuss or share this with others?", + }, + } + required = [ + "reasoning", + "public_statement", + "reasoning_summary", + "sentiment", + "conviction", + "will_share", + ] + + # Add Pass 2 outcome fields (skip open_ended) + for outcome in outcomes.suggested_outcomes: + if outcome.type == OutcomeType.OPEN_ENDED: + continue + + outcome_prop: dict[str, Any] = { + "description": outcome.description, + } + + if outcome.type == OutcomeType.CATEGORICAL and outcome.options: + outcome_prop["type"] = "string" + outcome_prop["enum"] = outcome.options + elif outcome.type == OutcomeType.BOOLEAN: + outcome_prop["type"] = "boolean" + elif outcome.type == OutcomeType.FLOAT and outcome.range: + outcome_prop["type"] = "number" + outcome_prop["minimum"] = outcome.range[0] + outcome_prop["maximum"] = outcome.range[1] + else: + outcome_prop["type"] = "string" + + properties[outcome.name] = outcome_prop + if outcome.required: + required.append(outcome.name) + + return { + "type": "object", + "properties": properties, + "required": required, + "additionalProperties": False, + } + + # ============================================================================= # Primary position outcome extraction # ============================================================================= @@ -644,6 +767,131 @@ async def _reason_agent_two_pass_async( ) +# ============================================================================= +# Merged pass reasoning (async) +# ============================================================================= + + +async def _reason_agent_merged_async( + context: ReasoningContext, + scenario: ScenarioSpec, + config: SimulationRunConfig, + rate_limiter: Any = None, +) -> ReasoningResponse | None: + """Single-pass async reasoning for an agent. + + Combines role-play and classification into one LLM call with a merged schema. + Cheaper/faster than two-pass but may produce less nuanced reasoning. + + Args: + context: Reasoning context + scenario: Scenario specification + config: Simulation run configuration + rate_limiter: Optional DualRateLimiter for API pacing + + Returns: + ReasoningResponse, or None if failed + """ + prompt = build_pass1_prompt(context, scenario) + schema = build_merged_schema(scenario.outcomes) + position_outcome = _get_primary_position_outcome(scenario) + + # Use main model for merged pass + model = config.strong or None + + usage = TokenUsage() + for attempt in range(config.max_retries): + try: + if rate_limiter: + estimated_input = len(prompt) // 4 + estimated_output = 400 # slightly larger for combined response + await rate_limiter.pivotal.acquire( + estimated_input_tokens=estimated_input, + estimated_output_tokens=estimated_output, + ) + + call_start = time.time() + response, usage = await asyncio.wait_for( + simple_call_async( + prompt=prompt, + response_schema=schema, + schema_name="agent_reasoning", + model=model, + ), + timeout=30.0, + ) + call_elapsed = time.time() - call_start + + logger.info(f"[MERGED] Agent {context.agent_id} - {call_elapsed:.2f}s") + + if not response: + continue + + break + except asyncio.TimeoutError: + logger.warning( + f"[MERGED] Agent {context.agent_id} - attempt {attempt + 1} timed out after 30s" + ) + if attempt == config.max_retries - 1: + return None + except Exception as e: + logger.warning( + f"[MERGED] Agent {context.agent_id} - attempt {attempt + 1} failed: {e}" + ) + if attempt == config.max_retries - 1: + return None + else: + return None + + # Extract fields + reasoning = response.get("reasoning", "") + public_statement = response.get("public_statement", "") + reasoning_summary = response.get("reasoning_summary", "") + sentiment = response.get("sentiment") + if sentiment is not None: + sentiment = max(-1.0, min(1.0, float(sentiment))) + conviction_score = response.get("conviction") + will_share = response.get("will_share", False) + + conviction_float = score_to_conviction_float(conviction_score) + + # Extract position from outcomes + position = None + if position_outcome and position_outcome in response: + position = response[position_outcome] + + # Build outcomes dict (everything except the Pass 1 fields) + pass1_fields = { + "reasoning", + "public_statement", + "reasoning_summary", + "sentiment", + "conviction", + "will_share", + } + outcomes = {k: v for k, v in response.items() if k not in pass1_fields} + + # Merge sentiment into outcomes for backwards compat + if sentiment is not None: + outcomes["sentiment"] = sentiment + + return ReasoningResponse( + position=position, + sentiment=sentiment, + conviction=conviction_float, + public_statement=public_statement, + reasoning_summary=reasoning_summary, + action_intent=outcomes.get("action_intent"), + will_share=will_share, + reasoning=reasoning, + outcomes=outcomes, + pass1_input_tokens=usage.input_tokens, + pass1_output_tokens=usage.output_tokens, + pass2_input_tokens=0, + pass2_output_tokens=0, + ) + + # ============================================================================= # Synchronous reasoning (kept for backwards compatibility / testing) # ============================================================================= @@ -801,7 +1049,10 @@ async def batch_reason_agents_async( rate_limiter: Any = None, on_agent_done: Callable[[str, ReasoningResponse | None], None] | None = None, ) -> tuple[list[tuple[str, ReasoningResponse | None]], BatchTokenUsage]: - """Reason multiple agents concurrently with two-pass reasoning. + """Reason multiple agents concurrently. + + Uses two-pass reasoning by default (Pass 1 role-play, Pass 2 classification). + When config.merged_pass=True, uses single-pass with combined schema. This is an async coroutine — call from within an existing event loop. The caller is responsible for provider cleanup when the loop ends. @@ -809,7 +1060,7 @@ async def batch_reason_agents_async( Args: contexts: List of reasoning contexts scenario: Scenario specification - config: Simulation run configuration + config: Simulation run configuration (merged_pass controls mode) max_concurrency: Max concurrent API calls (None/0 = auto from rate limiter) rate_limiter: Optional DualRateLimiter instance for API pacing on_agent_done: Optional callback(agent_id, response) called per agent after reasoning @@ -822,7 +1073,8 @@ async def batch_reason_agents_async( return [], BatchTokenUsage() total = len(contexts) - logger.info(f"[REASONING] Starting two-pass async reasoning for {total} agents") + mode = "merged" if config.merged_pass else "two-pass" + logger.info(f"[REASONING] Starting {mode} async reasoning for {total} agents") if rate_limiter: rpm_derived = rate_limiter.max_safe_concurrent @@ -846,7 +1098,14 @@ async def reason_with_pacing( ctx: ReasoningContext, ) -> tuple[int, str, ReasoningResponse | None, float]: start = time.time() - result = await _reason_agent_two_pass_async(ctx, scenario, config, rate_limiter) + if config.merged_pass: + result = await _reason_agent_merged_async( + ctx, scenario, config, rate_limiter + ) + else: + result = await _reason_agent_two_pass_async( + ctx, scenario, config, rate_limiter + ) elapsed = time.time() - start completed[0] += 1 diff --git a/tests/test_agent_focus.py b/tests/test_agent_focus.py index 833f381..a4a661b 100644 --- a/tests/test_agent_focus.py +++ b/tests/test_agent_focus.py @@ -207,9 +207,9 @@ def test_partners_are_npc(self): # Check that partner is NPC for agent in primaries_with_partners: assert "partner_npc" in agent, "Partner should be in partner_npc field" - assert ( - agent.get("partner_id") is None - ), "partner_id should be None for NPC partners" + assert agent.get("partner_id") is None, ( + "partner_id should be None for NPC partners" + ) # Check NPC partner has expected fields npc = agent["partner_npc"] @@ -228,18 +228,18 @@ def test_no_partner_agents_in_result(self): a for a in agents if a.get("household_role") == "adult_secondary" ] - assert ( - len(secondary_adults) == 0 - ), "In primary_only mode, partners should be NPCs, not agents" + assert len(secondary_adults) == 0, ( + "In primary_only mode, partners should be NPCs, not agents" + ) def test_total_agent_count_matches(self): spec = _make_household_spec(size=100, agent_focus="surgeons") result = sample_population(spec, count=100, seed=42) # Should produce at most the requested count - assert ( - len(result.agents) <= 100 - ), "Agent count should not exceed requested count" + assert len(result.agents) <= 100, ( + "Agent count should not exceed requested count" + ) def test_npc_partner_has_correlated_demographics(self): spec = _make_household_spec(size=300, agent_focus="surgeons") @@ -271,9 +271,9 @@ def test_npc_partner_shares_last_name(self): for agent in result.agents: npc = agent.get("partner_npc") if npc and agent.get("last_name"): - assert ( - npc.get("last_name") == agent["last_name"] - ), "NPC partner should share last name" + assert npc.get("last_name") == agent["last_name"], ( + "NPC partner should share last name" + ) class TestAgentFocusCouples: @@ -306,12 +306,12 @@ def test_both_partners_are_agents(self): assert pid in id_map, f"Partner {pid} should be a full agent" partner = id_map[pid] - assert ( - partner.get("household_role") == "adult_secondary" - ), "Partner should be adult_secondary" - assert ( - partner.get("partner_id") == agent["_id"] - ), "Partner should link back to primary" + assert partner.get("household_role") == "adult_secondary", ( + "Partner should be adult_secondary" + ) + assert partner.get("partner_id") == agent["_id"], ( + "Partner should link back to primary" + ) def test_no_npc_partners(self): spec = _make_household_spec(size=200, agent_focus="retired couples") @@ -319,9 +319,9 @@ def test_no_npc_partners(self): # No agents should have partner_npc field agents_with_npc = [a for a in result.agents if "partner_npc" in a] - assert ( - len(agents_with_npc) == 0 - ), "In couples mode, partners should be full agents, not NPCs" + assert len(agents_with_npc) == 0, ( + "In couples mode, partners should be full agents, not NPCs" + ) def test_partners_share_household_id(self): spec = _make_household_spec(size=200, agent_focus="retired couples") @@ -333,9 +333,9 @@ def test_partners_share_household_id(self): if pid: partner = id_map.get(pid) assert partner is not None - assert ( - partner["household_id"] == agent["household_id"] - ), "Partners should share household_id" + assert partner["household_id"] == agent["household_id"], ( + "Partners should share household_id" + ) def test_partners_share_household_scoped_attrs(self): spec = _make_household_spec(size=200, agent_focus="retired couples") @@ -348,9 +348,9 @@ def test_partners_share_household_scoped_attrs(self): partner = id_map.get(pid) assert partner is not None # 'state' is household-scoped - assert ( - agent["state"] == partner["state"] - ), "Partners should share household-scoped attrs" + assert agent["state"] == partner["state"], ( + "Partners should share household-scoped attrs" + ) def test_kids_are_npcs(self): spec = _make_household_spec(size=200, agent_focus="retired couples") @@ -363,15 +363,13 @@ def test_kids_are_npcs(self): if a.get("household_role", "").startswith("dependent_") ] - assert ( - len(kids_as_agents) == 0 - ), "In couples mode, kids should be NPCs, not agents" + assert len(kids_as_agents) == 0, ( + "In couples mode, kids should be NPCs, not agents" + ) # Check that some agents have dependents agents_with_kids = [a for a in result.agents if a.get("dependents")] - assert ( - len(agents_with_kids) > 0 - ), "Some households should have kids as NPCs" + assert len(agents_with_kids) > 0, "Some households should have kids as NPCs" class TestAgentFocusFamilies: @@ -409,22 +407,19 @@ def test_kid_agents_have_correct_structure(self): for kid in kid_agents: assert "_id" in kid, "Kid agent should have _id" assert "household_id" in kid, "Kid agent should have household_id" - assert ( - "household_role" in kid - ), "Kid agent should have household_role" - assert kid["household_role"].startswith( - "dependent_" - ), "Kid role should start with dependent_" - assert ( - "relationship_to_primary" in kid - ), "Kid should have relationship_to_primary" + assert "household_role" in kid, "Kid agent should have household_role" + assert kid["household_role"].startswith("dependent_"), ( + "Kid role should start with dependent_" + ) + assert "relationship_to_primary" in kid, ( + "Kid should have relationship_to_primary" + ) assert "age" in kid, "Kid should have age" assert "gender" in kid, "Kid should have gender" def test_kid_agents_inherit_household_attrs(self): spec = _make_household_spec(size=200, agent_focus="families") result = sample_population(spec, count=200, seed=42) - id_map = {a["_id"]: a for a in result.agents} kid_agents = [ a @@ -447,9 +442,9 @@ def test_kid_agents_inherit_household_attrs(self): if parent: # 'state' is household-scoped, should be inherited - assert ( - kid["state"] == parent["state"] - ), "Kid should inherit household-scoped attrs from parent" + assert kid["state"] == parent["state"], ( + "Kid should inherit household-scoped attrs from parent" + ) def test_both_partners_still_agents(self): spec = _make_household_spec(size=200, agent_focus="families") @@ -460,21 +455,20 @@ def test_both_partners_still_agents(self): a for a in result.agents if a.get("household_role") == "adult_secondary" ] - assert ( - len(secondary_adults) > 0 - ), "In families mode, both partners should be agents" + assert len(secondary_adults) > 0, ( + "In families mode, both partners should be agents" + ) def test_overflow_kids_remain_npc(self): """If we hit the agent count limit, remaining kids should be NPCs.""" spec = _make_household_spec(size=50, agent_focus="families") result = sample_population(spec, count=50, seed=42) - # Check if any primary has NPC dependents even in families mode - # (happens when agent count is reached) - agents_with_npc_deps = [a for a in result.agents if a.get("dependents")] + # Should not exceed requested count + assert len(result.agents) <= 50, "Should not exceed requested agent count" - # This is ok - overflow protection should keep some kids as NPCs - # if we hit the agent limit + # Some households may have NPC dependents if we hit the limit + # This is expected overflow protection behavior class TestAgentFocusDefault: @@ -492,9 +486,9 @@ def test_none_defaults_to_primary_only(self): ] # Should have some NPC partners - assert ( - len(primaries_with_npc) > 0 - ), "Default (None) should behave like primary_only" + assert len(primaries_with_npc) > 0, ( + "Default (None) should behave like primary_only" + ) def test_no_secondary_agents_by_default(self): spec = _make_household_spec(size=200, agent_focus=None) @@ -504,9 +498,9 @@ def test_no_secondary_agents_by_default(self): a for a in result.agents if a.get("household_role") == "adult_secondary" ] - assert ( - len(secondary_adults) == 0 - ), "Default should be primary_only (no partner agents)" + assert len(secondary_adults) == 0, ( + "Default should be primary_only (no partner agents)" + ) class TestAgentFocusMetadata: diff --git a/tests/test_compiler.py b/tests/test_compiler.py index eec5597..ab34cf8 100644 --- a/tests/test_compiler.py +++ b/tests/test_compiler.py @@ -123,8 +123,10 @@ def mock_files(self, minimal_population_spec, tmp_path): @patch("extropy.scenario.compiler.generate_seed_exposure") @patch("extropy.scenario.compiler.determine_interaction_model") @patch("extropy.scenario.compiler.define_outcomes") + @patch("extropy.scenario.compiler.generate_timeline") def test_creates_valid_scenario( self, + mock_timeline, mock_outcomes, mock_interaction, mock_exposure, @@ -193,6 +195,8 @@ def test_creates_valid_scenario( ], ) + mock_timeline.return_value = ([], None) # No timeline events, no background + spec, validation_result = create_scenario( description="Test product launch scenario", population_spec_path=pop_path, @@ -210,8 +214,10 @@ def test_creates_valid_scenario( @patch("extropy.scenario.compiler.generate_seed_exposure") @patch("extropy.scenario.compiler.determine_interaction_model") @patch("extropy.scenario.compiler.define_outcomes") + @patch("extropy.scenario.compiler.generate_timeline") def test_progress_callback_called( self, + mock_timeline, mock_outcomes, mock_interaction, mock_exposure, @@ -271,6 +277,8 @@ def test_progress_callback_called( ], ) + mock_timeline.return_value = ([], None) # No timeline events, no background + progress_calls = [] def on_progress(step, status): @@ -285,8 +293,8 @@ def on_progress(step, status): on_progress=on_progress, ) - # Should get 5 progress calls (steps 1/5 through 5/5) - # Note: step 1/5 is called twice (once for loading, once for parsing) - assert len(progress_calls) >= 5 + # Should get 6 progress calls (steps 1/6 through 6/6) + # Note: step 1/6 is called twice (once for loading, once for parsing) + assert len(progress_calls) >= 6 steps = [call[0] for call in progress_calls] - assert "5/5" in steps + assert "6/6" in steps diff --git a/tests/test_reasoning_prompts.py b/tests/test_reasoning_prompts.py index f17f06a..630e36d 100644 --- a/tests/test_reasoning_prompts.py +++ b/tests/test_reasoning_prompts.py @@ -672,6 +672,7 @@ def test_float_outcome(self): assert "satisfaction" not in schema["required"] def test_open_ended_outcome(self): + # Open-ended outcomes are skipped in Pass 2 — they're captured in Pass 1 free text outcomes = OutcomeConfig( suggested_outcomes=[ OutcomeDefinition( @@ -683,7 +684,8 @@ def test_open_ended_outcome(self): ] ) schema = build_pass2_schema(outcomes) - assert schema["properties"]["feedback"]["type"] == "string" + # Schema should be None when all outcomes are open_ended (skip Pass 2 entirely) + assert schema is None def test_multiple_outcomes(self): outcomes = OutcomeConfig(