Skip to content

Commit ee0805f

Browse files
Tighten sim rerreasoning caps and refresh simulate docs
1 parent 5af82ac commit ee0805f

7 files changed

Lines changed: 541 additions & 29 deletions

File tree

docs/commands.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -376,8 +376,8 @@ extropy simulate -s asi-announcement --early-convergence off
376376
4. Resolves effective models/rate limits from CLI overrides then config defaults.
377377
5. Runs simulation loop:
378378
- seed + timeline + network exposures,
379-
- chunked reasoning (two-pass by default, merged with `--merged-pass`),
380-
- medium/high conversation interleaving,
379+
- chunked reasoning (two-pass by default, merged with `--merged-pass`) with per-timestep reasoning budget,
380+
- medium/high conversation interleaving with novelty + per-timestep conversation budget,
381381
- timestep summary + stopping checks.
382382
6. Persists run state to canonical `study.db` and writes results artifacts to `results/{scenario}/` (or `--output`).
383383

@@ -462,6 +462,8 @@ Precedence:
462462
- `--early-convergence auto` uses scenario YAML value when set; otherwise runtime auto-rule applies (do not early-stop while future timeline events remain).
463463
- `low` fidelity skips conversations; `medium` and `high` enable conversations with stricter per-agent caps at lower fidelity.
464464
- `--retention-lite` drops full raw reasoning payload retention to reduce DB/storage volume.
465+
- Timeline events without explicit `exposure_rules` use bounded fallback filtering (not full-seed-rule replay).
466+
- `extreme` re-reasoning is bounded to a high-salience subset, not all aware agents.
465467

466468
### Failure Behavior
467469

docs/pipeline/simulate.md

Lines changed: 28 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,13 +17,22 @@ This is not a patch plan. It is a contract + current-state reality map.
1717
- Simulation is now scenario-centric in lookup order (`scenario_id` first, legacy IDs as fallback).
1818
- Runtime loop supports DB-backed resume/checkpoint behavior at timestep and chunk levels.
1919
- Timeline events support provenance epochs (`info_epoch`) and forced re-reasoning escalation.
20+
- Timeline fallback exposure no longer reuses full seed rule sets blindly:
21+
- fallback rules are filtered to representative channel rules,
22+
- direct timeline exposure is deduplicated per-agent per-event,
23+
- direct timeline exposure is capped by intensity (`normal|high|extreme`).
2024
- Reasoning supports two modes:
2125
- default two-pass (role-play + classification),
2226
- optional merged single-pass (`--merged-pass`).
2327
- Fidelity modes actively control conversation behavior:
2428
- `low`: no conversations,
2529
- `medium`: bounded conversations after chunks,
2630
- `high`: deeper conversations + extra public-text classification.
31+
- Runtime now applies deterministic per-timestep budgets:
32+
- bounded reasoning candidate set,
33+
- bounded conversation count,
34+
- novelty gate before conversation execution.
35+
- Reasoning prompt payload is now trimmed by fidelity (recent exposures + recent memory only).
2736
- Early convergence behavior is now explicit and overrideable (`auto|on|off`).
2837
- Runtime exports include `by_timestep.json`, `meta.json`, and conditional conversation/social artifacts.
2938
- State manager now auto-upgrades legacy simulation tables/columns (including timeline-provenance exposure fields) on startup.
@@ -273,8 +282,8 @@ Each timestep does:
273282
- seed,
274283
- timeline event (if this timestep has one),
275284
- network propagation,
276-
3. reason selected agents in chunks,
277-
4. interleave conversations after chunks (medium/high fidelity),
285+
3. select and cap reasoning candidates, then reason in chunks,
286+
4. interleave conversations after chunks (medium/high fidelity, novelty + budget gated),
278287
5. record social posts from sharers,
279288
6. apply conviction decay to non-reasoned agents,
280289
7. compute and save timestep summary,
@@ -288,6 +297,12 @@ Timeline exposures carry:
288297
- `info_epoch`,
289298
- optional `re_reasoning_intensity` (`normal|high|extreme`).
290299

300+
When a timeline event does not provide explicit `exposure_rules`, runtime fallback is now bounded:
301+
- prefers rules authored for current timestep (else earliest authored seed step),
302+
- keeps representative rule(s) per channel rather than all overlapping seed rules,
303+
- deduplicates direct event exposure per agent,
304+
- applies an intensity-based direct-exposure cap to prevent saturation spikes.
305+
291306
### Reasoning behavior
292307
Default reasoning mode is two-pass:
293308
1. Pass 1 role-play response (free text + sentiment/conviction/share/action fields).
@@ -298,6 +313,7 @@ Merged mode (`--merged-pass`) combines both in one schema/call.
298313
Important current behavior:
299314
- open-ended outcomes are not classified in Pass 2; they remain in free-text reasoning.
300315
- high fidelity adds separate public-text classification pass for public position.
316+
- reasoning context is trimmed to recent exposure/memory windows by fidelity to bound token growth.
301317

302318
Primary file:
303319
- `extropy/simulation/reasoning.py`
@@ -311,13 +327,22 @@ Primary file:
311327

312328
Committed agents are protected from routine multi-touch re-reasoning unless forced by timeline policy.
313329

330+
Additional runtime gating now applies before calls:
331+
- explicit forced IDs are no longer expanded to all aware agents under `extreme`,
332+
- forced expansion uses a bounded high-salience subset (directly impacted + nearby connected + sharers),
333+
- per-timestep reasoning budget caps total LLM calls.
334+
314335
### Conversation behavior by fidelity
315336
- `low`: no conversations executed.
316337
- `medium`: conversations enabled; per-agent cap 1 per timestep; 4-message conversations.
317338
- `high`: per-agent cap 2 per timestep; 6-message conversations.
318339

319340
Conversation outcomes can override provisional reasoning state.
320341

342+
Conversation execution now also requires:
343+
- per-timestep global conversation budget not exhausted,
344+
- novelty gate pass (material shift/uncertainty/share signal from fresh reasoning).
345+
321346
Primary file:
322347
- `extropy/simulation/conversation.py`
323348

@@ -399,7 +424,7 @@ Open-ended outcomes live in reasoning text, not in structured Pass 2 outcome pay
399424
`--resume` requires `--run-id`; incorrect run-id usage can lead to user confusion about expected continuation behavior.
400425

401426
### 6) Prompt/context size grows with memory and history
402-
High-fidelity prompts include richer memory and context windows, which improves realism but increases token load and cost volatility.
427+
Prompt growth is now bounded by fidelity-window trims, but high-fidelity runs still carry materially higher token/cost load than low fidelity.
403428

404429
### 7) Legacy fallback paths still exist
405430
Simulation still includes legacy lookup fallbacks (`population_id`, `network_id`) which can mask schema drift in older studies.

examples/study-03/scenario/iran-strikes/scenario.v1.yaml

Lines changed: 94 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -290,7 +290,26 @@ timeline:
290290
credibility: 0.95
291291
ambiguity: 0.28
292292
emotional_valence: -0.72
293-
exposure_rules: null
293+
exposure_rules:
294+
- channel: military_official_channels
295+
when: military_connection in ('Active duty', 'Veteran', 'Immediate family member')
296+
or reservist_or_national_guard == True
297+
probability: 1.0
298+
timestep: 0
299+
- channel: broadcast_breaking_news
300+
when: primary_info_source == 'Cable/broadcast TV news'
301+
probability: 0.95
302+
timestep: 0
303+
- channel: push_notification_online_news
304+
when: primary_info_source == 'Online news sites' and tech_adoption_posture in
305+
('Early adopter', 'Innovator', 'Early majority adopter')
306+
probability: 0.95
307+
timestep: 0
308+
- channel: social_media_viral_spread
309+
when: social_media_behavior in ('Active poster/Influencer', 'Occasional poster')
310+
and primary_info_source == 'Social media'
311+
probability: 0.92
312+
timestep: 0
294313
description: 'Week 1: Iranian missile retaliation kills US troops; gas prices surge
295314
and markets crash'
296315
re_reasoning_intensity: extreme
@@ -309,7 +328,29 @@ timeline:
309328
credibility: 0.94
310329
ambiguity: 0.28
311330
emotional_valence: -0.8
312-
exposure_rules: null
331+
exposure_rules:
332+
- channel: push_notification_online_news
333+
when: primary_info_source in ('Podcasts', 'Print media') and tech_adoption_posture
334+
not in ('Technophobe/Avoider', 'Late majority adopter')
335+
probability: 0.8
336+
timestep: 1
337+
- channel: broadcast_breaking_news
338+
when: age >= 50 and primary_info_source != 'Avoids news'
339+
probability: 0.85
340+
timestep: 1
341+
- channel: social_media_viral_spread
342+
when: age < 40 and social_media_behavior in ('Passive consumer', 'Occasional poster')
343+
probability: 0.88
344+
timestep: 1
345+
- channel: broadcast_breaking_news
346+
when: political_engagement_level == 'Highly engaged partisan'
347+
probability: 0.9
348+
timestep: 1
349+
- channel: social_media_viral_spread
350+
when: age >= 40 and social_media_behavior != 'Non-user' and primary_info_source
351+
== 'Social media'
352+
probability: 0.82
353+
timestep: 1
313354
description: 'Week 3: Iran blockades the Strait of Hormuz; oil hits $140/barrel;
314355
gas tops $6/gallon nationally'
315356
re_reasoning_intensity: extreme
@@ -329,7 +370,24 @@ timeline:
329370
credibility: 0.93
330371
ambiguity: 0.28
331372
emotional_valence: -0.85
332-
exposure_rules: null
373+
exposure_rules:
374+
- channel: personal_network_word_of_mouth
375+
when: primary_info_source == 'Local news' or social_media_behavior == 'Non-user'
376+
probability: 0.75
377+
timestep: 2
378+
- channel: broadcast_breaking_news
379+
when: primary_info_source == 'Local news'
380+
probability: 0.8
381+
timestep: 2
382+
- channel: personal_network_word_of_mouth
383+
when: supply_chain_job_exposure == True or employment_sector in ('Federal government',
384+
'State/local government')
385+
probability: 0.85
386+
timestep: 2
387+
- channel: social_media_viral_spread
388+
when: economic_anxiety > 0.6 and social_media_behavior != 'Non-user'
389+
probability: 0.78
390+
timestep: 2
333391
description: 'Week 5: Economic cascade — unemployment spikes, food prices surge,
334392
oil briefly hits $180/barrel after Iran strikes Saudi oil facilities'
335393
re_reasoning_intensity: extreme
@@ -350,7 +408,24 @@ timeline:
350408
credibility: 0.91
351409
ambiguity: 0.28
352410
emotional_valence: -0.6
353-
exposure_rules: null
411+
exposure_rules:
412+
- channel: personal_network_word_of_mouth
413+
when: primary_info_source == 'Avoids news' and extraversion > 0.5
414+
probability: 0.65
415+
timestep: 3
416+
- channel: broadcast_breaking_news
417+
when: urban_rural == 'Rural' and age >= 45
418+
probability: 0.8
419+
timestep: 3
420+
- channel: personal_network_word_of_mouth
421+
when: urban_rural == 'Rural' and primary_info_source not in ('Cable/broadcast
422+
TV news', 'Online news sites')
423+
probability: 0.7
424+
timestep: 3
425+
- channel: social_media_viral_spread
426+
when: war_escalation_fear > 0.7 and social_media_behavior != 'Non-user'
427+
probability: 0.85
428+
timestep: 3
354429
description: 'Week 8: Strait partially reopened; Russia and China back Iran; 400,000
355430
US jobs lost; midterm framing intensifies'
356431
re_reasoning_intensity: high
@@ -371,7 +446,16 @@ timeline:
371446
credibility: 0.82
372447
ambiguity: 0.28
373448
emotional_valence: -0.7
374-
exposure_rules: null
449+
exposure_rules:
450+
- channel: personal_network_word_of_mouth
451+
when: primary_info_source == 'Avoids news' and extraversion <= 0.5
452+
probability: 0.45
453+
timestep: 4
454+
- channel: broadcast_breaking_news
455+
when: midterm_turnout_propensity in ('Rarely votes', 'Never votes') and primary_info_source
456+
!= 'Avoids news'
457+
probability: 0.6
458+
timestep: 4
375459
description: 'Week 10: Leaked intelligence report suggests Iran is reconstituting
376460
uranium enrichment at undisclosed backup sites; ceasefire talks stall'
377461
re_reasoning_intensity: high
@@ -392,7 +476,11 @@ timeline:
392476
credibility: 0.93
393477
ambiguity: 0.28
394478
emotional_valence: -0.45
395-
exposure_rules: null
479+
exposure_rules:
480+
- channel: personal_network_word_of_mouth
481+
when: 'true'
482+
probability: 0.55
483+
timestep: 5
396484
description: 'Week 12: Fragile ceasefire reached through Omani/Qatari mediation;
397485
oil drops to $105; total US economic cost exceeds $800 billion'
398486
re_reasoning_intensity: high

0 commit comments

Comments
 (0)