Skip to content

Commit 09aba60

Browse files
Merge pull request #62 from exaforge/codex/network-fast-checkpointing
DB-first runtime + network checkpointing + config-provider merge
2 parents 9f0547c + be8d3de commit 09aba60

70 files changed

Lines changed: 8012 additions & 1419 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# Analysis Playbook
2+
3+
Use this after runs complete.
4+
5+
## Primary Artifacts
6+
7+
- `results/meta.json`
8+
- `results/by_timestep.json`
9+
- `results/outcome_distributions.json`
10+
- `results/agent_states.json`
11+
- `results/timeline.jsonl`
12+
13+
## Analysis Sequence
14+
15+
1. Run-level summary
16+
- population size
17+
- timesteps completed
18+
- stop reason
19+
- total reasoning calls
20+
- token/cost summary
21+
22+
2. Dynamics
23+
- exposure curve shape over time
24+
- when state changes plateau
25+
- whether run stopped by condition or quiescence
26+
27+
3. Outcomes
28+
- final distribution by outcome
29+
- concentration/polarization patterns
30+
- compare to baseline variants
31+
32+
4. Segments
33+
Use:
34+
```bash
35+
extropy results <results_dir> --segment <attribute>
36+
```
37+
Evaluate heterogeneous effects by segment.
38+
39+
5. Agent-level deep dive
40+
Use:
41+
```bash
42+
extropy results <results_dir> --agent <agent_id>
43+
```
44+
Look for representative or anomalous trajectories.
45+
46+
6. Convergence + uncertainty across runs
47+
- Run the same scenario across multiple seeds.
48+
- Report central tendency + spread for key outcomes (mean, min/max, std where possible).
49+
- Flag unstable outcomes where between-seed variance is decision-relevant.
50+
- Do not present one run as a definitive forecast.
51+
52+
## Comparative Analysis Template
53+
54+
When comparing runs, report:
55+
1. What changed (single axis)
56+
2. Delta in exposure speed
57+
3. Delta in key outcomes
58+
4. Delta in cost/runtime
59+
5. Confidence assessment (needs more seeds or stable)
Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
# Capabilities and Examples
2+
3+
Use this file to map user intent to what Extropy can model and how to execute it.
4+
5+
## 1) Core Capability Classes
6+
7+
1. Population synthesis
8+
- Build statistically grounded synthetic populations from natural-language scope.
9+
- Add scenario-specific behavioral/psychographic attributes.
10+
11+
2. Social graph simulation
12+
- Generate network structures and influence pathways.
13+
- Model diffusion and exposure propagation.
14+
15+
3. Scenario compilation
16+
- Translate events/policies/product changes into executable exposure + outcome logic.
17+
18+
4. Agent reasoning dynamics
19+
- Simulate iterative belief updates, memory effects, and classification outcomes.
20+
21+
5. Outcome analytics
22+
- Produce timeline dynamics, final distributions, segment deltas, and agent-level traces.
23+
24+
6. Experiment operations
25+
- Support estimation, batching, sweeps, versioning, triage, and reporting.
26+
27+
## 2) Decision Domains
28+
29+
- Public policy and governance
30+
- Market/pricing strategy
31+
- Product launch and diffusion
32+
- Crisis and reputation response
33+
- Messaging and political strategy
34+
- Community and urban planning
35+
- Healthcare behavior change
36+
- B2B and enterprise transformation
37+
38+
## 3) Advanced Study Patterns
39+
40+
1. Counterfactual suites
41+
- Baseline vs alternatives under fixed population/config.
42+
43+
2. Sensitivity analysis
44+
- Sweep one axis at a time around baseline assumptions.
45+
46+
3. Confidence sweeps
47+
- Multi-seed reruns for stability/variance analysis.
48+
49+
4. Segment stress tests
50+
- Identify cohorts with fragile or highly parameter-sensitive outcomes.
51+
52+
5. Mechanism-first analysis
53+
- Explain outcomes from exposure paths and agent state traces.
54+
55+
## 4) Practical Boundaries
56+
57+
- Best for social-behavioral dynamics, not physics/logistics optimization.
58+
- Multi-event cascades are better modeled as staged runs.
59+
- Outputs are simulation-informed forecasts, not guaranteed outcomes.
60+
61+
## 5) Trigger Phrases
62+
63+
Use this skill when users ask things like:
64+
- "simulate how people will respond to..."
65+
- "what happens if we raise price by..."
66+
- "which segments will churn/adopt/protest"
67+
- "test these message variants before launch"
68+
- "run scenario analysis with uncertainty"
69+
- "why did this segment flip in simulation"
70+
71+
## 6) Example Requests (Illustrative Only)
72+
73+
These are examples, not defaults.
74+
75+
1. Policy: congestion pricing alternatives
76+
- Ask: estimate compliance/backlash across income and commute-access segments.
77+
- Shape: baseline + alternatives + equity cuts.
78+
79+
2. Public health messaging
80+
- Ask: find least responsive groups and best message frame.
81+
- Shape: same population, multiple message scenarios, compare adoption/sentiment.
82+
83+
3. SaaS pricing
84+
- Ask: estimate churn/downgrade/stay under +10/+20/+30 price shifts.
85+
- Shape: counterfactual suite + revenue-risk tradeoff.
86+
87+
4. Product launch
88+
- Ask: predict enable/disable behavior for default-on AI feature.
89+
- Shape: adoption outcomes + trust/privacy sensitivity sweep.
90+
91+
5. Crisis response
92+
- Ask: compare apology-only vs refund vs policy-change response.
93+
- Shape: trust recovery and negative WOM dynamics by segment.
94+
95+
6. Political messaging
96+
- Ask: compare message resonance/backlash by ideology/economic exposure.
97+
- Shape: frame variants + propagation differences.
98+
99+
7. Community planning
100+
- Ask: simulate support/neutral/oppose response to development proposal.
101+
- Shape: concern taxonomy + coalition risk.
102+
103+
8. Healthcare adoption
104+
- Ask: model clinician switching under reimbursement changes.
105+
- Shape: policy variants + adoption friction analysis.
106+
107+
9. Enterprise change
108+
- Ask: simulate compliance/disengagement/attrition intent under policy shift.
109+
- Shape: role/commute/trust segment breakdown.
110+
111+
10. Deep triage
112+
- Ask: debug flat exposure curve or unstable seed outcomes.
113+
- Shape: evidence-led root cause, minimal fix, rerun command.
114+
115+
## 7) Quick Execution Templates
116+
117+
1. Baseline + sensitivity
118+
- 1 baseline
119+
- 3 variants
120+
- 3 seeds each
121+
- 2 to 3 key segment cuts
122+
123+
2. Message shootout
124+
- 1 population
125+
- 3 to 5 message scenarios
126+
- fixed config + seeds
127+
- rank by primary KPI + stability
128+
129+
3. Decision brief inputs
130+
- decision objective
131+
- top findings
132+
- segment impacts
133+
- confidence/stability
134+
- recommendation + caveats
135+
136+
## 8) Capability to File Map
137+
138+
- Can Extropy model this? -> this file
139+
- How to run it? -> `OPERATIONS.md`
140+
- How to validate/fix/escalate? -> `QUALITY_TRIAGE_ESCALATION.md`
141+
- How to analyze outcomes? -> `ANALYSIS_PLAYBOOK.md`
142+
- How to write decision report? -> `EXPERIMENT_REPORT_TEMPLATE.md`
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# Experiment Report Template
2+
3+
Use this template for every completed experiment batch.
4+
5+
## 1. Decision Context
6+
7+
- Study name:
8+
- Decision to support:
9+
- Primary KPI/outcome:
10+
- Constraints (budget, timeline, policy limits):
11+
12+
## 2. Experiment Setup
13+
14+
- Population description:
15+
- Scenario description:
16+
- Run set ID:
17+
- Variants included:
18+
- Seed policy (single seed or multi-seed):
19+
- Model/provider config:
20+
21+
## 3. Headline Results
22+
23+
- Baseline outcome distribution:
24+
- Most important segment deltas:
25+
- Exposure dynamics summary:
26+
- Stop condition and total timesteps:
27+
28+
## 4. Confidence and Stability
29+
30+
- Number of seeds:
31+
- Between-seed variance for key outcomes:
32+
- Stable findings (low variance):
33+
- Unstable findings (high variance):
34+
- Confidence statement (high/medium/low):
35+
36+
## 5. Why It Happened (Mechanism)
37+
38+
- Dominant drivers inferred from traces:
39+
- Key peer influence patterns:
40+
- Conviction/memory effects observed:
41+
- Outlier trajectories and interpretation:
42+
43+
## 6. Cost and Operations
44+
45+
- Total token usage (pivotal/routine):
46+
- Estimated cost:
47+
- Runtime and bottlenecks:
48+
- Any retries/errors/resume events:
49+
50+
## 7. Recommendations
51+
52+
1. Immediate decision recommendation
53+
2. Risk caveats
54+
3. Next experiment(s) to run
55+
4. What would change your recommendation
56+
57+
## 8. Evidence Files
58+
59+
- `results/meta.json`
60+
- `results/outcome_distributions.json`
61+
- `results/by_timestep.json`
62+
- `results/agent_states.json`
63+
- `results/timeline.jsonl`

0 commit comments

Comments
 (0)