Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
4b618ce
feat(db-first): add study.db storage, strict cli contracts, and new d…
DeveshParagiri Feb 15, 2026
cb45139
fix(tests+checkpoint): handle partial chunk resume and update DB-firs…
DeveshParagiri Feb 15, 2026
bcf0810
feat(sim): run-scoped state tables and writer-queue checkpoints
DeveshParagiri Feb 15, 2026
d07f6be
feat(cli): make inspect/results/report/export/chat run-aware
DeveshParagiri Feb 15, 2026
db3d364
feat(config): redesign config to 2-tier fast/strong with provider/mod…
DeveshParagiri Feb 15, 2026
093bc93
feat(providers): add provider registry, OpenAICompatProvider, rename …
DeveshParagiri Feb 15, 2026
ac4c470
feat(routing): wire 2-tier config through LLM routing, engine, and es…
DeveshParagiri Feb 15, 2026
36d0032
feat(cli): update config/simulate/estimate commands for fast/strong f…
DeveshParagiri Feb 15, 2026
ce7b2b6
feat(cost): add cost tracking package, wire providers, CLI --cost flag
DeveshParagiri Feb 15, 2026
d992aef
refactor(config): remove all legacy/deprecated code from config module
DeveshParagiri Feb 15, 2026
f87dacd
feat(network): enforce DB-only similarity checkpoints and resume
DeveshParagiri Feb 15, 2026
3bf939a
feat(sim): add runtime memory guardrails and compact default exports
DeveshParagiri Feb 15, 2026
81016f3
merge: integrate cost-tracking into config-provider-redesign
DeveshParagiri Feb 15, 2026
c57449d
feat(persona): add study-db agent source and db-first preview tests
DeveshParagiri Feb 15, 2026
cd8a23a
refactor(config): remove DefaultsConfig, move show_cost to top-level
DeveshParagiri Feb 15, 2026
21b0493
merge(config-provider-redesign): integrate provider/config refactor w…
DeveshParagiri Feb 15, 2026
be8d3de
chore: trigger claude code review
DeveshParagiri Feb 15, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions .claude/skills/extropy/references/ANALYSIS_PLAYBOOK.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Analysis Playbook

Use this after runs complete.

## Primary Artifacts

- `results/meta.json`
- `results/by_timestep.json`
- `results/outcome_distributions.json`
- `results/agent_states.json`
- `results/timeline.jsonl`

## Analysis Sequence

1. Run-level summary
- population size
- timesteps completed
- stop reason
- total reasoning calls
- token/cost summary

2. Dynamics
- exposure curve shape over time
- when state changes plateau
- whether run stopped by condition or quiescence

3. Outcomes
- final distribution by outcome
- concentration/polarization patterns
- compare to baseline variants

4. Segments
Use:
```bash
extropy results <results_dir> --segment <attribute>
```
Evaluate heterogeneous effects by segment.

5. Agent-level deep dive
Use:
```bash
extropy results <results_dir> --agent <agent_id>
```
Look for representative or anomalous trajectories.

6. Convergence + uncertainty across runs
- Run the same scenario across multiple seeds.
- Report central tendency + spread for key outcomes (mean, min/max, std where possible).
- Flag unstable outcomes where between-seed variance is decision-relevant.
- Do not present one run as a definitive forecast.

## Comparative Analysis Template

When comparing runs, report:
1. What changed (single axis)
2. Delta in exposure speed
3. Delta in key outcomes
4. Delta in cost/runtime
5. Confidence assessment (needs more seeds or stable)
142 changes: 142 additions & 0 deletions .claude/skills/extropy/references/CAPABILITIES_AND_EXAMPLES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
# Capabilities and Examples

Use this file to map user intent to what Extropy can model and how to execute it.

## 1) Core Capability Classes

1. Population synthesis
- Build statistically grounded synthetic populations from natural-language scope.
- Add scenario-specific behavioral/psychographic attributes.

2. Social graph simulation
- Generate network structures and influence pathways.
- Model diffusion and exposure propagation.

3. Scenario compilation
- Translate events/policies/product changes into executable exposure + outcome logic.

4. Agent reasoning dynamics
- Simulate iterative belief updates, memory effects, and classification outcomes.

5. Outcome analytics
- Produce timeline dynamics, final distributions, segment deltas, and agent-level traces.

6. Experiment operations
- Support estimation, batching, sweeps, versioning, triage, and reporting.

## 2) Decision Domains

- Public policy and governance
- Market/pricing strategy
- Product launch and diffusion
- Crisis and reputation response
- Messaging and political strategy
- Community and urban planning
- Healthcare behavior change
- B2B and enterprise transformation

## 3) Advanced Study Patterns

1. Counterfactual suites
- Baseline vs alternatives under fixed population/config.

2. Sensitivity analysis
- Sweep one axis at a time around baseline assumptions.

3. Confidence sweeps
- Multi-seed reruns for stability/variance analysis.

4. Segment stress tests
- Identify cohorts with fragile or highly parameter-sensitive outcomes.

5. Mechanism-first analysis
- Explain outcomes from exposure paths and agent state traces.

## 4) Practical Boundaries

- Best for social-behavioral dynamics, not physics/logistics optimization.
- Multi-event cascades are better modeled as staged runs.
- Outputs are simulation-informed forecasts, not guaranteed outcomes.

## 5) Trigger Phrases

Use this skill when users ask things like:
- "simulate how people will respond to..."
- "what happens if we raise price by..."
- "which segments will churn/adopt/protest"
- "test these message variants before launch"
- "run scenario analysis with uncertainty"
- "why did this segment flip in simulation"

## 6) Example Requests (Illustrative Only)

These are examples, not defaults.

1. Policy: congestion pricing alternatives
- Ask: estimate compliance/backlash across income and commute-access segments.
- Shape: baseline + alternatives + equity cuts.

2. Public health messaging
- Ask: find least responsive groups and best message frame.
- Shape: same population, multiple message scenarios, compare adoption/sentiment.

3. SaaS pricing
- Ask: estimate churn/downgrade/stay under +10/+20/+30 price shifts.
- Shape: counterfactual suite + revenue-risk tradeoff.

4. Product launch
- Ask: predict enable/disable behavior for default-on AI feature.
- Shape: adoption outcomes + trust/privacy sensitivity sweep.

5. Crisis response
- Ask: compare apology-only vs refund vs policy-change response.
- Shape: trust recovery and negative WOM dynamics by segment.

6. Political messaging
- Ask: compare message resonance/backlash by ideology/economic exposure.
- Shape: frame variants + propagation differences.

7. Community planning
- Ask: simulate support/neutral/oppose response to development proposal.
- Shape: concern taxonomy + coalition risk.

8. Healthcare adoption
- Ask: model clinician switching under reimbursement changes.
- Shape: policy variants + adoption friction analysis.

9. Enterprise change
- Ask: simulate compliance/disengagement/attrition intent under policy shift.
- Shape: role/commute/trust segment breakdown.

10. Deep triage
- Ask: debug flat exposure curve or unstable seed outcomes.
- Shape: evidence-led root cause, minimal fix, rerun command.

## 7) Quick Execution Templates

1. Baseline + sensitivity
- 1 baseline
- 3 variants
- 3 seeds each
- 2 to 3 key segment cuts

2. Message shootout
- 1 population
- 3 to 5 message scenarios
- fixed config + seeds
- rank by primary KPI + stability

3. Decision brief inputs
- decision objective
- top findings
- segment impacts
- confidence/stability
- recommendation + caveats

## 8) Capability to File Map

- Can Extropy model this? -> this file
- How to run it? -> `OPERATIONS.md`
- How to validate/fix/escalate? -> `QUALITY_TRIAGE_ESCALATION.md`
- How to analyze outcomes? -> `ANALYSIS_PLAYBOOK.md`
- How to write decision report? -> `EXPERIMENT_REPORT_TEMPLATE.md`
63 changes: 63 additions & 0 deletions .claude/skills/extropy/references/EXPERIMENT_REPORT_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Experiment Report Template

Use this template for every completed experiment batch.

## 1. Decision Context

- Study name:
- Decision to support:
- Primary KPI/outcome:
- Constraints (budget, timeline, policy limits):

## 2. Experiment Setup

- Population description:
- Scenario description:
- Run set ID:
- Variants included:
- Seed policy (single seed or multi-seed):
- Model/provider config:

## 3. Headline Results

- Baseline outcome distribution:
- Most important segment deltas:
- Exposure dynamics summary:
- Stop condition and total timesteps:

## 4. Confidence and Stability

- Number of seeds:
- Between-seed variance for key outcomes:
- Stable findings (low variance):
- Unstable findings (high variance):
- Confidence statement (high/medium/low):

## 5. Why It Happened (Mechanism)

- Dominant drivers inferred from traces:
- Key peer influence patterns:
- Conviction/memory effects observed:
- Outlier trajectories and interpretation:

## 6. Cost and Operations

- Total token usage (pivotal/routine):
- Estimated cost:
- Runtime and bottlenecks:
- Any retries/errors/resume events:

## 7. Recommendations

1. Immediate decision recommendation
2. Risk caveats
3. Next experiment(s) to run
4. What would change your recommendation

## 8. Evidence Files

- `results/meta.json`
- `results/outcome_distributions.json`
- `results/by_timestep.json`
- `results/agent_states.json`
- `results/timeline.jsonl`
Loading
Loading