You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/capabilities.md
+70-27Lines changed: 70 additions & 27 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -108,6 +108,8 @@ Agents experience the crisis as it unfolds. Early timesteps have high uncertaint
108
108
109
109
**Automatic detection**: If you define a single event with no timeline, Extropy treats it as static. If you provide multiple events or explicit timeline entries, it switches to evolving mode. You can override with `timeline_mode: static` or `timeline_mode: evolving`.
110
110
111
+
**Background context**: Scenarios can include ambient framing that appears in every agent's prompt. "The US economy is in a mild recession. Unemployment was at 4.5% before the AI announcement. It's early spring." This shapes reasoning without being the focal event.
112
+
111
113
**Timestep units are configurable**: Days, weeks, months - whatever fits your scenario. A crisis might unfold over days. A policy change might play out over months. A generational shift might span years.
112
114
113
115
---
@@ -190,10 +192,14 @@ Agents aren't goldfish. They remember.
190
192
191
193
**Temporal labeling**: Prompts explicitly state the current timestep. "It's now Week 5 of this situation." Agents can reason about time - how long something has been going on, whether their views have been stable or shifting.
192
194
193
-
**Emotional trajectory**: The system detects sentiment trends. "You started skeptical but have been warming up" or "Your enthusiasm has been fading over the past few weeks." This shapes agent self-awareness.
195
+
**Emotional trajectory**: The system detects sentiment trends and renders them as narrative. "I've been feeling increasingly negative since this started" or "My mood has been fairly steady." This gives agents emotional continuity between timesteps instead of starting fresh every time.
196
+
197
+
**Conviction self-awareness**: Agents know how firm they've been. "I've been firm about this since Week 1" or "I started certain but my certainty has been slipping." This enables commitment bias (consistent agents resist change) and openness (wavering agents are more receptive).
194
198
195
199
**Intent accountability**: If an agent said they'd do something, they get reminded. "Last week you said you were going to look into alternatives. Has anything changed?" This prevents agents from making bold claims they never follow through on.
196
200
201
+
**Repetition detection**: When agents keep thinking the same thing for multiple timesteps, they get nudged: "You've been thinking the same things for a while now. Has anything actually changed? Are you starting to doubt yourself? Have you done anything about it, or just thought about it?" This prevents stale convergence where agents repeat identical reasoning.
202
+
197
203
**Conviction decay**: Strong opinions fade without reinforcement. A conviction score of 0.9 doesn't stay at 0.9 forever. Configurable decay rate means you can model how quickly certainty erodes.
198
204
199
205
**Flip resistance**: High-conviction agents are harder to move. If someone is absolutely certain, new information needs to be compelling to shift them. This prevents unrealistic opinion swings.
@@ -216,24 +222,6 @@ You don't program social behavior explicitly. It emerges from the mechanics.
216
222
217
223
---
218
224
219
-
## What You Get Out
220
-
221
-
After simulation runs, you have:
222
-
223
-
**Position distributions**: What fraction of the population supports, opposes, or remains neutral? Segmented by any attribute - how do young people differ from old? Urban from rural? High-income from low-income?
224
-
225
-
**Sentiment trajectories**: How did emotional response evolve over time? Did initial negativity soften? Did enthusiasm fade?
226
-
227
-
**Conviction patterns**: Where are the true believers vs. the persuadable middle? How does certainty correlate with position?
228
-
229
-
**Sharing behavior**: Who's talking about this? Which demographics amplify vs. stay silent?
230
-
231
-
**Reasoning traces**: The actual first-person reasoning each agent produced. Qualitative insight into why people think what they think.
232
-
233
-
**Network effects**: How did information flow? Which communities adopted early? Where did resistance cluster?
234
-
235
-
---
236
-
237
225
## Agent Conversations
238
226
239
227
Agents can talk to each other. When reasoning, an agent can choose to initiate a conversation with someone in their network.
@@ -248,10 +236,10 @@ Agents can talk to each other. When reasoning, an agent can choose to initiate a
248
236
249
237
**Conflict resolution**: When multiple agents want to talk to the same target, priority determines who wins. Higher relationship weight wins - a partner request beats a coworker request. Deferred requests can execute in later timesteps.
250
238
251
-
**Fidelity control**: The `--fidelity` flag controls conversation depth:
252
-
-`low`: No conversations at all - just reasoning
253
-
-`medium` (default): 2 turns (4 messages), top 1 conversation per agent
254
-
-`high`: 3 turns (6 messages), up to 2 conversations per agent
239
+
**Fidelity control**: The `--fidelity` flag controls conversation depth and prompt richness:
240
+
-`low`: No conversations, last 5 memory traces, basic prompts
241
+
-`medium` (default): 2 turns (4 messages), 1 conversation per agent, full memory traces
242
+
-`high`: 3 turns (6 messages), up to 2 conversations per agent, explicit THINK vs SAY separation, repetition detection
255
243
256
244
---
257
245
@@ -289,8 +277,63 @@ To make it concrete, here are scenarios that work right now with no additional d
289
277
- Social media dynamics where public discourse influences individuals
290
278
- Any population, any country, any event, any outcome structure
291
279
292
-
The constraints are:
293
-
- No runtime fidelity/cost tradeoffs beyond merged pass and fidelity levels (yet)
294
-
- No validation against historical ground truth (yet)
280
+
---
281
+
282
+
## Cognitive Depth at High Fidelity
283
+
284
+
At `--fidelity high`, agents get additional cognitive architecture features:
285
+
286
+
**THINK vs SAY separation**: Prompts explicitly distinguish between internal monologue (raw, honest thoughts) and public statement (socially filtered). Agents with high agreeableness might have a large gap between what they think and what they say - that's interesting data.
287
+
288
+
**Repetition detection**: If an agent's reasoning is too similar to their previous timestep (>70% trigram overlap), they get a prompt nudge forcing them to go deeper. This prevents the stale convergence where agents just repeat "save money, learn AI, find backup work" verbatim for 5 timesteps.
289
+
290
+
These features are always-on at high fidelity, adding cognitive realism without additional configuration.
291
+
292
+
---
293
+
294
+
## Scenarios You Can Run Today
295
+
296
+
To make it concrete, here are scenarios that work right now with no additional development:
297
+
298
+
- US households responding to a streaming service price increase
299
+
- Japanese employees adapting to return-to-office mandates
300
+
- Indian consumers in multiple cities evaluating a new fintech app
301
+
- Brazilian families weighing migration decisions
302
+
- UK residents responding to congestion pricing expansion
303
+
- German citizens reacting to energy policy changes
304
+
- Mixed urban/rural populations facing a natural disaster
- Professional networks processing industry disruption news
307
+
- Religious communities responding to doctrinal changes
308
+
- Parent networks reacting to school policy updates
309
+
- Couples having conversations that shift their positions
310
+
- Workplace discussions that change minds
311
+
- Social media dynamics where public discourse influences individuals
312
+
- Crisis scenarios that evolve over days/weeks with new developments
313
+
- Any population, any country, any event, any outcome structure
314
+
315
+
The core simulation engine is complete through Phase F - households, networks, timelines, conversations, social posts, cognitive architecture, fidelity tiers, and results export.
316
+
317
+
---
318
+
319
+
## What You Get Out
320
+
321
+
After simulation runs, you have:
322
+
323
+
**Position distributions**: What fraction of the population supports, opposes, or remains neutral? Segmented by any attribute - how do young people differ from old? Urban from rural? High-income from low-income?
324
+
325
+
**Sentiment trajectories**: How did emotional response evolve over time? Did initial negativity soften? Did enthusiasm fade?
326
+
327
+
**Conviction patterns**: Where are the true believers vs. the persuadable middle? How does certainty correlate with position?
328
+
329
+
**Sharing behavior**: Who's talking about this? Which demographics amplify vs. stay silent?
330
+
331
+
**Reasoning traces**: The actual first-person reasoning each agent produced. Qualitative insight into why people think what they think.
332
+
333
+
**Network effects**: How did information flow? Which communities adopted early? Where did resistance cluster?
334
+
335
+
**Conversation impact**: Which conversations changed minds most? Ranked by sentiment and conviction delta. See exactly what was said that moved people.
336
+
337
+
**Elaborations export**: Flattened CSV with every agent's demographics, outcomes, and reasoning - ready for downstream analysis, clustering, or visualization.
295
338
296
-
Those are Phases E and F. What's here now is Phases A through D - the core simulation engine with households, networks, timelines, conversations, and social posts.
339
+
**Social posts timeline**: Every public statement made during the simulation, with agent name, position, and sentiment.
-✅ Exploratory outcome export (`elaborations.csv` with agent demographics + all outcomes)
1187
+
-✅ Conversation analysis (`compute_most_impactful_conversations` — ranks by sentiment+conviction delta)
1188
+
-✅ Social posts export (`social_posts.json`)
1189
1189
1190
-
### Phase G: Backtesting Harness (~2 weeks)
1190
+
### Phase G: Backtesting Harness — DEFERRED
1191
1191
1192
1192
**Files:**`tests/`, new `extropy/validation/` module
1193
1193
@@ -1199,4 +1199,4 @@ Ship this alone. Every simulation immediately feels more human, and the accounta
1199
1199
1200
1200
**Total estimated: ~13 weeks for full v2.**
1201
1201
1202
-
Phase A alone (~1.5 weeks) is a massive improvement with zero risk — named agents, temporal awareness, full memory, intent accountability, and macro feedback. Each subsequent phase is independently shippable and testable.
1202
+
**Status: Phases A-F COMPLETE.** Phase G (backtesting) is deferred — requires curating ground-truth datasets for historical scenarios.
0 commit comments