Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
194 changes: 194 additions & 0 deletions llm/climate_negotiation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,194 @@
# Climate Negotiation - Mesa-LLM Example

A multi-agent simulation of international climate treaty negotiations where six
country agents, each powered by an LLM negotiate a shared emissions-reduction
target over multiple rounds.

## What This Model Demonstrates

| Mesa-LLM feature | How it appears in this model |
|---|---|
| `STLTMemory` | Short-term memory stores recent proposals and messages; long-term memory consolidates committed positions across rounds |
| `ReActReasoning` | Agents reason about their economic interests and negotiating position, then act |
| `speak_to` (inbuilt tool) | Direct diplomatic messaging between specific countries |
| Custom `@tool` functions | `make_proposal`, `accept_proposal`, `form_coalition`, `reject_and_counter` |
| `vision=-1` | Each agent observes all others and model a negotiating room with no spatial grid |

## Countries and Their Profiles

Data sources: IEA 2022 (emissions), World Bank 2023 (GDP).

| Country | Emissions (tCO₂/capita) | GDP/capita | Stance |
|---------|------------------------|------------|--------|
| USA | 14.0 | $76,000 | Supports action; insists all major economies especially China and India to match commitments; prefers market-based mechanisms |
| EU | 6.0 | $37,000 | High ambition (Fit for 55, 55% by 2030); pushes legally binding targets and developing-nation finance |
| China | 8.0 | $12,700 | Argues developed nations bear historical responsibility; supports long-term goals contingent on tech transfer and green finance |
| India | 2.0 | $2,500 | Defends common but differentiated responsibilities; energy access for 1.4 billion people is non-negotiable |
| Brazil | 2.8 | $10,400 | Emissions driven by deforestation, not fossil fuels; demands forest conservation credits count in treaty text |
| Russia | 12.5 | $15,000 | Accepts climate science; resists near-term targets that threaten fossil fuel revenues; open to long timelines |

## Negotiation Tools

```
speak_to(listener_ids, message) - targeted diplomatic message to specific parties
make_proposal(reduction%, year, reason) - formal proposal broadcast to all
accept_proposal(proposer_id, %, message) - formal acceptance; marks agent as treaty signatory
form_coalition(partner_ids, name) - build or expand an alliance
reject_and_counter(proposer_id, %, reason) - reject a proposal and broadcast a counter-offer
```

A **treaty is reached** when at least 2/3 of countries have formally called `accept_proposal`.

## Sample Run Results

**With `openai/gpt-4o` - treaty reached in 6 rounds (~4 minutes)**

```
Round 1 Coalition-building. India forms "Developing Nations Unity" (China, Brazil).
EU builds a cross-bloc coalition. USA anchors EU + Russia.
Round 2 First proposals. USA: 30% (market mechanisms). EU: 40% by 2040. India: 20%.
Round 3 Counters. India rejects EU 40% -> 20%. EU meets India at 30%.
China: 20% contingent on tech transfer. Brazil rejects USA -> 25% + forest credits.
Round 4 EU accepts Brazil (25%). Russia accepts China (20%). 2/6 accepted.
Round 5 India accepts Brazil (25%). China updates to 25%. 3/6 accepted.
Round 6 USA accepts EU. Russia upgrades to 25%. TREATY REACHED - 4/6 ≥ 2/3

Final: USA 30% EU 30% India 25% Russia 25% (accepted)
China 25% Brazil 25% (pledged but held out for concessions)
```

**With `ollama/llama3.2` (local 3B model)**

The simulation loop runs without errors but smaller local models produce weaker
emergent behaviour: repeated proposals with empty justifications, and attempts to
use non-existent agent IDs. The code guards against both,
but `llama3.2` is best used for testing the simulation loop rather than observing realistic diplomacy. Use `gpt-4o-mini` or `gemini/gemini-2.0-flash` for meaningful negotiations.

## Robustness Against LLM Hallucinations

Two common failure modes are guarded in code:

- **Phantom agent IDs in `form_coalition`** - `partner_ids` are filtered against
the live agent set before being stored. Invalid IDs are dropped and logged as
`WARNING` in `climate_negotiation.log`.
- **Invalid `proposer_id` in `accept_proposal`** - the ID is validated before
recording an acceptance. If it doesn't match any agent, an error string listing
valid IDs is returned to the LLM so it can self-correct on its next step.

Every agent's step prompt also includes an explicit `VALID COUNTRY IDs` block so
models are less likely to invent IDs in the first place.

## Run Log

Each run writes a structured trace to `climate_negotiation.log` (configurable via
the `CLIMATE_LOG_FILE` environment variable). The log records:

- Round start/end with average pledge, total proposals, and treaty status
- Per-agent state (pledge, accepted, coalition) at the start and end of each round
- Every tool call with its arguments and outcome
- `WARNING` entries for any hallucinated IDs that were dropped

## Setup

### 1. Install dependencies

```bash
pip install mesa-llm mesa solara python-dotenv rich
```

### 2. Set your API key

Create a `.env` file in `llm/climate_negotiation/`:

```
# For Gemini (free tier available)
GEMINI_API_KEY=your_key_here

# OR for OpenAI
OPENAI_API_KEY=your_key_here

# OR for Anthropic
ANTHROPIC_API_KEY=your_key_here

# OR for a local model via Ollama (loop testing only)
OLLAMA_API_BASE=http://localhost:11434
```

### 3. Run with Solara visualisation

```bash
cd llm/climate_negotiation
solara run app.py
```

### 4. Run headless (terminal only)

```bash
cd llm/climate_negotiation
python -m climate_negotiation.model
```

## Supported LLMs

Works with any LiteLLM-compatible model string:

| Model string | Notes |
|---|---|
| `gemini/gemini-2.0-flash` | Default; free tier, fast |
| `openai/gpt-4o` | Best emergent behaviour (tested ✓) |
| `openai/gpt-4o-mini` | Good balance of quality and cost |
| `anthropic/claude-haiku-4-5-20251001` | Capable, low latency |
| `ollama/llama3.2` | Local; suitable for loop testing only |

## File Structure

```
llm/climate_negotiation/ # example root
├── app.py # Solara visualisation entry point
├── README.md
└── climate_negotiation/ # Python package
├── __init__.py # triggers tool registration on import
├── agents.py # CountryAgent, country_tool_manager
├── tools.py # four custom @tool functions
└── model.py # ClimateNegotiationModel + COUNTRIES configs
```

## Visualisation

The Solara dashboard shows:

- **Pledge bar chart** - each country's current reduction commitment; bars turn green when a country accepts the treaty
- **Coalition status panel** - live table of pledge, acceptance status, coalition members, and proposals made
- **Pledge trajectories** - line chart of all countries' pledges across rounds
- **Time-series plots** - TotalProposals, AveragePledge, LargestCoalitionSize

## What to Watch For

- **Round 1–2**: Coalition-building; agents probe positions before making formal proposals
- **Round 3–4**: Formal proposals emerge; developing nations counter with differentiated targets and conditions
- **Round 5+**: Coalition pressure on holdouts; some countries accept, others counter-propose
- **Treaty achieved**: Green bars in the visualisation, `treaty_reached=True` in the log

## Extending This Example To Try Different Possibilities

**Try different LLMs**: Change `llm_model` in `app.py`. `gpt-4o` produces richer diplomatic
language; `gemini-2.0-flash` is faster and free.

**Add more countries**: Add a new dict to the `COUNTRIES` list in `model.py` and assign it
a system prompt encoding that country's real-world stance.

**Change the treaty threshold**: Edit `_treaty_reached()` in `model.py`
(currently requires a 2/3 majority).

**Use CoTReasoning**: Replace `ReActReasoning` with `CoTReasoning` in `app.py` to see
step-by-step chain-of-thought reasoning printed alongside each agent action.

**Swap memory type**: Replace `STLTMemory` (default) with `EpisodicMemory` for
importance-scored memory retrieval - useful to observe which proposals agents
consider most significant across many rounds.

## Related Work

- Duffuant, G. & Weisbuch, G. (2002). *Bounded confidence and social networks.*
- The `deffuant_weisbuch` example shows opinion convergence without LLM reasoning.
Compare its convergence speed with this model's negotiated consensus.
182 changes: 182 additions & 0 deletions llm/climate_negotiation/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
import logging
import warnings

import matplotlib.pyplot as plt
import pandas as pd
import solara
from climate_negotiation.agents import CountryAgent
from climate_negotiation.model import ClimateNegotiationModel
from dotenv import load_dotenv
from mesa.visualization import SolaraViz, make_plot_component
from mesa.visualization.utils import update_counter
from mesa_llm.reasoning.react import ReActReasoning

warnings.filterwarnings("ignore", category=UserWarning, module="pydantic.main")
logging.getLogger("pydantic").setLevel(logging.ERROR)

load_dotenv()

model_params = {
"rng": {
"type": "InputText",
"value": 42,
"label": "Random Seed",
},
"llm_model": {
"type": "Select",
"value": "gemini/gemini-2.0-flash",
"values": [
"gemini/gemini-2.0-flash",
"openai/gpt-4o-mini",
"openai/gpt-4o",
"anthropic/claude-haiku-4-5-20251001",
"ollama/llama3.2",
],
"label": "LLM Model",
},
"reasoning": ReActReasoning,
}

model = ClimateNegotiationModel(
reasoning=model_params["reasoning"],
llm_model=model_params["llm_model"]["value"],
rng=model_params["rng"]["value"],
)


def PledgeBarChart(model):
"""Bar chart of each country's current reduction pledge."""
update_counter.get()

countries = [a for a in model.agents if isinstance(a, CountryAgent)]

fig, ax = plt.subplots(figsize=(8, 4))

if not countries or all(a.current_pledge == 0 for a in countries):
ax.set_title("No pledges yet — click Step to begin")
ax.set_ylim(0, 100)
return solara.FigureMatplotlib(fig)

names = [a.country_name for a in countries]
pledges = [a.current_pledge for a in countries]
colors = ["#27ae60" if a.accepted_treaty else "#2980b9" for a in countries]

bars = ax.bar(names, pledges, color=colors, edgecolor="white", linewidth=0.8)
ax.axhline(y=30, color="#e67e22", linestyle="--", linewidth=1.4, label="30% target")
ax.axhline(y=50, color="#e74c3c", linestyle="--", linewidth=1.4, label="50% target")
ax.set_ylabel("Reduction Pledge (%)", fontsize=11)
ax.set_title(
f"Country Pledges (green = accepted treaty) — Round {model.steps}", fontsize=12
)
ax.set_ylim(0, 100)
ax.legend(loc="upper right", fontsize=9)

for bar, pledge in zip(bars, pledges):
if pledge > 0:
ax.text(
bar.get_x() + bar.get_width() / 2,
bar.get_height() + 1.2,
f"{pledge:.0f}%",
ha="center",
va="bottom",
fontsize=9,
fontweight="bold",
)

plt.tight_layout()
return solara.FigureMatplotlib(fig)


@solara.component
def CoalitionStatusPanel(model):
update_counter.get()

countries = [a for a in model.agents if isinstance(a, CountryAgent)]
id_to_name = {a.unique_id: a.country_name for a in countries}
treaty_count = sum(1 for a in countries if a.accepted_treaty)
treaty_reached = model._treaty_reached()

solara.Text(
f"Round {model.steps} · "
f"Accepted: {treaty_count}/{len(countries)} · "
f"Treaty: {'YES ✓' if treaty_reached else 'not yet'} · "
f"Proposals: {model.total_proposals} · "
f"Avg pledge: {model._average_pledge():.1f}%"
)

rows = []
for a in sorted(countries, key=lambda x: x.country_name):
coalition = [id_to_name.get(i, str(i)) for i in a.coalition_members]
rows.append(
{
"Country": a.country_name,
"Pledge": f"{a.current_pledge:.1f}%",
"Accepted": "✓" if a.accepted_treaty else "—",
"Coalition": ", ".join(coalition) or "—",
"Proposals": a.proposals_made,
}
)

solara.DataFrame(pd.DataFrame(rows))


def PledgeTrajectoriesChart(model):
"""Line chart of pledge trajectories over rounds."""
update_counter.get()

fig, ax = plt.subplots(figsize=(8, 4))

try:
df = model.datacollector.get_agent_vars_dataframe()
except Exception:
ax.set_title("No trajectory data yet")
return solara.FigureMatplotlib(fig)

if df.empty or "CurrentPledge" not in df.columns:
ax.set_title("No trajectory data yet — run a few steps")
return solara.FigureMatplotlib(fig)

id_to_name = {
a.unique_id: a.country_name for a in model.agents if isinstance(a, CountryAgent)
}

if isinstance(df.index, pd.MultiIndex):
pledge_df = df["CurrentPledge"].unstack(level=1)
pledge_df.columns = [id_to_name.get(c, str(c)) for c in pledge_df.columns]
else:
ax.set_title("Run more steps to see trajectories")
return solara.FigureMatplotlib(fig)

for country in pledge_df.columns:
ax.plot(
pledge_df.index, pledge_df[country], marker="o", label=country, linewidth=2
)

ax.set_xlabel("Round", fontsize=11)
ax.set_ylabel("Reduction Pledge (%)", fontsize=11)
ax.set_title("Pledge Trajectories by Country", fontsize=12)
ax.legend(loc="upper left", fontsize=9)
ax.set_ylim(0, 100)
plt.tight_layout()
return solara.FigureMatplotlib(fig)


TotalProposalsPlot = make_plot_component("TotalProposals")
AveragePledgePlot = make_plot_component("AveragePledge")
LargestCoalitionPlot = make_plot_component("LargestCoalitionSize")

# renderer=None: no spatial grid in this model, so we skip the default space view
page = SolaraViz(
model,
renderer=None,
components=[
PledgeBarChart,
CoalitionStatusPanel,
PledgeTrajectoriesChart,
TotalProposalsPlot,
AveragePledgePlot,
LargestCoalitionPlot,
],
model_params=model_params,
name="Climate Negotiation - Mesa-LLM",
)
3 changes: 3 additions & 0 deletions llm/climate_negotiation/climate_negotiation/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from . import tools

__all__ = ["tools"]
Loading