Skip to content

Commit 3f1496c

Browse files
vahid-ahmadinikhilwoodruffMaxGhenisclaude
authored
Add AI agent blog post (#2766)
* Add AI agent blog post * update * edit image * Apply review changes * Rewrite AI agent blog post with results-focused narrative - Restructure post to lead with findings instead of tutorial format - Update title and description to be more specific - Rename slug from ai-agent-post to multi-agent-workflows-policy-research - Organize images with descriptive names (claude-code-interface.png, etc.) - Add section on addressing shortcomings with Skills and plugin ecosystem - Link to policyengine-claude plugin and PolicyEngine 2.0 event - Convert to UK spelling throughout - Strengthen active voice consistently - Remove hedging language and casual asides 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Enhance writing-style-checker agent with blog post learnings Add specific patterns caught during AI agent post rewrite: - Hedging language examples ("probably", "planning to") - Weak intro phrases ("The interesting part:") - Subtle passive voice patterns ("showed up" vs active subjects) These patterns complement existing comprehensive style guidelines. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Nikhil Woodruff <nikhil.woodruff@outlook.com> Co-authored-by: Max Ghenis <mghenis@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>
1 parent 9886f8b commit 3f1496c

8 files changed

Lines changed: 75 additions & 0 deletions

File tree

.claude/agents/writing-style-checker.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -281,6 +281,21 @@ grep "^#" src/posts/articles/[filename].md
281281
**Problem**: "The regressive policy burdens the poor"
282282
**Fix**: "The policy increases taxes for lower-income households by an average of £340"
283283

284+
### Issue: Hedging Language
285+
286+
**Problem**: "That's probably the right approach" or "We're planning to deploy"
287+
**Fix**: "That's the right approach" or "We're deploying" (remove unnecessary hedging)
288+
289+
### Issue: Weak Intro Phrases
290+
291+
**Problem**: "The interesting part: the system handles coordination"
292+
**Fix**: "The system handles coordination" (remove editorial labels)
293+
294+
### Issue: Subtle Passive Voice
295+
296+
**Problem**: "The time advantage showed up on repetitive tasks"
297+
**Fix**: "The system saved the most time on repetitive tasks" (clear active subject)
298+
284299
## Review Response Template
285300

286301
### For Policy Reports
92.3 KB
Loading
39.3 KB
Loading
34.1 KB
Loading
69.8 KB
Loading
32.7 MB
Loading
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
When we finished our [UK carbon dividend analysis](https://policyengine.org/uk/research/uk-carbon-tax-dividend) this autumn, we wondered whether [Claude Code's](https://claude.com/claude-code) multi-agent system could have automated parts of the research workflow. We tested it on a similar policy analysis to find out.
2+
3+
## Building a multi-agent research workflow
4+
5+
Using Claude Code, we configured three [specialised agents](https://docs.claude.com/en/docs/claude-code/agents) to handle different parts of a policy analysis pipeline. The first agent fetches PolicyEngine microsimulation data from our GitHub repositories and documentation. The second writes analysis scripts using our [Python package](https://github.com/PolicyEngine/policyengine-uk). The third generates formatted research output from the results.
6+
7+
We stored each agent as a markdown file in [`.claude/agents/`](https://docs.claude.com/en/docs/claude-code/agents#creating-custom-agents), defining their specific roles and tools. The data fetching agent got access to GitHub APIs and PolicyEngine documentation. The script writer could execute Python and access our policyengine-uk package. The report generator worked with markdown formatting and [Plotly](https://plotly.com/) visualisations.
8+
9+
![](/images/posts/multi-agent-workflows-policy-research/claude-code-interface.png)
10+
11+
Claude Code automatically routes tasks to the appropriate agent based on the request. When we asked it to analyse a carbon tax scenario, it handled the coordination without explicit instructions about which agent should do what.
12+
13+
![](/images/posts/multi-agent-workflows-policy-research/create-agent-dialog.png)
14+
15+
## What worked
16+
17+
For standard distributional analyses—calculating poverty rates, Gini coefficients, and decile-level impacts—the workflow produced output matching our manual approach. The agents correctly structured PolicyEngine API calls and generated properly formatted charts.
18+
19+
The system saved the most time on repetitive tasks. Once we configured the agents for UK distributional analysis, we needed minimal adjustments to run similar analyses on other policies. This matters for our work where we often model multiple versions of the same reform.
20+
21+
The report generator consistently matched PolicyEngine's house style at least as well as our human writers typically do on first attempt. It maintained active voice, avoided subjective adjectives, and led with quantitative findings.
22+
23+
![](/images/posts/multi-agent-workflows-policy-research/policyengine-agents-list.png)
24+
25+
## What didn't work
26+
27+
Complex policy modelling exposed the limits. When calculations required understanding interactions between multiple benefit programmes—like how Universal Credit's taper interacts with Housing Benefit phase-outs—the agents struggled. They would implement each programme correctly in isolation but miss the coordination logic.
28+
29+
We found prompt precision mattered more than expected. An instruction to "analyse carbon tax impacts" produced generic output. We needed specifics: "Calculate relative change in net income by decile, poverty rates by demographic group, and constituency-level winners and losers." The agents don't infer methodological standards the way human researchers do.
30+
31+
We hit an unexpected problem with code review. Our first approach used the script-writing agent to review its own code. This caught syntax errors but missed logical problems. We had to create a separate reviewer agent with different prompting to get meaningful code critique.
32+
33+
![](/images/posts/multi-agent-workflows-policy-research/agents-in-progress.png)
34+
35+
## What we learned
36+
37+
Multi-agent workflows work best when the task has clear boundaries and established patterns. Our monthly policy briefs analysing government reforms fit this pattern. Research exploring new methodologies doesn't.
38+
39+
The technology changes how we think about research automation. Instead of one AI that does everything, specialised agents with defined responsibilities produce more reliable output. This mirrors how a streamlined policy analysis shop might operate—one person pulls data, another writes Python, a third formats the report.
40+
41+
These limitations reveal where human judgement remains essential. When policy interactions get complex, when methodological choices require expertise, or when political context matters for framing, the agents defer to human researchers. That's the right division of labour.
42+
43+
## Addressing the shortcomings
44+
45+
We're working to address these limitations through improved agent configurations and leveraging Anthropic's recently launched [Skills](https://www.anthropic.com/news/skills) and [plugin ecosystem](https://www.anthropic.com/news/claude-code-plugins). Skills allow us to package specific PolicyEngine expertise—like properly structuring benefit programme interactions or applying our methodological standards—into reusable components that Claude Code loads when needed.
46+
47+
We're packaging these insights into the [policyengine-claude plugin](https://github.com/PolicyEngine/policyengine-mcp), which enables other researchers to use Claude Code with PolicyEngine more effectively. The plugin includes specialised agents, pre-configured skills for common policy analyses, and direct database access through the Model Context Protocol.
48+
49+
We'll demonstrate this workflow at our [PolicyEngine 2.0 event](https://www.eventbrite.co.uk/e/policyengine-20-and-the-future-of-public-policy-analysis-tickets-1673065246189?aff=oddtdtcreator) on 3 November in London, where we'll show how these tools work together for real policy analysis. If you're interested in seeing multi-agent workflows in action for policy research, we'd encourage you to attend.
50+
51+
We're deploying this workflow for routine policy updates where the analysis structure stays constant. It won't replace human researchers, but it will free them to focus on questions that require genuine policy expertise rather than technical execution.

src/posts/posts.json

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1495,5 +1495,14 @@
14951495
"authors": ["vahid-ahmadi"],
14961496
"filename": "uk-vat-thresholds.md",
14971497
"image": "uk-vat-reeves.jpg"
1498+
},
1499+
{
1500+
"title": "Testing multi-agent AI workflows for policy research",
1501+
"description": "A multi-agent Claude Code system we built fared better on distributional analysis than benefit interactions; we're using these insights to enhance our forthcoming plugin.",
1502+
"date": "2025-10-28",
1503+
"tags": ["uk", "featured", "technology", "ai"],
1504+
"authors": ["vahid-ahmadi"],
1505+
"filename": "multi-agent-workflows-policy-research.md",
1506+
"image": "multi-agent-workflows-policy-research.png"
14981507
}
14991508
]

0 commit comments

Comments
 (0)