|
| 1 | +--- |
| 2 | +title: Demo 3 — Trove mandate (spend-policy layer) |
| 3 | +description: >- |
| 4 | + Demo 2's multi-merchant agent with guardrails. An |
| 5 | + onBeforePaymentCreation hook runs every outbound x402 payment |
| 6 | + through a plain-JS spend policy — merchant allowlist, per-call and |
| 7 | + per-merchant caps, session total, rate limit — before any MUSD |
| 8 | + moves on chain. |
| 9 | +topic: developers |
| 10 | +--- |
| 11 | + |
| 12 | +import { Aside, Steps, Tabs, TabItem } from '@astrojs/starlight/components'; |
| 13 | + |
| 14 | +[Demo 2](./trove-advisor/) showed a Claude tool-use agent paying three |
| 15 | +distinct merchants at per-request dynamic prices. It gave the agent |
| 16 | +real spending power without asking the agent, the model, or the |
| 17 | +reader to reason about **how much** the agent is allowed to spend. |
| 18 | + |
| 19 | +Demo 3 adds exactly that. The server, endpoints, merchants, prices, |
| 20 | +and agent prompt are unchanged from Demo 2. The one addition: a |
| 21 | +**single hook on the x402 client** — `onBeforePaymentCreation` — that |
| 22 | +runs every outbound payment through a plain-JS spend policy before |
| 23 | +`@x402/core` creates a signature. If the policy denies the payment, |
| 24 | +no permit2 signature is produced, **no MUSD moves on chain**, and the |
| 25 | +agent sees a structured tool error it can react to. |
| 26 | + |
| 27 | +The headline for readers: *add guardrails to an agentic buyer in one |
| 28 | +well-defined place.* |
| 29 | + |
| 30 | +## What you will build |
| 31 | + |
| 32 | +You will run the reference server + agent that ship in |
| 33 | +[`vativ/mezo-hack/apps/trove-advisor-mandate`](https://github.com/vativ/mezo-hack/tree/feat/agentic-3-trove-mandate/apps/trove-advisor-mandate) |
| 34 | +(currently on branch `feat/agentic-3-trove-mandate`). |
| 35 | + |
| 36 | +Same three paywalled endpoints as Demo 2 (`GET /oracle/btc`, |
| 37 | +`POST /risk/trove-assessment`, `GET /liquidations/queue`) with the |
| 38 | +same merchants and dynamic pricing. The new piece is one line on the |
| 39 | +x402 client: |
| 40 | + |
| 41 | +```typescript |
| 42 | +// apps/trove-advisor-mandate/src/agent.ts |
| 43 | +const xClient = new x402Client() |
| 44 | + .register('eip155:*', new ExactEvmScheme(signer)) |
| 45 | + .onBeforePaymentCreation(policy.asHook()); |
| 46 | +``` |
| 47 | + |
| 48 | +The hook receives `{ paymentRequired, selectedRequirements }` — the |
| 49 | +full 402 response and the single `accepts[]` entry the client chose |
| 50 | +to sign. It runs the policy, then returns either `undefined` (approve) |
| 51 | +or `{ abort: true, reason }` (deny). An aborted payment stops at the |
| 52 | +client — no signature, no on-chain transaction, no facilitator round |
| 53 | +trip. |
| 54 | + |
| 55 | +<Aside type="tip" title="Why this matters"> |
| 56 | +Agentic payments put signing authority behind an LLM loop. The |
| 57 | +model's reasoning can drift, its tools can be misused, its inputs |
| 58 | +can be adversarial. The policy layer is where a human-authored |
| 59 | +budget turns the spending surface from "whatever the model asks |
| 60 | +for" into "whatever the model asks for, within these rules." |
| 61 | +</Aside> |
| 62 | + |
| 63 | +## The default policy |
| 64 | + |
| 65 | +`src/policy.ts` defines `SpendPolicy` with six checks applied in |
| 66 | +order. The default instance wired into the demo: |
| 67 | + |
| 68 | +```typescript |
| 69 | +new SpendPolicy({ |
| 70 | + maxPerCall: { |
| 71 | + oracle: 0.001, // 2× actual 0.0005 — normal calls pass |
| 72 | + risk: 0.005, // up to ~6 stress scenarios |
| 73 | + liquidations: 0.002, // DELIBERATELY TIGHT — a limit=5 call (0.0025) is denied |
| 74 | + }, |
| 75 | + maxPerMerchant: { |
| 76 | + [ORACLE_PAYTO]: 0.005, |
| 77 | + [RISK_PAYTO]: 0.02, |
| 78 | + [HUNTER_PAYTO]: 0.01, |
| 79 | + }, |
| 80 | + maxTotal: 0.05, // 0.05 MUSD session cap |
| 81 | + merchantAllowlist: [ORACLE_PAYTO, RISK_PAYTO, HUNTER_PAYTO], |
| 82 | + timeWindow: 5 * 60_000, // 5-minute rolling window |
| 83 | + rateLimit: { |
| 84 | + liquidations: { max: 2, perMs: 60_000 }, // ≤ 2 liquidations calls per minute |
| 85 | + }, |
| 86 | +}); |
| 87 | +``` |
| 88 | + |
| 89 | +Applied to every outbound payment, in order: |
| 90 | + |
| 91 | +1. **`merchantAllowlist`** — hard reject if `payTo` is not in the |
| 92 | + allowlist. |
| 93 | +2. **`maxPerCall[endpoint]`** — reject if this single call's amount |
| 94 | + exceeds the per-endpoint cap. The endpoint is inferred from |
| 95 | + `resource.url`: `/oracle/…` → `oracle`, `/risk/…` → `risk`, |
| 96 | + `/liquidations/…` → `liquidations`. |
| 97 | +3. **`maxPerMerchant[payTo]`** — reject if the running cumulative |
| 98 | + spend to that merchant would exceed its cap. |
| 99 | +4. **`maxTotal`** — reject if the running session total would exceed |
| 100 | + the session cap. |
| 101 | +5. **`rateLimit[endpoint]`** — sliding window per endpoint; reject if |
| 102 | + recent-call count would exceed the limit. |
| 103 | +6. **`timeWindow`** — session counters auto-reset if the agent has |
| 104 | + been idle past the window. |
| 105 | + |
| 106 | +If all pass, the hook commits the spend intent (increments the |
| 107 | +cumulative counters + appends a timestamp) and returns `undefined` |
| 108 | +to approve. |
| 109 | + |
| 110 | +The `liquidations` per-call cap is deliberately tight: the default |
| 111 | +prompt asks for the top 5 troves, which costs `0.0005 × 5 = 0.0025` |
| 112 | +MUSD — over the `0.002` cap. That's what makes the policy fire |
| 113 | +visibly when you run the demo. |
| 114 | + |
| 115 | +## Prerequisites |
| 116 | + |
| 117 | +- **Node.js 20+** and **pnpm 9+**. Same as Demo 2. |
| 118 | +- **git**. |
| 119 | +- **Account A (Buyer) from the [Quickstart](./x402-quickstart/)** — |
| 120 | + funded with ≥ `0.003` MUSD to cover the default run (2 paid tool |
| 121 | + calls totalling `0.0029` MUSD). A 1,800 MUSD borrow from Quickstart |
| 122 | + Step 2 covers far more runs than you'll do. |
| 123 | +- **Permit2 approved for MUSD** on Account A. Already done if you |
| 124 | + completed [Demo 1 Step 1](./agentic-joke-buyer/#step-1-one-time-permit2-approval) |
| 125 | + or Demo 2. |
| 126 | +- **Account A's private key.** |
| 127 | +- **An Anthropic API key** for the Claude tool-use loop. |
| 128 | + |
| 129 | +## Step 1: Clone the demo and install |
| 130 | + |
| 131 | +```bash |
| 132 | +# Fresh clone: |
| 133 | +git clone https://github.com/vativ/mezo-hack.git |
| 134 | +cd mezo-hack |
| 135 | +git checkout feat/agentic-3-trove-mandate |
| 136 | +cd apps/trove-advisor-mandate |
| 137 | +pnpm install |
| 138 | +``` |
| 139 | + |
| 140 | +```bash |
| 141 | +# Or, reuse the existing clone from Demo 1 or Demo 2: |
| 142 | +cd path/to/mezo-hack |
| 143 | +git fetch && git checkout feat/agentic-3-trove-mandate |
| 144 | +cd apps/trove-advisor-mandate |
| 145 | +pnpm install |
| 146 | +``` |
| 147 | + |
| 148 | +<Aside type="note" title="Directory naming"> |
| 149 | +The **branch slug** in the bead is `demo-3-trove-mandate` and the |
| 150 | +page URL follows that. The **app directory** inside the repo is |
| 151 | +`apps/trove-advisor-mandate/` — reflecting that Demo 3 is Demo 2's |
| 152 | +advisor with a mandate layer added. Use the path above verbatim. |
| 153 | +</Aside> |
| 154 | + |
| 155 | +Demo 3's `package.json` pins `@x402/evm 2.10.0-mezo.7` through |
| 156 | +`pnpm.overrides` — same tarball as Demo 2 (the `onBeforePaymentCreation` |
| 157 | +hook is part of `@x402/core`, not the EVM scheme). Only `@x402/evm` |
| 158 | +needs the override. |
| 159 | + |
| 160 | +## Step 2: Configure `.env` |
| 161 | + |
| 162 | +```bash |
| 163 | +cp .env.example .env |
| 164 | +``` |
| 165 | + |
| 166 | +Fill in the two secrets at the bottom: |
| 167 | + |
| 168 | +```bash |
| 169 | +CLIENT_PRIVATE_KEY=0xYOUR_ACCOUNT_A_PRIVATE_KEY_HERE |
| 170 | +ANTHROPIC_API_KEY=sk-ant-YOUR_KEY_HERE |
| 171 | +``` |
| 172 | + |
| 173 | +Everything else in `.env.example` is pre-filled for Mezo Testnet — |
| 174 | +the RPC, MUSD contract, facilitator, and the same three merchant |
| 175 | +addresses as Demo 2 (`ORACLE_PAYTO`, `RISK_PAYTO`, `HUNTER_PAYTO`). |
| 176 | + |
| 177 | +<Aside type="caution" title="Create a .gitignore before any commit"> |
| 178 | +`vativ/mezo-hack` has no root `.gitignore`. The Demo 3 app ships its |
| 179 | +own `.gitignore`, but if you plan to commit anything at the repo |
| 180 | +level, confirm `.env` is covered: |
| 181 | + |
| 182 | +```bash |
| 183 | +cat apps/trove-advisor-mandate/.gitignore # verify .env is listed |
| 184 | +# or, at the repo root: |
| 185 | +echo ".env" > .gitignore |
| 186 | +``` |
| 187 | +</Aside> |
| 188 | + |
| 189 | +## Step 3: Start the server (Terminal 1) |
| 190 | + |
| 191 | +```bash |
| 192 | +pnpm server |
| 193 | +``` |
| 194 | + |
| 195 | +Demo 3's server is a byte-for-byte copy of Demo 2's server — same |
| 196 | +three endpoints, same prices, same merchants, same port (`4402`), |
| 197 | +same startup banner. Copied into the app so Demo 3 stands alone |
| 198 | +without cross-app imports; see |
| 199 | +[Demo 2](./trove-advisor/#step-3-start-the-server-terminal-1) for |
| 200 | +the server-side explainer. |
| 201 | + |
| 202 | +## Step 4: Run the agent (Terminal 2) and watch the policy fire |
| 203 | + |
| 204 | +```bash |
| 205 | +pnpm agent |
| 206 | +``` |
| 207 | + |
| 208 | +Expected output (condensed): |
| 209 | + |
| 210 | +``` |
| 211 | +[tool] get_btc_price({}) |
| 212 | + [paid] get_btc_price → tx 0xca055b…9baf (0.0005 MUSD → Merchant A) |
| 213 | +
|
| 214 | +[tool] assess_trove_risk({"collateralBtc":0.5,"debtMusd":20000,"scenarios":[10,20,30]}) |
| 215 | + [paid] assess_trove_risk → tx 0x285ebc…fb08 (0.0024 MUSD → Merchant B) |
| 216 | +
|
| 217 | +[tool] get_liquidation_queue({"limit":5}) |
| 218 | + [denied] get_liquidation_queue — policy blocked: |
| 219 | + Payment creation aborted: per_call_cap_exceeded: |
| 220 | + {"endpoint":"liquidations","amountMusd":0.0025,"capMusd":0.002} |
| 221 | +
|
| 222 | +=== Summary === |
| 223 | +Paid tool calls: 2 |
| 224 | +Denied tool calls: 1 |
| 225 | +Policy spend: 0.0029 MUSD total |
| 226 | +``` |
| 227 | + |
| 228 | +Two on-chain transactions settle (Merchant A + Merchant B). The |
| 229 | +third call — `get_liquidation_queue({ limit: 5 })` — is denied by |
| 230 | +the policy **before** any permit2 signature is created, so there is |
| 231 | +**no third transaction** on chain. The agent's running spend stops |
| 232 | +at `0.0029` MUSD. |
| 233 | + |
| 234 | +Claude sees the denial as a structured tool error and typically |
| 235 | +reports what it got from the two approved tools, often suggesting a |
| 236 | +workaround within budget — e.g. retry `get_liquidation_queue` with |
| 237 | +`limit=4` for a `0.002` MUSD call that the policy would approve. |
| 238 | + |
| 239 | +## Verifying the policy actually blocks on chain |
| 240 | + |
| 241 | +The "no MUSD moves for the denied call" claim is what this demo |
| 242 | +lives or dies on. Verify independently: |
| 243 | + |
| 244 | +<Steps> |
| 245 | + |
| 246 | +1. Before the run, note Account A's MUSD balance on chain: |
| 247 | + |
| 248 | + ```bash |
| 249 | + cast call 0x118917a40FAF1CD7a13dB0Ef56C86De7973Ac503 \ |
| 250 | + "balanceOf(address)(uint256)" <account-A-address> \ |
| 251 | + --rpc-url https://rpc.test.mezo.org |
| 252 | + ``` |
| 253 | + |
| 254 | +2. Run `pnpm agent`. |
| 255 | + |
| 256 | +3. Re-check the balance. The delta should be **exactly** `0.0029` |
| 257 | + MUSD (= `1000000000000000` oracle + `2400000000000000` risk = `2.9e15` |
| 258 | + wei) — **not** `0.0054` MUSD, which is what you'd see if the |
| 259 | + denied call had also settled. |
| 260 | + |
| 261 | +4. The two printed tx hashes should both resolve on |
| 262 | + [`explorer.test.mezo.org`](https://explorer.test.mezo.org) with |
| 263 | + MUSD `Transfer` logs pointing at Merchants A and B. Search |
| 264 | + Account A's address on the explorer; you should see **no third |
| 265 | + Transfer** from Account A corresponding to the denied call — |
| 266 | + because no transaction was ever submitted. |
| 267 | + |
| 268 | +</Steps> |
| 269 | + |
| 270 | +## Extending the policy |
| 271 | + |
| 272 | +`SpendPolicy` is plain data. Add new checks by dropping them into |
| 273 | +`SpendPolicy.check(…)` in the same shape as the existing six. |
| 274 | +Examples that fit naturally: |
| 275 | + |
| 276 | +- **Time-of-day windows.** Deny agentic spending between `00:00–06:00` |
| 277 | + UTC. |
| 278 | +- **Per-merchant velocity.** Slow an endpoint down after a spike. |
| 279 | +- **Per-resource cost caps.** No more than `$0.01` per unique URL per |
| 280 | + session. |
| 281 | +- **External allowlist lookup.** Call an in-house service to |
| 282 | + verify the merchant is still approved; cache with TTL. |
| 283 | +- **Escalation to a human.** Raise an approval prompt (Slack, email, |
| 284 | + CLI) if the projected running total would cross a threshold. |
| 285 | + |
| 286 | +All of these wire into the same single seam — the |
| 287 | +`onBeforePaymentCreation` hook. The signing pipeline, the facilitator, |
| 288 | +the on-chain settlement don't change. |
| 289 | + |
| 290 | +## Troubleshooting |
| 291 | + |
| 292 | +| Symptom | Cause | Fix | |
| 293 | +|---|---|---| |
| 294 | +| `Payment creation aborted: not_on_allowlist` | Server's `accepts[].payTo` isn't in your `merchantAllowlist` | Expected for a merchant you explicitly didn't approve. If unexpected, double-check `.env` matches `policy.ts`'s address list (case-sensitive hex) | |
| 295 | +| `Payment creation aborted: per_call_cap_exceeded` | This single call's amount > the endpoint's `maxPerCall` | If intended (like the default `liquidations` block), no action — that's the demo story. If unintended, raise the cap in `policy.ts` or ask the agent for a smaller `limit`/`scenarios.length` | |
| 296 | +| `Payment creation aborted: merchant_cap_exceeded` | Running cumulative spend to one merchant crossed `maxPerMerchant` | Raise that merchant's cap, or let `timeWindow` (5 min) reset the counters | |
| 297 | +| `Payment creation aborted: session_total_exceeded` | Running session total crossed `maxTotal` | Raise `maxTotal`, wait out `timeWindow`, or restart the agent process to reset counters | |
| 298 | +| `Payment creation aborted: rate_limit_exceeded` | More than `max` calls to an endpoint within `perMs` | Back off; the sliding window releases the lock naturally | |
| 299 | +| `CLIENT_PRIVATE_KEY environment variable is required` | `.env` missing or in the wrong directory | `cp .env.example .env` inside `apps/trove-advisor-mandate/` | |
| 300 | +| `WARNING: mUSD not approved for Permit2` | Account A hasn't granted Permit2 allowance | Apply [Demo 1 Step 1](./agentic-joke-buyer/#step-1-one-time-permit2-approval) | |
| 301 | +| `does not provide an export named 'DEFAULT_STABLECOINS'` | `pnpm.overrides` not forcing `@x402/evm` through the preview tarball | Confirm the overrides block pins `@x402/evm@2.10.0-mezo.7`; rerun `pnpm install` | |
| 302 | +| Two paid txs in Summary but three on the explorer | The denied call somehow still settled | Something is bypassing the hook. Confirm `onBeforePaymentCreation(policy.asHook())` is chained on the `x402Client` you actually pass to `wrapFetchWithPayment`, not on a second client instance | |
| 303 | + |
| 304 | +## Security |
| 305 | + |
| 306 | +- **Throwaway testnet key only.** The policy enforces a spending |
| 307 | + ceiling but the key itself is still a signing key — rotate it if |
| 308 | + the clone ends up on a shared machine. |
| 309 | +- **The policy is not a sandbox.** It runs in the same process as |
| 310 | + the agent and the signer. Anyone with code execution on that host |
| 311 | + can bypass it. Use OS-level guards (separate user, container, |
| 312 | + VM) for anything beyond a local demo. |
| 313 | +- **Policy changes are authoritative.** Edits to `policy.ts` take |
| 314 | + effect on the next agent start with no audit trail. For |
| 315 | + production, load the policy from a signed config, a KMS-guarded |
| 316 | + store, or a remote approval service instead of a local `.ts` file. |
| 317 | +- **Anthropic API key.** Treat it like any paid API credential. |
| 318 | + |
| 319 | +## See also |
| 320 | + |
| 321 | +- [Demo 2 — Trove advisor](./trove-advisor/). Same server + agent, |
| 322 | + **without** the policy hook. Start here if you want to see the |
| 323 | + full spend happen first. |
| 324 | +- [Demo 1 — Agentic joke buyer](./agentic-joke-buyer/). Single |
| 325 | + merchant, flat price, no LLM. |
| 326 | +- [Quickstart](./x402-quickstart/). Account A setup + MUSD minting. |
| 327 | +- [`vativ/mezo-hack/apps/trove-advisor-mandate`](https://github.com/vativ/mezo-hack/tree/feat/agentic-3-trove-mandate/apps/trove-advisor-mandate). |
| 328 | + Full source — server, agent, policy, ABIs, `.env.example`. |
| 329 | +- [Uniswap Permit2](https://github.com/Uniswap/permit2). The |
| 330 | + allowance contract the EVM x402 scheme uses for MUSD transfers. |
0 commit comments