Skip to content

Commit fd3c68c

Browse files
committed
feat(a2a): unify on Pro 2.5 / us-central1 + appspot runtime SA (W9.2)
Fixes the 2026-04-30 Telegram failure where the A2A orchestrator returned a brief whose every consult had failed with 403 PERMISSION_DENIED. Root cause: every A2A engine was deployed with the default Reasoning Engine Service Agent, which lacks aiplatform.user — so engine→engine calls 403d. Bot-as-caller worked (appspot SA on Cloud Run), engine-as- caller didn't (different runtime identity). Bundles three coupled changes: deploy_a2a.py - New --service_account flag; defaults to APPSPOT_SA constant (gcp-cits-ccat-poc-d4d2@appspot.gserviceaccount.com — the only enabled SA on this project with roles/editor). - --region default flipped from asia-southeast1 to us-central1. - Lifted the env-var auto-forward block to a module-level AUTO_FORWARD_ENV_VARS constant. Added LEVEL_2_*, LEVEL_2B_*, LEVEL_3_*, LEVEL_4_* + LEVEL_REGION (orchestrator routes all 5). - truststore.inject_into_ssl() at module top so deploys work on NTU's TLS-inspecting network where certifi's bundle isn't trusted. Model bumps Flash 2.5 -> Pro 2.5 across every A2A sub-agent - level_1_agent: root. - level_2_agent: classify, quick_answerer, task_planner, researcher, schedule_writer. - level_2b_agent: classify, greet_user, bug_handler, billing_handler, feature_handler. - level_3_agent: search_agent, analyst_agent, writer_agent, root. - level_4_agent: data_fetcher, analyst (code-executor — gotcha google#21 override, verified locally not to hang), report_writer, agent_creator (BuiltInPlanner block removed, Pro has native thinking on by default), root. - a2a_orchestrator: chart_agent (code-executor — gotcha google#21 override), writer_agent. Root was already Pro 2.5. remote_tools.py defaults - a2a_orchestrator/remote_tools.py: _LEVEL_REGION default asia-southeast1 -> us-central1. - level_4_agent/remote_tools.py: _LEVEL_1_REGION default asia-southeast1 -> us-central1. scripts/local_smoke.py - New file. InMemoryRunner-based local probe for any A2A agent. - 90s hang threshold catches the Pro+BuiltInCodeExecutor signature from CLAUDE.md gotcha google#21. All 6 W9.2 probes ran < 60s; gotcha did NOT reproduce on this ADK 2.0 path. 3.x Gemini models (gemini-3.x-pro-preview, gemini-3.x-flash-preview, gemini-3.1-flash-lite-preview) verified gated on this project as of 2026-05-01 — return 404 NOT_FOUND for generateContent in both us-central1 and asia-southeast1. Pinned to 2.5 family until per-project preview access is granted. Resource IDs are unchanged at this commit. The 6 engines themselves are redeployed in a follow-up step; the W9.2 plan documents the delete-then-redeploy sequence at new features/17-a2a-orchestrator-403-fix.md in the swarm repo.
1 parent cadedd4 commit fd3c68c

10 files changed

Lines changed: 202 additions & 65 deletions

File tree

a2a_orchestrator/agent.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,10 @@ class WriterInput(BaseModel):
149149
# Flash + BuiltInPlanner per Level 4 line ~252 ("the proven combination"
150150
# for Flash on tool-heavy code-execution tasks). Pro on a
151151
# BuiltInCodeExecutor leaf can hang 6+ min under AFC — Level 4 gotcha #21.
152-
model="gemini-2.5-flash",
152+
# W9.2 (Simon 2026-05-01): override the gotcha — try Pro 2.5 here. Local
153+
# smoke test in plan §5.4 catches the hang signature before deploy; if
154+
# it reproduces, revert this line to "gemini-2.5-flash".
155+
model="gemini-2.5-pro",
153156
planner=BuiltInPlanner(
154157
thinking_config=types.ThinkingConfig(include_thoughts=True),
155158
),
@@ -231,7 +234,8 @@ class WriterInput(BaseModel):
231234

232235
writer_agent = Agent(
233236
name="writer_agent",
234-
model="gemini-2.5-flash",
237+
# us-central1 + Pro 2.5 (W9.2 — all A2A sub-agents on Pro per Simon 2026-05-01).
238+
model="gemini-2.5-pro",
235239
description=(
236240
"Synthesises consulted findings (and optionally a chart "
237241
"description) into a Markdown-formatted report. Final node — "

a2a_orchestrator/remote_tools.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,8 @@
2525

2626
logger = logging.getLogger(__name__)
2727

28-
# All five Level engines live in asia-southeast1.
29-
_LEVEL_REGION = os.environ.get("LEVEL_REGION", "asia-southeast1")
28+
# All five Level engines live in us-central1 (W9.2 — Pro 2.5 unification).
29+
_LEVEL_REGION = os.environ.get("LEVEL_REGION", "us-central1")
3030
_PROJECT_NUMBER = os.environ.get("LEVEL_PROJECT_NUMBER", "888142536377")
3131

3232
# Defaults are the post-Phase-A engine IDs (verified 2026-04-28). Override

deploy_a2a.py

Lines changed: 54 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,19 @@
2727
"""
2828
from __future__ import annotations
2929

30+
# truststore must be injected BEFORE any HTTPS-using import (vertexai,
31+
# google-auth, requests, etc.) so all SSL goes through the Windows trust
32+
# store. Required on NTU's network where TLS inspection injects a
33+
# corporate root CA that's not in certifi's bundle.
34+
try:
35+
import truststore # type: ignore
36+
truststore.inject_into_ssl()
37+
except ImportError:
38+
# truststore is optional — required only when running against a
39+
# network that does TLS inspection (e.g., NTU). On other networks
40+
# certifi handles validation fine.
41+
pass
42+
3043
import argparse
3144
import importlib
3245
import os
@@ -43,6 +56,27 @@
4356
from vertexai.preview.reasoning_engines.templates.a2a import create_agent_card
4457

4558
PROJECT = "gcp-cits-ccat-poc-d4d2"
59+
APPSPOT_SA = f"{PROJECT}@appspot.gserviceaccount.com"
60+
61+
# Env vars auto-forwarded into the deployed engine's runtime container if
62+
# present in the deploy shell. Lifted from main()'s body to a module
63+
# constant so tests can import + assert against it (W9.2 §6.3.5).
64+
AUTO_FORWARD_ENV_VARS = (
65+
# gahmen-mcp toolset (level_4_agent's data_fetcher_agent reads at import).
66+
"SMITHERY_API_KEY",
67+
"SMITHERY_GAHMEN_URL",
68+
# Per-Level A2A peer routing (orchestrator + level_4 consume these).
69+
# All Levels live in us-central1 post-W9.2 (was asia-southeast1).
70+
"LEVEL_1_A2A_ENGINE_ID", "LEVEL_1_A2A_REGION",
71+
"LEVEL_2_A2A_ENGINE_ID", "LEVEL_2_A2A_REGION",
72+
"LEVEL_2B_A2A_ENGINE_ID", "LEVEL_2B_A2A_REGION",
73+
"LEVEL_3_A2A_ENGINE_ID", "LEVEL_3_A2A_REGION",
74+
"LEVEL_4_A2A_ENGINE_ID", "LEVEL_4_A2A_REGION",
75+
# Generic project / region overrides (orchestrator's remote_tools reads
76+
# LEVEL_REGION as the cross-Level default).
77+
"LEVEL_PROJECT_NUMBER",
78+
"LEVEL_REGION",
79+
)
4680

4781

4882
def _executor_builder(root_agent):
@@ -71,7 +105,19 @@ def main() -> None:
71105
parser = argparse.ArgumentParser()
72106
parser.add_argument("module", help="Agent package, e.g. level_1_agent")
73107
parser.add_argument("--display", required=True, help='Display name, e.g. "Level 1 (A2A)"')
74-
parser.add_argument("--region", default="asia-southeast1")
108+
# W9.2 default: us-central1 (Pro 2.5 lives here; was asia-southeast1).
109+
parser.add_argument("--region", default="us-central1")
110+
parser.add_argument(
111+
"--service_account",
112+
default=APPSPOT_SA,
113+
help=(
114+
"Runtime SA for the deployed engine. Default: appspot SA "
115+
"(the only enabled SA on this project with roles/editor). "
116+
"Without this, Vertex assigns the default Reasoning Engine "
117+
"Service Agent — which lacks aiplatform.user, so any "
118+
"engine→engine agent_engines.get() call 403s. See W9.2 plan §3."
119+
),
120+
)
75121
parser.add_argument(
76122
"--description",
77123
default="ADK agent exposed via A2A on Vertex Agent Engine.",
@@ -160,40 +206,27 @@ def main() -> None:
160206
"google-adk[a2a]>=2.0.0b1,<3.0.0",
161207
]
162208

163-
# Auto-forward env vars that the agent might need at runtime.
164-
# SMITHERY_API_KEY — gahmen-mcp toolset gate (level_4_agent's
165-
# data_fetcher_agent reads it at import time).
166-
# SMITHERY_GAHMEN_URL — override for the Smithery server URL.
167-
# LEVEL_1_A2A_* — Level 1 peer-A2A target for level_4_agent's
168-
# consult_level_1 tool. The defaults baked
169-
# into remote_tools.py work for the canonical
170-
# asia-southeast1 Phase 7 deploy; override
171-
# via these env vars if Level 1 has been
172-
# redeployed to a new ID/region.
173-
# If the deploy shell has any of these set, bake them into the
174-
# deployed engine's container. Anything not set falls back to the
175-
# in-code defaults (or the agent runs without that capability).
209+
# Auto-forward env vars that the agent might need at runtime. Source of
210+
# truth is AUTO_FORWARD_ENV_VARS at module top — single place to add new
211+
# ones. Anything not set in the deploy shell falls back to the in-code
212+
# defaults (or the agent runs without that capability).
176213
env_vars: dict[str, str] = {}
177-
for name in (
178-
"SMITHERY_API_KEY",
179-
"SMITHERY_GAHMEN_URL",
180-
"LEVEL_1_A2A_ENGINE_ID",
181-
"LEVEL_1_A2A_REGION",
182-
"LEVEL_1_A2A_PROJECT_NUMBER",
183-
):
214+
for name in AUTO_FORWARD_ENV_VARS:
184215
value = os.environ.get(name)
185216
if value:
186217
env_vars[name] = value
187218
if env_vars:
188219
print(f"Forwarding {len(env_vars)} env var(s) to engine: {sorted(env_vars)}")
189220

190221
print(f"Deploying {args.module} to {args.region} as {args.display!r} ...")
222+
print(f"Runtime SA: {args.service_account}")
191223
remote = agent_engines.create(
192224
agent_engine=a2a_app,
193225
requirements=requirements,
194226
extra_packages=[args.module], # uploads e.g. ./level_1_agent
195227
display_name=args.display,
196228
env_vars=env_vars or None,
229+
service_account=args.service_account,
197230
)
198231
print(f"\n✅ Deployed: {remote.resource_name}")
199232
print(

level_1_agent/agent.py

Lines changed: 3 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -49,14 +49,9 @@
4949

5050
root_agent = Agent(
5151
name="level_1_agent",
52-
# Was `gemini-3.1-flash-lite-preview` (preview alias resolves only via
53-
# GOOGLE_CLOUD_LOCATION=global). Switched to `gemini-2.5-flash` because
54-
# Vertex Agent Engine deploys force-overwrite the location to the
55-
# engine's region (templates/a2a.py:241-245) and the preview alias 404s
56-
# in regional endpoints like asia-southeast1. `gemini-2.5-flash` works
57-
# in both `global` and `asia-southeast1`, so the local `adk run` path
58-
# is unaffected. See DEPLOYMENT_NOTES.md "Phase 7" for context.
59-
model="gemini-2.5-flash",
52+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated on this project per
53+
# audit 2026-05-01). Was Flash 2.5 in asia-southeast1 (W9 Phase A).
54+
model="gemini-2.5-pro",
6055
description=(
6156
"A connected problem-solver that uses Google Search to answer"
6257
" questions requiring real-time information."

level_2_agent/agent.py

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -183,7 +183,8 @@ def anchor_today(ctx: Context):
183183

184184
classify = Agent(
185185
name="classify",
186-
model="gemini-2.5-flash",
186+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated per audit 2026-05-01).
187+
model="gemini-2.5-pro",
187188
instruction=(
188189
"Classify the user's input as 'quick' or 'plan'."
189190
"\n\n QUICK: greetings ('hi', 'hello', 'what can you do?',"
@@ -211,7 +212,8 @@ def anchor_today(ctx: Context):
211212
# only when current information is needed.
212213
quick_answerer = Agent(
213214
name="quick_answerer",
214-
model="gemini-2.5-flash",
215+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated per audit 2026-05-01).
216+
model="gemini-2.5-pro",
215217
description=(
216218
"Greets the user and answers single-step factual questions"
217219
" without going through the full planning pipeline."
@@ -251,7 +253,8 @@ def anchor_today(ctx: Context):
251253

252254
task_planner = Agent(
253255
name="task_planner",
254-
model="gemini-2.5-flash",
256+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated per audit 2026-05-01).
257+
model="gemini-2.5-pro",
255258
instruction=(
256259
"Today is {today_human?} ({today?}).\n\n"
257260
'The user request: "{request}"\n\n'
@@ -272,7 +275,8 @@ def anchor_today(ctx: Context):
272275
# `fan_out_research` below.
273276
researcher = Agent(
274277
name="researcher",
275-
model="gemini-2.5-flash",
278+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated per audit 2026-05-01).
279+
model="gemini-2.5-pro",
276280
instruction=(
277281
"Use google_search to gather a 2-3 sentence study brief on the"
278282
" topic in the user message. Focus on: key concepts to review,"
@@ -285,7 +289,8 @@ def anchor_today(ctx: Context):
285289

286290
schedule_writer = Agent(
287291
name="schedule_writer",
288-
model="gemini-2.5-flash",
292+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated per audit 2026-05-01).
293+
model="gemini-2.5-pro",
289294
instruction=(
290295
"Today is {today_human?} ({today?}). Produce a markdown"
291296
" timetable for the user.\n\n"

level_2b_agent/agent.py

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,8 @@ def route_input(node_input: dict):
163163

164164
classify = Agent(
165165
name="classify",
166-
model="gemini-2.5-flash",
166+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated per audit 2026-05-01).
167+
model="gemini-2.5-pro",
167168
description=(
168169
"Classifies an inbound support message into one of four"
169170
" categories. Pure routing logic — does not respond to the"
@@ -196,7 +197,8 @@ def route_input(node_input: dict):
196197

197198
greet_user = Agent(
198199
name="greet_user",
199-
model="gemini-2.5-flash",
200+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated per audit 2026-05-01).
201+
model="gemini-2.5-pro",
200202
description=(
201203
"Handles greeting / capability-question routes by introducing"
202204
" the agent and suggesting example queries."
@@ -224,7 +226,8 @@ def route_input(node_input: dict):
224226

225227
bug_handler = Agent(
226228
name="bug_handler",
227-
model="gemini-2.5-flash",
229+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated per audit 2026-05-01).
230+
model="gemini-2.5-pro",
228231
description=(
229232
"Handles bug reports — captures repro steps, severity, and"
230233
" recent changes."
@@ -250,7 +253,8 @@ def route_input(node_input: dict):
250253

251254
billing_handler = Agent(
252255
name="billing_handler",
253-
model="gemini-2.5-flash",
256+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated per audit 2026-05-01).
257+
model="gemini-2.5-pro",
254258
description=(
255259
"Handles billing and pricing questions for the (mock) product"
256260
" plans."
@@ -277,7 +281,8 @@ def route_input(node_input: dict):
277281

278282
feature_handler = Agent(
279283
name="feature_handler",
280-
model="gemini-2.5-flash",
284+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated per audit 2026-05-01).
285+
model="gemini-2.5-pro",
281286
description=(
282287
"Handles feature requests — captures use case, logs to the"
283288
" product backlog, sets expectations."

level_3_agent/agent.py

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -213,7 +213,8 @@ class Brief(BaseModel):
213213

214214
search_agent = Agent(
215215
name="search_agent",
216-
model="gemini-2.5-flash",
216+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated per audit 2026-05-01).
217+
model="gemini-2.5-pro",
217218
description=(
218219
"Searches the web for one focused sub-question and returns a"
219220
" plain-text finding with source domains cited inline."
@@ -248,7 +249,8 @@ class Brief(BaseModel):
248249

249250
analyst_agent = Agent(
250251
name="analyst_agent",
251-
model="gemini-2.5-flash",
252+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated per audit 2026-05-01).
253+
model="gemini-2.5-pro",
252254
description=(
253255
"Reviews accumulated search findings for patterns, contradictions,"
254256
" and gaps. Pure LLM reasoning — no tools."
@@ -274,7 +276,8 @@ class Brief(BaseModel):
274276

275277
writer_agent = Agent(
276278
name="writer_agent",
277-
model="gemini-2.5-flash",
279+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated per audit 2026-05-01).
280+
model="gemini-2.5-pro",
278281
description=(
279282
"Synthesises findings + analysis into the final structured Brief"
280283
" for the user. Pure LLM reasoning — no tools."
@@ -305,7 +308,8 @@ class Brief(BaseModel):
305308

306309
root_agent = Agent(
307310
name="level_3_agent",
308-
model="gemini-2.5-flash",
311+
# us-central1 + Pro 2.5 (W9.2 — 3.x preview gated per audit 2026-05-01).
312+
model="gemini-2.5-pro",
309313
description=(
310314
"Research coordinator that delegates to search, analyst, and"
311315
" writer specialists and returns a structured brief. Routing"

level_4_agent/agent.py

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -353,7 +353,8 @@ class Brief(BaseModel):
353353

354354
data_fetcher_agent = Agent(
355355
name="data_fetcher_agent",
356-
model="gemini-2.5-flash",
356+
# us-central1 + Pro 2.5 (W9.2 — orchestration-shaped: routes between MCP/A2A).
357+
model="gemini-2.5-pro",
357358
description=(
358359
"Fetches public business data via two inter-system protocols:"
359360
" A2A peer consultation (Level 1 over on_message_send) for"
@@ -379,13 +380,17 @@ class Brief(BaseModel):
379380

380381
analyst_agent = Agent(
381382
name="analyst_agent",
383+
# W9.2 (Simon 2026-05-01): override gotcha #21 — Pro 2.5 on
384+
# BuiltInCodeExecutor. Local smoke test in plan §5.4 catches the
385+
# hang signature before any cloud deploy; if reproducible, revert
386+
# this to "gemini-2.5-flash". Original gotcha:
382387
# Flash + BuiltInPlanner per AGENTS.md gotcha #21: Pro on a
383388
# BuiltInCodeExecutor leaf can hang 6+ min under AFC. The planner
384389
# turns Gemini's native thinking on for Flash so the model plans
385390
# cell layout before writing code — directly addressing gotcha #20
386391
# ("a later code cell closes/re-saves the figure → blank Version 1
387392
# overwrite hides the real chart at Version 0").
388-
model="gemini-2.5-flash",
393+
model="gemini-2.5-pro",
389394
planner=BuiltInPlanner(
390395
thinking_config=types.ThinkingConfig(include_thoughts=True)
391396
),
@@ -449,7 +454,8 @@ class Brief(BaseModel):
449454

450455
report_writer_agent = Agent(
451456
name="report_writer_agent",
452-
model="gemini-2.5-flash",
457+
# us-central1 + Pro 2.5 (W9.2 — all A2A sub-agents on Pro per Simon 2026-05-01).
458+
model="gemini-2.5-pro",
453459
description=(
454460
"Formats accumulated findings into a structured BI brief."
455461
" Output is the final answer — do not re-paraphrase."
@@ -510,19 +516,12 @@ class Brief(BaseModel):
510516
# output, which directly addresses the conflated-decision empty
511517
# STOP. Trade-off: ~30-50% more latency on creator turns vs. Pro,
512518
# but creator runs rarely so absolute cost stays low.
513-
model="gemini-2.5-flash",
514-
# Native thinking on Flash. Same shape as analyst_agent — confirmed
515-
# working pattern. Don't replace with PlanReActPlanner: that's a
516-
# prompt-level text scaffold (forces the LLM to TYPE planning
517-
# sections), whereas BuiltInPlanner activates Gemini's native
518-
# thinking compute (separate token budget, runs BEFORE tool-choice
519-
# is committed). For multi-turn HITL with chained tool calls,
520-
# native thinking is the right primitive. Only one `planner` field
521-
# is supported per LlmAgent; the two planners are mutually
522-
# exclusive.
523-
planner=BuiltInPlanner(
524-
thinking_config=types.ThinkingConfig(include_thoughts=True)
525-
),
519+
# W9.2 (Simon 2026-05-01): flipped Flash → Pro 2.5. The downgrade
520+
# to Flash + BuiltInPlanner was specifically because asia-southeast1
521+
# didn't serve Pro; us-central1 does, so the original Pro choice is
522+
# restored. Native thinking is on by default on Pro for compositional
523+
# function calls — explicit BuiltInPlanner block removed.
524+
model="gemini-2.5-pro",
526525
description=(
527526
"Synthesises a new specialist agent when the BI team lacks a"
528527
" capability. Use when the user's request cannot be served by"
@@ -623,7 +622,8 @@ def _rehydrate_runtime_tools(callback_context: CallbackContext):
623622

624623
root_agent = Agent(
625624
name="level_4_agent",
626-
model="gemini-2.5-flash",
625+
# us-central1 + Pro 2.5 (W9.2 — orchestrator role, all A2A on Pro per Simon 2026-05-01).
626+
model="gemini-2.5-pro",
627627
description=(
628628
"Self-evolving Business Intelligence coordinator. Routes"
629629
" analytical business questions to a fixed team (data_fetcher,"

level_4_agent/remote_tools.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,8 @@
4141
# Where Level 1's A2A engine lives. Defaults to the resource ID minted
4242
# by Phase 7 (the level_1_agent A2A redeploy with `gemini-2.5-flash`).
4343
# Override via env var for re-deploys without rebuilding Level 4.
44-
_LEVEL_1_REGION = os.environ.get("LEVEL_1_A2A_REGION", "asia-southeast1")
44+
# W9.2 — Level 1 lives in us-central1 (was asia-southeast1).
45+
_LEVEL_1_REGION = os.environ.get("LEVEL_1_A2A_REGION", "us-central1")
4546
_LEVEL_1_RESOURCE_ID = os.environ.get(
4647
"LEVEL_1_A2A_ENGINE_ID",
4748
"2134899737420103680",

0 commit comments

Comments
 (0)