GoogleCloudPlatform
diff --git a/‎examples/migration_v5/periodic_materialization/README.md‎
Lines changed: 145 additions & 0 deletions b/‎examples/migration_v5/periodic_materialization/README.md‎
Lines changed: 145 additions & 0 deletions
@@ -56,6 +56,15 @@ full prose.
   tables and the state/checkpoint table all live here.
 * `gcloud` authenticated with permissions to deploy Cloud Run
   Jobs, create scheduler triggers, and grant IAM bindings.
+* `python3` on PATH. The deploy script uses Python to apply
+  dataset-level IAM via the BigQuery client's `AccessEntry`
+  API (since `bq add-iam-policy-binding` requires project
+  allowlisting in some environments). If your `python3` doesn't
+  have `google-cloud-bigquery` installed, the script
+  transparently creates a one-shot temp venv with it — no
+  manual install required. If it does (e.g., you ran
+  `pip install -e .` from the repo root), the script reuses
+  that directly.
 
 ## Local dry-run
 
@@ -127,6 +136,14 @@ The script:
 3. **Grants narrow IAM** to the SA:
    * Project-level `roles/bigquery.jobUser` —
      `bigquery.jobs.create` only.
+   * Project-level `roles/aiplatform.user` — required because
+     the MAKO demo's extraction path calls BigQuery's
+     `AI.GENERATE` function (Gemini-backed entity extraction).
+     Without this grant, the AI call returns "user does not
+     have the permission to access resources used by
+     AI.GENERATE" and the orchestrator silently extracts an
+     empty graph for every session. Surfaced by the live
+     verification in PR #166.
    * Dataset-level `roles/bigquery.dataViewer` on
      **events** — read-only access. The events dataset stays
      effectively read-only per the contract above.
@@ -270,6 +287,134 @@ also fail and the scheduled fire will be reported as failed in
 Cloud Monitoring. Set up an alert on
 `logging.googleapis.com/log_entry_count` with severity `ERROR`.
 
+## Verified Cloud Run deployment evidence
+
+This section documents an end-to-end live verification of the
+deploy path against `test-project-0728-467323` (the
+canonical SDK test project). The verification was the work of
+PR #166 (follow-up to #165) and surfaced four real issues — all
+fixed in `deploy_cloud_run_job.sh` before the evidence below
+was captured. See the PR description for the full discovery
+log.
+
+**Inputs:**
+
+* Events dataset: `migration_v5_idem_43c51d05` (3 demo
+  sessions, 115 events, pre-populated by `run_agent.py` in PR
+  #164).
+* Graph dataset: `migration_v5_graph_verify_500c9f` (auto-
+  created by deploy script).
+* Job name: `bqaa-periodic-verify-500c9f`.
+* Schedule: `0 */6 * * *`.
+* Region: `us-central1`.
+
+**Build + deploy:**
+
+* Cloud Build image:
+  `us-central1-docker.pkg.dev/test-project-0728-467323/cloud-run-source-deploy/bqaa-periodic-verify-500c9f@sha256:d1cd008…`.
+* Built from the vendored `./sdk_src` (PR #165 contract):
+  `Building bigquery-agent-analytics @ file:///workspace/sdk_src` →
+  `Built bigquery-agent-analytics @ file:///workspace/sdk_src`.
+* Build time: ~4 min (Cloud Build + Buildpacks).
+
+**Cloud Scheduler trigger** (`gcloud scheduler jobs describe`):
+
+```yaml
+httpTarget:
+  httpMethod: POST
+  oauthToken:
+    scope: https://www.googleapis.com/auth/cloud-platform
+    serviceAccountEmail: bqaa-periodic-sa@test-project-0728-467323.iam.gserviceaccount.com
+  uri: https://us-central1-run.googleapis.com/apis/run.googleapis.com/v1/namespaces/test-project-0728-467323/jobs/bqaa-periodic-verify-500c9f:run
+schedule: 0 */6 * * *
+state: ENABLED
+```
+
+The OAuth identity matches the runtime SA — same SA serves
+both runtime and scheduler-caller as designed.
+
+**IAM contract** verified post-deploy via the BigQuery client:
+
+```
+Events dataset IAM for SA:
+  role=READER, entity_type=userByEmail, entity_id=bqaa-periodic-sa@…
+  Number of WRITE/OWNER bindings for SA: 0
+
+Graph dataset IAM for SA:
+  role=WRITER, entity_type=userByEmail, entity_id=bqaa-periodic-sa@…
+```
+
+The SA can read events but cannot write — the "events dataset
+read-only" contract holds at the IAM layer.
+
+**Successful execution** (`materialization complete` payload
+from Cloud Logging, after the post-deploy `roles/aiplatform.user`
+grant the verification added to the deploy script):
+
+```json
+{
+  "run_id": "2d52338e16db",
+  "sessions_discovered": 3,
+  "sessions_materialized": 3,
+  "sessions_failed": 0,
+  "rows_materialized": {
+    "DecisionExecution": 3,
+    "DecisionPoint": 3,
+    "Candidate": 11,
+    "SelectionOutcome": 3,
+    "ContextSnapshot": 3,
+    "evaluatesCandidate": 11,
+    "selectedCandidate": 3,
+    "rejectedCandidate": 5,
+    "atContextSnapshot": 3,
+    "executedAtDecisionPoint": 3,
+    "hasSelectionOutcome": 3
+  },
+  "ok": true,
+  "failures": []
+}
+```
+
+All 11 entity/relationship tables populated.
+`cleanup_status=deleted, insert_status=inserted,
+idempotent=true` across the board.
+
+**Scheduler trigger actually fires.** The cron-scheduled fire
+at `2026-05-16T06:00:00Z` produced a third state-table row
+(`run_id 4725ebd79060`) with the same shape — proving the
+end-to-end path from Cloud Scheduler → Cloud Run Job →
+materialization works without manual intervention.
+
+**State table audit log** (`_bqaa_materialization_state` in
+the graph dataset):
+
+```
+run_id          run_started_at         sessions_disc / mat / failed   ok
+ff1e956df8b8    2026-05-16 04:38:59    3 / 3 / 0                       true
+2d52338e16db    2026-05-16 04:48:45    3 / 3 / 0                       true
+4725ebd79060    2026-05-16 06:02:51    3 / 3 / 0                       true
+```
+
+(Row 1 is the deploy script's `--smoke` execution, which ran
+BEFORE the verification added `roles/aiplatform.user` to the
+deploy. AI.GENERATE failed for every session there, but the
+orchestrator still reported `ok=true` with empty
+`rows_materialized` — see the known issue below.)
+
+### Known issue surfaced by this verification
+
+The orchestrator currently reports `sessions_materialized ==
+sessions_discovered` and `ok=true` even when every per-event
+`AI.GENERATE` call failed. The `rows_materialized` dict is
+empty in that case, but `sessions_materialized` doesn't
+reflect the underlying failure. Operators monitoring on
+`jsonPayload.ok` would miss the silent extraction failure.
+
+Tracked for SDK follow-up — out of scope for this example PR.
+Workaround: alert on
+`jsonPayload.rows_materialized == {}` in Cloud Logging /
+Monitoring as a second-line check.
+
 ## Not in scope here
 
 * **Terraform / Pulumi.** A scripted deploy is easier to read