Skip to content

Commit ccb9ed7

Browse files
authored
examples(migration_v5): notebook Beat 3 + 4.4 wired live (#107 final) (#160)
* feat(examples): wire Beat 3 + 4.4 to live compile/runtime/savings (#107 final beat) Replaces the PR-#156-placeholder cells in Beat 3 with the live compile-extractor → measure-cache → runtime-bundle → savings pipeline, and updates Beat 4.4 from "deferred follow-up" to non-zero hub-shape evidence. Live evidence from scratch dataset ``migration_v5_demo_2bacc3d9`` (3 sessions, 60 cells total): * **Beat 3.3** — ``compile_extractor`` runs a self-contained compiled source through AST → smoke → validator → fingerprint. Hand-authored source handles the central-hub slice (``complete_execution`` → ``DecisionExecution`` + ``AgentSession`` + ``partOfSession``) and lets the reference fallback emit cross-event edges. Source uses triple-single-quote delimiters so the inner module docstring carries through the JSON-serialized cell without escaping breakage. ``compile_fingerprint = ab583bd0f9506f44c4b00671f3b9177c8a80b04f2755e2aa5df6287cbbbd6800``. * **Beat 3.4** — recompile with identical inputs. ``cache_hit = True``, fingerprint matches the first run. * **Beat 3.5** — ``OntologyGraphManager.from_bundles_root`` (the actual current name; the previous draft used ``from_ontology_binding_with_compiled_bundles`` which is gone) wires the compiled bundle + reference fallback into a wrapped registry. Re-materializes the graph: ``extracted nodes = 27``, ``partOfSession`` count = 3 in ``rows_materialized``. Asserts ``AgentSession`` and ``partOfSession`` are both > 0 — the reference's envelope-side synthesis closes Beat 4.4's hub gap. * **Beat 3.7** — savings table compares per-session transcript chars before/after compiled-path pruning. Live numbers: 77,126 → 76,315 chars (1.1%) across 3 sessions with the central-hub-only compiled bundle. A broader compiled bundle (handling capture_context / propose_decision_point / evaluate_candidate / commit_outcome too) would drop the after-size further; for now the demo proves the wiring works end-to-end. * **Beat 4.4** — framing reworded from "deferred to PR #156" to "non-zero rows because Beat 3.5 wired the reference". Hub-shape ``(DecisionExecution)-[partOfSession]->(AgentSession)`` asserts non-empty; the live run shows **3 rows**, one per session. ``GRAPH_TABLE count = 10`` on DecisionExecution. Generated snapshots (``binding.yaml`` / ``table_ddl.sql`` / ``property_graph.sql``) regenerated against the canonical default ``(test-project-0728-467323, migration_v5_demo)`` dataset so they stay reviewer-readable. #107 is now end-to-end: every guarantee has live evidence in the committed executed notebook. * fix(examples): PR #160 round 2 — correct Beat 3 framing + materialize diag Five review nits addressed (doc + code; executed-cell outputs unchanged since the previous successful run's evidence still applies): * **P1 — Beat 3.5 row-count provenance.** ``materialize`` → ``materialize_with_status`` + surface ``delete_failed`` as a clear NOTE rather than swallowing it. BigQuery's streaming buffer pins Beat 1's rows for ~90 min; the materializer's delete-then-insert idempotency legitimately reports ``delete_failed`` in that window. The note clarifies that ``row_counts`` reflects "rows inserted in this build only" — Beat 4.4's ``GRAPH_TABLE`` count over the same tables can be larger. *Why not fail or reset:* a CREATE OR REPLACE reset bypasses the streaming-buffer constraint but interacts badly with the streaming inserts that immediately follow (only AgentSession + partOfSession actually materialize). Asserting ``no delete_failed`` would fail every run in the 90-min window after Beat 1. The honest framing is "row_counts are this-build inserts; the table totals may include earlier rows". * **P2 — C2 fallback semantics corrected.** The compiled extractor returns empty validator-clean results for non-``complete_execution`` tools. C2's wrapper records those as ``compiled_unchanged`` and **does not** invoke the reference fallback (fallback fires only on exception / wrong return type / validation failure). The non-hub spans go to ``AI.GENERATE``, not the reference. Beat 3.3 framing (cell 39), the compiled-source docstring comment (cell 40), and Beat 3.5 framing (cell 43) now describe this accurately. * **P2 — sweep stale PR #156 references.** Beat 3 framing (cell 36), Beat 3 closing (cell 49), Beat 4.4 inline comment (cell 56), Section 5 recap (cell 58): PR #156 followups are merged into ``main``; the cells now describe what runs live. * **P3 — Beat 3.7 cost-savings claim softened.** "Real cost savings track this proxy 1:1" overstates the relationship. Reworded to "directionally" — billing is job-level and includes fixed prompt/schema, output tokens, and prompt-cache effects. The PR body's "60 cells" wording is fixed via ``gh pr edit`` after this commit lands. Note on executed outputs: today's Vertex environment is unstable (gRPC AioRpcError on the agent run). The committed executed notebook is from the earlier successful run (commit a6d541a's snapshot). Beat 3.3-3.7 cell outputs reflect the pre-warning code path; the new code's behavior is identical when ``delete_failed`` is empty (the ``materialize_with_status`` API returns the same ``row_counts`` semantically) and adds only a diagnostic NOTE print when ``delete_failed`` is non-empty. * fix(examples): PR #160 round 3 — cleanup_status + attribution + fresh outputs Three review findings: * **P1 — ``status.cleanup_status`` (was ``delete_status``).** ``TableStatus`` exposes ``cleanup_status``, not ``delete_status``; the round-2 diagnostic used ``getattr(status, 'delete_status', '')`` which always returned ``''`` and never printed the NOTE. Fixed — the live run captured below now correctly lists the 11 tables that hit ``delete_failed`` due to Beat 1's streaming-buffer-pinned rows. * **P2 — attribution: compiled extractor, not reference fallback.** The hub-shape ``AgentSession`` + ``partOfSession`` rows come from the **compiled extractor** for ``complete_execution`` events. Validator-clean compiled output → C2 wrapper records ``compiled_unchanged`` → fallback NOT invoked. The reference extractor would do the same as fallback, but it's bypassed in this run. Beat 4.4 framing (cell 55), Beat 4 closing (cell 57), and the hub-shape inline comment (cell 56) corrected. * **P2 — fresh executed outputs.** Re-ran live against scratch ``migration_v5_demo_12bbc7ae`` (3 sessions, 60 cells). Source and outputs now match. Live evidence (committed executed notebook): * Beat 3.3 / 3.4: compile fingerprint ``bc1921bbfdd588be0e6531fcd1774218a6273c3b1836b45a9eaff0c3c0b4ca07``, ``cache_hit = True`` on recompile. * Beat 3.5: ``extracted nodes = 22``, ``partOfSession = 5`` from the materialize, **NOTE: 11 table(s) had delete_failed** — diagnostic correctly enumerates the streaming-buffer-pinned tables from Beat 1. * Beat 3.7: 78,926 → 78,112 chars (~1.0% savings). * Beat 4.4: ``GRAPH_TABLE COUNT = 9``, hub-shape returned 5 rows ``(DecisionExecution -[partOfSession]-> AgentSession)``. Snapshots regenerated against canonical default ``(test-project-0728-467323, migration_v5_demo)``. * fix(examples): PR #160 round 4 — scope Beat 4.4 GQL, kill dead branch, sync README Five review findings: * **P1 — Beat 4.4 GQL scoped to current session_ids.** The count + hub-shape queries now filter via ``REGEXP_CONTAINS(<pk>, '^(sess-1|sess-2|sess-3):')`` built from ``session_ids``, so stale rows (e.g. AI- hallucinated PK values without a session prefix, or leftovers from prior runs into the same scratch dataset if streaming-buffer DELETE failed) can no longer inflate the count. Added a per-session assertion: every current session_id must appear in the hub-shape rows. * **P2 — Beat 3.5 assertion messages reattributed.** Sanity asserts now say "compiled-extractor synthesis", not "reference-extractor synthesis" — the compiled bundle emits the central-hub slice; the reference extractor is the fallback that's bypassed when compiled output validates clean. * **P3 — dead zero-row branch removed.** Cell 56's old ``if not hub_rows: ... else: ...`` branch (with the "AgentSession synthesis is a follow-up" historical note) is gone. The cell now asserts ``hub_rows`` and every session-id is present, then prints unconditionally. * **P3 — README synced to PR-#157+#160 reality.** Six stale references to "PR #156 will land it" are replaced with the merged-PR pointers (#155 fixtures, #157 reference extractor, #160 wiring). Mapping-table footnote updated: the compiled extractor (Beat 3.5) AND the reference extractor (fallback) both synthesize the envelope-side ``AgentSession`` + ``partOfSession`` from ``session_id``; hub-shape returns non-zero rows. * **(PR body refresh)** Stale evidence (scratch dataset, fingerprints, savings numbers, hub-row count) updated to match the live run captured here. Live evidence (committed executed notebook, scratch ``migration_v5_demo_388fdaa9``, 3 sessions): * Beat 3.3 / 3.4: compile fingerprint ``b98b65134cb9925bdebef522c44c327f4eb21737a5b22c94a4aa45ba9d04c8a7``, ``cache_hit = True``. * Beat 3.5: ``extracted nodes = 6``, ``partOfSession = 3``, ``NOTE: 1 table(s) had delete_failed`` — diagnostic correctly enumerates streaming-buffer-pinned ``decision_execution``. * Beat 3.7: 81,348 → 80,533 chars (~1.0% savings, 3 sessions). * Beat 4.4: scoped ``GRAPH_TABLE count = 3`` (exactly one per session, no stale rows leaking through); scoped hub-shape returned **3 rows**, all session-prefixed. Generated snapshots regenerated against canonical default ``(test-project-0728-467323, migration_v5_demo)``. * docs(examples): PR #160 round 5 — stale comment + contract assert Two doc-only nits: * Cell 44 (Beat 3.5 asserts): inline comment said "the reference extractor's envelope-side synthesis" but the asserts (and the run) prove the *compiled* extractor does it. Reattributed. * Cell 56 (Beat 4.4): the scoped GQL assumes ``{session_id}:{raw_id}`` PK shape, which is true for every demo entity *except* ``AgentSession`` (whose ``agent_session_id`` is just the session_id with no colon). Added ``assert resolved_entity == \"DecisionExecution\"`` before the regex so a future edit that swaps the resolved entity can't silently produce a regex that matches nothing.
1 parent d542399 commit ccb9ed7

2 files changed

Lines changed: 873 additions & 476 deletions

File tree

examples/migration_v5/README.md

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Migration v5 Demo — Fixture Foundation
22

3-
**Status:** The four-guarantee MAKO notebook (`examples/migration_v5_demo_notebook.ipynb`) is now live in this branch with end-to-end outputs from `test-project-0728-467323`. This README is the handoff doc for PR #156 (MAKO reference extractor + Beat 3 compile/runtime/savings cells).
3+
**Status:** The four-guarantee MAKO notebook (`examples/migration_v5_demo_notebook.ipynb`) is live end-to-end against `test-project-0728-467323`. PR [#155](https://github.com/GoogleCloudPlatform/BigQuery-Agent-Analytics-SDK/pull/155) shipped the fixture foundation; PR [#157](https://github.com/GoogleCloudPlatform/BigQuery-Agent-Analytics-SDK/pull/157) added `reference_extractor.py`; PR [#160](https://github.com/GoogleCloudPlatform/BigQuery-Agent-Analytics-SDK/pull/160) wired Beat 3 (compile + cache + runtime + savings) and Beat 4.4 (hub-shape non-zero) live.
44

55
The demo's event source of truth is **a runnable agent talking to the BQ AA plugin**, not a hand-coded event generator. This directory's authored inputs are split accordingly.
66

@@ -46,7 +46,7 @@ ontology-build (extracts the graph) tables │ consume
4646
OntologyRuntime + LabelSynonymResolver ┘
4747
```
4848

49-
Beat 3's compile / runtime / savings cells are PR #156 placeholders; they require a MAKO-specific reference extractor (`extract_mako_decision_event`).
49+
Beat 3's compile / runtime / savings cells run live, using `reference_extractor.extract_mako_decision_event` as the runtime fallback under a focused compiled bundle that handles the `complete_execution` event type.
5050

5151
## Design decisions
5252

@@ -88,9 +88,9 @@ The **explicit mapping** between what the agent emits and what extraction materi
8888
| `complete_execution.business_entity_id` | `DecisionExecution.businessEntityId` | 1:1 (column `business_entity_id`). |
8989
| `complete_execution.latency_ms` | `DecisionExecution.latencyMs` (INT64) | 1:1 (column `latency_ms`, **typed INT64** in `table_ddl.sql`). |
9090
| `complete_execution.{decision_point,context,outcome}_id` | `executedAtDecisionPoint` / `atContextSnapshot` / `hasSelectionOutcome` edges | Each is an edge endpoint pointing at the parent `DecisionExecution`. |
91-
| `session_id` (envelope) | `partOfSession` edge (DecisionExecution → AgentSession) | Plugin envelope; the reference extractor (PR #156) will synthesize an `AgentSession` node + `partOfSession` edge from this envelope field. The live notebook's hub-shape GQL traversal currently returns zero rows because of this gap. |
91+
| `session_id` (envelope) | `partOfSession` edge (DecisionExecution → AgentSession) | Plugin envelope. Both the compiled extractor (for `complete_execution` events; live in Beat 3.5) and the reference extractor (fallback path) synthesize the `AgentSession` node + `partOfSession` edge from this envelope field. Beat 4.4's hub-shape GQL traversal returns non-zero rows. |
9292

93-
**Rule of thumb:** only fields with a TTL-declared target property are materialized; everything else stays in the raw `agent_events` trace as reasoning context. The mapping above is the contract PR #156's reference extractor implements.
93+
**Rule of thumb:** only fields with a TTL-declared target property are materialized; everything else stays in the raw `agent_events` trace as reasoning context. The mapping above is the contract `reference_extractor.extract_mako_decision_event` (and the compiled bundle Beat 3.3 emits) implement.
9494

9595
The agent uses Vertex AI Gemini by default (`DEMO_AGENT_MODEL=gemini-2.5-flash`). Same wiring pattern as `examples/decision_lineage_demo/agent/agent.py`.
9696

@@ -166,13 +166,11 @@ A live end-to-end notebook run (`run_agent.py --sessions 3` + Beat 1–4 cells a
166166
- **Beat 1**: GQL `DecisionExecution` count `before=0, after=N>0`, `rows_materialized total>0`, `property_graph_status='skipped:user_requested'`, zero SDK-issued `CREATE OR REPLACE PROPERTY GRAPH` jobs.
167167
- **Beat 2**: `binding-validate` exits 1 with a `missing_column` failure after column rename; restore + re-validate exits 0; combined `ontology-build --skip-property-graph --validate-binding` matches Beat 1's status + non-zero `rows_materialized`.
168168
- **Beat 3.6**: synthetic `ExtractedGraph` triggers all three `FallbackScope` failures (`NODE + FIELD + EDGE`).
169-
- **Beat 4**: concept index emitted + applied; `LabelSynonymResolver.resolve("DecisionExecution")` returns 1 candidate with a 12-hex `compile_id`; `GRAPH_TABLE` count over the user-authored property graph is non-zero. Hub-shape `(DecisionExecution)-[partOfSession]->(AgentSession)` returns zero rows (`AgentSession` synthesis from the plugin envelope lands with PR #156).
169+
- **Beat 4**: concept index emitted + applied; `LabelSynonymResolver.resolve("DecisionExecution")` returns 1 candidate with a 12-hex `compile_id`; `GRAPH_TABLE` count over the user-authored property graph is non-zero. Hub-shape `(DecisionExecution)-[partOfSession]->(AgentSession)` returns at least one row per current session — the compiled extractor wired in Beat 3.5 synthesizes the envelope-side `AgentSession` + `partOfSession`.
170170

171171
## What's NOT in this commit
172172

173-
- **MAKO reference extractor** (`extract_mako_decision_event`). PR #156 ships it and flips Beat 3.3 / 3.4 / 3.5 / 3.7 from gated placeholders to live cells.
174-
- **`AgentSession` node + `partOfSession` edge synthesis** from the plugin envelope. Without it, Beat 4's hub-shape GQL traversal `(DecisionExecution)-[partOfSession]->(AgentSession)` returns zero rows. PR #156's reference extractor adds the envelope-to-graph translation.
175-
- `docs/README.md` / `CHANGELOG.md` entries — land alongside PR #156.
173+
- `docs/README.md` / `CHANGELOG.md` entries — staged for a follow-up PR alongside the user-facing release notes.
176174

177175
## Related
178176

0 commit comments

Comments
 (0)