Commit ccb9ed7
authored
* feat(examples): wire Beat 3 + 4.4 to live compile/runtime/savings (#107 final beat)
Replaces the PR-#156-placeholder cells in Beat 3 with the live
compile-extractor → measure-cache → runtime-bundle → savings
pipeline, and updates Beat 4.4 from "deferred follow-up" to
non-zero hub-shape evidence.
Live evidence from scratch dataset
``migration_v5_demo_2bacc3d9`` (3 sessions, 60 cells total):
* **Beat 3.3** — ``compile_extractor`` runs a self-contained
compiled source through AST → smoke → validator →
fingerprint. Hand-authored source handles the central-hub
slice (``complete_execution`` →
``DecisionExecution`` + ``AgentSession`` +
``partOfSession``) and lets the reference fallback emit
cross-event edges. Source uses triple-single-quote
delimiters so the inner module docstring carries through
the JSON-serialized cell without escaping breakage.
``compile_fingerprint =
ab583bd0f9506f44c4b00671f3b9177c8a80b04f2755e2aa5df6287cbbbd6800``.
* **Beat 3.4** — recompile with identical inputs.
``cache_hit = True``, fingerprint matches the first run.
* **Beat 3.5** — ``OntologyGraphManager.from_bundles_root``
(the actual current name; the previous draft used
``from_ontology_binding_with_compiled_bundles`` which is
gone) wires the compiled bundle + reference fallback into
a wrapped registry. Re-materializes the graph:
``extracted nodes = 27``, ``partOfSession`` count = 3 in
``rows_materialized``. Asserts ``AgentSession`` and
``partOfSession`` are both > 0 — the reference's
envelope-side synthesis closes Beat 4.4's hub gap.
* **Beat 3.7** — savings table compares per-session
transcript chars before/after compiled-path pruning.
Live numbers: 77,126 → 76,315 chars (1.1%) across 3
sessions with the central-hub-only compiled bundle. A
broader compiled bundle (handling capture_context /
propose_decision_point / evaluate_candidate /
commit_outcome too) would drop the after-size further;
for now the demo proves the wiring works end-to-end.
* **Beat 4.4** — framing reworded from "deferred to PR #156"
to "non-zero rows because Beat 3.5 wired the reference".
Hub-shape ``(DecisionExecution)-[partOfSession]->(AgentSession)``
asserts non-empty; the live run shows **3 rows**, one per
session. ``GRAPH_TABLE count = 10`` on DecisionExecution.
Generated snapshots (``binding.yaml`` / ``table_ddl.sql`` /
``property_graph.sql``) regenerated against the canonical
default ``(test-project-0728-467323, migration_v5_demo)``
dataset so they stay reviewer-readable.
#107 is now end-to-end: every guarantee has live evidence
in the committed executed notebook.
* fix(examples): PR #160 round 2 — correct Beat 3 framing + materialize diag
Five review nits addressed (doc + code; executed-cell outputs
unchanged since the previous successful run's evidence still
applies):
* **P1 — Beat 3.5 row-count provenance.** ``materialize`` →
``materialize_with_status`` + surface ``delete_failed`` as
a clear NOTE rather than swallowing it. BigQuery's
streaming buffer pins Beat 1's rows for ~90 min; the
materializer's delete-then-insert idempotency
legitimately reports ``delete_failed`` in that window.
The note clarifies that ``row_counts`` reflects "rows
inserted in this build only" — Beat 4.4's
``GRAPH_TABLE`` count over the same tables can be
larger.
*Why not fail or reset:* a CREATE OR REPLACE reset
bypasses the streaming-buffer constraint but interacts
badly with the streaming inserts that immediately
follow (only AgentSession + partOfSession actually
materialize). Asserting ``no delete_failed`` would
fail every run in the 90-min window after Beat 1. The
honest framing is "row_counts are this-build inserts;
the table totals may include earlier rows".
* **P2 — C2 fallback semantics corrected.** The compiled
extractor returns empty validator-clean results for
non-``complete_execution`` tools. C2's wrapper records
those as ``compiled_unchanged`` and **does not** invoke
the reference fallback (fallback fires only on
exception / wrong return type / validation failure).
The non-hub spans go to ``AI.GENERATE``, not the
reference. Beat 3.3 framing (cell 39), the
compiled-source docstring comment (cell 40), and Beat
3.5 framing (cell 43) now describe this accurately.
* **P2 — sweep stale PR #156 references.** Beat 3
framing (cell 36), Beat 3 closing (cell 49), Beat 4.4
inline comment (cell 56), Section 5 recap (cell 58):
PR #156 followups are merged into ``main``; the cells
now describe what runs live.
* **P3 — Beat 3.7 cost-savings claim softened.**
"Real cost savings track this proxy 1:1" overstates the
relationship. Reworded to "directionally" — billing is
job-level and includes fixed prompt/schema, output
tokens, and prompt-cache effects.
The PR body's "60 cells" wording is fixed via ``gh pr edit``
after this commit lands.
Note on executed outputs: today's Vertex environment is
unstable (gRPC AioRpcError on the agent run). The committed
executed notebook is from the earlier successful run
(commit a6d541a's snapshot). Beat 3.3-3.7 cell outputs
reflect the pre-warning code path; the new code's behavior
is identical when ``delete_failed`` is empty (the
``materialize_with_status`` API returns the same
``row_counts`` semantically) and adds only a diagnostic
NOTE print when ``delete_failed`` is non-empty.
* fix(examples): PR #160 round 3 — cleanup_status + attribution + fresh outputs
Three review findings:
* **P1 — ``status.cleanup_status`` (was ``delete_status``).**
``TableStatus`` exposes ``cleanup_status``, not
``delete_status``; the round-2 diagnostic used
``getattr(status, 'delete_status', '')`` which always
returned ``''`` and never printed the NOTE. Fixed —
the live run captured below now correctly lists the
11 tables that hit ``delete_failed`` due to Beat 1's
streaming-buffer-pinned rows.
* **P2 — attribution: compiled extractor, not reference
fallback.** The hub-shape ``AgentSession`` +
``partOfSession`` rows come from the **compiled
extractor** for ``complete_execution`` events.
Validator-clean compiled output → C2 wrapper records
``compiled_unchanged`` → fallback NOT invoked. The
reference extractor would do the same as fallback,
but it's bypassed in this run. Beat 4.4 framing
(cell 55), Beat 4 closing (cell 57), and the
hub-shape inline comment (cell 56) corrected.
* **P2 — fresh executed outputs.** Re-ran live against
scratch ``migration_v5_demo_12bbc7ae`` (3 sessions, 60
cells). Source and outputs now match.
Live evidence (committed executed notebook):
* Beat 3.3 / 3.4: compile fingerprint
``bc1921bbfdd588be0e6531fcd1774218a6273c3b1836b45a9eaff0c3c0b4ca07``,
``cache_hit = True`` on recompile.
* Beat 3.5: ``extracted nodes = 22``, ``partOfSession =
5`` from the materialize, **NOTE: 11 table(s) had
delete_failed** — diagnostic correctly enumerates the
streaming-buffer-pinned tables from Beat 1.
* Beat 3.7: 78,926 → 78,112 chars (~1.0% savings).
* Beat 4.4: ``GRAPH_TABLE COUNT = 9``, hub-shape returned
5 rows ``(DecisionExecution -[partOfSession]->
AgentSession)``.
Snapshots regenerated against canonical default
``(test-project-0728-467323, migration_v5_demo)``.
* fix(examples): PR #160 round 4 — scope Beat 4.4 GQL, kill dead branch, sync README
Five review findings:
* **P1 — Beat 4.4 GQL scoped to current session_ids.** The
count + hub-shape queries now filter via
``REGEXP_CONTAINS(<pk>, '^(sess-1|sess-2|sess-3):')``
built from ``session_ids``, so stale rows (e.g. AI-
hallucinated PK values without a session prefix, or
leftovers from prior runs into the same scratch dataset
if streaming-buffer DELETE failed) can no longer
inflate the count. Added a per-session assertion: every
current session_id must appear in the hub-shape rows.
* **P2 — Beat 3.5 assertion messages reattributed.** Sanity
asserts now say "compiled-extractor synthesis", not
"reference-extractor synthesis" — the compiled bundle
emits the central-hub slice; the reference extractor
is the fallback that's bypassed when compiled output
validates clean.
* **P3 — dead zero-row branch removed.** Cell 56's old
``if not hub_rows: ... else: ...`` branch (with the
"AgentSession synthesis is a follow-up" historical
note) is gone. The cell now asserts ``hub_rows`` and
every session-id is present, then prints unconditionally.
* **P3 — README synced to PR-#157+#160 reality.** Six stale
references to "PR #156 will land it" are replaced with
the merged-PR pointers (#155 fixtures, #157 reference
extractor, #160 wiring). Mapping-table footnote updated:
the compiled extractor (Beat 3.5) AND the reference
extractor (fallback) both synthesize the envelope-side
``AgentSession`` + ``partOfSession`` from
``session_id``; hub-shape returns non-zero rows.
* **(PR body refresh)** Stale evidence (scratch dataset,
fingerprints, savings numbers, hub-row count) updated
to match the live run captured here.
Live evidence (committed executed notebook, scratch
``migration_v5_demo_388fdaa9``, 3 sessions):
* Beat 3.3 / 3.4: compile fingerprint
``b98b65134cb9925bdebef522c44c327f4eb21737a5b22c94a4aa45ba9d04c8a7``,
``cache_hit = True``.
* Beat 3.5: ``extracted nodes = 6``, ``partOfSession = 3``,
``NOTE: 1 table(s) had delete_failed`` — diagnostic
correctly enumerates streaming-buffer-pinned
``decision_execution``.
* Beat 3.7: 81,348 → 80,533 chars (~1.0% savings, 3 sessions).
* Beat 4.4: scoped ``GRAPH_TABLE count = 3`` (exactly one
per session, no stale rows leaking through); scoped
hub-shape returned **3 rows**, all session-prefixed.
Generated snapshots regenerated against canonical default
``(test-project-0728-467323, migration_v5_demo)``.
* docs(examples): PR #160 round 5 — stale comment + contract assert
Two doc-only nits:
* Cell 44 (Beat 3.5 asserts): inline comment said "the
reference extractor's envelope-side synthesis" but the
asserts (and the run) prove the *compiled* extractor
does it. Reattributed.
* Cell 56 (Beat 4.4): the scoped GQL assumes
``{session_id}:{raw_id}`` PK shape, which is true for
every demo entity *except* ``AgentSession`` (whose
``agent_session_id`` is just the session_id with no
colon). Added ``assert resolved_entity ==
\"DecisionExecution\"`` before the regex so a future
edit that swaps the resolved entity can't silently
produce a regex that matches nothing.
1 parent d542399 commit ccb9ed7
2 files changed
Lines changed: 873 additions & 476 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | | - | |
| 49 | + | |
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
| |||
88 | 88 | | |
89 | 89 | | |
90 | 90 | | |
91 | | - | |
| 91 | + | |
92 | 92 | | |
93 | | - | |
| 93 | + | |
94 | 94 | | |
95 | 95 | | |
96 | 96 | | |
| |||
166 | 166 | | |
167 | 167 | | |
168 | 168 | | |
169 | | - | |
| 169 | + | |
170 | 170 | | |
171 | 171 | | |
172 | 172 | | |
173 | | - | |
174 | | - | |
175 | | - | |
| 173 | + | |
176 | 174 | | |
177 | 175 | | |
178 | 176 | | |
| |||
0 commit comments