Status: Implemented (PR C2.c.2 of issue #75 Phase C / Milestone C2)
Parent epic: issue #75
Builds on: extractor_compilation_runtime_registry.md (PR C2.c.1), extractor_compilation_runtime_fallback.md (PR C2.b), extractor_compilation_bundle_loader.md (PR C2.a)
Working plan: issue #96, Milestone C2 / PR C2.c.2
The actual call-site swap that puts compiled structured extractors on the runtime path. C2.c.1 shipped the registry adapter (build_runtime_extractor_registry); this PR wires that adapter into OntologyGraphManager so the existing run_structured_extractors call inside extract_graph picks up the compiled-with-fallback behavior automatically when the manager is built from a bundle root.
The wire-up is a new classmethod, OntologyGraphManager.from_bundles_root(...). The existing __init__ and from_ontology_binding paths are unchanged — back-compat is preserved by construction, since direct-constructor callers leave manager.runtime_registry = None.
from bigquery_agent_analytics.ontology_graph import OntologyGraphManager
manager = OntologyGraphManager.from_bundles_root(
project_id="my-project",
dataset_id="my-dataset",
ontology=ontology,
binding=binding,
bundles_root=pathlib.Path("/path/to/bundles"),
expected_fingerprint=runtime_fingerprint,
fallback_extractors={"bka_decision": extract_bka_decision_event, ...},
event_type_allowlist=("bka_decision", "tool_call"), # optional
on_outcome=my_telemetry_callback, # optional
)
# Use the manager exactly like a manager built from from_ontology_binding —
# extract_graph internally calls run_structured_extractors(manager.extractors, ...).
graph = manager.extract_graph(session_ids=..., use_ai_generate=True)
# Audit telemetry:
registry = manager.runtime_registry # WrappedRegistry (or None for direct __init__)
registry.bundles_without_fallback # event_types skipped because no fallback
registry.fallbacks_without_bundle # event_types with no usable compiled entry
registry.discovery.failures # underlying load failures for the wider auditmanager.extractors— the dict the runtime actually invokes (the same dictrun_structured_extractorsalready consumes). For event_types with both a compiled bundle and a fallback, this is a wrapper closure; for fallback-only event_types, the original callable; for compiled-only event_types, nothing (fail-closed per C2's safety contract).manager.runtime_registry— the audit handle.WrappedRegistryfrom C2.c.1 when constructed viafrom_bundles_root;Nonewhen constructed via the legacy__init__. Always non-Nonefor managers wired to a bundles root, even if discovery found zero bundles.
The split keeps the mental model intact: extractors is what runtime uses, runtime_registry is the audit object next to it.
The constructor stays simple (one construction mode = one set of inputs). Adding bundles_root / expected_fingerprint to __init__ would mix two construction modes and make future call sites harder to audit — there'd be no syntactic clue from the call site whether the manager is bundle-wired.
from_bundles_root makes the bundle-wired path explicit. The parameter shape mirrors from_ontology_binding so existing callers reach for the parallel pattern; the only differences are the bundle-specific args (bundles_root, expected_fingerprint, event_type_allowlist, on_outcome) and that extractors= becomes fallback_extractors= (since that's what the registry adapter calls them and the safety-contract role is different from "the dict the runtime uses").
A bundle discovered for event_x with no matching fallback is not registered (manager.extractors doesn't contain event_x). The audit handle records it:
manager.runtime_registry.bundles_without_fallback == ("event_x",)
"event_x" not in manager.extractorsThis matches C2.c.1's fail-closed policy — C2.b's safety contract requires a fallback, so registering a compiled-only event_type would invert the C2 guarantees. The audit field surfaces the configuration gap so operators can fix it; the runtime stays safe.
extract_graph runs structured extractors only under:
if self.extractors and use_ai_generate:
raw_events = self._fetch_raw_events(session_ids)
structured_result = run_structured_extractors(raw_events, self.extractors, self.spec)This gate pre-dates C2.c.2 — the bundle-wired path inherits it as-is. When use_ai_generate=False, the compiled extractors do not run, even if manager.extractors is fully populated from from_bundles_root. The non-AI path falls back to _extract_payloads(session_ids) and returns the stub graph that has always been there.
Whether to decouple structured extraction from the AI flag is a separate scope decision. C2.c.2 deliberately does not change this gate so that the call-site swap is a pure substitution: the same conditions that triggered the legacy extractors= path trigger the new from_bundles_root path, no more and no less. A regression test (test_extract_graph_skips_structured_when_use_ai_generate_false) pins this inherited behavior so any future decoupling shows up as a deliberate change.
TestOntologyGraphManagerDirectInit(1) — direct__init__leavesruntime_registry = None;extractorsidentity is preserved.TestFromBundlesRootNoBundles(1) —bundles_rootexists but contains no bundles →manager.extractorsidentity-preserves the fallback;runtime_registry.fallbacks_without_bundlelists the uncovered event_types.TestFromBundlesRootCompiledAndFallback(1) — hand-written bundle + matching fallback → wrapped closure registered (different identity from the original fallback); calling it drivesrun_with_fallbackand theon_outcomecallback fires withdecision="compiled_unchanged".TestFromBundlesRootCompiledOnlyNoFallback(1) — negative case — bundle forevent_xwith emptyfallback_extractors.event_xis NOT registered; surfaced inbundles_without_fallback. Behavioral check: runningrun_structured_extractorsover anevent_xevent yields an empty result (the runtime's "no extractor → skip" path).TestFromBundlesRootEndToEnd(1) — real BKA compiled bundle (driven through the full Phase C compile pipeline) + realextract_bka_decision_eventas fallback, wired into the manager, fed throughrun_structured_extractorsviamanager.extractors. Both BKA sample events producecompiled_unchangedoutcomes; bothmako_DecisionPointnodes appear in the merged result; callback log shows expected per-event traces.TestFromBundlesRootExtractGraphCallSite(3) — the production call site itself, including the merge:- Monkeypatches
_fetch_raw_events+_extract_via_ai_generateand callsmanager.extract_graph(..., use_ai_generate=True); assertson_outcomefired with the compiled path. Provesextract_graphactually invokes the wrapped registry, not justrun_structured_extractorsin isolation. - Same setup with a non-empty compiled bundle that returns one
ExtractedNodeper event; asserts the structured node propagates through the merge into the finalExtractedGraphreturned byextract_graph. Pins the compiled-output-is-merged behavior. - Pins the inherited
use_ai_generate=Falsegate: structured extractors don't run, the fetch hook isn't called,on_outcomedoesn't fire.
- Monkeypatches
- BigQuery-table bundle mirror for cross-process distribution. C2.c.3 — adds support for fetching bundles from a BigQuery table in addition to a filesystem path.
- Revalidation harness (scheduled / on-demand agreement check between compiled and reference outputs). C2.d.
AI.GENERATE-backed fallback adapter that fits theStructuredExtractorsignature so it can be passed as afallback_extractorsvalue. Orchestrator integration's concern.- Other orchestrator entry points. This PR adds
from_bundles_roottoOntologyGraphManager. Other managers / orchestrators that also callrun_structured_extractorscan adopt the same pattern when they need compiled extractors; the registry adapter and decision tree don't change.
extractor_compilation_runtime_registry.md—build_runtime_extractor_registryandWrappedRegistryshape.from_bundles_rootis a thin wrapper around this adapter.extractor_compilation_runtime_fallback.md—run_with_fallbackdecision tree and safety contract.extractor_compilation_bundle_loader.md—discover_bundlesand the trust-boundary failure codes that surface inmanager.runtime_registry.discovery.failures.extractor_compilation_runtime_target.md— the RFC that made client-side Python the Phase 1 runtime target.