| ontology/ontology.md |
Ontology core design — logical ontology spec |
| ontology/binding.md |
Binding design — attaching ontology to physical tables |
| ontology/compilation.md |
Compilation — resolving ontology + binding into backend DDL |
| ontology/cli.md |
CLI design for the gm tool (validate, compile, import-owl) |
| ontology/owl-import.md |
OWL import — converting OWL ontologies to YAML format |
| ontology/ontology-build.md |
bq-agent-sdk ontology-build orchestrator + --skip-property-graph reference |
| ontology/binding-validation.md |
bq-agent-sdk binding-validate pre-flight + ontology-build --validate-binding[-strict] reference |
| ontology/validation.md |
validate_extracted_graph(spec, graph) post-extraction validator with NODE/FIELD/EDGE-scope failure classification |
| extractor_compilation_rollout_guide.md |
Start here for compiled extractors. End-to-end rollout playbook stitching the five Phase C stages together: Compile → Publish → Sync → Wire → Revalidate. Covers when to run each stage, inputs/outputs, failure modes, trust boundaries (four gates across the pipeline: the compile-time smoke check plus three load_bundle runs at publish, sync, and runtime discovery), a worked BKA example with Python snippets + the real bqaa-revalidate-extractors shell invocation, and a failure-recovery playbook. Notes the local/co-located shortcut (Compile → Wire → Revalidate) vs the canonical distributed flow. Per-stage docs are the deep dives. |
| extractor_compilation_runtime_target.md |
Phase 1 runtime-target decision for compiled structured extractors (issue #75 P0.2): client-side Python via the existing run_structured_extractors() hook |
| extractor_compilation_scaffolding.md |
Compile-time scaffolding for compiled structured extractors (issue #75 PR 4b.1): fingerprint, manifest, AST allowlist, smoke-test runner, end-to-end compile_extractor. LLM-driven template fill is PR 4b.2; runtime loading is C2. |
| extractor_compilation_template_renderer.md |
Deterministic source generator for compiled structured extractors (issue #75 PR 4b.2.1): render_extractor_source(plan) turns a ResolvedExtractorPlan into Python source compatible with 4b.1's compile_extractor. LLM step that resolves raw rules into a plan is PR 4b.2.2. |
| extractor_compilation_plan_parser.md |
JSON-to-plan parser for compiled structured extractors (issue #75 PR 4b.2.2.a): parse_resolved_extractor_plan_json(payload) turns LLM-emitted JSON into a ResolvedExtractorPlan with structured PlanParseError codes. The deterministic boundary the LLM step in PR 4b.2.2.b will plug into. |
| extractor_compilation_plan_resolver.md |
LLM-driven plan resolver for compiled structured extractors (issue #75 PR 4b.2.2.b): build_resolution_prompt(rule, schema) produces the prompt; PlanResolver(llm_client).resolve(rule, schema) wires prompt → LLM call → parser. Adapter-free LLMClient Protocol; concrete provider adapters and retry orchestration land separately. |
| extractor_compilation_diagnostics.md |
Diagnostic builders for retry-prompt feedback (issue #75 PR 4b.2.2.c.1): build_plan_parse_diagnostic, build_ast_diagnostic, build_smoke_diagnostic, build_compile_result_diagnostic (covers invalid_identifier / invalid_event_types / load_error plus AST/smoke fall-through), plus a build_gate_diagnostic(kind, payload) dispatcher. Output is actionable, bounded (ten-entry caps; tracebacks reduced to their last line), and deterministic — ready for retry-prompt embedding in PR 4b.2.2.c.2. |
| extractor_compilation_retry_loop.md |
Retry-on-gate-failure orchestrator for compiled structured extractors (issue #75 PR 4b.2.2.c.2): compile_with_llm(rule, schema, llm_client, compile_source, max_attempts) loops resolver → renderer → compile_extractor, feeding build_compile_result_diagnostic / build_plan_parse_diagnostic / synthesized RenderError strings back to the LLM via build_retry_prompt. Returns RetryCompileResult with per-attempt AttemptRecord history (one failure channel populated each: parser / render / compile). LLM exceptions propagate unchanged. |
| extractor_compilation_bka_measurement.md |
Compile-and-measure utility + BKA-decision end-to-end proof (issue #75 PR 4c): measure_compile(...) runs compile_with_llm, loads the compiled bundle, and computes parity against a reference extractor; returns a JSON-serializable CompileMeasurement (loop outcome + per-axis parity counts + audit fields). CI path is deterministic; gated live path (BQAA_RUN_LIVE_LLM_COMPILE_TESTS=1) regenerates the checked-in measurement artifact at tests/fixtures_extractor_compilation/bka_decision_measurement_report.json. |
| extractor_compilation_bundle_loader.md |
Bundle loader + minimal runtime discovery for compiled extractors (issue #75 PR C2.a): load_bundle(bundle_dir, expected_fingerprint, expected_event_types) and discover_bundles(parent_dir, expected_fingerprint, event_type_allowlist). Stable LoadFailure codes (manifest_missing / unreadable, fingerprint_mismatch, event_types_mismatch, module_not_found, import_failed, function_not_found, function_signature_mismatch, event_type_collision); never raises through to the caller. Multi-event bundles register the same callable under each declared event_type; collisions fail closed. Out of scope: fallback wiring, BQ mirror, ontology-graph call-site swap. |
| extractor_compilation_runtime_fallback.md |
Runtime fallback wiring for compiled structured extractors (issue #75 PR C2.b): run_with_fallback(...) returning FallbackOutcome (decision is one of compiled_unchanged / compiled_filtered / fallback_for_event). Validates compiled output via #76; on per-element failures drops just the offending nodes / edges (with orphan cleanup) AND downgrades the event's span from fully_handled to partially_handled so the AI transcript still sees the source span. EVENT-scope, exception, wrong-type, and unpinpointable failures all trigger whole-event fallback. Does not validate fallback output; fallback exceptions propagate. Orchestrator call-site swap is C2.c. |
| extractor_compilation_runtime_registry.md |
Runtime extractor-registry adapter (issue #75 PR C2.c.1): build_runtime_extractor_registry(...) glues C2.a's discover_bundles + C2.b's run_with_fallback into one call, returning a WrappedRegistry with an extractors dict ready for run_structured_extractors plus bundles_without_fallback (compiled-only, skipped) and fallbacks_without_bundle (no usable compiled registry entry — "never built" and "rejected by discovery"; cross-reference discovery.failures for the reason). Compiled-only event_types are skipped and recorded (fail-closed); fallback-only event_types pass through unchanged. Non-callable fallbacks are rejected at build time with TypeError naming the event_type. The on_outcome(event_type, outcome) callback fires on every wrapped invocation (denominator metric); callback exceptions propagate. Out of scope: actual orchestrator call-site swap (C2.c.2), BQ mirror (C2.c.3), revalidation (C2.d). |
| extractor_compilation_orchestrator_swap.md |
Orchestrator call-site swap (issue #75 PR C2.c.2): OntologyGraphManager.from_bundles_root(...) classmethod that builds the runtime registry internally and constructs a manager whose extractors dict is the wrapped registry, so existing run_structured_extractors calls inside extract_graph pick up compiled-with-fallback behavior with no other code changes. Adds `manager.runtime_registry: WrappedRegistry |
| extractor_compilation_bq_bundle_mirror.md |
BigQuery-table bundle mirror (issue #75 PR C2.c.3): publish_bundles_to_bq(bundle_root, store, ...) + sync_bundles_from_bq(store, dest_dir, ...). Mirror is a publish/sync utility, NOT a runtime loader — the runtime path stays sync_bundles_from_bq → discover_bundles → from_bundles_root. Both functions call load_bundle as a gate: publish refuses bundles that wouldn't load at the runtime; sync writes to a side-by-side staging directory and load_bundle-validates the staged copy before performing a staged replace of the target (the rmtree+move pair is not strictly atomic — a crash between the two leaves the bundle absent on disk and is recoverable by re-sync — but the load-bundle-failure direction is atomic, so a bad mirror row never destroys a previously-good local bundle). Strict bundle-shape (exactly manifest.json + the manifest's module_filename) plus shape-check on the manifest's module_filename (bare filename only — no separators, no .., no NUL; otherwise manifest_row_unreadable). Path-safety rejects traversal / absolute / backslash / NUL. duplicate_fingerprint rejects publish-side cases where two subdirs claim the same fingerprint (neither published). duplicate_row rejects two rows sharing the same (fingerprint, bundle_path) at sync. malformed_row shape check. Idempotent republish via DELETE+INSERT in BigQueryBundleStore.publish_rows (NOT a single atomic transaction; a transient INSERT failure is recoverable by re-running publish). publish_rows raises ValueError on duplicate input pairs as defense in depth. BundleStore Protocol for testability; BigQueryBundleStore is the concrete impl. Stable MirrorFailure codes; per-bundle problems accumulate, store exceptions propagate. Out of scope: GCS signed URLs, caching, garbage collection, multi-region. |
| extractor_compilation_revalidate_cli.md |
bqaa-revalidate-extractors CLI (Phase C operationalization): one-shot binary that wraps revalidate_compiled_extractors. Event source flags are mutually exclusive: --events-jsonl for local JSONL files OR --events-bq-query-file for BigQuery (SQL must return one column named event_json STRING per row). --bq-project is optional with ADC fallback. Other flags: --bundles-root, --reference-extractors-module, --thresholds-json (optional), --report-out. Reference module exposes EXTRACTORS dict + RESOLVED_GRAPH (+ optional SPEC) so the CLI doesn't need ontology/binding flags. Fingerprint auto-detected from the first bundle's manifest; mixed fingerprints fail-closed. Exit codes: 0 pass / 1 threshold violation / 2 usage-or-input error. Report JSON includes both the raw RevalidationReport and the ThresholdCheckResult. Out of scope: pagination strategy for ultra-large corpora, scheduled execution, BQ persistence, auto-row-shape inference (explicit non-goal — the event_json contract keeps the CLI predictable). |
| extractor_compilation_revalidation.md |
Revalidation harness (issue #75 PR C2.d): revalidate_compiled_extractors(events, compiled_extractors, reference_extractors, resolved_graph, ...) drives run_with_fallback (with a no-op fallback) over a batch of events AND calls the reference extractor directly, aggregating outcomes into a RevalidationReport with two orthogonal dimensions: runtime decision (compiled_unchanged / compiled_filtered / fallback_for_event, plus compiled_path_faults split out so bundle bugs are distinguishable from ontology drift) and agreement against reference (parity_match / parity_divergence / parity_not_checked). Parity uses three comparators: _compare_nodes and _compare_span_handling from measurement.py plus _compare_edges in revalidation.py (same edge_id set with matching relationship_name / endpoints / property-set per shared edge; duplicate edge_ids on either side reported as a divergence rather than silently collapsed by dict keying). The parity dimension catches schema-valid but semantically wrong outputs the schema-only check would miss. Every failure mode on the reference side becomes a parity divergence, never a batch abort: exceptions, non-StructuredExtractionResult returns (including None), and comparator crashes all funnel into the divergence channel with a descriptive string. check_thresholds(report, RevalidationThresholds(...)) evaluates policy gates; threshold rates are validated to [0, 1] at construction so a typo like =5 (intended as 5%) fails loud. JSON-serializable for persistence; deterministic. Out of scope: scheduled orchestration, BQ persistence, CLI, sampling strategy. |