Commit 128bc78
authored
docs: end-to-end rollout guide for compiled structured extractors (#152)
* docs: end-to-end rollout guide for compiled structured extractors
Phase C wrap-up of issue #75. Operational playbook stitching
the five Phase C stages — Compile, Publish, Sync, Wire,
Revalidate — into one flow. Treats Publish/Sync as the
remote-runtime path; co-located deployments can shortcut to
Compile -> Wire -> Revalidate.
Structure:
* Overview + ASCII pipeline diagram + cadence table.
* Per-stage section: purpose, API + canonical Python snippet
(or shell command for the real CLI in Stage 5), failure
modes by stable code, pointer to the detailed per-PR doc.
* Worked BKA example walked through all five stages: Python
for compile / publish / sync / wire (no CLIs exist for
those), shell for ``bqaa-revalidate-extractors`` (real
CLI). Snippets use the actual call signatures verified
against the BKA live test.
* Trust-boundaries section documenting the four points where
``load_bundle`` runs — compile smoke gate, pre-publish,
post-sync, runtime-startup discovery — so the trust model
is one mental model across the pipeline.
* Failure-recovery playbook keyed on the stable failure
codes each stage emits (``duplicate_fingerprint``,
``bundle_load_failed``, ``manifest_row_unreadable``,
``invalid_bundle_path``, ``fingerprint_not_in_table``,
etc.) with the one-line action for each.
Index entry in docs/README.md positions the rollout guide as
"Start here for compiled extractors" — the per-PR docs become
deep dives once readers have the pipeline shape in their
heads.
No code changes.
* docs(rollout-guide): four-trust-gates wording + self-contained sync snippet
Addresses PR #152 round-1 reviewer findings.
P2 - Trust-boundaries section previously claimed
``load_bundle`` runs at compile time. It doesn't:
``compile_extractor`` runs ``load_callable_from_source`` +
``run_smoke_test[_in_subprocess]`` and then writes the
manifest. The actual ``load_bundle`` gate exists only at
publish, sync, and runtime discovery — three places, not
four.
Re-framed as "four trust gates" with explicit annotation
that gate 1 is the compile-time smoke check (NOT
``load_bundle`` itself — no manifest exists yet) and gates
2-4 are the real ``load_bundle`` runs. Same edit propagated
to the docs/README.md index entry and the CHANGELOG bullet
so the three places that describe the trust model use the
same words.
P3 - Stage 3's sync snippet reused ``store`` from Stage 2,
which only works in a single process. In a distributed
deployment the sync host is a different process from the
publish host. Both the standalone Stage 3 example and the
worked-BKA Stage 3 example now reconstruct
``BigQueryBundleStore`` against the same ``table_id`` (the
typical pattern: the runtime host uses its own service-
account ADC). A one-line note now explicitly calls out
that the runtime host constructs the same store handle
before calling ``sync_bundles_from_bq``.
Sanity-checked: every public import used in the rollout
doc's snippets resolves cleanly
(``measure_compile``, ``compile_extractor``,
``compute_fingerprint``, ``BigQueryBundleStore``,
``publish_bundles_to_bq``, ``sync_bundles_from_bq``,
``OntologyGraphManager``, ``extract_bka_decision_event``).
No code changes.1 parent c0d6eac commit 128bc78
3 files changed
Lines changed: 467 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
12 | 35 | | |
13 | 36 | | |
14 | 37 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| 42 | + | |
42 | 43 | | |
43 | 44 | | |
44 | 45 | | |
| |||
0 commit comments