You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* observability: phase 6.1 PR-C.2 — proposal 0013 + v0.10.0
Implement spec graph-engine + observability v0.10.0 (proposal
0013): non-detached fan-out instances now synthesize per-instance
dispatch spans nested between the fan-out node span and the
inner-node spans (mirroring the detached path's layout, but
landing in the parent trace rather than a fresh trace_id). The
fan-out node span carries the four §5.4 attributes (item_count /
concurrency / error_policy / parent_node_name) sourced from the
new ``NodeEvent.fan_out_config`` field.
Mechanism: typed ``FanOutEventConfig`` dataclass on the canonical
event payload (events.py). Engine populates eagerly at fan-out
entry in ``_step_fan_out_node`` — resolves item_count (from
``count`` mode or ``len(items_field)``) + concurrency once,
threads through every dispatch site (started, all four error-path
completed, deferred success-case completed). Pre-resolved values
flow into ``FanOutNode.run_with_context`` via new
``pre_resolved_count`` / ``pre_resolved_concurrency`` kwargs so
callable resolvers fire at most once per fan-out attempt scope
(retry middleware re-uses the outer-scope resolution).
The four contract corrections from PR #34's CoPilot review:
1. ``concurrency: int | None`` on the canonical payload (positive
int or None for unbounded per pipeline-utilities §9.2). The
``None → 0`` translation lives only at the OTel attribute
layer in ``_node_attrs`` per observability §5.4's bare-int
sentinel convention.
2. All four ``FanOutEventConfig`` fields structurally required
(frozen dataclass; partial resolution would raise on
construction). The implementation-defined corner from
spec-thread 03 (config resolution itself failing) doesn't
surface in current fixtures; natural error propagation
covers it.
3. Retried fan-out attempts carry ``fan_out_config``: the
resolved config is constructed in the outer scope of
``_step_fan_out_node``; retry middleware re-enters
``innermost`` and re-uses the same outer-scope reference. No
per-attempt re-resolution; partition is by node type, not
event category.
4. ``parent_node_name`` is on per-instance spans only; the
observer caches it from the fan-out node's ``started`` event
in ``_InvState.fan_out_parent_node_name`` keyed by namespace
prefix, applies at per-instance dispatch span synthesis, and
clears at the fan-out node's ``completed`` event.
Observer changes (``observer.py``):
- ``_InvState`` gains ``fan_out_instance_spans`` (per-instance
dispatch span store, keyed by ``prefix + (str(fan_out_index),)``)
and ``fan_out_parent_node_name`` (cache).
- ``_handle_started`` populates the cache when fan-out node
events land.
- ``_handle_completed`` closes per-instance dispatch spans on the
fan-out node's own completion (children-before-parents
ordering) and clears the cache entry.
- ``_sync_subgraph_spans`` routes non-detached fan-out instance
namespaces through the new
``_open_fan_out_instance_dispatch_span`` helper instead of
opening a shared subgraph span at the prefix.
- ``_resolve_parent_context`` finds the per-instance dispatch
span before falling through, so inner-node events parent
under their own per-instance dispatch span (not the shared
fan-out node span).
- ``_node_attrs`` populates the three §5.4 fan-out node span
attributes from ``event.fan_out_config`` when present.
- ``_drain_inv_state`` extended to drain
``fan_out_instance_spans`` in child→parent order.
Pin sites bumped to v0.10.0 (three-place sync per CLAUDE.md):
``openarmature-spec`` submodule pointer to ``ff86945``,
``pyproject.toml`` ``tool.openarmature.spec_version``,
``src/openarmature/__init__.py`` ``__spec_version__``,
``tests/test_smoke.py`` drift-guard literal.
``OTelObserver.spec_version`` updates automatically via PR-A's
``_read_spec_version``.
Tests:
- ``tests/conformance/test_observability.py`` removes
``006-otel-fan-out-instance-attribution`` from
``_DEFERRED_FIXTURES`` and adds ``_run_fixture_006`` driver.
Verifies the fixture's expected tree: 1 fan-out NODE span
with item_count/concurrency/error_policy attributes; 3
per-instance dispatch spans with fan_out_index 0..2 and
parent_node_name="process"; 3 compute spans (one per
instance) parented under their own per-instance dispatch
span.
- ``tests/conformance/adapter.py``:
``_TracingFanOutNode.run_with_context`` accepts and forwards
the new pre-resolved kwargs (pyright override-compatibility).
- Existing fan-out tests (resolution-once-per-entry,
concurrency-callable-once-per-entry, instance-middleware
retry, fan-in ordering, etc.) stay green — pre-resolved-
values plumbing didn't introduce double-resolution.
399 tests pass (was 392; net +7 from new fan-out path
verification + fixture 006 going from skip to pass + harness
work). 3 skipped (was 4; only fixture 010 remains, which is
PR-C.3's scope). Pyright clean.
PR-C.3 (observer ``prepare_sync`` + fixture 010) sits
independently behind its own architectural piece; lands in
either order. Phase 6.1 closes when both merge.
* otel: address PR #26 review
- Move fan_out runtime imports from compiled.py module top into
function scope to break the textual import cycle CodeQL's
py/cyclic-import rule flagged. fan_out has a TYPE_CHECKING
back-reference to compiled (no runtime issue), but the static
analyzer doesn't see the gate. Lazy-imports inside _invoke
(FanOutNode for the isinstance check) and _step_fan_out_node
(_resolve_concurrency / _resolve_count). Type-only FanOutNode
reference moves to TYPE_CHECKING with `from __future__ import
annotations` so signatures resolve without the runtime import.
- Fix _drain_inv_state ordering: per-instance dispatch spans in
fan_out_instance_spans are children of the fan-out NODE span
(which lives in open_spans at depth 1). The previous shape
closed all of open_spans before draining
fan_out_instance_spans, ending the parent before its children
during shutdown/abandon. Now drains open_spans in two phases
(deep >= 2 / shallow = 1) with the per-instance drain between,
so children-before-parents holds.
- Remove dead `by_id` map from _run_fixture_006_case —
scaffolding from an earlier draft; current assertions don't
use it.
* codeql: suppress py/unsafe-cyclic-import
Both compiled.py and fan_out.py reference each other's types
only via TYPE_CHECKING blocks. CodeQL's analyzer flags the
textual cycle regardless of the gate; no runtime cycle exists.
Removing either side breaks pyright's type resolution for
generics across the boundary. Pyright's reportImportCycles
already covers genuine runtime cycles at type-check time, so
dropping this CodeQL rule loses no signal.
* fan-out: surface resolver failures with event pair
PR-C.2 hoisted item_count / concurrency resolution out of
``run_with_context`` to step entry so the eager
``FanOutEventConfig`` could ride on every fan-out node event.
That moved resolver failures (callable raises, ``getattr`` on
malformed state, items_field non-list) outside ``innermost``'s
``except Exception → NodeException`` block, dropping the
started/completed event pair and the NodeException wrap that
the inner path used to provide.
Re-establish the contract by wrapping the resolution block:
any failure now dispatches a started+completed pair with
``fan_out_config=None`` (we never built one) and surfaces as
NodeException. Also drops the silent 0-coercion when
``items_field`` resolves to a non-list — raises with the same
TypeError text ``_build_instance_states`` produces, so
fan_out_config never reports a misleading ``item_count=0``.
* tests: drop redundant Protocol assertion
The ``_: ProjectionStrategy[S, Inner] = BoomProjection()`` line was
a defensive belt-and-suspenders structural-conformance check, but
the call-site ``add_subgraph_node(projection=BoomProjection())``
exercises the same Protocol check via its parameter annotation —
pyright catches a mismatch there without needing the explicit
assignment. Drop the line and update the comment to point at the
call site.
* tests: comment LLM step sentinel + parametrize list factories
- Annotate ``step=-1`` in the synthetic LLM-event observer test
to point readers at ``OpenAIProvider._llm_event`` (openai.py:643)
where the same sentinel is minted for production LLM-provider
span events that aren't tied to graph step sequencing.
- ``_ParentState`` in the fan-out gating test now uses
``Field(default_factory=list[int])`` instead of the bare ``[]``
default, matching the parametrized factory shape used in
``test_checkpoint.py``'s ``_ParentState`` and the surrounding
pyright-strict expectation that field types are fully known.
0 commit comments