You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add per-invocation observer event drain primitive
CompiledGraph.drain_events_for(invocation_id, *, timeout=5.0)
gives a terminal node a way to block until every event dispatched
for the in-flight invocation has been delivered to every attached
observer, then proceed -- typically followed by a read against a
queryable observer accumulator's per-invocation bucket whose
state the drain has now caught up to. Without this, the deliver
loop may still hold not-yet-dispatched events at the moment the
terminal node reads, and the read silently undercounts.
The implementation hangs on the existing per-invocation
_DrainCounters: drain_events_for snapshots dispatched at call
time, registers a (target, Future) pair on a new drain_wakers
list, and awaits the Future via asyncio.wait_for. The deliver
loop fulfils any waker whose target has been reached after each
delivered increment. On timeout the waker is removed from the
list and the partial summary is returned -- and crucially, the
worker is NOT cancelled. This is the load-bearing difference
from the process-wide drain: drain is shutdown semantics and
cancels its workers; drain_events_for is in-flight
synchronization and leaves them running so the graph keeps
serving other invocations.
Subgraph descents share the parent's _DrainCounters, so a drain
on the outermost invocation_id covers fan-out instance events
and parallel-branches branch events for free. Ten unit tests
mirror the spec fixtures' case shapes: basic synchronization
(028), snapshot semantic / no-deadlock on the calling node's
own completed event (029), worker-not-cancelled-on-timeout
(030, the load-bearing divergence), invocation-scope isolation
(031), fan-out coverage (032), parallel-branches coverage (033),
plus zero-timeout non-blocking check, unknown id, negative + NaN
boundary rejection.
Conformance manifest 0054 flips to implemented since 0.12.0. The
six graph-engine fixtures (028-033) stay deferred from the
cross-capability parser pending the upcoming conformance-adapter
capability spec to ratify the directive vocabulary.
Coord context: discuss-per-invocation-event-drain thread,
02-spec-accepted-as-0054 settling the five spec questions
(section, name, default timeout, summary shape, resume
interaction) and flagging the worker-cancellation divergence
explicitly.
* Tighten drain_events_for docstring + tests
Docstring fixes: the opening sentence said the drain covers events
dispatched "before this call returns", but the snapshot is taken
at call entry — events dispatched between entry and exit are not
in scope. Rewords to "as of this call's entry". The snapshot-
semantic paragraph also claimed the calling node's started AND
completed events were out of scope, but the engine dispatches
started immediately before the node body runs, so started is
already in the snapshot and the drain awaits its delivery
normally. Only completed is guaranteed out of scope.
Test fixes:
- test_drain_events_for_timeout_does_not_cancel_worker now runs
the follow-up invocation on the SAME compiled graph as the
timed-out one, and the decisive contract check is that all
NodeEvents from the originally-pending queue land in the
observer's delivery list AFTER the timed-out drain returned.
The previous version compiled a fresh graph for the follow-up,
so the test would have passed even if the original worker had
been cancelled.
- test_drain_events_for_invocation_scope_isolation now actually
performs two serial invocations (the previous version invoked
once and drained the same id twice, testing stale-id idempotency
rather than per-invocation isolation). Each drain's delivery
log entries are partitioned by invocation_id; the assertion is
strict (no cross-contamination, no entries outside either
partition).
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,7 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The
16
16
17
17
### Added
18
18
19
+
- **`CompiledGraph.drain_events_for(invocation_id, *, timeout=5.0) -> DrainSummary`** (proposal 0054, spec graph-engine §6 *Per-invocation drain*, v0.46.0). The architectural pair to proposal 0048's §9.4 queryable observer accumulator lifecycle: a terminal node calling `await graph.drain_events_for(state.invocation_id)` blocks until every event dispatched for that invocation has reached every attached observer, typically followed by a read against a queryable observer accumulator whose bucket the drain has now caught up to. Snapshot semantic: the drain awaits the events dispatched as of call time; new emissions after the call are out of scope. Reuses the existing `DrainSummary` shape verbatim — no new `InvocationDrainSummary` variant. **Load-bearing divergence from `drain()`**: a per-invocation drain timeout MUST NOT cancel the delivery worker, in contrast to `drain()`'s shutdown semantics. The graph stays serving other invocations after the timeout fires; the deliver loop keeps processing the queue. Default timeout is `5.0` seconds; `None` waits indefinitely; `0.0` is a non-blocking check. Negative or `NaN` timeout raises `ValueError` at the API boundary. Unknown `invocation_id` (already drained or never started) returns an empty summary, not an error.
19
20
-**`get_invocation_metadata()` read-symmetric API** (proposal 0048, observability §3.4, spec v0.40.0). The canonical spec-idiomatic public name for the §3.4 read access pairs with `set_invocation_metadata()` on the write side: same function object as the historical `current_invocation_metadata`, exposed for callers wishing to use the symmetric `get_/set_` naming. Returns the `MappingProxyType` snapshot of the current async context's view (caller baseline + in-node augments), or the empty mapping outside any active invocation. Read-only — callers MUST NOT mutate it. Both names are now exported from `openarmature.observability`; existing `current_invocation_metadata` callers continue to work unchanged.
20
21
-**`docs/concepts/observability.md` §9 *Queryable observer pattern*** documents the convention-only observer-attached read methods that proposal 0048 §9 blesses: how to add a `get_*` read method to a custom observer (§9.1), the async-safety contract for concurrent reads under in-flight delivery (§9.2), the three-channel data-access guidance (typed State / untyped invocation metadata / queryable observer accumulator, §9.3, with a side-by-side table), and the lifecycle / explicit `drop(invocation_id)` discipline (§9.4). No new abstract surface on `Observer` per the spec — the pattern is convention-only and exists to bless the existing observer-state read shape used in production code.
21
22
-**Production observability example.**`examples/production-observability/` demonstrates the production-grade observability stack end-to-end: `OTelObserver` + `LangfuseObserver` attached to the same graph (proposal 0031), `trace_input_from_state` / `trace_output_from_state` caller hooks on the Langfuse observer (proposal 0043 §8.4.1) deriving domain dicts from State, the built-in `TimingMiddleware` recording per-node duration via an `on_complete` callback, and `invoke(metadata={...})` carrying multi-tenant identifiers (tenantId / requestId / featureFlag) that both observers pick up at once. `InMemoryLangfuseClient` + `InMemorySpanExporter` capture in-process so the demo prints what each backend would have ingested without needing real production credentials.
0 commit comments