Skip to content

Commit e10fc07

Browse files
Add retrieval-provider embedding capability (#196)
* Add retrieval-provider embedding capability Add the embedding surface of the new retrieval-provider capability: an EmbeddingProvider protocol, the EmbeddingResponse / EmbeddingUsage / EmbeddingRuntimeConfig types, and an OpenAIEmbeddingProvider reference impl posting to an OpenAI-compatible /v1/embeddings. Typed EmbeddingEvent / EmbeddingFailedEvent join the observer event union, dispatched on every embed(); the bundled OTel and Langfuse observers safely skip them for now (the embedding span and Embedding observation are a follow-up). The provider rejects malformed responses (missing, empty, or non-numeric vectors and non-permutation indices), guards a trailing /v1 on base_url, and offers a configurable readiness probe defaulting to a universal one-input embed so it works against backends without a /v1/models catalog (e.g. TEI's OpenAI surface). Conformance fixtures 001-005 pass; unit tests cover the guard, the readiness modes, and the observer safe-handling. conformance.toml keeps 0059 not-yet; the observability rendering (074-083), the cross-provider helper cleanups, and the docs are v0.16.0 follow-ups. * Address embedding-capability review findings Fix the issues from the PR #196 re-review: - Build the embeddings request body extras-first so a caller's undeclared runtime-config extra cannot clobber the bound model or the input list. - Reject non-numeric vector values (JSON strings, bools) and bool usage counts strictly rather than coercing them, so a non-numeric vector is treated as malformed. - Dedup the observer event union: correlation and both observers now reference the single ObserverEvent alias instead of repeating the member list, so a new event widens the contract in one place. - Move a stray spec section ref out of a docstring into a comment and reword em-dash substitutes in docstrings. - Strengthen the tests: the Langfuse safe-handling test asserts no trace is created, and the conformance runner tolerates a fixture without a stores_response_in key instead of raising KeyError. - Refresh the observer dispatch docstrings to point at ObserverEvent rather than a hardcoded variant count. * Raise invalid-response on malformed model catalog From the PR #196 review: - _probe_models now raises ProviderInvalidResponse when the /v1/models body is not a JSON object or its data field is missing or not a list, mirroring _parse_response. A malformed catalog was being degraded to an empty list and misreported as a missing model (ProviderInvalidModel); that category is now reserved for a well-formed catalog that genuinely lacks the bound model. Covered by a new unit test. - Align the _classify_embedding_http_error docstring with the code: the catch-all maps every other status (not just 5xx) to unavailable, the same as the sibling classify_http_error. * Fix Observer docstring example annotation The Observer protocol docstring example annotated its log_observer with NodeEvent | MetadataAugmentationEvent, which would be a structural conformance error now that the protocol delivers the full ObserverEvent union (embedding, tool, and provider events included). Annotate the example with ObserverEvent so copied observers type-check.
1 parent 620b010 commit e10fc07

12 files changed

Lines changed: 1263 additions & 135 deletions

File tree

src/openarmature/graph/events.py

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@
3232
if TYPE_CHECKING:
3333
from openarmature.llm.messages import ToolCall
3434
from openarmature.llm.response import Usage
35+
from openarmature.retrieval.response import EmbeddingUsage
3536

3637
# Sentinel empty metadata mapping for events constructed without a
3738
# live caller-metadata snapshot (test helpers, synthetic events).
@@ -749,6 +750,125 @@ class LlmRetryAttemptEvent:
749750
output_tool_calls: list["ToolCall"] = field(default_factory=list["ToolCall"])
750751

751752

753+
# Spec: realizes graph-engine §6 + observability §5.5.9 -- the typed
754+
# EmbeddingEvent / EmbeddingFailedEvent pair (proposal 0059,
755+
# retrieval-provider capability). Dispatched on the observer delivery
756+
# queue per EmbeddingProvider.embed() call: the success variant after the
757+
# response is parsed + validated, the failure variant alongside a raised
758+
# §7 category exception (mutually exclusive per call). Scalar
759+
# fan_out_index / branch_name only; the lineage chains arrive uniformly
760+
# across the provider events with proposal 0084 (v0.81.0). input_strings /
761+
# request_extras are payload-bearing, populated unconditionally; observer-
762+
# side privacy gates (OTel disable_provider_payload, Langfuse equivalents)
763+
# apply at rendering, symmetric with LlmCompletionEvent.
764+
@dataclass(frozen=True)
765+
class EmbeddingEvent:
766+
"""A typed embedding provider call event delivered to observers.
767+
768+
Carries identity, scoping, and outcome data for a successful
769+
``EmbeddingProvider.embed()`` call. Observer code filters by type
770+
discrimination (``isinstance(event, EmbeddingEvent)``).
771+
772+
The identity / scoping / request-side fields mirror
773+
``LlmCompletionEvent``'s convention; the outcome fields are
774+
embedding-specific:
775+
776+
- ``input_strings``: the input strings the call was made with;
777+
non-nullable, populated unconditionally (privacy gating is
778+
observer-side at rendering).
779+
- ``input_count``: ``len(input_strings)``; a convenience field.
780+
- ``dimensions``: the output vector dimensionality from the response;
781+
``None`` when the response surfaced no determinate dimensionality.
782+
- ``response_model`` / ``response_id``: the provider-returned model
783+
and response identifiers; ``None`` when the provider returned none.
784+
- ``usage``: the embedding token record; ``None`` when the call
785+
returned no usage.
786+
- ``request_params``: the embedding request parameters the caller
787+
supplied (e.g. ``dimensions``). Absence-is-meaningful: only supplied
788+
keys appear; an empty mapping when none.
789+
- ``request_extras``: the runtime-config extras pass-through bag.
790+
- ``active_prompt`` / ``active_prompt_group``: prompt-context
791+
snapshots at embed-call time; ``None`` outside a binding.
792+
- ``call_id``: a per-call disambiguator, always present, freshly
793+
minted per ``embed()`` call.
794+
"""
795+
796+
invocation_id: str
797+
correlation_id: str | None
798+
node_name: str
799+
namespace: tuple[str, ...]
800+
attempt_index: int
801+
fan_out_index: int | None
802+
branch_name: str | None
803+
provider: str
804+
model: str
805+
response_id: str | None
806+
response_model: str | None
807+
# EmbeddingUsage is a string-typed forward reference per the
808+
# TYPE_CHECKING import -- keeps the runtime import direction
809+
# graph -> retrieval off the module-load path.
810+
usage: "EmbeddingUsage | None"
811+
latency_ms: float | None
812+
input_strings: list[str]
813+
input_count: int
814+
dimensions: int | None
815+
request_params: Mapping[str, Any]
816+
request_extras: Mapping[str, Any]
817+
active_prompt: Any
818+
active_prompt_group: Any
819+
call_id: str
820+
caller_invocation_metadata: Mapping[str, AttributeValue] | None = None
821+
822+
823+
# Spec: the failure sibling of EmbeddingEvent (proposal 0059). Dispatched
824+
# whenever EmbeddingProvider.embed() raises a §7 category exception --
825+
# covers both the provider-caught path and the pre-send validation raise
826+
# (provider_invalid_request on an empty input list). Dispatched ALONGSIDE
827+
# the exception, not in place of it; mutually exclusive with EmbeddingEvent
828+
# on the same call. The response-side fields are absent (no response).
829+
@dataclass(frozen=True)
830+
class EmbeddingFailedEvent:
831+
"""A typed embedding provider call failure event delivered to observers.
832+
833+
Carries identity, scoping, and failure-context data for an ``embed()``
834+
call that raised a retrieval-provider category exception. Observer code
835+
filters by type discrimination
836+
(``isinstance(event, EmbeddingFailedEvent)``).
837+
838+
The identity / scoping / request-side field set mirrors
839+
``EmbeddingEvent``; the response-side fields are absent. Failure-
840+
specific fields:
841+
842+
- ``error_category``: the error category the call raised (one of the
843+
embedding-applicable provider categories). Always present.
844+
- ``error_type``: an optional impl-level / vendor-specific type or
845+
code; ``None`` when unavailable.
846+
- ``error_message``: a human-readable message; always present (the
847+
empty string when the exception carried no message).
848+
"""
849+
850+
invocation_id: str
851+
correlation_id: str | None
852+
node_name: str
853+
namespace: tuple[str, ...]
854+
attempt_index: int
855+
fan_out_index: int | None
856+
branch_name: str | None
857+
provider: str
858+
model: str
859+
latency_ms: float | None
860+
input_strings: list[str]
861+
request_params: Mapping[str, Any]
862+
request_extras: Mapping[str, Any]
863+
active_prompt: Any
864+
active_prompt_group: Any
865+
call_id: str
866+
error_category: str
867+
error_message: str
868+
error_type: str | None = None
869+
caller_invocation_metadata: Mapping[str, AttributeValue] | None = None
870+
871+
752872
# Spec: realizes pipeline-utilities §6.3 failure-isolation middleware
753873
# (proposal 0050). Emitted by FailureIsolationMiddleware when it
754874
# catches an exception escaping the inner chain and substitutes a
@@ -905,6 +1025,8 @@ class ToolCallFailedEvent:
9051025

9061026

9071027
__all__ = [
1028+
"EmbeddingEvent",
1029+
"EmbeddingFailedEvent",
9081030
"FailureIsolatedEvent",
9091031
"FanOutEventConfig",
9101032
"InvocationCompletedEvent",

src/openarmature/graph/observer.py

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@
3535
from typing import Any, Literal, Protocol
3636

3737
from .events import (
38+
EmbeddingEvent,
39+
EmbeddingFailedEvent,
3840
FailureIsolatedEvent,
3941
InvocationCompletedEvent,
4042
InvocationStartedEvent,
@@ -63,7 +65,9 @@
6365
# retry to drive the per-attempt OTel span surface),
6466
# and FailureIsolatedEvent (proposal 0050 §6.3 framework-emitted event,
6567
# dispatched by FailureIsolationMiddleware when it catches an exception
66-
# escaping the inner chain and substitutes a degraded partial update).
68+
# escaping the inner chain and substitutes a degraded partial update);
69+
# and EmbeddingEvent / EmbeddingFailedEvent (proposal 0059 typed embedding
70+
# provider call events, dispatched on every EmbeddingProvider.embed()).
6771
ObserverEvent = (
6872
NodeEvent
6973
| MetadataAugmentationEvent
@@ -75,6 +79,8 @@
7579
| FailureIsolatedEvent
7680
| ToolCallEvent
7781
| ToolCallFailedEvent
82+
| EmbeddingEvent
83+
| EmbeddingFailedEvent
7884
)
7985

8086

@@ -85,7 +91,7 @@ class Observer(Protocol):
8591
signature qualifies, no subclass required. Plain functions, bound
8692
methods, and class instances with `__call__` all work::
8793
88-
async def log_observer(event: NodeEvent | MetadataAugmentationEvent) -> None:
94+
async def log_observer(event: ObserverEvent) -> None:
8995
if isinstance(event, NodeEvent):
9096
print(event.node_name, event.phase)
9197
@@ -104,9 +110,9 @@ async def log_observer(event: NodeEvent | MetadataAugmentationEvent) -> None:
104110
conformance doesn't pin you to that name; any of `event`, `_event`,
105111
`e`, etc. matches.
106112
107-
Seven event variants reach observers. The signature is the union;
108-
observers ``isinstance``-narrow on the first line and choose which
109-
variants they handle.
113+
The variants reaching observers are the :data:`ObserverEvent` members.
114+
The signature is that union; observers ``isinstance``-narrow on the
115+
first line and choose which variants they handle.
110116
111117
- :class:`NodeEvent` — the started/completed/checkpoint phase
112118
events. Subject to the ``phases`` filter on
@@ -778,7 +784,7 @@ def _dispatch(
778784
) -> None:
779785
"""Enqueue an event for the delivery worker.
780786
781-
Handles four event variants:
787+
Handles the :data:`ObserverEvent` variants. The principal ones:
782788
783789
- :class:`NodeEvent`: the started/completed/checkpoint pair model.
784790
For ``"started"``-phase events, also calls any subscribed

src/openarmature/observability/correlation.py

Lines changed: 15 additions & 105 deletions
Original file line numberDiff line numberDiff line change
@@ -36,19 +36,13 @@
3636
from typing import TYPE_CHECKING
3737

3838
if TYPE_CHECKING:
39-
from openarmature.graph.events import (
40-
FailureIsolatedEvent,
41-
InvocationCompletedEvent,
42-
InvocationStartedEvent,
43-
LlmCompletionEvent,
44-
LlmFailedEvent,
45-
LlmRetryAttemptEvent,
46-
MetadataAugmentationEvent,
47-
NodeEvent,
48-
ToolCallEvent,
49-
ToolCallFailedEvent,
50-
)
51-
from openarmature.graph.observer import SubscribedObserver
39+
from openarmature.graph.observer import ObserverEvent, SubscribedObserver
40+
41+
# The dispatch callable accepts the full ObserverEvent union (the single
42+
# source of truth, defined in graph.observer): a new capability event
43+
# widens the contract in one place rather than across every dispatch
44+
# accessor here.
45+
_DispatchFn = Callable[[ObserverEvent], None]
5246

5347

5448
# ---------------------------------------------------------------------------
@@ -221,44 +215,12 @@ def _reset_active_observers(token: Token[tuple[SubscribedObserver, ...]]) -> Non
221215
# ---------------------------------------------------------------------------
222216

223217

224-
_active_dispatch_var: ContextVar[
225-
Callable[
226-
[
227-
NodeEvent
228-
| MetadataAugmentationEvent
229-
| InvocationStartedEvent
230-
| InvocationCompletedEvent
231-
| LlmCompletionEvent
232-
| LlmFailedEvent
233-
| LlmRetryAttemptEvent
234-
| FailureIsolatedEvent
235-
| ToolCallEvent
236-
| ToolCallFailedEvent
237-
],
238-
None,
239-
]
240-
| None
241-
] = ContextVar("openarmature.active_dispatch", default=None)
242-
243-
244-
def current_dispatch() -> (
245-
Callable[
246-
[
247-
NodeEvent
248-
| MetadataAugmentationEvent
249-
| InvocationStartedEvent
250-
| InvocationCompletedEvent
251-
| LlmCompletionEvent
252-
| LlmFailedEvent
253-
| LlmRetryAttemptEvent
254-
| FailureIsolatedEvent
255-
| ToolCallEvent
256-
| ToolCallFailedEvent
257-
],
258-
None,
259-
]
260-
| None
261-
):
218+
_active_dispatch_var: ContextVar[_DispatchFn | None] = ContextVar(
219+
"openarmature.active_dispatch", default=None
220+
)
221+
222+
223+
def current_dispatch() -> _DispatchFn | None:
262224
"""Return the engine's dispatch callable for the current invocation,
263225
or ``None`` outside any invocation.
264226
@@ -272,65 +234,13 @@ def current_dispatch() -> (
272234
return _active_dispatch_var.get()
273235

274236

275-
def _set_active_dispatch(
276-
dispatch: Callable[
277-
[
278-
NodeEvent
279-
| MetadataAugmentationEvent
280-
| InvocationStartedEvent
281-
| InvocationCompletedEvent
282-
| LlmCompletionEvent
283-
| LlmFailedEvent
284-
| LlmRetryAttemptEvent
285-
| FailureIsolatedEvent
286-
| ToolCallEvent
287-
| ToolCallFailedEvent
288-
],
289-
None,
290-
],
291-
) -> Token[
292-
Callable[
293-
[
294-
NodeEvent
295-
| MetadataAugmentationEvent
296-
| InvocationStartedEvent
297-
| InvocationCompletedEvent
298-
| LlmCompletionEvent
299-
| LlmFailedEvent
300-
| LlmRetryAttemptEvent
301-
| FailureIsolatedEvent
302-
| ToolCallEvent
303-
| ToolCallFailedEvent
304-
],
305-
None,
306-
]
307-
| None
308-
]:
237+
def _set_active_dispatch(dispatch: _DispatchFn) -> Token[_DispatchFn | None]:
309238
"""Set the engine's dispatch callable in scope. Internal —
310239
engine-only."""
311240
return _active_dispatch_var.set(dispatch)
312241

313242

314-
def _reset_active_dispatch(
315-
token: Token[
316-
Callable[
317-
[
318-
NodeEvent
319-
| MetadataAugmentationEvent
320-
| InvocationStartedEvent
321-
| InvocationCompletedEvent
322-
| LlmCompletionEvent
323-
| LlmFailedEvent
324-
| LlmRetryAttemptEvent
325-
| FailureIsolatedEvent
326-
| ToolCallEvent
327-
| ToolCallFailedEvent
328-
],
329-
None,
330-
]
331-
| None
332-
],
333-
) -> None:
243+
def _reset_active_dispatch(token: Token[_DispatchFn | None]) -> None:
334244
_active_dispatch_var.reset(token)
335245

336246

src/openarmature/observability/langfuse/observer.py

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@
3030
from typing import Any, cast
3131

3232
from openarmature.graph.events import (
33+
EmbeddingEvent,
34+
EmbeddingFailedEvent,
3335
FailureIsolatedEvent,
3436
InvocationCompletedEvent,
3537
InvocationStartedEvent,
@@ -41,6 +43,7 @@
4143
ToolCallEvent,
4244
ToolCallFailedEvent,
4345
)
46+
from openarmature.graph.observer import ObserverEvent
4447
from openarmature.observability.lineage import is_strict_prefix
4548

4649
from .client import (
@@ -432,18 +435,7 @@ def __post_init__(self) -> None:
432435

433436
async def __call__(
434437
self,
435-
event: (
436-
NodeEvent
437-
| MetadataAugmentationEvent
438-
| InvocationStartedEvent
439-
| InvocationCompletedEvent
440-
| LlmCompletionEvent
441-
| LlmFailedEvent
442-
| LlmRetryAttemptEvent
443-
| FailureIsolatedEvent
444-
| ToolCallEvent
445-
| ToolCallFailedEvent
446-
),
438+
event: ObserverEvent,
447439
) -> None:
448440
if isinstance(event, InvocationStartedEvent):
449441
self._handle_invocation_started(event)
@@ -484,6 +476,12 @@ async def __call__(
484476
if isinstance(event, MetadataAugmentationEvent):
485477
self._handle_metadata_augmentation(event)
486478
return
479+
# Proposal 0059 embedding events: the bundled Langfuse Embedding
480+
# observation is a follow-up. Until it lands the events are safely
481+
# ignored here rather than falling through to the NodeEvent phase
482+
# dispatch (which would AttributeError on the absent ``phase``).
483+
if isinstance(event, EmbeddingEvent | EmbeddingFailedEvent):
484+
return
487485
if event.phase == "started":
488486
self._open_started_observation(event)
489487
elif event.phase == "completed":

0 commit comments

Comments
 (0)