Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,16 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The

## [Unreleased]

### Added

- **`LlmFailedEvent` typed event variant** (proposal 0058, spec v0.53.0). Carves LLM provider failures into a spec-normatively-typed event variant alongside `LlmCompletionEvent`. 17 mirrored identity / scoping / request-side fields + 3 failure-specific fields (`error_category` always-present from the llm-provider §7 normative category enumeration; optional `error_type` for vendor-specific detail or upstream exception class name; always-present `error_message`). `OpenAIProvider.complete()` emits the typed event alongside the §7 exception on both raise paths — adapter-caught provider exceptions AND pre-send validation raises. Caller-side exception flow unchanged; the exception still raises out of `complete()`. Mutually exclusive with `LlmCompletionEvent` on the same call. Both bundled observers (OTel + Langfuse) consume `LlmFailedEvent` directly: same `openarmature.llm.complete` span / Generation shape as the success path with ERROR status / level + `openarmature.error.category` attribute (OTel) / `error_category` as statusMessage (Langfuse), `start_time` back-dated by `latency_ms` so the failure duration reflects the time-to-raise.

### Changed

- **OTel and Langfuse observers drive the `openarmature.llm.complete` span / Generation observation lifecycle from the typed `LlmCompletionEvent`** (proposal 0049 + 0057, observability §5.5.7). Successful LLM-provider calls now open + close the OTel span and the Langfuse Generation in one shot at typed-event arrival, with `start_time` back-dated by `LlmCompletionEvent.latency_ms` so duration reflects the adapter-boundary measurement rather than dispatcher queue delay. Failure paths continue to fire from the sentinel `NodeEvent` (the typed event is success-only per the proposal). The §5.5 attribute set and §8.4 Generation metadata are unchanged.
- **`OpenAIProvider.complete()` no longer emits the sentinel `NodeEvent` pair on the success path** (v0.13.0 cleanup). The bundled OTel and Langfuse observers now consume the typed `LlmCompletionEvent` directly; the sentinel pair was kept on the success path through earlier releases for compatibility with pre-typed-event observers. External custom observers that filtered LLM calls by `event.namespace == LLM_NAMESPACE` MUST migrate to `isinstance(event, LlmCompletionEvent)` to continue seeing successful LLM calls. The sentinel `completed` event still fires on the failure path until the spec extends `LlmCompletionEvent` with error semantics; the sentinel `started` event is no longer emitted on either path.
- **Sentinel-namespace `NodeEvent` emission for LLM events retired entirely from `OpenAIProvider`** (proposal 0058 cleanup). The provider no longer dispatches the `("openarmature.llm.complete",)`-namespaced `NodeEvent`s on either outcome path; both success and failure flow through their respective typed variants exclusively. The `_make_llm_event` helper is removed. External custom observers that filtered LLM calls by `event.namespace == LLM_NAMESPACE` MUST migrate to `isinstance(event, LlmCompletionEvent)` for success and `isinstance(event, LlmFailedEvent)` for failure to keep receiving LLM-call notifications. `LlmEventPayload` and `LLM_NAMESPACE` remain in `openarmature.observability.llm_event` as a documented compatibility surface for custom providers that haven't migrated; neither is referenced by the bundled provider or observers anymore.
- **Pinned spec advances from v0.51.0 to v0.53.0** (absorbs proposals 0023 + 0058). Proposal 0023 (canonical state reducers) ships in spec v0.52.0 but is not implemented this cycle — `conformance.toml` marks 0023 as `not-yet`; fixtures 034–038 stay parser-deferred.
- **OTel and Langfuse observers drive the `openarmature.llm.complete` span / Generation observation lifecycle from the typed `LlmCompletionEvent`** (proposal 0049 + 0057, observability §5.5.7). Successful LLM-provider calls now open + close the OTel span and the Langfuse Generation in one shot at typed-event arrival, with `start_time` back-dated by `LlmCompletionEvent.latency_ms` so duration reflects the adapter-boundary measurement rather than dispatcher queue delay. The §5.5 attribute set and §8.4 Generation metadata are unchanged. (Failure paths land on `LlmFailedEvent` later in the same cycle — see the proposal 0058 entry above.)
- **`OpenAIProvider.complete()` no longer emits the sentinel `NodeEvent` pair on the success path** (v0.13.0 cleanup). The bundled OTel and Langfuse observers now consume the typed `LlmCompletionEvent` directly; the sentinel pair was kept on the success path through earlier releases for compatibility with pre-typed-event observers. External custom observers that filtered LLM calls by `event.namespace == LLM_NAMESPACE` MUST migrate to `isinstance(event, LlmCompletionEvent)` to continue seeing successful LLM calls. (The failure-path sentinel emission is retired entirely later in the same cycle — see the proposal 0058 entry above.)
- **`LangfuseClient` Protocol gains optional `start_time` / `end_time` timestamps** on `generation(...)` and the Generation/Span handles' `end(...)`. The Langfuse observer passes back-dated timestamps on the typed-event success path so the Langfuse UI shows the actual adapter-boundary duration. The SDK adapter handles v4 Langfuse SDK quirks transparently: `Langfuse.start_observation()` does NOT accept `start_time`, so back-dated generations are routed through the private `_otel_tracer.start_span(name=..., start_time=int_ns)` API (mirroring the SDK's own `create_event` precedent) and the resulting OTel span is wrapped in `LangfuseGeneration` directly; the non-back-dated path still uses `start_observation`. `LangfuseSpan.end()` is typed `Optional[int]` (nanoseconds), so the adapter converts the Protocol's `datetime` surface to int nanoseconds before forwarding. The `InMemoryLangfuseClient` stores both fields verbatim on `LangfuseObservation` for test assertions.
- **`OpenAIProvider(populate_caller_metadata=...)` default flipped from `False` to `True`.** The python implementation now populates `LlmCompletionEvent.caller_invocation_metadata` by default so the bundled OTel and Langfuse observers can emit the §5.6 `openarmature.user.<key>` span-attribute family without a separate opt-in. Pass `populate_caller_metadata=False` to suppress the snapshot when no downstream consumer needs it. The spec-defined opt-in mechanism is unchanged; only the python default flips.

Expand Down
28 changes: 27 additions & 1 deletion conformance.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@

[manifest]
implementation = "openarmature-python"
spec_pin = "v0.51.0"
spec_pin = "v0.53.0"

# Status values:
# implemented — shipped behavior matches the proposal's contract
Expand Down Expand Up @@ -217,6 +217,15 @@ status = "not-yet"
[proposals."0022"]
status = "not-yet"

# Spec v0.52.0 (proposal 0023). Canonical state reducers — three
# new factory-style reducers (``bounded_append``, ``dedupe_append``,
# ``merge_by_key``) extending the graph-engine §2 baseline set.
# Python has not yet shipped the new reducers; v0.13.0 leaves the
# capability not-yet-implemented. Conformance fixtures 035-038
# stay parser-deferred until the implementation lands.
[proposals."0023"]
status = "not-yet"

[proposals."0042"]
status = "implemented"
since = "0.11.0"
Expand Down Expand Up @@ -509,3 +518,20 @@ status = "not-yet"
[proposals."0057"]
status = "implemented"
since = "0.13.0"

# Spec v0.53.0 (proposal 0058). Typed LLM failure event — second
# spec-normatively-typed event variant on the observer event union
# alongside LlmCompletionEvent. Field set mirrors LlmCompletionEvent's
# identity / scoping / request-side surface (17 fields) plus three
# failure-specific fields (error_category from the §7 normative
# category enumeration, optional vendor-specific error_type, always-
# present error_message). Dispatched alongside the §7 exception on
# the observer queue — caller-side exception flow unchanged.
# Mutually exclusive with LlmCompletionEvent on the same call. Python
# lands the typed variant + provider emission + OTel/Langfuse
# consumer migration in v0.13.0; same PR also drops sentinel-namespace
# NodeEvent emission for LLM events entirely from the bundled
# OpenAIProvider.
[proposals."0058"]
status = "implemented"
since = "0.13.0"
2 changes: 1 addition & 1 deletion openarmature-spec
Submodule openarmature-spec updated 33 files
+27 −0 CHANGELOG.md
+18 −15 README.md
+4 −0 docs/index.md
+113 −0 docs/javascripts/paginate-tables.js
+5 −4 docs/proposals.md
+1 −0 docs/proposals/0058-typed-llm-failure-event.md
+61 −0 docs/stylesheets/extra.css
+7 −1 mkdocs.yml
+30 −42 proposals/0023-canonical-state-reducers.md
+333 −0 proposals/0058-typed-llm-failure-event.md
+10 −3 spec/conformance-adapter/spec.md
+39 −0 spec/graph-engine/conformance/034-reducer-bounded-append.md
+106 −0 spec/graph-engine/conformance/034-reducer-bounded-append.yaml
+35 −0 spec/graph-engine/conformance/035-reducer-dedupe-append.md
+61 −0 spec/graph-engine/conformance/035-reducer-dedupe-append.yaml
+43 −0 spec/graph-engine/conformance/036-reducer-merge-by-key.md
+138 −0 spec/graph-engine/conformance/036-reducer-merge-by-key.yaml
+42 −0 spec/graph-engine/conformance/037-reducer-configuration-invalid-max-len.md
+53 −0 spec/graph-engine/conformance/037-reducer-configuration-invalid-max-len.yaml
+33 −0 spec/graph-engine/conformance/038-reducer-error-non-list-update.md
+31 −0 spec/graph-engine/conformance/038-reducer-error-non-list-update.yaml
+129 −5 spec/graph-engine/spec.md
+36 −0 spec/observability/conformance/069-llm-failure-event-dispatch-on-provider-unavailable.md
+62 −0 spec/observability/conformance/069-llm-failure-event-dispatch-on-provider-unavailable.yaml
+33 −0 spec/observability/conformance/070-llm-failure-event-dispatch-on-provider-invalid-request.md
+56 −0 spec/observability/conformance/070-llm-failure-event-dispatch-on-provider-invalid-request.yaml
+40 −0 spec/observability/conformance/071-llm-failure-event-call-id-distinct-from-completion-event.md
+72 −0 spec/observability/conformance/071-llm-failure-event-call-id-distinct-from-completion-event.yaml
+31 −0 spec/observability/conformance/072-llm-failure-event-mutual-exclusion-with-completion-event.md
+46 −0 spec/observability/conformance/072-llm-failure-event-mutual-exclusion-with-completion-event.yaml
+39 −0 spec/observability/conformance/073-llm-failure-event-error-type-vendor-specific.md
+130 −0 spec/observability/conformance/073-llm-failure-event-error-type-vendor-specific.yaml
+20 −0 spec/observability/spec.md
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Specification = "https://github.com/LunarCommand/openarmature-spec"
openarmature = "openarmature.cli:main"

[tool.openarmature]
spec_version = "0.51.0"
spec_version = "0.53.0"

[dependency-groups]
dev = [
Expand Down
6 changes: 5 additions & 1 deletion scripts/build_agents_md.py
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,11 @@ def _capability_summaries(spec_tag: str) -> str:
(
f"_Sourced from openarmature-spec {spec_tag}. Each entry below "
+ "reproduces §1 (Purpose) and §2 (Concepts) of the capability's "
+ "`spec.md`. For the full spec text (execution model, error semantics, "
+ "`spec.md` verbatim — including additions from accepted proposals "
+ "that this Python implementation may not yet ship. For per-proposal "
+ "implementation status (implemented / partial / textual-only / "
+ "not-yet), see the `conformance.toml` manifest at the repo root. "
+ "For the full spec text (execution model, error semantics, "
+ "determinism, observer hooks, etc.) see the linked docs site._"
),
]
Expand Down
74 changes: 67 additions & 7 deletions src/openarmature/AGENTS.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# OpenArmature — Agent documentation

*This is the agent guide bundled with the openarmature Python package, version 0.12.0 (spec v0.51.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*
*This is the agent guide bundled with the openarmature Python package, version 0.12.0 (spec v0.53.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*

## TL;DR

Expand All @@ -10,7 +10,7 @@ OpenArmature is a workflow framework for LLM pipelines and tool-calling agents:

## Capability contracts

_Sourced from openarmature-spec v0.51.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md`. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._
_Sourced from openarmature-spec v0.53.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._

### Capability: `graph-engine`

Expand Down Expand Up @@ -46,11 +46,15 @@ engine constant, not a reserved node name, so a user node may happen to be named

**Reducer.** A function that merges a node's partial update into the prior state for a given field. Each state
field has exactly one reducer. The default reducer is _last-write-wins_ (the new value replaces the old).
Implementations MUST provide at least: `last_write_wins`, `append` (for list-typed fields), `merge`
(for mapping-typed fields), `concat_flatten` (for list-typed fields whose updates are lists of lists —
e.g., fan-out target fields collecting list-emitting per-instance values), and `merge_all` (for
mapping-typed fields whose updates are lists of mappings — e.g., fan-out target fields collecting
dict-emitting per-instance values). Users MAY register custom reducers per field.
Implementations MUST provide at least the following eight canonical reducers: `last_write_wins`, `append`
(for list-typed fields), `merge` (for mapping-typed fields), `concat_flatten` (for list-typed fields whose
updates are lists of lists — e.g., fan-out target fields collecting list-emitting per-instance values),
`merge_all` (for mapping-typed fields whose updates are lists of mappings — e.g., fan-out target fields
collecting dict-emitting per-instance values), `bounded_append(max_len)` (factory; `append` capped at
`max_len` entries with front-drop on overflow), `dedupe_append(key=None)` (factory; `append` skipping
items whose key already appears in the existing list), and `merge_by_key(key)` (factory; list-of-records
keyed merge — entries with a key matching an existing entry replace the existing entry in place; entries
with novel keys are appended). Users MAY register custom reducers per field.
Comment thread
chris-colinsky marked this conversation as resolved.

**`concat_flatten` semantics.** `concat_flatten(prior, update)` returns the concatenation of `prior` with the
one-level flattening of `update`. Both `prior` and `update` MUST be lists, and every element of `update` MUST
Expand All @@ -72,6 +76,57 @@ inside `update` contribute zero keys. Implementations MUST NOT auto-detect wheth
mappings vs. a single mapping — `merge_all` is strictly the list-of-mappings reducer; callers needing both
behaviors on the same field MUST register a custom reducer rather than rely on shape-dependent behavior.

**`bounded_append(max_len)` semantics.** A factory returning a reducer that extends a list with the update's
items and truncates from the front (oldest entries dropped first) if the post-merge length exceeds `max_len`.
`max_len` MUST be a positive integer (≥ 1); a factory call with `max_len ≤ 0` raises
`reducer_configuration_invalid` at field registration time. Behavior: concatenate prior + update, then if
the concatenated list's length exceeds `max_len`, drop entries from the front until the length equals
`max_len`. The bound applies to the post-merge length, not to the update's individual size — an update
larger than `max_len` keeps only the last `max_len` items of the update and the prior list is fully evicted. Both `prior` and `update` MUST be lists;
violations raise `ReducerError` per §4. Empty `update` is a no-op (returns `prior` unchanged) — the bound
applies to merge-time transformations, not as a prior-validation pass; `prior` is returned as-is even if
it somehow already exceeds `max_len` (matching the established `concat_flatten` / `merge_all` empty-update
pattern). Truncation MUST be from the front (oldest-first eviction) for cross-impl consistency; back-drop
is recoverable via a
custom reducer if needed. `bounded_append` is for cases where silent drop of evicted data is acceptable
(recent-events buffers, debug log windows, sliding metric caches); for cases where dropped data must be
summarized or transformed first (the canonical chat-history-with-LLM-summarization shape), use unbounded
`append` plus a separate compaction node or middleware — reducers are pure synchronous functions per the
contract above and cannot perform the IO that real compaction requires.

**`dedupe_append(key=None)` semantics.** A factory returning a reducer that extends a list with items from
the update that are not already present (by key) in the existing list. The `key` parameter is an optional
callable mapping an item to its dedup key; if omitted, the item itself is used as the key (requires hashable
items). Behavior: initialize a seen-keys set from `prior` (preserving `prior` unchanged in the result),
iterate `update` in order, and for each item compute its key — if the key is NOT yet in seen-keys, append
the item to the result and record its key; otherwise skip. Existing items appear before update items;
within each, original order is maintained. Duplicates within the update itself are filtered alongside
matches against `prior` — first occurrence wins (preserves left-to-right precedence consistent with
`append`). The computed key (the item itself when no `key` callable is supplied, or the value returned by
the callable) MUST be hashable; a non-hashable key raises `ReducerError` per §4 at merge time. A `key`
callable that raises on any item propagates as `ReducerError`. The reducer does NOT mutate existing items
(no in-place dedup of `prior`); only the update is filtered.

**`merge_by_key(key)` semantics.** A factory returning a reducer for list-of-records fields. Items in the
update with a key matching an existing item REPLACE the existing item in place; items with novel keys are
appended at the end of the list in the order they appear in the update. The `key` parameter is a required
callable mapping an item to its merge key — the spec does NOT default this; keyed merge without a key
function is meaningless and a factory call with `key=None` raises `reducer_configuration_invalid` at field
registration time. Behavior: build a `key_to_idx` index from `prior` (when `prior` contains duplicate keys,
the index MUST hold the LAST index for each duplicate key — implementations whose native dict construction
uses first-wins semantics MUST iterate explicitly to enforce last-wins); for each item in `update`, if its
key is in the index, replace the prior entry at that index with the update item; otherwise append the
update item to the result and register its key. Existing entry order MUST be preserved (replacements are
in-place); novel entries are appended in update order. Duplicate keys within the update collapse to
last-occurrence-wins (consistent with how dict updates work for repeated keys). Earlier duplicates in
`prior` are preserved in place — the reducer does NOT in-place dedupe existing entries (parallel to
`dedupe_append`'s "no in-place dedup of existing" rule). The value returned by the `key` callable MUST
be hashable (required by the index-build step); a non-hashable return value raises `ReducerError` per §4
at merge time. The `key` callable raising on any item propagates as `ReducerError`. Empty `update` is a
no-op. `merge_by_key` is NOT a substitute for `merge` — `merge`
operates on dict-typed fields with shallow key-value semantics; `merge_by_key` operates on list-of-records
fields with item-key semantics. The qualifier `_by_key` distinguishes the two shapes.

**Subgraph.** A compiled graph used as a node inside another graph. A subgraph executes against its own state
schema and produces a partial update that is merged into the parent's state. The merge uses the same reducer
rules as ordinary nodes — parent reducers, applied to parent fields.
Expand Down Expand Up @@ -136,6 +191,11 @@ identifiers (as an error class, error code, or tagged discriminant, per the lang
- `conflicting_reducers` — a state field has more than one declared reducer.
- `mapping_references_undeclared_field` — a subgraph-as-node `inputs` or `outputs` mapping names a field
not declared in the relevant state schema.
- `reducer_configuration_invalid` — a reducer factory was supplied invalid construction parameters
(e.g., `bounded_append(max_len=0)`, `merge_by_key(key=None)`). Raised at field registration / graph
compilation time, before any node body runs. Distinct from `conflicting_reducers`, which is about
the reducer-declaration shape across multiple reducers on the same field; `reducer_configuration_invalid`
is about parameters supplied to a single reducer factory.

### Capability: `pipeline-utilities`

Expand Down
2 changes: 1 addition & 1 deletion src/openarmature/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
"""

__version__ = "0.12.0"
__spec_version__ = "0.51.0"
__spec_version__ = "0.53.0"
# Proposal 0052 (spec observability §5.1 / §8.4.1): canonical
# package-registry name for this implementation. Surfaces on every
# OTel invocation span as ``openarmature.implementation.name`` and on
Expand Down
2 changes: 2 additions & 0 deletions src/openarmature/graph/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
InvocationCompletedEvent,
InvocationStartedEvent,
LlmCompletionEvent,
LlmFailedEvent,
MetadataAugmentationEvent,
NodeEvent,
)
Expand Down Expand Up @@ -86,6 +87,7 @@
"InvocationCompletedEvent",
"InvocationStartedEvent",
"LlmCompletionEvent",
"LlmFailedEvent",
"MappingReferencesUndeclaredField",
"MetadataAugmentationEvent",
"Middleware",
Expand Down
Loading