Skip to content

Implement LlmFailedEvent typed variant (proposal 0058)#144

Merged
chris-colinsky merged 2 commits into
mainfrom
feature/0058-llm-failed-event
Jun 9, 2026
Merged

Implement LlmFailedEvent typed variant (proposal 0058)#144
chris-colinsky merged 2 commits into
mainfrom
feature/0058-llm-failed-event

Conversation

@chris-colinsky

Copy link
Copy Markdown
Member

Summary

  • Defines LlmFailedEvent typed variant per proposal 0058 (spec v0.53.0). 17 mirrored identity / scoping / request-side fields + 3 failure-specific fields (error_category from the §7 normative category enumeration, optional error_type, always-present error_message).
  • Restructures OpenAIProvider.complete() so pre-send validation lives inside the try-block. Any §7 category exception (adapter-caught provider error OR pre-send validation raise) dispatches LlmFailedEvent alongside the still-raised exception. Caller-side exception flow is unchanged.
  • Both bundled observers (OTel + Langfuse) consume LlmFailedEvent directly. Same openarmature.llm.complete span / Generation shape as the success path, with ERROR status / level and the openarmature.error.category attribute. start_time back-dated by latency_ms so failure duration reflects time-to-raise.
  • Sentinel-namespace NodeEvent emission for LLM events retires entirely from the bundled OpenAIProvider. _make_llm_event removed. External custom observers that filtered LLM calls by event.namespace == LLM_NAMESPACE MUST migrate to isinstance(event, LlmCompletionEvent) for success and isinstance(event, LlmFailedEvent) for failure. LlmEventPayload + LLM_NAMESPACE remain in observability/llm_event.py as a documented compatibility surface for custom providers.
  • Spec pin advances v0.51.0 → v0.53.0. Proposal 0023 (canonical state reducers, spec v0.52.0) marked not-yet; fixtures 034-038 parser-deferred.

Scope

Closes the v0.13.0 typed-event migration cycle. After this PR:

  • Both LLM outcome paths use typed events (success: LlmCompletionEvent; failure: LlmFailedEvent).
  • The bundled provider no longer emits sentinel-namespace NodeEvents for LLM events at all.
  • The discuss-llm-completion-event-failure-path coord thread can close.

Test coverage

  • 9-category field-mapping lockdown (test_build_llm_failed_event_maps_category_and_type_per_exception). Parametrized over every LlmProviderError subclass; verifies error_category / error_type / error_message populate correctly for all 9 §7 categories.
  • Pre-send validation raise (test_complete_pre_send_validation_emits_llm_failed_event_before_propagating). Drives ProviderInvalidRequest from _normalize_response_schema's non-BaseModel-class rejection — bypasses every wire concern.
  • Mutual-exclusion lockdown (test_llm_completion_and_failed_events_are_mutually_exclusive). Asserts the disjoint-count rule on both success and failure paths in a single test.
  • OTel + Langfuse error-path tests migrated from sentinel-completed-payload dispatch to LlmFailedEvent dispatch via the shared make_failed_event helper.
  • White-box parallel-branches Langfuse parent-resolution test (from PR 3c) rewritten to dispatch LlmFailedEvent.

Test plan

  • uv run pytest tests/ — 1225 passed.
  • uv run pyright clean.
  • uv run ruff check + ruff format --check clean.
  • Spec pin verify: cat openarmature-spec/VERSION shows 0.53.0.

Out of scope

  • Conformance fixtures 069-073 stay parser-deferred pending the harness's typed_event_collector schema work + the event_counts: list directive (introduced by fixture 071). Follow-on harness PR.
  • LlmEventPayload + LLM_NAMESPACE retention/removal — a future cleanup PR once we're confident no real downstream consumers depend on them.
  • Lifting shared attribute-building between the typed success/failure observer handlers — defer; small future refactor.

Carve LLM provider failures into a spec-normatively-typed event
variant alongside LlmCompletionEvent. Field set mirrors the success
variant's identity / scoping / request-side surface 1:1 (17 fields)
plus three failure-specific fields: error_category (always-present,
from the §7 normative category enumeration), error_type (optional
upstream class name or vendor code), error_message (always-present
human-readable from the raised exception).

OpenAIProvider.complete() restructures around the failure-event
emission: pre-send validation (validate_message_list / validate_tools
/ _normalize_response_schema) moves inside the try-block so any §7
category exception — pre-send OR adapter-caught — flows through the
same LlmFailedEvent path. The exception still raises out of
complete() unchanged; the typed event is dispatched on the observer
queue alongside the exception per proposal 0058's §6 dispatch
contract.

Both bundled observers (OTel + Langfuse) consume LlmFailedEvent
directly with the same openarmature.llm.complete span / Generation
shape as the success path plus ERROR status / level and the
openarmature.error.category attribute. Sentinel-namespace NodeEvent
emission for LLM events retires entirely from the bundled provider;
_make_llm_event is removed. LlmEventPayload + LLM_NAMESPACE remain
in observability/llm_event.py as a documented compatibility surface
for custom providers.

Spec pin advances from v0.51.0 to v0.53.0; proposal 0023 (canonical
state reducers) marked not-yet with fixtures 034-038 parser-
deferred. Fixtures 069-073 (the 0058 conformance set) deferred
pending typed_event_collector schema + the event_counts list
directive in the harness; unit tests pin the contract end-to-end:
9-category field-mapping lockdown, pre-send validation raise,
mutual-exclusion between LlmCompletionEvent and LlmFailedEvent on
the same call.
Copilot AI review requested due to automatic review settings June 9, 2026 21:51

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements proposal 0058 (spec v0.53.0) by adding a typed LlmFailedEvent for LLM failure observability, migrating the bundled OpenTelemetry and Langfuse observers to consume it, and removing the legacy sentinel-namespace NodeEvent LLM emission path from OpenAIProvider.

Changes:

  • Add LlmFailedEvent typed event and integrate it into the observer event union + exports.
  • Update OpenAIProvider.complete() to emit LlmFailedEvent on §7-category failures (including pre-send validation) while preserving exception flow, and retire sentinel-namespace LLM NodeEvent emission.
  • Update OTel/Langfuse observers and tests to render failure spans/generations from LlmFailedEvent; bump spec pin/versioning to v0.53.0 and defer proposal-0023 fixtures.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/unit/test_observability_otel.py Updates OTel failure-path test to use LlmFailedEvent helper and validate error-category attribute.
tests/unit/test_observability_langfuse.py Updates Langfuse failure-path tests to use LlmFailedEvent and validates error generation behavior/parenting.
tests/unit/test_llm_provider.py Adds provider-level tests for LlmFailedEvent emission, category/type mapping, pre-send validation emission, and mutual exclusion with completion events.
tests/test_smoke.py Updates asserted __spec_version__ to 0.53.0.
tests/conformance/test_fixture_parsing.py Defers new 0058 fixtures (069–073) and 0023 fixtures (034–038) with rationale.
tests/conformance/test_conformance.py Marks proposal-0023 runtime fixtures as deferred in the conformance runner.
tests/_helpers/typed_event.py Adds make_failed_event helper for constructing LlmFailedEvent instances in tests.
src/openarmature/observability/otel/observer.py Adds LlmFailedEvent handling to emit ERROR spans; removes sentinel-namespace LLM handling.
src/openarmature/observability/langfuse/observer.py Adds LlmFailedEvent handling to emit ERROR generations; removes sentinel-namespace LLM handling and shares typed metadata helpers.
src/openarmature/observability/correlation.py Extends dispatch/event typing unions to include LlmFailedEvent.
src/openarmature/llm/providers/openai.py Refactors complete() to emit typed failure events on §7 errors and removes _make_llm_event sentinel helper.
src/openarmature/graph/observer.py Extends ObserverEvent union and docs to include LlmFailedEvent.
src/openarmature/graph/events.py Defines the LlmFailedEvent dataclass and exports it.
src/openarmature/graph/init.py Re-exports LlmFailedEvent.
src/openarmature/AGENTS.md Updates bundled spec excerpt to v0.53.0 (includes proposal-0023 reducer text).
src/openarmature/init.py Bumps __spec_version__ to 0.53.0.
pyproject.toml Bumps [tool.openarmature].spec_version to 0.53.0.
conformance.toml Advances spec pin to v0.53.0 and adds proposal-0058 entry (proposal-0023 marked not-yet).
CHANGELOG.md Documents LlmFailedEvent, sentinel retirement, and spec bump (contains now-stale sentinel-failure text that should be corrected).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread CHANGELOG.md Outdated
Comment thread src/openarmature/AGENTS.md
Two stale-content fixes flagged by CoPilot:

1. CHANGELOG line-17 and line-18 bullets carried "Failure paths
   continue to fire from the sentinel NodeEvent" framing from the
   3b/3c era, which contradicts this PR's LlmFailedEvent migration
   and full sentinel retirement. Trimmed both fragments and added
   a forward-reference to the proposal 0058 entry that documents
   the cycle-final state.

2. AGENTS.md's reducer baseline reproduced proposal 0023's factory
   reducers verbatim from the spec, but Python doesn't ship them
   in this cycle (manifest 0023 = not-yet). The text is auto-
   generated by build_agents_md.py from the pinned spec submodule;
   updated the generator's lead paragraph to flag that capability
   summaries reproduce spec content verbatim — including additions
   from accepted proposals this implementation may not yet ship —
   and point readers at conformance.toml for per-proposal impl
   status. Generalizes to any future not-yet proposal landing in
   spec text before Python catches up. Regenerated AGENTS.md.
@chris-colinsky chris-colinsky merged commit d2d387a into main Jun 9, 2026
6 checks passed
@chris-colinsky chris-colinsky deleted the feature/0058-llm-failed-event branch June 9, 2026 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants