Implement LlmFailedEvent typed variant (proposal 0058)#144
Merged
Conversation
Carve LLM provider failures into a spec-normatively-typed event variant alongside LlmCompletionEvent. Field set mirrors the success variant's identity / scoping / request-side surface 1:1 (17 fields) plus three failure-specific fields: error_category (always-present, from the §7 normative category enumeration), error_type (optional upstream class name or vendor code), error_message (always-present human-readable from the raised exception). OpenAIProvider.complete() restructures around the failure-event emission: pre-send validation (validate_message_list / validate_tools / _normalize_response_schema) moves inside the try-block so any §7 category exception — pre-send OR adapter-caught — flows through the same LlmFailedEvent path. The exception still raises out of complete() unchanged; the typed event is dispatched on the observer queue alongside the exception per proposal 0058's §6 dispatch contract. Both bundled observers (OTel + Langfuse) consume LlmFailedEvent directly with the same openarmature.llm.complete span / Generation shape as the success path plus ERROR status / level and the openarmature.error.category attribute. Sentinel-namespace NodeEvent emission for LLM events retires entirely from the bundled provider; _make_llm_event is removed. LlmEventPayload + LLM_NAMESPACE remain in observability/llm_event.py as a documented compatibility surface for custom providers. Spec pin advances from v0.51.0 to v0.53.0; proposal 0023 (canonical state reducers) marked not-yet with fixtures 034-038 parser- deferred. Fixtures 069-073 (the 0058 conformance set) deferred pending typed_event_collector schema + the event_counts list directive in the harness; unit tests pin the contract end-to-end: 9-category field-mapping lockdown, pre-send validation raise, mutual-exclusion between LlmCompletionEvent and LlmFailedEvent on the same call.
There was a problem hiding this comment.
Pull request overview
Implements proposal 0058 (spec v0.53.0) by adding a typed LlmFailedEvent for LLM failure observability, migrating the bundled OpenTelemetry and Langfuse observers to consume it, and removing the legacy sentinel-namespace NodeEvent LLM emission path from OpenAIProvider.
Changes:
- Add
LlmFailedEventtyped event and integrate it into the observer event union + exports. - Update
OpenAIProvider.complete()to emitLlmFailedEventon §7-category failures (including pre-send validation) while preserving exception flow, and retire sentinel-namespace LLMNodeEventemission. - Update OTel/Langfuse observers and tests to render failure spans/generations from
LlmFailedEvent; bump spec pin/versioning to v0.53.0 and defer proposal-0023 fixtures.
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_observability_otel.py | Updates OTel failure-path test to use LlmFailedEvent helper and validate error-category attribute. |
| tests/unit/test_observability_langfuse.py | Updates Langfuse failure-path tests to use LlmFailedEvent and validates error generation behavior/parenting. |
| tests/unit/test_llm_provider.py | Adds provider-level tests for LlmFailedEvent emission, category/type mapping, pre-send validation emission, and mutual exclusion with completion events. |
| tests/test_smoke.py | Updates asserted __spec_version__ to 0.53.0. |
| tests/conformance/test_fixture_parsing.py | Defers new 0058 fixtures (069–073) and 0023 fixtures (034–038) with rationale. |
| tests/conformance/test_conformance.py | Marks proposal-0023 runtime fixtures as deferred in the conformance runner. |
| tests/_helpers/typed_event.py | Adds make_failed_event helper for constructing LlmFailedEvent instances in tests. |
| src/openarmature/observability/otel/observer.py | Adds LlmFailedEvent handling to emit ERROR spans; removes sentinel-namespace LLM handling. |
| src/openarmature/observability/langfuse/observer.py | Adds LlmFailedEvent handling to emit ERROR generations; removes sentinel-namespace LLM handling and shares typed metadata helpers. |
| src/openarmature/observability/correlation.py | Extends dispatch/event typing unions to include LlmFailedEvent. |
| src/openarmature/llm/providers/openai.py | Refactors complete() to emit typed failure events on §7 errors and removes _make_llm_event sentinel helper. |
| src/openarmature/graph/observer.py | Extends ObserverEvent union and docs to include LlmFailedEvent. |
| src/openarmature/graph/events.py | Defines the LlmFailedEvent dataclass and exports it. |
| src/openarmature/graph/init.py | Re-exports LlmFailedEvent. |
| src/openarmature/AGENTS.md | Updates bundled spec excerpt to v0.53.0 (includes proposal-0023 reducer text). |
| src/openarmature/init.py | Bumps __spec_version__ to 0.53.0. |
| pyproject.toml | Bumps [tool.openarmature].spec_version to 0.53.0. |
| conformance.toml | Advances spec pin to v0.53.0 and adds proposal-0058 entry (proposal-0023 marked not-yet). |
| CHANGELOG.md | Documents LlmFailedEvent, sentinel retirement, and spec bump (contains now-stale sentinel-failure text that should be corrected). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Two stale-content fixes flagged by CoPilot: 1. CHANGELOG line-17 and line-18 bullets carried "Failure paths continue to fire from the sentinel NodeEvent" framing from the 3b/3c era, which contradicts this PR's LlmFailedEvent migration and full sentinel retirement. Trimmed both fragments and added a forward-reference to the proposal 0058 entry that documents the cycle-final state. 2. AGENTS.md's reducer baseline reproduced proposal 0023's factory reducers verbatim from the spec, but Python doesn't ship them in this cycle (manifest 0023 = not-yet). The text is auto- generated by build_agents_md.py from the pinned spec submodule; updated the generator's lead paragraph to flag that capability summaries reproduce spec content verbatim — including additions from accepted proposals this implementation may not yet ship — and point readers at conformance.toml for per-proposal impl status. Generalizes to any future not-yet proposal landing in spec text before Python catches up. Regenerated AGENTS.md.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
LlmFailedEventtyped variant per proposal 0058 (spec v0.53.0). 17 mirrored identity / scoping / request-side fields + 3 failure-specific fields (error_categoryfrom the §7 normative category enumeration, optionalerror_type, always-presenterror_message).OpenAIProvider.complete()so pre-send validation lives inside the try-block. Any §7 category exception (adapter-caught provider error OR pre-send validation raise) dispatchesLlmFailedEventalongside the still-raised exception. Caller-side exception flow is unchanged.LlmFailedEventdirectly. Sameopenarmature.llm.completespan / Generation shape as the success path, with ERROR status / level and theopenarmature.error.categoryattribute.start_timeback-dated bylatency_msso failure duration reflects time-to-raise.NodeEventemission for LLM events retires entirely from the bundledOpenAIProvider._make_llm_eventremoved. External custom observers that filtered LLM calls byevent.namespace == LLM_NAMESPACEMUST migrate toisinstance(event, LlmCompletionEvent)for success andisinstance(event, LlmFailedEvent)for failure.LlmEventPayload+LLM_NAMESPACEremain inobservability/llm_event.pyas a documented compatibility surface for custom providers.not-yet; fixtures 034-038 parser-deferred.Scope
Closes the v0.13.0 typed-event migration cycle. After this PR:
LlmCompletionEvent; failure:LlmFailedEvent).NodeEvents for LLM events at all.discuss-llm-completion-event-failure-pathcoord thread can close.Test coverage
test_build_llm_failed_event_maps_category_and_type_per_exception). Parametrized over everyLlmProviderErrorsubclass; verifieserror_category/error_type/error_messagepopulate correctly for all 9 §7 categories.test_complete_pre_send_validation_emits_llm_failed_event_before_propagating). DrivesProviderInvalidRequestfrom_normalize_response_schema's non-BaseModel-class rejection — bypasses every wire concern.test_llm_completion_and_failed_events_are_mutually_exclusive). Asserts the disjoint-count rule on both success and failure paths in a single test.LlmFailedEventdispatch via the sharedmake_failed_eventhelper.LlmFailedEvent.Test plan
uv run pytest tests/— 1225 passed.uv run pyrightclean.uv run ruff check+ruff format --checkclean.cat openarmature-spec/VERSIONshows0.53.0.Out of scope
typed_event_collectorschema work + theevent_counts:list directive (introduced by fixture 071). Follow-on harness PR.LlmEventPayload+LLM_NAMESPACEretention/removal — a future cleanup PR once we're confident no real downstream consumers depend on them.