Skip to content

Prepare v0.13.0 release: reconcile changelog, docs, and examples#146

Merged
chris-colinsky merged 7 commits into
mainfrom
chore/v0.13.0-release-prep
Jun 10, 2026
Merged

Prepare v0.13.0 release: reconcile changelog, docs, and examples#146
chris-colinsky merged 7 commits into
mainfrom
chore/v0.13.0-release-prep

Conversation

@chris-colinsky

@chris-colinsky chris-colinsky commented Jun 10, 2026

Copy link
Copy Markdown
Member

Summary

Documentation and changelog prep for the v0.13.0 release. The typed-event migration code (proposals 0047, 0049, 0057, 0058) already landed in PRs 141 through 145; this PR reconciles the release narrative and brings the docs and examples that lagged behind into sync. No changes to the library's runtime code: this is changelog, conformance manifest, docs, examples, and the generated AGENTS.md.

1. Changelog and conformance corrections

Spec flagged three accuracy gaps in a pre-release review of the [Unreleased] section. All three are fixed:

  • 0047 cache-stat attribution. The Response.usage cache fields (PR 136) and the OTel cache attributes (PR 140) were originally credited to v0.12.0. Both actually landed this cycle, so the 0047 entry now records all three pieces (PRs 136, 140, 145) under v0.13.0.
  • Spec pin journey. The advance is recorded as v0.46.0 to v0.53.0; it had been understated as starting at v0.51.0.
  • tool_call.arguments wire change. The switch to sort_keys=True now has its own ### Changed entry, and conformance.toml notes the downstream-observable wire-byte shift for consumers that snapshot request bodies.

The two ### Added headings in the [Unreleased] section are consolidated into one.

2. Docs migration to typed-event-first

Three docs still led with the legacy sentinel-namespace pattern for LLM events. They now lead with the typed LlmCompletionEvent / LlmFailedEvent variants and demote the sentinel pattern to a compatibility note:

  • docs/concepts/observability.md: custom-observer consumption shows isinstance discrimination as the primary path, and calls out the success/failure mutual-exclusion contract.
  • docs/model-providers/authoring.md: the custom-provider emission sketch dispatches the typed events on each outcome path.
  • docs/agent/non-obvious-shapes.md: the LLM sentinel is dropped from the sentinel-events list (checkpoint sentinels stay). AGENTS.md is regenerated to match its sources.

3. Examples

  • production-observability extended to exercise the v0.13.0 surface end to end: a new LlmFailureTracker observer consumes LlmFailedEvent for per-invocation error-category rollups, LlmUsageAccumulator gains a cache-hit ratio from the new cached_tokens field, and the OTel formatter surfaces the cache-read attribute. The walkthrough doc gains matching "reading the output" commentary.
  • Spec and proposal references removed from all example comments and docstrings. Quoting proposal numbers and spec sections carries no meaning for end users reading the example code; the comments now describe only the implementation behavior. Em dashes were scrubbed from the same files.

Test plan

  • uv run pytest: 1244 passed, 355 skipped.
  • uv run ruff check + ruff format --check: clean.
  • uv run pyright: clean.
  • uv run mkdocs build: clean.
  • The production-observability example was run live against a real provider: the new usage line (with the cache segment), the failures line, and the openarmature.llm.cache_read.input_tokens span attribute all render correctly.
  • tests/test_production_observability_accumulators.py (9 tests) locks the example's queryable-observer logic deterministically, covering the paths a happy-path live run cannot reach: the cache-hit ratio math with non-zero cached tokens, cached=None tolerance, failure-category counting and ordering, mutual exclusion between the success and failure events, per-invocation bucket cleanup, the real persist-node output, and the OTel cache-read attribute. The persist check drives the real node offline (an unknown invocation id makes drain_events_for return an empty summary, so no live provider call is needed).

Three blocking + three should-fix items spec flagged on the
pre-tag review. All narrative; no code behavior change.

- 0047 CHANGELOG entry mis-attributed pieces 1+2 (Response.usage
  cache fields + OTel cache attributes) to v0.12.0. Verified via
  git: those landed in PRs #136 + #140 post-v0.12.0-tag, so all
  three pieces of 0047 ship in v0.13.0. Reframed.
- conformance.toml [proposals."0047"] leading-comment block had
  the same v0.12.0 mis-attribution. Same correction; added PR
  references for traceability.
- Unreleased section had two ### Added headings with the 0057
  entry orphaned below ### Changed. Consolidated.
- Spec pin advance text undercounted the cycle journey (said
  v0.51.0 → v0.53.0; actual is v0.46.0 → v0.53.0 across three
  hops). Reframed and listed absorbed proposals inline.
- tool_call.arguments JSON encoding now uses sort_keys=True
  (functionally equivalent but byte-different for downstream
  snapshot consumers). Surfaced as its own ### Changed entry
  instead of buried in the 0047 ### Added.
- conformance.toml [proposals."0049"] leading-comment block
  grew the fixture-deferral surface (057-068 + 069-073 parser-
  deferred pending harness directive schema catch-up; behavior
  pinned by unit tests) per spec OQ2.
Three docs still pushed the legacy sentinel-namespace pattern as
the primary path for custom observers consuming LLM events and
custom providers emitting them. After v0.13.0 the bundled provider
emits typed LlmCompletionEvent / LlmFailedEvent variants directly;
the bundled OTel + Langfuse observers consume via isinstance
discrimination. Rewrites:

- docs/concepts/observability.md: "Publishing LLM events for
  custom observers" → "Consuming LLM events in custom observers".
  Typed-event consumption shown as primary (isinstance branch on
  LlmCompletionEvent + LlmFailedEvent with the mutual-exclusion +
  field-set notes). Sentinel pattern demoted to a "Legacy
  sentinel-namespace pattern (compatibility surface)" subsection
  for downstream code interoperating with custom providers that
  haven't migrated.
- docs/model-providers/authoring.md: custom-provider emission
  sketch rewritten — dispatch LlmCompletionEvent on success,
  LlmFailedEvent alongside the §7 exception on failure. Shows the
  current-attempt-index / current-fan-out-index / etc. scoping
  fields the typed events carry. Calls out the mutual-exclusion
  + exception-flow-preservation contracts. Legacy sentinel pattern
  retained as a compatibility-surface callout for older providers.
- docs/agent/non-obvious-shapes.md: "filter openarmature.*-
  namespaced events" tip drops the openarmature.llm.complete
  example (v0.13.0 retired the sentinel pattern for LLM events);
  checkpoint sentinels stay since the tip is still applicable for
  those. Adjusted the follow-on paragraph mentioning LLM events.

mkdocs strict build clean.
The non-obvious-shapes doc migration changed a generator source
without regenerating the committed AGENTS.md. Bring it back in
sync so the drift guard passes.
Add an LlmFailureTracker observer that consumes the typed
LlmFailedEvent and rolls up per-invocation error-category counts,
and extend LlmUsageAccumulator to track cached_tokens and report a
cache-hit ratio. The persist node now reports both rollups and the
OTel formatter surfaces the cache-read attribute.

Also drop spec/proposal references and em dashes from the
example's comments and walk-through, which carry no meaning for
end users reading the code.
Example comments and docstrings quoted proposal numbers and spec
section refs that have no meaning to end users reading the code.
Reword them to describe only the implementation behavior.
Copilot AI review requested due to automatic review settings June 10, 2026 02:23

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Prepares the v0.13.0 release narrative by reconciling changelog + conformance notes and updating docs/examples to the typed LLM event model (LlmCompletionEvent / LlmFailedEvent) and the new cache-stat surface.

Changes:

  • Reconciles release notes and conformance commentary for spec pinning, proposal attribution, and the tool_call.arguments wire-byte change.
  • Updates observability docs and provider-authoring guidance to be typed-event-first, retaining the sentinel-namespace pattern as compatibility-only.
  • Extends the production observability example to track cache-hit ratio and per-invocation LLM failure categories.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/openarmature/AGENTS.md Regenerated agent docs to reflect typed-event-first LLM observability guidance and updated sentinel filtering notes.
examples/tool-use/main.py Removes spec/proposal references and clarifies tool_call_id round-trip requirement.
examples/production-observability/main.py Adds cache-hit rollup via usage.cached_tokens and a LlmFailedEvent-driven per-invocation failure-category tracker.
examples/chat-with-multimodal/main.py Removes spec/proposal references from the example header commentary.
docs/model-providers/authoring.md Updates provider authoring guidance to emit typed LLM events on success/failure and documents mutual-exclusion/exception-flow contracts.
docs/examples/production-observability.md Updates walkthrough to match the expanded example output (cache-hit ratio + failure attribution) and new versions/spec pin.
docs/concepts/observability.md Migrates custom-observer guidance to typed LLM events and demotes sentinel namespace to a legacy compatibility surface.
docs/agent/non-obvious-shapes.md Updates observer-event filtering guidance to reflect typed LLM events rather than LLM sentinel NodeEvents.
conformance.toml Updates conformance commentary for proposal attribution and documents the downstream-observable tool_call.arguments encoding byte change.
CHANGELOG.md Rewrites v0.13.0 “Unreleased” entries for correctness (attribution, spec pin journey, and tool_call.arguments change) and consolidates headings.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/model-providers/authoring.md
Comment thread docs/model-providers/authoring.md
Comment thread docs/concepts/observability.md
The examples smoke test only proves the demo loads and its
build_graph() compiles. Cover the two queryable observers the
production-observability example ships: cache-token accumulation
and the derived cache-hit ratio, failure-category counting,
mutual exclusion between the success and failure events, the
per-invocation bucket cleanup, and the OTel cache-read attribute.
The persist-output check drives the real persist node offline.
The legacy sentinel-namespace observer example accessed
event.namespace / event.pre_state without narrowing to NodeEvent.
A real observer receives the full ObserverEvent union, where
variants like InvocationCompletedEvent have no namespace, so the
snippet would raise AttributeError. Add an isinstance(event,
NodeEvent) guard so the copy-paste example is correct.
Copilot AI review requested due to automatic review settings June 10, 2026 02:45

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Comment thread tests/test_production_observability_accumulators.py
Comment thread tests/test_production_observability_accumulators.py
@chris-colinsky chris-colinsky merged commit 43a4ddc into main Jun 10, 2026
7 checks passed
@chris-colinsky chris-colinsky deleted the chore/v0.13.0-release-prep branch June 10, 2026 02:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants