chore(release): v0.10.0-rc1 (#89)

chris-colinsky · web-flow · commit 186bcb1af08c · 2026-05-27T21:52:41.000-07:00
* chore(release): v0.10.0-rc1

Release prep for the v0.10.0 Langfuse observability release.

- CHANGELOG: [0.10.0] entry covering proposals 0031-0036 plus the
  downstream-driven provider / observability hardening.
- conformance.toml: proposals 0031-0036 flipped not-yet →
  implemented since = 0.10.0; spec_pin already at v0.27.1.
- Version → 0.10.0rc1 across pyproject, __init__, smoke test,
  uv.lock; AGENTS.md regenerated with the new version stamp.
- README: added a native-LangfuseObserver bullet — the release
  headline wasn't represented in the feature list.

rc1 publishes to TestPyPI only; the real-release bump to 0.10.0
is a separate commit after rc verification per RELEASING.md.

* Fix two CHANGELOG accuracy nits from spec review

Per spec's v0.10.0 sign-off (coord review-v0-10-0-release msg 02):

- 0033 bullet: replace the vague "typed prompt model-config" with
  the actual surface — Prompt.sampling (a SamplingConfig subclass
  of RuntimeConfig), Prompt.observability_entities, the LabelResolver
  resolution chain, and filesystem layout / sampling-source.
- 0032 bullet: the declared-field promotion is llm-provider §6;
  §8.1 is the OpenAI wire-mapping section. Attribute both correctly.

CHANGELOG-only; no behavior change.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,38 @@ All notable changes to `openarmature-python` are documented in this file.
 
 The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The package follows [Semantic Versioning](https://semver.org/); pre-1.0 minor bumps may carry behavioral changes per [spec governance](https://github.com/LunarCommand/openarmature-spec/blob/main/GOVERNANCE.md).
 
+## [0.10.0] — 2026-05-27
+
+Langfuse observability release. The pinned spec advances from v0.22.1 to v0.27.1, absorbing six accepted proposals (0031-0036). The headline is a native Langfuse backend mapping (a sibling to the OTel mapping) driven by a downstream production project integrating OpenArmature with Langfuse; this release also adds caller-supplied invocation metadata, two fan-out collection reducers, and a batch of provider / observability hardening surfaced by that same downstream integration.
+
+### Added
+
+- **`LangfuseObserver` — native Langfuse backend mapping** (proposal 0031, observability §8). An observer that consumes the §6 event stream as a sibling to the OTel observer (both can be attached to one graph; each honors its own opt-out). Maps invocation → Langfuse Trace, node / subgraph / fan-out → Span observation, LLM provider call → Generation observation. Sets the Trace `id` equal to the OA `invocation_id` so cross-system lookup is a direct hit; routes `correlation_id` to `trace.metadata` and every `observation.metadata`. Full subgraph dispatch, per-instance fan-out, and detached-trace-mode parenting (§8.3 / §8.5). Decoupled from the SDK via the `LangfuseClient` Protocol.
+- **`InMemoryLangfuseClient`** — an in-process recorder satisfying `LangfuseClient`, used by the conformance harness and useful for unit tests; captures Traces / Observations verbatim for assertion.
+- **`LangfuseSDKAdapter`** — bridges the real `langfuse>=4.6` SDK to the `LangfuseClient` Protocol (UUID4 → OTel-hex trace-id conversion, `propagate_attributes` on every observation, usage translation). Gated behind the new `[langfuse]` extra (`pip install openarmature[langfuse]`); the observer itself needs no SDK install because the Protocol decouples it.
+- **Public `force_flush(timeout_ms=30_000)` on `OTelObserver` and `LangfuseObserver`** (downstream ask). Wraps the underlying provider / client flush so fast-teardown harnesses (serverless functions, CLI one-shots, FastAPI `TestClient` teardown) can drain the export buffer without reaching into the private `_provider` attribute. Distinct from `CompiledGraph.drain()`, which covers the engine's observer-event queue; `force_flush()` covers the outbound span-export buffer.
+- **Caller-supplied invocation metadata** (proposal 0034, observability §3.4 + §5.6 + §8.4). `invoke(metadata={...})` accepts a per-invocation mapping of `str → AttributeValue` (OTel scalars or homogeneous arrays). The framework propagates every entry to all observability backends: the OTel observer emits each as an `openarmature.user.<key>` cross-cutting span attribute on every span; the Langfuse observer merges each as a top-level key into `trace.metadata` and every `observation.metadata`. `openarmature.observability.set_invocation_metadata(**entries)` augments the in-scope mapping mid-invocation (additive; respects fan-out / parallel-branches per-instance COW scoping); `current_invocation_metadata()` reads it. Boundary validation rejects keys under the reserved `openarmature.*` / `gen_ai.*` prefixes and non-OTel-compatible value types with a synchronous `ValueError`.
+- **`concat_flatten` and `merge_all` reducers** (proposal 0036, graph-engine §2). The fan-out collection analogs of `append` / `merge`: a fan-out subgraph emitting `list[X]` per instance lands `list[list[X]]` at the parent `target_field` (use `concat_flatten` to flatten one level); emitting `dict[str, X]` lands `list[dict]` (use `merge_all` to fold with last-write-wins per key). Both are strict — they raise `ReducerError` (graph-engine §4) when an update element isn't the expected list / mapping shape. Exported from `openarmature.graph`; the required-built-in set grows from three to five.
+- **Three new `RuntimeConfig` declared fields** (proposal 0032, llm-provider §6): `frequency_penalty`, `presence_penalty`, and `stop_sequences`. Surfaced on the OpenAI wire body per §8.1 (with `stop_sequences` renaming to OpenAI's `stop` key) and as `gen_ai.request.*` span attributes. Per the §6 null-skip rule, each declared field with value `None` is omitted from the wire body.
+- **Prompt-management surface refinements** (proposal 0033). `Prompt.sampling` (a `SamplingConfig` subclass of `RuntimeConfig`), `Prompt.observability_entities`, the LabelResolver three-step resolution chain (explicit > resolver > `"production"`), and filesystem layout / sampling-source ergonomics for the prompt-management capability.
+- **Self-hosted vLLM cookbook** at `docs/model-providers/vllm.md` — base-URL contract, the structured-output fallback flag, the `genai_system="vllm"` override, readiness-probe limitations + warm-up pattern, and tool calling.
+- **`conformance.toml` manifest + CI guard.** A machine-readable record of which spec proposals are implemented and since which version, validated against the pinned spec submodule by `scripts/check_conformance_manifest.py` on every PR. Consumed by the spec docs site to render per-implementation status.
+
+### Changed
+
+- **`OpenAIProvider` rejects a `/v1` suffix on `base_url`** (downstream-surfaced bug). httpx joins base URLs by appending, so `base_url="https://host/v1"` plus the provider's `/v1/chat/completions` request produced a doubled `/v1/v1/...` wire path that silently 404/405'd on most backends while the readiness probe stayed green. The provider now raises `ValueError` at construction when `base_url`'s path ends in `/v1` (with or without a trailing slash, and through query strings / fragments). Other non-empty paths (proxy prefixes) are left intact. No existing users were affected; this is the first production integration.
+- **`metadata.subgraph_name` / `openarmature.subgraph.name` carries the compiled-subgraph identity** (proposal 0035 resolution), not the wrapper node name. `SubgraphNode` and `FanOutConfig` gain an optional `subgraph_identity`; the engine threads it through `NodeEvent.subgraph_identities` to the observers. Falls back to the empty string when no identity is tracked (observability §5.3). Distinct from the observation's `name` / namespace, which remain the wrapper node name.
+
+### Fixed
+
+- **`entry_node` / trace name when the outer entry is a `SubgraphNode`.** Subgraph wrappers don't emit their own events, so the first event the observer saw came from inside the subgraph; the Langfuse observer recorded the inner node as the trace's `entry_node`. Now resolves to `event.namespace[0]` (the outer entry).
+- **Detached-mode link observation no longer carries `subgraph_name`.** In detached-trace mode the wrapper role migrates to the detached trace; the parent trace's link observation is the SubgraphNode span (no wrapper role) and must not carry `subgraph_name`.
+
+### Notes
+
+- **Pinned spec version bumped from v0.22.1 to v0.27.1 over the v0.10.0 cycle.** Six proposals absorbed: 0031 (observability Langfuse mapping, v0.23.0), 0032 (RuntimeConfig declared-field expansion, v0.24.0), 0033 (prompt-management surface refinements, v0.25.0), 0034 (caller-supplied invocation metadata, v0.26.0), 0035 (Langfuse graph-topology conformance fixtures, v0.26.1 + v0.27.1 fixture corrections), and 0036 (fan-out collection reducers `concat_flatten` / `merge_all`, v0.27.0). All conformance fixtures pass against the v0.27.1 pin, including the un-deferred Langfuse subgraph / fan-out / detached-trace fixtures and the two new reducer fixtures.
+- **`langfuse>=4.6,<5` is the supported SDK range** for `LangfuseSDKAdapter`, validated end-to-end against Langfuse Cloud. The v4 SDK's `flush()` is synchronous but exposes no timeout parameter, so `LangfuseObserver.force_flush(timeout_ms=...)` accepts the argument for Protocol symmetry but the underlying flush honors the SDK's own deadlines (best-effort).
+
 ## [0.9.0] — 2026-05-25
 
 ### Added
diff --git a/README.md b/README.md
@@ -56,6 +56,9 @@ The OpenTelemetry mapping mandates a private `TracerProvider`. That prevents the
 **LLM spans LLM-aware backends can actually read.**<br>
 Each `provider.complete()` call emits a dedicated `openarmature.llm.complete` span carrying both the framework's `openarmature.llm.*` attributes and the cross-vendor OpenTelemetry GenAI semantic conventions (`gen_ai.system`, `gen_ai.request.*`, `gen_ai.response.*`, `gen_ai.usage.*`). Langfuse, Phoenix, Honeycomb's LLM lens — they render generations correctly out of the box, no per-service attribute-mapping shim required. Input/output payload emission is opt-in (`disable_llm_payload=False`), default-off because the payload may contain PII; image bytes are unconditionally redacted at the provider so they never enter the observability stream.
 
+**Native Langfuse mapping, not just OTLP.**<br>
+Alongside the OpenTelemetry mapping, `LangfuseObserver` (in `openarmature[langfuse]`) maps invocations to Langfuse Traces and Observations directly — subgraph hierarchy, per-instance fan-out, and detached-trace mode included. Both observers can run on one graph. Caller-supplied invocation metadata (`invoke(metadata={"tenantId": ...})`) propagates to every backend at once: `openarmature.user.*` span attributes on the OTel side, top-level `trace.metadata` / `observation.metadata` keys on the Langfuse side.
+
 ## Hello World
 
 About a hundred lines that show the engine in action. Three reducer policies declared on one state class. Three LLM calls each returning typed structured output (Pydantic class on two, raw JSON Schema dict on the third). Conditional routing as a pure function of state, not a hidden state machine. An observer attached at compile time that sees every node boundary the engine emits. Requires Python 3.12 or later and an OpenAI-compatible endpoint (defaults to OpenAI public API; works against any local server too).
diff --git a/conformance.toml b/conformance.toml
@@ -151,25 +151,27 @@ since = "0.9.0"
 note = "Drain snapshot semantic and timeout-input validation already implemented as part of the proposal 0010 impl PR (v0.9.0); no additional module-level work needed."
 
 # Spec v0.23.0-v0.27.1 batch (proposals 0031, 0032, 0033, 0034, 0035,
-# 0036). All six have impl work landing across the v0.10.0 release
-# cycle; status stays `not-yet` until the release PR flips them to
-# `implemented` with `since = "0.10.0"`. The pinned spec submodule
-# advances ahead of the impl status because newer fixtures need to be
-# visible to the conformance harness as each PR lands.
+# 0036), all shipped in the v0.10.0 release.
 [proposals."0031"]
-status = "not-yet"
+status = "implemented"
+since = "0.10.0"
 
 [proposals."0032"]
-status = "not-yet"
+status = "implemented"
+since = "0.10.0"
 
 [proposals."0033"]
-status = "not-yet"
+status = "implemented"
+since = "0.10.0"
 
 [proposals."0034"]
-status = "not-yet"
+status = "implemented"
+since = "0.10.0"
 
 [proposals."0035"]
-status = "not-yet"
+status = "implemented"
+since = "0.10.0"
 
 [proposals."0036"]
-status = "not-yet"
+status = "implemented"
+since = "0.10.0"
diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 
 [project]
 name = "openarmature"
-version = "0.9.0"
+version = "0.10.0rc1"
 description = "Workflow framework for LLM pipelines and tool-calling agents."
 readme = "README.md"
 requires-python = ">=3.12"
diff --git a/src/openarmature/AGENTS.md b/src/openarmature/AGENTS.md
@@ -1,6 +1,6 @@
 # OpenArmature — Agent documentation
 
-*This is the agent guide bundled with the openarmature Python package, version 0.9.0 (spec v0.27.1). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*
+*This is the agent guide bundled with the openarmature Python package, version 0.10.0rc1 (spec v0.27.1). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*
 
 ## TL;DR
 
diff --git a/src/openarmature/__init__.py b/src/openarmature/__init__.py
@@ -24,5 +24,5 @@
    sessions opening the project find the bundled docs automatically.
 """
 
-__version__ = "0.9.0"
+__version__ = "0.10.0rc1"
 __spec_version__ = "0.27.1"
diff --git a/tests/test_smoke.py b/tests/test_smoke.py
@@ -8,7 +8,7 @@
 
 
 def test_package_versions() -> None:
-    assert openarmature.__version__ == "0.9.0"
+    assert openarmature.__version__ == "0.10.0rc1"
     assert openarmature.__spec_version__ == "0.27.1"
 
 
diff --git a/uv.lock b/uv.lock