Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,11 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The

- **Detached-trace invocation span** (proposal 0061, observability §4.4, spec v0.61.0). The OTel observer now synthesizes an `openarmature.invocation` span at the root of each detached trace (a detached subgraph and each detached fan-out instance), carrying the parent's shared `invocation_id` (detached mode is observer-side trace rendering, not a new run) and the detached unit's own `entry_node`; the detached subgraph / instance span nests under it. A raising detached subgraph surfaces ERROR plus the error category and an OTel exception event on both the parent dispatch span and the detached invocation span. This is observer-side only, with no graph-engine change; the Langfuse observer is unchanged (its Trace entity already plays the invocation-level-container role). Conformance fixtures 008 (rewritten) and 058 (newly wired) run in `test_observability`.
- **Per-attempt LLM spans under call-level retry** (proposal 0050, observability §5.5 / llm-provider §7.1). Completes proposal 0050, which shipped `partial` in v0.14.0 (failure-isolation middleware and the `complete(retry=...)` loop landed then; the per-attempt span surface was deferred). Under call-level retry the OTel observer now emits one `openarmature.llm.complete` span per attempt, each carrying `openarmature.llm.attempt_index` (0-based, 0..N-1, and 0 for a no-retry call). An intermediate failed attempt's span carries ERROR status plus its error category and the request-side attributes; the final attempt's span carries the terminal outcome and, on success, the full response surface. A python-internal `LlmRetryAttemptEvent`, dispatched once per attempt, is the sole source of the OTel span; the terminal `LlmCompletionEvent` / `LlmFailedEvent` stay one per call (payload, latency, Langfuse Generation) and no longer drive the OTel span. Langfuse renders one terminal Generation per call, with the per-attempt detail on the OTel span surface only (a spec-side §8 clarification to pin this is tracked, non-blocking). `conformance.toml` flips proposal 0050 to `implemented`; the call-level fixtures 056-058 are driven through the provider plus OTel observer and the single-attempt observability fixture 057 is wired.
- **Langfuse `trace.userId` / `trace.sessionId` population** (proposal 0064, observability §8.4.1, spec v0.62.0). The Langfuse observer now promotes a recognized `userId` key in the caller-supplied invocation metadata to Langfuse's first-class `trace.userId` field (the Users dashboard), additively: the key also remains at `trace.metadata.userId`. Promotion is automatic and unconditional; an absent key leaves `trace.userId` unset. The `LangfuseClient.trace()` surface (the Protocol, the in-memory client, and the SDK adapter) gains `session_id` / `user_id`. `trace.sessionId` is sourced from `openarmature.session_id`, which the sessions capability (proposal 0020) establishes; that capability is not yet implemented in python, so the `sessionId` plumbing is in place but dormant (no source) and unset in the interim. `conformance.toml` records proposal 0064 `partial` on that basis: fixture 084 cases 2/3/4 (not session-bound, `userId` present additively, `userId` absent) run, and the session-bound cases 1/5 defer until 0020. Langfuse-only: the OTel side already carries `openarmature.session_id` and `openarmature.user.*` as span attributes, and OTel has no trace-level session/user field.

### Changed

- **Pinned spec advances v0.60.0 → v0.61.0** (proposal 0061, the detached-trace invocation span above). A single step this cycle; `conformance.toml` records proposal 0061 as `implemented`. Proposal 0050 needed no pin bump of its own (it was already within the pin from its v0.42.0 acceptance); its v0.14.0 `partial` entry flips to `implemented` with the per-attempt span surface above.
- **Pinned spec advances v0.60.0 → v0.62.0** across the v0.15.0 cycle: v0.61.0 (proposal 0061, the detached-trace invocation span above) and v0.62.0 (proposal 0064, the Langfuse session/user population above). `conformance.toml` records 0061 `implemented` and 0064 `partial` (its `sessionId` half is dormant pending the sessions capability). Proposal 0050 needed no pin bump of its own (it was already within the pin from its v0.42.0 acceptance); its v0.14.0 `partial` entry flips to `implemented` with the per-attempt span surface above.

## [0.14.0] — 2026-06-17

Expand Down
7 changes: 7 additions & 0 deletions conformance.toml
Original file line number Diff line number Diff line change
Expand Up @@ -698,3 +698,10 @@ note = "Descriptive catalog of the failure-mock family (flaky + failure_sequence
status = "implemented"
since = "0.15.0"
note = "The OTel observer synthesizes an openarmature.invocation span at the root of each detached trace (a detached subgraph + each detached fan-out instance), carrying the parent's SHARED invocation_id (detached mode is observer-side trace rendering, not a new run) and the detached unit's own entry_node; the detached subgraph / instance span nests under it. A raising detached subgraph surfaces ERROR + the category + an OTel exception event on BOTH the parent dispatch span and the detached invocation span. Observer-side only -- no graph-engine change; the Langfuse observer is unchanged (its Trace entity already plays the invocation-level-container role). Fixtures 008 (rewritten) and 058 (newly wired) run in test_observability."

# Spec v0.62.0 (proposal 0064). Langfuse trace.sessionId / trace.userId
# population (observability §8.4.1 / §8.10).
[proposals."0064"]
status = "partial"
since = "0.15.0"
note = "The Langfuse observer promotes a recognized userId caller-metadata key to the first-class trace.userId (additive: the key also stays in trace.metadata.userId), and sets trace.sessionId from openarmature.session_id when present. trace.userId is LIVE (sourced from 0034 caller metadata): fixture 084 cases 2/3/4 (not-session-bound, userId present additive, userId absent) pass. partial because trace.sessionId is DORMANT -- openarmature.session_id is established by the sessions capability (0020, observability §5.6), unimplemented in python until v0.19.0, so there is no session_id source yet; the trace(session_id=) plumbing is wired end to end but the observer passes None. Fixture 084 session-bound cases 1 + 5 are deferred (per-case) pending 0020. Langfuse-only: no OTel change (the OTel side already carries openarmature.session_id + openarmature.user.* as span attributes; no trace-level OTel equivalent)."
12 changes: 12 additions & 0 deletions docs/concepts/observability.md
Original file line number Diff line number Diff line change
Expand Up @@ -1048,6 +1048,18 @@ for a runnable demo.
- **Trace name.** Defaults to the entry-node name (spec §8.6
fallback). Caller-supplied invocation labels land in PR 4
(proposal 0034).
- **Session / user grouping (`trace.sessionId` / `trace.userId`).**
The observer populates the two cross-trace grouping fields behind
Langfuse's Sessions and Users dashboards (spec §8.4.1, proposal
0064). `trace.userId` is promoted from a recognized `userId` key in
the caller-supplied invocation metadata, automatically and
additively (the key also stays at `trace.metadata.userId`); an
absent key leaves it unset. `trace.sessionId` is sourced from
`openarmature.session_id` (the sessions capability), which is not
yet implemented, so it is unset for now. There is no OTel
equivalent (an OTel trace has no trace-level session / user field);
the same identity already rides as `openarmature.session_id` and the
`openarmature.user.*` family on the OTel span side.
- **Per-observation metadata.** Each Span / Generation carries
`namespace`, `step`, `attempt_index`, optional `fan_out_index` /
`branch_name`, and the `correlation_id` cross-cutting join key
Expand Down
17 changes: 15 additions & 2 deletions examples/langfuse-observability/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,10 @@
observation picks that up and links back to the entity, which is how
production Langfuse dashboards thread "this generation came from prompt
v7 of `mission-briefing`" without you having to wire anything up
manually.
manually. It also tags each trace with a ``userId`` (operator identity)
via invocation metadata; the observer promotes that to Langfuse's
first-class user dimension, so the Users dashboard groups and filters
the assistant's traffic by operator.

The example uses the bundled ``InMemoryLangfuseClient`` recorder so the
demo runs without a Langfuse account; at the end we print the captured
Expand Down Expand Up @@ -193,6 +196,8 @@ def _format_trace(trace: LangfuseTrace) -> str:
lines: list[str] = []
lines.append(f"Trace id={trace.id}")
lines.append(f" name={trace.name!r}")
if trace.user_id is not None:
lines.append(f" userId={trace.user_id!r} (promoted to the Langfuse Users dimension)")
lines.append(f" metadata={_format_metadata(trace.metadata)}")
for obs in trace.children_of(None):
_format_observation(lines, trace, obs, indent=" ")
Expand Down Expand Up @@ -274,7 +279,15 @@ async def main() -> None:
graph.attach_observer(observer)

try:
final = await graph.invoke(BriefingState(question=question))
# metadata={"userId": ...} tags the trace with an operator
# identity. The Langfuse observer promotes a recognized ``userId``
# key to the first-class trace.userId field so the Users dashboard
# can group and filter traces by operator (additive: it also stays
# in trace.metadata.userId).
final = await graph.invoke(
BriefingState(question=question),
metadata={"userId": "flight-controller-gene"},
)
finally:
# Required for short-lived processes: invoke() returns when the
# graph reaches END regardless of whether the observer queue
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Specification = "https://github.com/LunarCommand/openarmature-spec"
openarmature = "openarmature.cli:main"

[tool.openarmature]
spec_version = "0.61.0"
spec_version = "0.62.0"

[dependency-groups]
dev = [
Expand Down
4 changes: 2 additions & 2 deletions src/openarmature/AGENTS.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# OpenArmature — Agent documentation

*This is the agent guide bundled with the openarmature Python package, version 0.14.0 (spec v0.61.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*
*This is the agent guide bundled with the openarmature Python package, version 0.14.0 (spec v0.62.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*

## TL;DR

Expand All @@ -10,7 +10,7 @@ OpenArmature is a workflow framework for LLM pipelines and tool-calling agents:

## Capability contracts

_Sourced from openarmature-spec v0.61.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._
_Sourced from openarmature-spec v0.62.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._

### Capability: `graph-engine`

Expand Down
2 changes: 1 addition & 1 deletion src/openarmature/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
"""

__version__ = "0.14.0"
__spec_version__ = "0.61.0"
__spec_version__ = "0.62.0"
# Proposal 0052 (spec observability §5.1 / §8.4.1): canonical
# package-registry name for this implementation. Surfaces on every
# OTel invocation span as ``openarmature.implementation.name`` and on
Expand Down
18 changes: 17 additions & 1 deletion src/openarmature/observability/langfuse/adapter.py
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,8 @@ def trace(
id: str,
name: str | None = None,
metadata: dict[str, Any] | None = None,
session_id: str | None = None,
user_id: str | None = None,
) -> None:
# v4 has no explicit trace creation; cache the info and apply
# it via propagate_attributes on every observation under this
Expand All @@ -221,7 +223,15 @@ def trace(
# reserved (proposal 0041), so no caller metadata collides.
if not _is_uuid(id):
md.setdefault("invocation_id", id)
self._trace_info[id] = {"name": name, "metadata": md}
# Proposal 0064 §8.4.1: cache the session/user grouping fields so
# propagate_attributes can apply them around every observation
# under this trace_id (v4 has no explicit trace-create call).
self._trace_info[id] = {
"name": name,
"metadata": md,
"session_id": session_id,
"user_id": user_id,
}

def update_trace(
self,
Expand Down Expand Up @@ -292,6 +302,8 @@ def _emit_trace_output_synthetic(self, trace_id: str, output: Any) -> None:
propagate_attributes(
trace_name=entry["name"],
metadata=_stringify_metadata(entry["metadata"]),
session_id=entry.get("session_id"),
user_id=entry.get("user_id"),
)
)
obs = cast(
Expand Down Expand Up @@ -438,6 +450,8 @@ def _start_back_dated_generation(
propagate_attributes(
trace_name=trace_entry["name"],
metadata=_stringify_metadata(trace_entry["metadata"]),
session_id=trace_entry.get("session_id"),
user_id=trace_entry.get("user_id"),
)
)
stack.enter_context(otel_trace_api.use_span(remote_parent_span))
Expand Down Expand Up @@ -524,6 +538,8 @@ def _start_observation(
propagate_attributes(
trace_name=trace_entry["name"],
metadata=_stringify_metadata(trace_entry["metadata"]),
session_id=trace_entry.get("session_id"),
user_id=trace_entry.get("user_id"),
)
)
obs = cast("Any", self._client.start_observation(**kwargs))
Expand Down
16 changes: 16 additions & 0 deletions src/openarmature/observability/langfuse/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,12 @@ class LangfuseTrace:
# invocation-boundary events; absent when no observer wrote them.
input: Any | None = None
output: Any | None = None
# Proposal 0064 §8.4.1: Langfuse's two cross-trace grouping fields.
# ``session_id`` groups traces sharing a session (Sessions dashboard);
# ``user_id`` populates the Users dimension. Each is unset (None) when
# its source is absent.
session_id: str | None = None
user_id: str | None = None
observations: list[LangfuseObservation] = field(default_factory=list[LangfuseObservation])

def find_observation(self, observation_id: str) -> LangfuseObservation | None:
Expand Down Expand Up @@ -170,12 +176,18 @@ def trace(
id: str,
name: str | None = None,
metadata: dict[str, Any] | None = None,
session_id: str | None = None,
user_id: str | None = None,
) -> None:
"""Create a new Trace.

The Trace `id` MUST be the OA invocation_id verbatim.
Implementations track Traces internally; observation calls
pass `trace_id` to associate.

`session_id` / `user_id` (proposal 0064 §8.4.1) populate
Langfuse's cross-trace grouping fields (the Sessions / Users
dashboards); each is unset when its source is absent.
"""
# Spec §8.4.1: the Trace id is the OA invocation_id verbatim.
...
Expand Down Expand Up @@ -368,11 +380,15 @@ def trace(
id: str,
name: str | None = None,
metadata: dict[str, Any] | None = None,
session_id: str | None = None,
user_id: str | None = None,
) -> None:
self.traces[id] = LangfuseTrace(
id=id,
name=name,
metadata=dict(metadata) if metadata is not None else {},
session_id=session_id,
user_id=user_id,
)

def update_trace(
Expand Down
Loading
Loading