Skip to content

Commit 2b4bc1b

Browse files
Add [langfuse] extras and SDK adapter (PR 3.6) (#82)
* Add [langfuse] extras and SDK adapter Validates the Langfuse observer against the real langfuse>=4.6 SDK and ships a bridge so production users get the same Protocol-shaped observer surface as the InMemoryLangfuseClient. [langfuse] optional-dependency group pins langfuse>=4.6,<5. The v4 SDK is structurally different from v2 / v3 — traces are auto-created when the first observation starts, span and generation collapse into start_observation with as_type, trace_id threads through TraceContext, and trace-level metadata sets via propagate_attributes context manager. Per Chris's directive ("no existing-user constraint; do what's right for OA"), the adapter targets v4 only; earlier SDK versions are out of scope. LangfuseSDKAdapter wraps the v4 client to satisfy the four-method LangfuseClient Protocol. Key translations: - UUID4 invocation_id -> OTel-hex trace_id (32 chars, no dashes). v4 fails int(uuid, 16) parsing on the dashed form; OA's observer error-isolation pattern swallowed that as a warnings.warn, silently dropping traces. - propagate_attributes(trace_name=, metadata=) runs on EVERY observation under each trace_id (not just the first). Without this, v4's last-attribute-wins display logic let later observations clobber the trace's display name to whatever the final observation was called. - usage values translate from the Protocol's LangfuseUsage record to v4's usage_details dict (int values only). - Returned LangfuseSpan / LangfuseGeneration handles wrap into _SpanHandle to expose the .update() / .end() the observer calls. Trace-info cache persists per trace_id rather than popping on first observation. Memory is linear in unique trace_ids; a close_trace cleanup hook is deferred to a future PR. Tests: - Five unit tests covering Protocol satisfaction, observer construction, trace_info cache lifecycle, update_trace merge, UUID4 -> OTel-hex conversion (with idempotency on already-hex and non-UUID passthrough). - One opt-in integration test against real Langfuse Cloud, gated by @pytest.mark.integration + LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY env vars. Calls auth_check() to fail loud on bad credentials, client.shutdown() for synchronous batch-exporter drain. Accepts LANGFUSE_HOST or LANGFUSE_BASE_URL for the host. - New pytest marker config: addopts defaults to "-m not integration" so CI auto-skips integration tests; run with -m integration to include. Docs: - docs/concepts/observability.md flips the "no SDK version validated" disclosure to the validated v4 state, shows the LangfuseSDKAdapter wire-up snippet. - docs/examples/10-langfuse-observability.md same. - examples/10-langfuse-observability/main.py docstring + inline comment updated with the production swap recipe. - AGENTS.md regenerated. Validated end-to-end against Langfuse Cloud (US region) with langfuse 4.7.0: a two-node graph produces one Trace with entry-node-name set as the display name, both nodes as Span observations under it, and the spec §8.4.1 trace metadata (correlation_id, entry_node, spec_version) populated. Fifth of 6 core PRs in the v0.10.0 batch (PR 3.6). * Align adapter docs with propagate-on-every behavior Three stale-doc residues from when the adapter consumed trace_info on the first observation only. The current behavior — propagate on every observation under each trace_id to avoid v4's last-attribute- wins display logic clobbering the trace name — got caught by the integration-test run and the cache+propagation refactored, but the module header comments and one example-doc paragraph still described the original "first observation only" path. Comment / doc text updated; no code change. Addresses CoPilot PR review feedback on #82.
1 parent 62691fe commit 2b4bc1b

8 files changed

Lines changed: 835 additions & 102 deletions

File tree

docs/concepts/observability.md

Lines changed: 45 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -610,26 +610,55 @@ graph.attach_observer(observer)
610610
The `client` is anything matching the `LangfuseClient` Protocol —
611611
the bundled `InMemoryLangfuseClient` (used by the conformance
612612
harness, useful for unit tests), or a real `langfuse.Langfuse()`
613-
instance from the [Langfuse Python SDK](https://github.com/langfuse/langfuse-python).
614-
The Protocol declares only the methods the observer calls, so SDK
615-
versions whose shape matches drop in directly. SDK versions whose
616-
shape diverges (renamed kwargs, return-type quirks) plug in via a
617-
small adapter; see
613+
instance wrapped in `LangfuseSDKAdapter` for production. Install
614+
the optional extras to bring in the Langfuse SDK:
615+
616+
```bash
617+
pip install 'openarmature[langfuse]'
618+
```
619+
620+
Production wire-up:
621+
622+
```python
623+
from langfuse import Langfuse
624+
from openarmature.observability.langfuse import (
625+
LangfuseObserver,
626+
LangfuseSDKAdapter,
627+
)
628+
629+
langfuse_client = Langfuse(
630+
public_key="pk-lf-...",
631+
secret_key="sk-lf-...",
632+
host="https://cloud.langfuse.com",
633+
)
634+
observer = LangfuseObserver(
635+
client=LangfuseSDKAdapter(langfuse_client),
636+
disable_llm_payload=False,
637+
)
638+
```
639+
640+
The adapter bridges `langfuse>=4.6`'s unified `start_observation`
641+
API onto our `LangfuseClient` Protocol; the observer code is the
642+
same in tests and production. See
618643
[`examples/10-langfuse-observability`](../examples/10-langfuse-observability.md)
619-
for the runnable demo plus the adapter shape.
644+
for a runnable demo.
620645

621646
!!! note "Langfuse SDK version compatibility"
622647

623-
No specific `langfuse` SDK version is validated in CI as of this
624-
release. The Protocol mirrors the SDK's documented low-level
625-
`trace` / `span` / `generation` shape, but the SDK has shifted
626-
between major versions (v2 → v3 introduced API changes). A
627-
follow-on release pins a tested `[langfuse]` extras range and
628-
ships a runtime `isinstance` check confirming the SDK satisfies
629-
the Protocol. Until then, treat production wire-up as a "verify
630-
in your own environment" path: bring the langfuse version your
631-
stack already uses, run a smoke trace, and write a thin adapter
632-
if any kwargs don't line up.
648+
Validated against `langfuse>=4.6,<5`. The v4 SDK introduced an
649+
OTel-based architecture with `start_observation` /
650+
`propagate_attributes` replacing the v2/v3 `trace` / `span` /
651+
`generation` low-level API; the bundled `LangfuseSDKAdapter`
652+
handles the bridge so the observer surface is stable across
653+
future v4 patches.
654+
655+
Earlier SDK versions (v2.x, v3.x) are NOT supported. Projects on
656+
those versions either upgrade to v4 or supply their own adapter
657+
matching the `LangfuseClient` Protocol's four methods.
658+
659+
A runtime `isinstance(adapter, LangfuseClient)` check ships in
660+
the unit suite — if a future v4 patch breaks the Protocol's
661+
surface, the test fails loudly.
633662

634663
### What Langfuse sees
635664

docs/examples/10-langfuse-observability.md

Lines changed: 29 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -114,35 +114,45 @@ Trace id=01234567-89ab-...
114114

115115
## Swapping to a real Langfuse SDK
116116

117-
The observer's `client` parameter is `LangfuseClient`-Protocol-typed,
118-
so any structurally-compatible value works:
117+
Install the optional extras:
118+
119+
```bash
120+
pip install 'openarmature[langfuse]'
121+
```
122+
123+
Wrap the SDK client with `LangfuseSDKAdapter` and pass it to the
124+
observer:
119125

120126
```python
121127
from langfuse import Langfuse
128+
from openarmature.observability.langfuse import (
129+
LangfuseObserver,
130+
LangfuseSDKAdapter,
131+
)
122132

123-
client = Langfuse(
133+
langfuse_client = Langfuse(
124134
public_key="pk-lf-...",
125135
secret_key="sk-lf-...",
126136
host="https://cloud.langfuse.com",
127137
)
128-
observer = LangfuseObserver(client=client, disable_llm_payload=False)
138+
observer = LangfuseObserver(
139+
client=LangfuseSDKAdapter(langfuse_client),
140+
disable_llm_payload=False,
141+
)
129142
```
130143

131-
If the installed SDK version's `trace` / `span` / `generation` method
132-
signatures match the Protocol exactly, this is the whole change. If
133-
they diverge (renamed kwargs, return-type quirks), wrap the SDK in a
134-
small adapter class that implements `LangfuseClient` and delegates to
135-
the SDK call-by-call. The Protocol surface is narrow — four methods —
136-
so the adapter is on the order of 40 lines.
137-
138-
**No specific `langfuse` SDK version is validated in CI as of this
139-
release.** The Protocol matches the SDK's documented low-level shape,
140-
but `langfuse` has shifted between major versions (v2 → v3 introduced
141-
API changes). A follow-on release pins a tested `[langfuse]` extras
142-
range and a runtime `isinstance(client, LangfuseClient)` check; until
143-
then, smoke-trace in your own environment with whichever `langfuse`
144-
version your stack already uses and write a thin adapter if any
145-
kwargs don't line up.
144+
The adapter bridges `langfuse>=4.6,<5`'s unified `start_observation`
145+
API onto OA's four-method `LangfuseClient` Protocol. v4 has no
146+
explicit trace creation (traces are auto-created from observations);
147+
the adapter caches trace info from `.trace()` and applies it via
148+
`propagate_attributes` around EVERY observation under that trace_id.
149+
Propagating on every observation keeps v4's last-attribute-wins
150+
display logic from clobbering the trace's display name when later
151+
observations land without the attribute set.
152+
153+
Validated against `langfuse>=4.6,<5`. v2.x and v3.x are NOT
154+
supported — supply your own adapter against the same four-method
155+
Protocol if you need to stay on an older version.
146156

147157
For prompt linkage: in production, the
148158
`Prompt.observability_entities['langfuse_prompt']` value is the SDK's

examples/10-langfuse-observability/main.py

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,12 @@
2121
The example uses the bundled ``InMemoryLangfuseClient`` recorder so the
2222
demo runs without a Langfuse account — at the end we print the captured
2323
Trace + Observation tree. Swapping to a real ``langfuse.Langfuse()``
24-
client is a one-line constructor change (see the comment near the
25-
observer build below).
24+
client is a one-line constructor change via ``LangfuseSDKAdapter`` (see
25+
the comment near the observer build below). The adapter bridges the
26+
``langfuse>=4.6`` Python SDK shape onto OA's ``LangfuseClient``
27+
Protocol. Install with::
28+
29+
pip install 'openarmature[langfuse]'
2630
2731
LLM calls go through ``openarmature.llm.OpenAIProvider``.
2832
@@ -244,11 +248,18 @@ async def main() -> None:
244248
# fields — without needing a Langfuse account. For production:
245249
#
246250
# from langfuse import Langfuse
247-
# client = Langfuse(public_key=..., secret_key=..., host=...)
251+
# from openarmature.observability.langfuse import LangfuseSDKAdapter
252+
#
253+
# langfuse_client = Langfuse(
254+
# public_key="pk-lf-...",
255+
# secret_key="sk-lf-...",
256+
# host="https://cloud.langfuse.com",
257+
# )
258+
# client = LangfuseSDKAdapter(langfuse_client)
248259
#
249-
# Replace the InMemoryLangfuseClient construction below with that
250-
# client. The observer code doesn't change — the client is
251-
# Protocol-typed, so any structurally-compatible value works.
260+
# Validated against ``langfuse>=4.6,<5``. The adapter bridges
261+
# langfuse v4's unified ``start_observation`` API onto OA's
262+
# ``LangfuseClient`` Protocol; the observer code doesn't change.
252263
client = InMemoryLangfuseClient()
253264

254265
# disable_llm_payload=False opts in to capturing the input messages

pyproject.toml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,13 @@ otel = [
4242
# below 1.0; revisit when 1.0 lands.
4343
"opentelemetry-instrumentation-logging>=0.62.0b1",
4444
]
45+
# Spec observability §8 Langfuse mapping. Optional per charter §3.1
46+
# principle 5; matches the [otel] extras' shape. Validated against
47+
# Langfuse Python SDK 4.6.x; bridge from the SDK's v4 API onto the
48+
# LangfuseClient Protocol lives in ``observability.langfuse.adapter``.
49+
langfuse = [
50+
"langfuse>=4.6,<5",
51+
]
4552

4653
[project.urls]
4754
Repository = "https://github.com/LunarCommand/openarmature-python"
@@ -105,3 +112,11 @@ select = ["E", "F", "I", "B", "UP"]
105112
[tool.pytest.ini_options]
106113
testpaths = ["tests"]
107114
asyncio_mode = "auto"
115+
# Opt-in markers for tests that hit real external services. CI skips
116+
# these by default — run with `-m integration` to include them after
117+
# setting the relevant env vars (LANGFUSE_PUBLIC_KEY / SECRET_KEY for
118+
# Langfuse Cloud, etc).
119+
markers = [
120+
"integration: tests that exercise real external services (Langfuse Cloud, HyperDX). Skipped by default.",
121+
]
122+
addopts = ["-m", "not integration"]

src/openarmature/observability/langfuse/__init__.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,18 @@
4242
)
4343
from .observer import LangfuseObserver
4444

45+
# LangfuseSDKAdapter requires the [langfuse] optional dependency.
46+
# Surface it when available, but don't force the import on consumers
47+
# who only use the InMemoryLangfuseClient — the adapter module's own
48+
# guard raises an informative ImportError if anyone tries to use it
49+
# without the extras installed.
50+
try:
51+
from .adapter import LangfuseSDKAdapter as LangfuseSDKAdapter
52+
53+
_adapter_available = True
54+
except ImportError: # pragma: no cover - exercised by extras-not-installed path
55+
_adapter_available = False
56+
4557
__all__ = [
4658
"InMemoryLangfuseClient",
4759
"LangfuseClient",
@@ -54,3 +66,5 @@
5466
"ObservationLevel",
5567
"ObservationType",
5668
]
69+
if _adapter_available:
70+
__all__.append("LangfuseSDKAdapter")

0 commit comments

Comments
 (0)