You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add [langfuse] extras and SDK adapter (PR 3.6) (#82)
* Add [langfuse] extras and SDK adapter
Validates the Langfuse observer against the real langfuse>=4.6 SDK
and ships a bridge so production users get the same Protocol-shaped
observer surface as the InMemoryLangfuseClient.
[langfuse] optional-dependency group pins langfuse>=4.6,<5. The v4
SDK is structurally different from v2 / v3 — traces are auto-created
when the first observation starts, span and generation collapse into
start_observation with as_type, trace_id threads through
TraceContext, and trace-level metadata sets via propagate_attributes
context manager. Per Chris's directive ("no existing-user constraint;
do what's right for OA"), the adapter targets v4 only; earlier SDK
versions are out of scope.
LangfuseSDKAdapter wraps the v4 client to satisfy the four-method
LangfuseClient Protocol. Key translations:
- UUID4 invocation_id -> OTel-hex trace_id (32 chars, no dashes).
v4 fails int(uuid, 16) parsing on the dashed form; OA's observer
error-isolation pattern swallowed that as a warnings.warn,
silently dropping traces.
- propagate_attributes(trace_name=, metadata=) runs on EVERY
observation under each trace_id (not just the first). Without
this, v4's last-attribute-wins display logic let later
observations clobber the trace's display name to whatever the
final observation was called.
- usage values translate from the Protocol's LangfuseUsage record
to v4's usage_details dict (int values only).
- Returned LangfuseSpan / LangfuseGeneration handles wrap into
_SpanHandle to expose the .update() / .end() the observer calls.
Trace-info cache persists per trace_id rather than popping on first
observation. Memory is linear in unique trace_ids; a close_trace
cleanup hook is deferred to a future PR.
Tests:
- Five unit tests covering Protocol satisfaction, observer
construction, trace_info cache lifecycle, update_trace merge,
UUID4 -> OTel-hex conversion (with idempotency on already-hex and
non-UUID passthrough).
- One opt-in integration test against real Langfuse Cloud, gated by
@pytest.mark.integration + LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY
env vars. Calls auth_check() to fail loud on bad credentials,
client.shutdown() for synchronous batch-exporter drain. Accepts
LANGFUSE_HOST or LANGFUSE_BASE_URL for the host.
- New pytest marker config: addopts defaults to "-m not integration"
so CI auto-skips integration tests; run with -m integration to
include.
Docs:
- docs/concepts/observability.md flips the "no SDK version validated"
disclosure to the validated v4 state, shows the LangfuseSDKAdapter
wire-up snippet.
- docs/examples/10-langfuse-observability.md same.
- examples/10-langfuse-observability/main.py docstring + inline
comment updated with the production swap recipe.
- AGENTS.md regenerated.
Validated end-to-end against Langfuse Cloud (US region) with
langfuse 4.7.0: a two-node graph produces one Trace with
entry-node-name set as the display name, both nodes as Span
observations under it, and the spec §8.4.1 trace metadata
(correlation_id, entry_node, spec_version) populated.
Fifth of 6 core PRs in the v0.10.0 batch (PR 3.6).
* Align adapter docs with propagate-on-every behavior
Three stale-doc residues from when the adapter consumed trace_info
on the first observation only. The current behavior — propagate on
every observation under each trace_id to avoid v4's last-attribute-
wins display logic clobbering the trace name — got caught by the
integration-test run and the cache+propagation refactored, but
the module header comments and one example-doc paragraph still
described the original "first observation only" path.
Comment / doc text updated; no code change.
Addresses CoPilot PR review feedback on #82.
0 commit comments