Skip to content

Commit 5c4e999

Browse files
Add Langfuse prompt backend (text prompts) (#93)
* Add Langfuse prompt backend (text prompts) LangfusePromptBackend fetches prompts from Langfuse's prompt registry through PromptManager, behind the [langfuse] extra. It takes a caller-supplied Langfuse client so it shares one connection with a LangfuseObserver on the same client. A text prompt maps to Prompt (template, version, label, sampling lifted from config) and sets observability_entities['langfuse_prompt'] so the Generation-to-Prompt link fires with no caller wiring. Chat prompts raise PromptNotFound: OA's render produces a single user message today, so multi-message support is deferred (tracked for a later release). Also document the LANGFUSE_* env vars next to the LLM_* ones in the examples config sections. * Harden Langfuse prompt backend error mapping From PR #93 review: - Map httpx.TransportError (connect/read/timeout/network) to PromptStoreUnavailable in LangfusePromptBackend, alongside 503s, so PromptManager falls back on transport failures per the PromptBackend contract. Adds a transport-timeout unit test. - Align the opt-in Langfuse integration test's host resolution to the SDK's precedence (LANGFUSE_BASE_URL before LANGFUSE_HOST); it had framed LANGFUSE_HOST as canonical with the opposite precedence.
1 parent 95b5f14 commit 5c4e999

5 files changed

Lines changed: 355 additions & 6 deletions

File tree

docs/examples/index.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,17 @@ The OpenAI public-API defaults are:
5555
| `LLM_MODEL` | `gpt-4o-mini` | Any model the bound endpoint exposes. |
5656
| `LLM_API_KEY` | (none) | Required. Pass empty for local servers that don't authenticate. |
5757

58+
The Langfuse observer and the Langfuse prompt backend read the
59+
standard Langfuse SDK variables when pointed at a live Langfuse
60+
account; `Langfuse()` picks them up automatically, so no credentials
61+
appear in the example code:
62+
63+
| Env var | Notes |
64+
| --------------------- | ---------------------------------------------------------------------------------------- |
65+
| `LANGFUSE_PUBLIC_KEY` | From your Langfuse project settings. |
66+
| `LANGFUSE_SECRET_KEY` | From your Langfuse project settings. |
67+
| `LANGFUSE_BASE_URL` | Langfuse host (e.g. `https://cloud.langfuse.com`). The SDK also accepts `LANGFUSE_HOST`. |
68+
5869
For a local OpenAI-compatible server (vLLM, LM Studio, llama.cpp,
5970
etc.), point `LLM_BASE_URL` at the host root (e.g.
6071
`http://localhost:8000`) and set `LLM_API_KEY` to whatever value the

examples/README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,17 @@ defaults shown:
113113
| `LLM_MODEL` | `gpt-4o-mini` | Any model the bound endpoint exposes. |
114114
| `LLM_API_KEY` | (none) | Required; pass empty for local servers that don't authenticate. |
115115

116+
The Langfuse observer and the Langfuse prompt backend read the standard
117+
Langfuse SDK variables when pointed at a live Langfuse account;
118+
`Langfuse()` reads them automatically, so no credentials appear in the
119+
example code:
120+
121+
| Env var | Notes |
122+
| --- | --- |
123+
| `LANGFUSE_PUBLIC_KEY` | From your Langfuse project settings. |
124+
| `LANGFUSE_SECRET_KEY` | From your Langfuse project settings. |
125+
| `LANGFUSE_BASE_URL` | Langfuse host (e.g. `https://cloud.langfuse.com`); the SDK also accepts `LANGFUSE_HOST`. |
126+
116127
## Running
117128

118129
```bash
Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
"""Langfuse-backed PromptBackend (text prompts).
2+
3+
Fetches prompts from Langfuse's prompt registry through OA's
4+
``PromptManager``. Gated behind the ``[langfuse]`` extra; import this
5+
module only when ``langfuse`` is installed (``backends/__init__`` does
6+
not import it, so the base package stays langfuse-free).
7+
8+
v1 supports Langfuse TEXT prompts. A Langfuse CHAT prompt raises
9+
``PromptNotFound`` because OA's render produces a single user message
10+
today; multi-message (chat) prompt support is tracked for a later
11+
release.
12+
"""
13+
14+
from __future__ import annotations
15+
16+
import asyncio
17+
from datetime import UTC, datetime
18+
from typing import Any, Protocol
19+
20+
import httpx
21+
from langfuse.api import NotFoundError, ServiceUnavailableError
22+
from langfuse.model import ChatPromptClient, TextPromptClient
23+
24+
from ..errors import PromptNotFound, PromptStoreUnavailable
25+
from ..hashing import compute_template_hash
26+
from ..prompt import Prompt, SamplingConfig
27+
28+
29+
class LangfusePromptClient(Protocol):
30+
"""The minimal Langfuse prompt-fetch surface this backend needs.
31+
32+
``langfuse.Langfuse`` satisfies it structurally (its ``get_prompt``
33+
has additional optional parameters), so callers pass a real client;
34+
tests can supply a lightweight fake.
35+
"""
36+
37+
def get_prompt(self, name: str, *, label: str = "production") -> TextPromptClient | ChatPromptClient: ...
38+
39+
40+
# Langfuse prompt `config` keys that line up with SamplingConfig's
41+
# declared fields. Only these are lifted into `Prompt.sampling`; the
42+
# full config is preserved under `Prompt.metadata` so nothing is lost.
43+
_SAMPLING_FIELDS = (
44+
"temperature",
45+
"max_tokens",
46+
"top_p",
47+
"seed",
48+
"frequency_penalty",
49+
"presence_penalty",
50+
"stop_sequences",
51+
)
52+
53+
54+
class LangfusePromptBackend:
55+
"""Reads prompts from Langfuse's prompt registry.
56+
57+
Constructed with a caller-supplied ``langfuse.Langfuse`` client, so
58+
it shares one client (one connection pool, one flush thread) with a
59+
:class:`~openarmature.observability.langfuse.LangfuseObserver` built
60+
on the same instance::
61+
62+
from langfuse import Langfuse
63+
from openarmature.prompts import PromptManager
64+
from openarmature.prompts.backends.langfuse import LangfusePromptBackend
65+
66+
client = Langfuse(public_key="pk-lf-...", secret_key="sk-lf-...")
67+
manager = PromptManager(LangfusePromptBackend(client))
68+
69+
``fetch`` is reentrant and does not render; the manager renders.
70+
The returned ``Prompt`` carries the raw Langfuse template (Langfuse
71+
``{{var}}`` placeholders are Jinja2-compatible, so OA's render
72+
applies unchanged), plus the Langfuse SDK Prompt object under
73+
``observability_entities['langfuse_prompt']`` so the observability
74+
Generation -> Prompt link fires automatically.
75+
"""
76+
77+
def __init__(self, client: LangfusePromptClient) -> None:
78+
self._client = client
79+
80+
async def fetch(self, name: str, label: str = "production") -> Prompt:
81+
# The Langfuse SDK's get_prompt is synchronous (and does its own
82+
# client-side caching); run it off the event loop.
83+
result = await asyncio.to_thread(self._get_prompt, name, label)
84+
85+
if isinstance(result, ChatPromptClient):
86+
raise PromptNotFound(
87+
f"prompt ({name!r}, {label!r}) is a Langfuse chat prompt; "
88+
"the Langfuse backend supports text prompts only in this "
89+
"release (multi-message prompt support is planned)",
90+
name=name,
91+
label=label,
92+
backend="langfuse",
93+
)
94+
95+
template = result.prompt
96+
template_hash = compute_template_hash(template)
97+
return Prompt(
98+
name=name,
99+
version=str(result.version),
100+
label=label,
101+
template=template,
102+
template_hash=template_hash,
103+
fetched_at=datetime.now(UTC),
104+
sampling=_sampling_from_config(result.config),
105+
observability_entities={"langfuse_prompt": result},
106+
metadata=_metadata_from(result),
107+
)
108+
109+
def _get_prompt(self, name: str, label: str) -> TextPromptClient | ChatPromptClient:
110+
try:
111+
return self._client.get_prompt(name, label=label)
112+
except NotFoundError as exc:
113+
raise PromptNotFound(
114+
f"prompt ({name!r}, {label!r}) not found in Langfuse",
115+
name=name,
116+
label=label,
117+
backend="langfuse",
118+
) from exc
119+
except (ServiceUnavailableError, httpx.TransportError) as exc:
120+
# 503 plus transport-level failures (connect/read/timeout/
121+
# network): the SDK surfaces raw httpx errors when there's no
122+
# HTTP response to map to a typed error. Per the PromptBackend
123+
# contract these are unavailability, so the manager can fall
124+
# back. 4xx auth and other errors still propagate.
125+
raise PromptStoreUnavailable(
126+
f"Langfuse unavailable fetching ({name!r}, {label!r}): {exc}",
127+
name=name,
128+
label=label,
129+
) from exc
130+
131+
132+
def _sampling_from_config(config: dict[str, Any] | None) -> SamplingConfig | None:
133+
if not config:
134+
return None
135+
declared = {k: config[k] for k in _SAMPLING_FIELDS if k in config}
136+
if not declared:
137+
return None
138+
return SamplingConfig(**declared)
139+
140+
141+
def _metadata_from(result: TextPromptClient) -> dict[str, Any]:
142+
# Preserve Langfuse-side attribution. `config` is kept whole here
143+
# even though sampling fields are also lifted to `Prompt.sampling`,
144+
# so non-sampling config keys aren't dropped.
145+
meta: dict[str, Any] = {
146+
"langfuse_version": result.version,
147+
"langfuse_labels": result.labels,
148+
"langfuse_tags": result.tags,
149+
}
150+
if result.config:
151+
meta["langfuse_config"] = result.config
152+
if result.commit_message is not None:
153+
meta["langfuse_commit_message"] = result.commit_message
154+
return meta

tests/unit/test_observability_langfuse_adapter.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
1212
LANGFUSE_PUBLIC_KEY=pk-lf-... \\
1313
LANGFUSE_SECRET_KEY=sk-lf-... \\
14-
LANGFUSE_HOST=https://cloud.langfuse.com \\
14+
LANGFUSE_BASE_URL=https://cloud.langfuse.com \\
1515
uv run pytest tests/unit/test_observability_langfuse_adapter.py \\
1616
-m integration -v
1717
@@ -160,11 +160,11 @@ async def test_adapter_against_real_langfuse_cloud() -> None:
160160
if not public_key or not secret_key:
161161
pytest.skip("LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY not set")
162162

163-
# LANGFUSE_HOST is the canonical name (matches the SDK's ``host=``
164-
# kwarg); LANGFUSE_BASE_URL is the common alias some downstream
165-
# configs use. Accept either; LANGFUSE_HOST wins when both set.
163+
# Mirror the SDK's precedence: Langfuse() reads LANGFUSE_BASE_URL
164+
# first, then LANGFUSE_HOST. Resolve the same order here so this
165+
# explicit host matches what a no-arg Langfuse() would pick up.
166166
host = (
167-
os.environ.get("LANGFUSE_HOST") or os.environ.get("LANGFUSE_BASE_URL") or "https://cloud.langfuse.com"
167+
os.environ.get("LANGFUSE_BASE_URL") or os.environ.get("LANGFUSE_HOST") or "https://cloud.langfuse.com"
168168
)
169169
client = Langfuse(
170170
public_key=public_key,
@@ -175,7 +175,7 @@ async def test_adapter_against_real_langfuse_cloud() -> None:
175175
# background export thread is just a logged warning and the test
176176
# passes while traces vanish.
177177
assert client.auth_check(), (
178-
"Langfuse auth_check failed — verify LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY / LANGFUSE_HOST"
178+
"Langfuse auth_check failed — verify LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY / LANGFUSE_BASE_URL"
179179
)
180180

181181
observer = LangfuseObserver(client=LangfuseSDKAdapter(client))
Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
"""Unit tests for the Langfuse-backed PromptBackend (text prompts)."""
2+
3+
from __future__ import annotations
4+
5+
from typing import Any, cast
6+
7+
import httpx
8+
import pytest
9+
10+
pytest.importorskip("langfuse")
11+
12+
from langfuse.api import NotFoundError, ServiceUnavailableError # noqa: E402
13+
from langfuse.model import ( # noqa: E402
14+
ChatPromptClient,
15+
Prompt_Chat, # pyright: ignore[reportPrivateImportUsage]
16+
Prompt_Text, # pyright: ignore[reportPrivateImportUsage]
17+
TextPromptClient,
18+
)
19+
20+
from openarmature.prompts import PromptManager # noqa: E402
21+
from openarmature.prompts.backends.langfuse import LangfusePromptBackend # noqa: E402
22+
from openarmature.prompts.errors import ( # noqa: E402
23+
PromptNotFound,
24+
PromptStoreUnavailable,
25+
)
26+
27+
pytestmark = pytest.mark.asyncio
28+
29+
30+
def _text_client(
31+
*,
32+
name: str = "greeting",
33+
version: int = 3,
34+
prompt: str = "Hello {{ user }}",
35+
config: dict[str, Any] | None = None,
36+
labels: list[str] | None = None,
37+
tags: list[str] | None = None,
38+
) -> TextPromptClient:
39+
return TextPromptClient(
40+
Prompt_Text(
41+
type="text",
42+
name=name,
43+
version=version,
44+
prompt=prompt,
45+
config=config or {},
46+
labels=["production"] if labels is None else labels,
47+
tags=tags or [],
48+
)
49+
)
50+
51+
52+
def _chat_client(*, name: str = "chatty", version: int = 1) -> ChatPromptClient:
53+
return ChatPromptClient(
54+
Prompt_Chat(
55+
type="chat",
56+
name=name,
57+
version=version,
58+
prompt=cast(Any, [{"role": "system", "content": "hi {{ user }}"}]),
59+
config={},
60+
labels=["production"],
61+
tags=[],
62+
)
63+
)
64+
65+
66+
class _FakeClient:
67+
"""Stands in for ``langfuse.Langfuse`` exposing only ``get_prompt``."""
68+
69+
def __init__(self, *, result: Any = None, exc: BaseException | None = None) -> None:
70+
self._result = result
71+
self._exc = exc
72+
self.calls: list[tuple[str, str]] = []
73+
74+
def get_prompt(self, name: str, *, label: str = "production", **_: Any) -> Any:
75+
self.calls.append((name, label))
76+
if self._exc is not None:
77+
raise self._exc
78+
return self._result
79+
80+
81+
async def test_fetch_text_prompt_maps_to_prompt() -> None:
82+
client = _text_client(prompt="Hello {{ user }}", version=7, tags=["greeting"])
83+
backend = LangfusePromptBackend(_FakeClient(result=client))
84+
85+
prompt = await backend.fetch("greeting", "production")
86+
87+
assert prompt.name == "greeting"
88+
assert prompt.version == "7"
89+
assert prompt.label == "production"
90+
assert prompt.template == "Hello {{ user }}"
91+
assert prompt.template_hash.startswith("sha256:")
92+
assert prompt.observability_entities is not None
93+
assert prompt.observability_entities["langfuse_prompt"] is client
94+
assert prompt.metadata is not None
95+
assert prompt.metadata["langfuse_version"] == 7
96+
assert prompt.metadata["langfuse_tags"] == ["greeting"]
97+
98+
99+
async def test_fetch_passes_label_through() -> None:
100+
fake = _FakeClient(result=_text_client())
101+
backend = LangfusePromptBackend(fake)
102+
103+
await backend.fetch("greeting", "staging")
104+
105+
assert fake.calls == [("greeting", "staging")]
106+
107+
108+
async def test_chat_prompt_raises_not_found() -> None:
109+
backend = LangfusePromptBackend(_FakeClient(result=_chat_client()))
110+
111+
with pytest.raises(PromptNotFound) as excinfo:
112+
await backend.fetch("chatty", "production")
113+
114+
assert excinfo.value.backend == "langfuse"
115+
assert "chat prompt" in str(excinfo.value)
116+
117+
118+
async def test_not_found_maps_to_prompt_not_found() -> None:
119+
backend = LangfusePromptBackend(_FakeClient(exc=NotFoundError("nope")))
120+
121+
with pytest.raises(PromptNotFound):
122+
await backend.fetch("missing", "production")
123+
124+
125+
async def test_service_unavailable_maps_to_store_unavailable() -> None:
126+
backend = LangfusePromptBackend(_FakeClient(exc=ServiceUnavailableError()))
127+
128+
with pytest.raises(PromptStoreUnavailable):
129+
await backend.fetch("greeting", "production")
130+
131+
132+
async def test_transport_error_maps_to_store_unavailable() -> None:
133+
# A connect/read/timeout/network failure surfaces as a raw httpx
134+
# TransportError (no HTTP response to map to a typed SDK error); it
135+
# must become PromptStoreUnavailable so PromptManager can fall back.
136+
backend = LangfusePromptBackend(_FakeClient(exc=httpx.ConnectTimeout("timed out")))
137+
138+
with pytest.raises(PromptStoreUnavailable):
139+
await backend.fetch("greeting", "production")
140+
141+
142+
async def test_sampling_extracted_from_config() -> None:
143+
client = _text_client(config={"temperature": 0.0, "max_tokens": 256, "model": "gpt-4o"})
144+
backend = LangfusePromptBackend(_FakeClient(result=client))
145+
146+
prompt = await backend.fetch("greeting", "production")
147+
148+
assert prompt.sampling is not None
149+
assert prompt.sampling.temperature == 0.0
150+
assert prompt.sampling.max_tokens == 256
151+
# Non-sampling config keys are not lifted into sampling, but the
152+
# full config is preserved under metadata.
153+
assert prompt.metadata is not None
154+
assert prompt.metadata["langfuse_config"]["model"] == "gpt-4o"
155+
156+
157+
async def test_no_sampling_config_yields_none() -> None:
158+
backend = LangfusePromptBackend(_FakeClient(result=_text_client(config={})))
159+
160+
prompt = await backend.fetch("greeting", "production")
161+
162+
assert prompt.sampling is None
163+
164+
165+
async def test_fetched_prompt_renders_through_manager() -> None:
166+
backend = LangfusePromptBackend(_FakeClient(result=_text_client(prompt="Hi {{ user }}")))
167+
manager = PromptManager(backend)
168+
169+
prompt = await manager.fetch("greeting", "production")
170+
result = manager.render(prompt, {"user": "Alice"})
171+
172+
assert len(result.messages) == 1
173+
assert result.messages[0].content == "Hi Alice"

0 commit comments

Comments
 (0)