Skip to content

Commit ccb2d29

Browse files
Implement proposal 0046 (chat-prompt rendering) (#105)
* Implement proposal 0046 (chat-prompt rendering) Spec proposal 0046 (prompt-management §3.1 + §6, v0.38.0) adds the Chat-prompt variant alongside the existing Text-prompt variant. Carries an ordered list of ChatSegment entries — ContentSegment (role + text-template OR content-blocks-template) and PlaceholderSegment (caller-supplied message-list injection at render). Content-block templates mirror llm-provider §3.1 (TextBlockTemplate, ImageURLBlockTemplate, ImageInlineBlockTemplate). Type surface: existing Prompt class renamed to TextPrompt; new ChatPrompt; Prompt = TextPrompt | ChatPrompt as the discriminated- union alias. Breaking change at the instantiation surface (Prompt(...) callers move to TextPrompt(...)); type annotations using Prompt continue to work via the union alias. PromptManager.render gains a placeholders kwarg. Chat prompts render segment-by-segment per spec §6 with strict-undefined per segment and per block; the four §11 error categories (empty content segment, empty block list, unfilled placeholder, role- block compatibility) plus duplicate-placeholder-name and placeholder-regex checks fire at render time (spec-normative trigger). Construction-time validators on ContentSegment, PlaceholderSegment, and ChatPrompt are the optional ergonomic- bonus layer per spec msg-07; harnesses building intentionally- invalid fixtures bypass via model_construct. Langfuse backend: ChatPromptClient maps to ChatPrompt with one ContentSegment per Langfuse chat message; placeholder markers map to PlaceholderSegment. Malformed placeholder names from Langfuse bypass construction-time validation so the §11 error fires at render time per the spec-normative timing contract. Security: inline image base64_data is validated via base64.b64decode(..., validate=True) at render time; a malformed substitution surfaces as prompt_render_error at the prompt boundary rather than as a provider-specific decode error. Conformance fixtures 017-031 activate against an extended prompt-management harness (chat_template + placeholders directives; image-block YAML shape mapping; message-dict structural compare). Spec pin v0.37.0 -> v0.38.0. * Address PR 105 review: fail-closed on unsupported shapes CoPilot flagged three issues: a doc-comment mismatch on the fixture image-block mapper and two fail-open patterns where unrecognized shapes silently dropped or coerced instead of surfacing the failure. Image-block fixture mapper: the doc comment claimed media_type was nested inside source; the actual fixture YAML puts it at the outer block level (per llm-provider §3.1.2 the field lives on the block, conditional on source type). Code was correct; fixed the comment. Langfuse backend mapper (_normalized_langfuse_entries): split into _normalized_langfuse_entries (fail-closed) plus _chat_segments_from_normalized. Raises PromptNotFound with a descriptive message naming the unsupported shape when it hits an entry the current mapper doesn't recognize. Matches how the backend handles other fetch-side failures. Added two regression tests using a new _chat_client_with_raw_prompt helper that bypasses the Langfuse SDK's own __init__ filter. Placeholder-injection harness (_message_from_fixture): now handles all four llm-provider §3 roles including tool (the fourth role, intentionally excluded from authored ChatSegment per spec §3.1 but legitimate in caller-supplied placeholder message lists). Raises on unknown / misspelled roles instead of silently coercing to UserMessage.
1 parent fb1ef70 commit ccb2d29

21 files changed

Lines changed: 1272 additions & 130 deletions

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The
88

99
### Added
1010

11+
- **Multi-message chat-prompt rendering** (proposal 0046, prompt-management §3.1 / §6, spec v0.38.0). The `Prompt` type splits into a discriminated union over `TextPrompt` (existing single-string template) and the new `ChatPrompt` carrying an ordered list of `ChatSegment` entries — `ContentSegment` (role-tagged content; text-template OR content-blocks-template) and `PlaceholderSegment` (caller-supplied message-list injection). Content-block templates mirror llm-provider §3.1 (`TextBlockTemplate`, `ImageURLBlockTemplate`, `ImageInlineBlockTemplate`). `PromptManager.render` accepts a new `placeholders: Mapping[str, Sequence[Message]] | None` kwarg; chat prompts render segment-by-segment with strict-undefined per segment and per block. The Langfuse backend now maps Langfuse `ChatPromptClient` to `ChatPrompt`. Conformance fixtures 017-031 activate against the extended harness. Single-string Text-prompt rendering is unchanged at the call surface — existing `prompt.template` callers continue to work via the `TextPrompt` variant.
12+
- **Inline image base64 validated at render time.** A chat-prompt content-blocks template with an `ImageInlineBlockTemplate` whose rendered `base64_data` fails `base64.b64decode(..., validate=True)` now raises `prompt_render_error` at the prompt-manager boundary rather than letting the malformed payload reach the LLM provider, where the error would be provider-specific.
1113
- **Nested-lineage augmentation containment scope** (proposal 0045, observability §3.4, spec v0.37.0). The per-async-context augmentation boundary rewrites as three lineage-aware rules: the augmenter's call-stack ancestor chain MUST update (every strict dispatch ancestor on the path — each outer fan-out instance dispatch span, each outer parallel-branches branch dispatch span, each outer serial subgraph wrapper); siblings at any depth MUST NOT; shared parents (fan-out NODE, parallel-branches NODE, invocation span) MUST NOT. Engine-side: tracks per-depth lineage chains (`fan_out_index_chain` / `branch_name_chain`) parallel to `namespace_prefix`, available on `NodeEvent` and `MetadataAugmentationEvent`. Observer-side: `OTelObserver._collect_augmentation_targets` and `LangfuseObserver._handle_metadata_augmentation` rewrite against the three-step boundary decision tree. Single-level behavior (fixtures 029 / 030 / 034) is unchanged.
1214
- **`LangfuseObserver` Trace input/output sourcing** (proposal 0043, observability §8.4.1). New observer construction knobs populate `trace.input` and `trace.output` per the three-lever decision tree:
1315
- **`disable_state_payload: bool = True`** — privacy knob symmetric to `disable_llm_payload`. When ON (default), Trace fields receive the minimal stub `{entry_node, correlation_id}` / `{final_node, status}`; when OFF, the raw state object is serialized.
@@ -26,6 +28,7 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The
2628

2729
### Changed (breaking, pre-1.0)
2830

31+
- **`Prompt` is now a discriminated-union type alias over `TextPrompt | ChatPrompt`** (proposal 0046). The previous `Prompt(...)` class instantiation MUST update to `TextPrompt(...)`; type annotations using `Prompt` as a return / parameter type continue to work (the alias is the union). The Langfuse backend no longer raises on Langfuse chat prompts — it returns `ChatPrompt` instead of `PromptNotFound`. Per spec §6 narrowing, Text prompts render to exactly one `UserMessage`; multi-message / multimodal prompts MUST use the Chat variant.
2932
- **OTel span attribute `openarmature.branch_name` is renamed to `openarmature.node.branch_name`** to align with the spec §5.7 attribute namespace. Prior python releases emitted `openarmature.branch_name` as a workaround because the spec hadn't defined an OTel attribute carrying `branch_name` yet; proposal 0044 (v0.36.0) formalizes the namespace. **Downstream dashboards, queries, or alerts filtering on the old attribute name MUST update.** Pre-1.0 break; the prior name was python-implementation-only and was never spec-normative.
3033

3134
### Changed

conformance.toml

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@
3232

3333
[manifest]
3434
implementation = "openarmature-python"
35-
spec_pin = "v0.37.0"
35+
spec_pin = "v0.38.0"
3636

3737
# Status values:
3838
# implemented — shipped behavior matches the proposal's contract
@@ -227,3 +227,16 @@ since = "0.11.0"
227227
[proposals."0045"]
228228
status = "implemented"
229229
since = "0.11.0"
230+
231+
# Spec v0.38.0 (proposal 0046). Multi-message chat-prompt rendering.
232+
# Adds ``ChatPrompt`` alongside the existing ``TextPrompt`` (renamed
233+
# from ``Prompt``); ``Prompt`` is now the discriminated-union alias.
234+
# Chat-prompt segments (content + placeholder) and content-block
235+
# templates (text + image-URL + image-inline) ship in
236+
# ``openarmature.prompts``. ``PromptManager.render`` accepts a new
237+
# ``placeholders`` kwarg for variable-length message-list injection.
238+
# Conformance fixtures 017-031 activate against the existing
239+
# prompt-management harness with a chat-template parser extension.
240+
[proposals."0046"]
241+
status = "implemented"
242+
since = "0.11.0"

examples/10-langfuse-observability/main.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@
6161
LangfuseObserver,
6262
LangfuseTrace,
6363
)
64-
from openarmature.prompts import Prompt, PromptManager, PromptResult
64+
from openarmature.prompts import Prompt, PromptManager, PromptResult, TextPrompt
6565
from openarmature.prompts.context import with_active_prompt
6666

6767
_provider_instance: OpenAIProvider | None = None
@@ -103,7 +103,7 @@ class _MockLangfusePromptBackend:
103103

104104
def __init__(self) -> None:
105105
now = datetime.now(UTC)
106-
self._prompt = Prompt(
106+
self._prompt = TextPrompt(
107107
name="mission-briefing",
108108
version="v7",
109109
label="production",

openarmature-spec

Submodule openarmature-spec updated 36 files

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ Specification = "https://github.com/LunarCommand/openarmature-spec"
5858
openarmature = "openarmature.cli:main"
5959

6060
[tool.openarmature]
61-
spec_version = "0.37.0"
61+
spec_version = "0.38.0"
6262

6363
[dependency-groups]
6464
dev = [

src/openarmature/AGENTS.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# OpenArmature — Agent documentation
22

3-
*This is the agent guide bundled with the openarmature Python package, version 0.10.0 (spec v0.37.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*
3+
*This is the agent guide bundled with the openarmature Python package, version 0.10.0 (spec v0.38.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*
44

55
## TL;DR
66

@@ -10,7 +10,7 @@ OpenArmature is a workflow framework for LLM pipelines and tool-calling agents
1010

1111
## Capability contracts
1212

13-
_Sourced from openarmature-spec v0.37.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md`. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._
13+
_Sourced from openarmature-spec v0.38.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md`. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._
1414

1515
### Capability: `graph-engine`
1616

src/openarmature/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,4 +25,4 @@
2525
"""
2626

2727
__version__ = "0.10.0"
28-
__spec_version__ = "0.37.0"
28+
__spec_version__ = "0.38.0"

src/openarmature/prompts/__init__.py

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,17 +22,37 @@
2222
from .hashing import compute_rendered_hash, compute_template_hash
2323
from .label_resolver import SPEC_FALLBACK_LABEL, LabelResolver, MappingLabelResolver
2424
from .manager import PromptManager
25-
from .prompt import Prompt, PromptResult, SamplingConfig
25+
from .prompt import (
26+
ChatPrompt,
27+
ChatSegment,
28+
ContentBlockTemplate,
29+
ContentSegment,
30+
ImageInlineBlockTemplate,
31+
ImageURLBlockTemplate,
32+
PlaceholderSegment,
33+
Prompt,
34+
PromptResult,
35+
SamplingConfig,
36+
TextBlockTemplate,
37+
TextPrompt,
38+
)
2639

2740
__all__ = [
2841
"PROMPT_NOT_FOUND",
2942
"PROMPT_RENDER_ERROR",
3043
"PROMPT_STORE_UNAVAILABLE",
3144
"PROMPT_TRANSIENT_CATEGORIES",
3245
"SPEC_FALLBACK_LABEL",
46+
"ChatPrompt",
47+
"ChatSegment",
48+
"ContentBlockTemplate",
49+
"ContentSegment",
3350
"FilesystemPromptBackend",
51+
"ImageInlineBlockTemplate",
52+
"ImageURLBlockTemplate",
3453
"LabelResolver",
3554
"MappingLabelResolver",
55+
"PlaceholderSegment",
3656
"Prompt",
3757
"PromptBackend",
3858
"PromptError",
@@ -43,6 +63,8 @@
4363
"PromptResult",
4464
"PromptStoreUnavailable",
4565
"SamplingConfig",
66+
"TextBlockTemplate",
67+
"TextPrompt",
4668
"compute_rendered_hash",
4769
"compute_template_hash",
4870
"current_prompt_group",

src/openarmature/prompts/backends/filesystem.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010

1111
from ..errors import PromptNotFound, PromptStoreUnavailable
1212
from ..hashing import compute_template_hash
13-
from ..prompt import Prompt, SamplingConfig
13+
from ..prompt import Prompt, SamplingConfig, TextPrompt
1414

1515

1616
class FilesystemPromptBackend:
@@ -171,7 +171,7 @@ async def fetch(self, name: str, label: str = "production") -> Prompt:
171171
sampling = await asyncio.to_thread(self._resolve_sampling, name, label)
172172
template_hash = compute_template_hash(template_source)
173173
version = template_hash.removeprefix("sha256:")[:16]
174-
return Prompt(
174+
return TextPrompt(
175175
name=name,
176176
version=version,
177177
label=label,

src/openarmature/prompts/backends/langfuse.py

Lines changed: 117 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,40 @@
1-
"""Langfuse-backed PromptBackend (text prompts).
1+
"""Langfuse-backed PromptBackend (text + chat prompts).
22
33
Fetches prompts from Langfuse's prompt registry through OA's
44
``PromptManager``. Gated behind the ``[langfuse]`` extra; import this
55
module only when ``langfuse`` is installed (``backends/__init__`` does
66
not import it, so the base package stays langfuse-free).
77
8-
v1 supports Langfuse TEXT prompts. A Langfuse CHAT prompt raises
9-
``PromptNotFound`` because OA's render produces a single user message
10-
today; multi-message (chat) prompt support is tracked for a later
11-
release.
8+
Per proposal 0046 (v0.38.0): both Langfuse TEXT and CHAT prompts are
9+
supported. Text prompts return a :class:`TextPrompt`; chat prompts
10+
return a :class:`ChatPrompt` with one :class:`ContentSegment` per
11+
Langfuse chat message. Langfuse chat placeholders map to
12+
:class:`PlaceholderSegment` entries.
1213
"""
1314

1415
from __future__ import annotations
1516

1617
import asyncio
18+
import json
19+
from collections.abc import Iterable
1720
from datetime import UTC, datetime
18-
from typing import Any, Protocol
21+
from typing import Any, Protocol, cast
1922

2023
import httpx
2124
from langfuse.api import NotFoundError, ServiceUnavailableError
2225
from langfuse.model import ChatPromptClient, TextPromptClient
2326

2427
from ..errors import PromptNotFound, PromptStoreUnavailable
2528
from ..hashing import compute_template_hash
26-
from ..prompt import Prompt, SamplingConfig
29+
from ..prompt import (
30+
ChatPrompt,
31+
ChatSegment,
32+
ContentSegment,
33+
PlaceholderSegment,
34+
Prompt,
35+
SamplingConfig,
36+
TextPrompt,
37+
)
2738

2839

2940
class LangfusePromptClient(Protocol):
@@ -83,18 +94,34 @@ async def fetch(self, name: str, label: str = "production") -> Prompt:
8394
result = await asyncio.to_thread(self._get_prompt, name, label)
8495

8596
if isinstance(result, ChatPromptClient):
86-
raise PromptNotFound(
87-
f"prompt ({name!r}, {label!r}) is a Langfuse chat prompt; "
88-
"the Langfuse backend supports text prompts only in this "
89-
"release (multi-message prompt support is planned)",
97+
normalized = _normalized_langfuse_entries(result.prompt, name=name, label=label)
98+
chat_template = list(_chat_segments_from_normalized(normalized))
99+
template_hash = compute_template_hash(json.dumps(normalized, sort_keys=True))
100+
# ``ChatPrompt.model_construct`` is required (not the
101+
# plain constructor): pydantic re-runs validators on
102+
# nested field values when validating the outer model,
103+
# so a placeholder name we bypassed at the
104+
# ``PlaceholderSegment`` level would still trip the
105+
# regex check during ChatPrompt construction. Bypass
106+
# the outer validators too so the malformed input
107+
# reaches render-time (the spec-normative §11 error
108+
# trigger).
109+
return ChatPrompt.model_construct(
110+
kind="chat",
90111
name=name,
112+
version=str(result.version),
91113
label=label,
92-
backend="langfuse",
114+
chat_template=chat_template,
115+
template_hash=template_hash,
116+
fetched_at=datetime.now(UTC),
117+
sampling=_sampling_from_config(result.config),
118+
observability_entities={"langfuse_prompt": result},
119+
metadata=_metadata_from(result),
93120
)
94121

95122
template = result.prompt
96123
template_hash = compute_template_hash(template)
97-
return Prompt(
124+
return TextPrompt(
98125
name=name,
99126
version=str(result.version),
100127
label=label,
@@ -138,7 +165,83 @@ def _sampling_from_config(config: dict[str, Any] | None) -> SamplingConfig | Non
138165
return SamplingConfig(**declared)
139166

140167

141-
def _metadata_from(result: TextPromptClient) -> dict[str, Any]:
168+
def _normalized_langfuse_entries(raw: Iterable[Any], *, name: str, label: str) -> list[dict[str, Any]]:
169+
"""Normalize a Langfuse ``ChatPromptClient.prompt`` list to OA
170+
canonical entry dicts. Each output entry is either a content
171+
message ``{"role": ..., "content": ...}`` or a placeholder
172+
marker ``{"type": "placeholder", "name": ...}``.
173+
174+
Fails closed on any entry whose shape this mapper doesn't
175+
recognize. Silent skipping is the wrong posture for a fetch-
176+
side mapper: a Langfuse SDK extension (or a malformed entry)
177+
would otherwise produce a degraded rendered prompt with zero
178+
signal to the caller — exactly the kind of bug that changes
179+
model behavior invisibly. ``PromptNotFound`` is the canonical
180+
"we got the prompt but couldn't fully deserialize it" signal,
181+
matching how the backend handles other fetch-side failures.
182+
183+
``name`` and ``label`` are threaded through purely for error
184+
context on the ``PromptNotFound`` carriers.
185+
"""
186+
out: list[dict[str, Any]] = []
187+
for raw_entry in raw:
188+
if not isinstance(raw_entry, dict):
189+
raise PromptNotFound(
190+
f"Langfuse chat-prompt entry has unsupported shape: "
191+
f"expected dict, got {type(raw_entry).__name__}",
192+
name=name,
193+
label=label,
194+
backend="langfuse",
195+
)
196+
entry = cast("dict[str, Any]", raw_entry)
197+
entry_type = entry.get("type")
198+
if entry_type == "placeholder":
199+
placeholder_name = entry.get("name")
200+
if not isinstance(placeholder_name, str):
201+
raise PromptNotFound(
202+
f"Langfuse placeholder entry missing or invalid 'name': {entry!r}",
203+
name=name,
204+
label=label,
205+
backend="langfuse",
206+
)
207+
out.append({"type": "placeholder", "name": placeholder_name})
208+
continue
209+
role = entry.get("role")
210+
content = entry.get("content")
211+
if role in {"system", "user", "assistant"} and isinstance(content, str):
212+
out.append({"role": role, "content": content})
213+
continue
214+
raise PromptNotFound(
215+
f"Langfuse chat-prompt entry has unsupported role/content shape: {entry!r}",
216+
name=name,
217+
label=label,
218+
backend="langfuse",
219+
)
220+
return out
221+
222+
223+
def _chat_segments_from_normalized(
224+
entries: Iterable[dict[str, Any]],
225+
) -> Iterable[ChatSegment]:
226+
"""Map a normalized canonical entry list to OA
227+
:class:`ChatSegment` entries. Placeholder segments use
228+
``model_construct`` so a Langfuse-stored prompt with a
229+
malformed placeholder name (e.g., leading-digit) reaches the
230+
render path before raising — the spec-normative §11 error
231+
trigger. Content segments go through the normal pydantic
232+
constructor since their fields don't carry spec-§11 constraints
233+
that hand-built callers would benefit from catching earlier."""
234+
for entry in entries:
235+
if entry.get("type") == "placeholder":
236+
yield PlaceholderSegment.model_construct(
237+
type="placeholder",
238+
placeholder=entry["name"],
239+
)
240+
else:
241+
yield ContentSegment(role=entry["role"], content=entry["content"])
242+
243+
244+
def _metadata_from(result: TextPromptClient | ChatPromptClient) -> dict[str, Any]:
142245
# Preserve Langfuse-side attribution. `config` is kept whole here
143246
# even though sampling fields are also lifted to `Prompt.sampling`,
144247
# so non-sampling config keys aren't dropped.

0 commit comments

Comments
 (0)