Skip to content

Commit ad2e896

Browse files
docs: prompts concept page, API reference, changelog
docs/concepts/prompts.md walks through the prompt-management capability: the fetch + render split (and why both, not just get()), Prompt identity fields, strict-by-default variables, composite-backend fallback (PromptStoreUnavailable continues, PromptNotFound stops), the three error categories, PromptGroup for tracing related prompts, observability propagation via with_active_prompt and the six normative openarmature.prompt.* attributes, determinism + content-addressed caching, a minimal example, and what's out of scope (vendor backends, versioning workflows, cache invalidation, multi-message decomposition). docs/reference/prompts.md is an mkdocstrings autodoc page in the same shape as docs/reference/llm.md. mkdocs.yml gains the two new pages in the Concepts and Reference nav sections. CHANGELOG.md adds two entries under [Unreleased]: - the new openarmature.prompts subpackage with PromptManager, the three error categories, FilesystemPromptBackend, and the jinja2>=3.1 runtime dependency. - the observability propagation surface in openarmature.prompts.context plus the OTel observer wiring.
1 parent 853b6d5 commit ad2e896

4 files changed

Lines changed: 298 additions & 0 deletions

File tree

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The
88

99
### Added
1010

11+
- **Prompt-management capability (proposal 0017, introduced in spec v0.15.0).** New `openarmature.prompts` subpackage. `PromptManager` composes one or more `PromptBackend`s, exposes `fetch` / `render` / `get`, applies the §8 fallback semantics (`prompt_store_unavailable` continues to the next backend; `prompt_not_found` stops the chain), and renders templates with Jinja2's `StrictUndefined` per §7. `Prompt` / `PromptResult` / `PromptGroup` are Pydantic models matching spec §3 / §4 / §9. Three error categories (`PromptNotFound`, `PromptRenderError`, `PromptStoreUnavailable`) with `PROMPT_TRANSIENT_CATEGORIES` exported for retry-middleware classifiers. `FilesystemPromptBackend` is the minimum local-filesystem reference backend (layout: `<root>/<label>/<name>.j2`; `version` derived from the first 12 chars of `template_hash`). New runtime dependency: `jinja2>=3.1`.
12+
- **`openarmature.prompts.context` — observability propagation per spec §11.** `with_active_prompt(result)` and `with_active_prompt_group(group)` context managers + `current_prompt_result()` / `current_prompt_group()` inspectors. When the OTel observer is active and an LLM call fires inside `with_active_prompt`, the `openarmature.llm.complete` span carries the normative `openarmature.prompt.*` attributes (`name`, `version`, `label`, `template_hash`, `rendered_hash`, `group_name`). Nesting is innermost-wins.
1113
- **Image content blocks for user messages (proposal 0015, introduced in spec v0.13.0).** `UserMessage.content` now accepts `str | list[ContentBlock]`. The block surface introduces `TextBlock`, `ImageBlock`, `ImageSourceURL`, `ImageSourceInline`, and the `ContentBlock` / `ImageSource` discriminated unions over the block / source `type` field. `ImageBlock` carries a `media_type` (required for inline sources; ignored for URL sources; typed as `str | None` so callers MAY pass any `image/*` type the bound model supports) and an optional `detail` hint (`"auto"` / `"low"` / `"high"`; `None` default omits the field from the wire so providers apply their own default). System, assistant, and tool messages stay text-string-only; image inputs are user-only in v1.
1214
- **`OpenAIProvider` content-array wire mapping.** When `UserMessage.content` is a content-block sequence, the wire body uses OpenAI's `content` array per §8.1.1. `TextBlock → {type: "text", text}`. `ImageBlock` with a URL source maps to `{type: "image_url", image_url: {url, detail?}}`. `ImageBlock` with an inline source constructs an RFC 2397 `data:<media_type>;base64,<base64_data>` URI and goes through the same `image_url` entry shape. Inline bytes pass through unchanged — no inspection, transcoding, or re-encoding.
1315
- **New error category `ProviderUnsupportedContentBlock` (non-transient).** Raised when the bound model rejects a content block type / media variant. Distinct from `ProviderInvalidRequest` (which covers spec-shape malformation): this category surfaces a *capability* mismatch, letting callers route differently (e.g., fall back to a multimodal-capable provider) without overloading the malformed-request category. Carries `block_type` ("image" / "audio" / "video") and `reason` (provider's human-readable message) when those are recoverable from the rejection. `OpenAIProvider` detects content rejection via HTTP 400 bodies — heuristic on `error.code` (known set: `image_content_not_supported`, `unsupported_image_media_type`, `audio_content_not_supported`, etc.), `error.type` (`image_parse_error`), and `error.message` ("does not support" + image/audio/video).

docs/concepts/prompts.md

Lines changed: 287 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,287 @@
1+
# Prompts
2+
3+
Named, versioned, content-addressed prompts. OpenArmature's
4+
prompt-management capability separates *fetching* a template
5+
from *rendering* it, lets you compose multiple backends with
6+
explicit fallback, and propagates prompt identity to your
7+
observability backend so trace UIs can pivot on the prompt
8+
that produced a call.
9+
10+
Skip ahead to [a minimal example](#a-minimal-example) if you
11+
want code first.
12+
13+
## The two halves: fetch and render
14+
15+
A `PromptBackend` knows how to find a template by `name` and
16+
`label`; nothing more. A `PromptManager` composes one or more
17+
backends and adds rendering on top:
18+
19+
```python
20+
from openarmature.prompts import PromptManager, FilesystemPromptBackend
21+
22+
manager = PromptManager(FilesystemPromptBackend("./prompts"))
23+
24+
# Fetch returns a Prompt (the raw template + identity metadata).
25+
prompt = await manager.fetch("greeting", "production")
26+
27+
# Render applies variables and returns a PromptResult (the
28+
# rendered messages plus a content-addressed identity).
29+
result = manager.render(prompt, {"user": "Alice"})
30+
31+
# Or do both in one shot:
32+
result = await manager.get("greeting", "production", {"user": "Alice"})
33+
```
34+
35+
Why two operations instead of one? Three reasons:
36+
37+
- **Inspect templates without binding variables.** Schema
38+
validation, prompt diffing, tooling that walks the prompt
39+
catalogue.
40+
- **Cache templates separately from rendered output.** The
41+
fetch step is the I/O step; rendering is pure local
42+
computation.
43+
- **Render the same template with different variables in
44+
tight loops.** Map-reduce over chunks, batch evaluation,
45+
fan-out fixtures.
46+
47+
The convenience `get()` operation gives you the single-call
48+
shape when you want it without removing the separability.
49+
50+
## Prompt identity
51+
52+
Every `Prompt` carries five identity fields:
53+
54+
- `name` — your stable identifier (`"greeting"`).
55+
- `version` — the backend's version string. Implementation-defined:
56+
a backend MAY use semver, monotonic integers, content
57+
hashes, git short-SHAs, or any stable identifier. The
58+
filesystem backend derives it from the template content
59+
hash.
60+
- `label` — the slot the prompt was fetched from
61+
(`"production"`, `"latest"`, `"variant-a"`). The label is
62+
part of the query.
63+
- `template_hash` — SHA-256 of the raw template source.
64+
Two prompts with different content always have different
65+
hashes.
66+
- `fetched_at` — when the prompt was fetched. Cached
67+
backends preserve the original fetch time, not the
68+
cache-hit time.
69+
70+
The `name + version + label` triple identifies the prompt;
71+
the `template_hash` lets you tell two prompts apart by
72+
*content*, which matters when a vendor backend serves
73+
different content under the same `latest` label over time.
74+
75+
A `PromptResult` propagates all of those, plus:
76+
77+
- `rendered_hash` — SHA-256 over the rendered messages.
78+
Same template + same variables → same hash. This is the
79+
cache-key value a memoization layer wants.
80+
- `messages` — the rendered output as an LLM-ready
81+
`list[Message]`. Directly consumable by
82+
`Provider.complete()`.
83+
- `variables` — what was applied. Audit-trail friendly.
84+
- `rendered_at` — when the render happened. Distinct from
85+
`fetched_at`.
86+
87+
## Strict variables by default
88+
89+
A template that references a variable not in the mapping
90+
raises `PromptRenderError`:
91+
92+
```python
93+
prompt = await manager.fetch("greeting", "production") # "Hello, {{ user }}! Today is {{ day }}."
94+
manager.render(prompt, {"user": "Alice"}) # raises — "day" is undefined
95+
```
96+
97+
This is intentional. Silently substituting empty strings for
98+
missing variables masks bugs: a typo'd variable name produces
99+
a working-but-wrong prompt, often invisibly. If you need
100+
lenient behavior, wrap your variables in your own defaulting
101+
layer before passing them to `render()`.
102+
103+
The Python implementation uses Jinja2's `StrictUndefined`.
104+
105+
## Composite backends and fallback
106+
107+
A manager constructed with multiple backends consults them in
108+
order. The fallback rule distinguishes infrastructure failure
109+
from logical absence:
110+
111+
```python
112+
from openarmature.prompts import PromptManager
113+
from openarmature_langfuse import LangfusePromptBackend # hypothetical sibling
114+
115+
manager = PromptManager(
116+
LangfusePromptBackend(api_key=...),
117+
FilesystemPromptBackend("./prompts"), # local fallback
118+
)
119+
```
120+
121+
- **`PromptStoreUnavailable` from a backend → try the next.**
122+
Network's down, vendor API is 5xx-ing, filesystem hiccupped —
123+
the manager falls back. This is the "Langfuse is degraded,
124+
use the local copy" case.
125+
- **`PromptNotFound` from a backend → STOP the chain.** The
126+
error propagates. This is the "operator deliberately
127+
deleted the prompt from Langfuse to retire it" case —
128+
falling back here would silently resurface a stale local
129+
copy under a name the operator wanted gone.
130+
- **All backends `PromptStoreUnavailable` → manager raises
131+
`PromptStoreUnavailable`.** Everything's down.
132+
133+
The two error categories have different operational
134+
meanings; the manager keeps them separated.
135+
136+
## Errors
137+
138+
Three categories cover every failure mode:
139+
140+
| Error | When | Transient |
141+
| ------------------------- | ------------------------------------------------------------------- | --------- |
142+
| `PromptNotFound` | No prompt matches `(name, label)` in any backend (after §8 rules) | No |
143+
| `PromptRenderError` | Undefined variable, template parse error, coercion failure | No |
144+
| `PromptStoreUnavailable` | Backend infrastructure failure (network, I/O, vendor API) | Yes |
145+
146+
`PROMPT_TRANSIENT_CATEGORIES` is exported as a frozenset for
147+
retry-middleware classifiers — the same pattern
148+
`openarmature.llm` uses with its `TRANSIENT_CATEGORIES`.
149+
150+
## PromptGroup — tracing related prompts together
151+
152+
A `PromptGroup` is a structural grouping of two or more
153+
`PromptResult` instances under a stable `group_name`. The
154+
group itself doesn't execute anything; it gives observability
155+
a shared name to render related calls under.
156+
157+
```python
158+
from openarmature.prompts import PromptGroup, with_active_prompt_group
159+
160+
classify = await manager.get("classify", variables={"input": user_query})
161+
answer = await manager.get("answer", variables={"input": user_query, ...})
162+
163+
group = PromptGroup(group_name="classifier_chain", members=[classify, answer])
164+
with with_active_prompt_group(group):
165+
# Every LLM call in this scope carries
166+
# openarmature.prompt.group_name="classifier_chain".
167+
classification = await provider.complete(classify.messages, ...)
168+
final = await provider.complete(answer.messages, ...)
169+
```
170+
171+
Canonical patterns the primitive covers:
172+
173+
- **Multi-stage classification**`[coarse, fine, answer]`.
174+
- **RAG with reranking**`[query_rewrite, retrieve, rerank, answer]`.
175+
- **Self-correction loops**`[generate, critique, revise]`.
176+
- **Map-reduce over chunks**`[chunk_classify_1..N, synthesize]`.
177+
178+
The N=2 case ("classifier + follow-up") is the simplest;
179+
larger groups work under the same primitive. The group rejects
180+
empty and single-member shapes — single-prompt tagging is
181+
already served by the per-prompt observability attributes
182+
below.
183+
184+
## Observability propagation
185+
186+
When an LLM call fires inside `with_active_prompt(result)` (or
187+
`with_active_prompt_group(group)`), the OTel observer surfaces
188+
six normative attributes on the `openarmature.llm.complete`
189+
span:
190+
191+
- `openarmature.prompt.name`
192+
- `openarmature.prompt.version`
193+
- `openarmature.prompt.label`
194+
- `openarmature.prompt.template_hash`
195+
- `openarmature.prompt.rendered_hash`
196+
- `openarmature.prompt.group_name`
197+
198+
Pattern:
199+
200+
```python
201+
result = await manager.get("greeting", "production", {"user": "Alice"})
202+
with with_active_prompt(result):
203+
response = await provider.complete(result.messages, ...)
204+
```
205+
206+
Trace UIs can then pivot on `prompt.name`, filter on
207+
`prompt.template_hash` to find every call that used a given
208+
template version, or surface `prompt.group_name` to group
209+
related calls into a single workflow view.
210+
211+
Nesting is innermost-wins. If you activate a result inside
212+
another active result, the inner one wins for the duration
213+
of the inner block.
214+
215+
## Determinism and content-addressed caching
216+
217+
`render` is deterministic: same `Prompt`, same `variables`
218+
bytewise-identical `messages` and `rendered_hash` across
219+
calls. This is the cache-key contract — `rendered_hash`
220+
gives a downstream memoization layer the right equivalence
221+
relation for free.
222+
223+
Templates MAY reference user-supplied variables that capture
224+
nondeterministic values (`now=datetime.utcnow()`); the
225+
determinism contract applies to the render operation given
226+
fixed inputs, not to user-supplied variable content.
227+
228+
## A minimal example
229+
230+
```python
231+
import asyncio
232+
from pathlib import Path
233+
234+
from openarmature.prompts import (
235+
FilesystemPromptBackend,
236+
PromptManager,
237+
with_active_prompt,
238+
)
239+
240+
241+
async def main() -> None:
242+
manager = PromptManager(FilesystemPromptBackend(Path("./prompts")))
243+
result = await manager.get(
244+
"greeting",
245+
"production",
246+
variables={"user": "Alice"},
247+
)
248+
print(result.messages[0].content) # rendered text
249+
print(result.rendered_hash) # cache key
250+
# Run an LLM call inside the active-prompt context so the
251+
# OTel observer can surface prompt.* span attributes.
252+
# with with_active_prompt(result):
253+
# response = await provider.complete(result.messages)
254+
_ = with_active_prompt # marker for the snippet above
255+
256+
257+
asyncio.run(main())
258+
```
259+
260+
The filesystem backend layout is
261+
`<root>/<label>/<name>.j2` — for the example above,
262+
`./prompts/production/greeting.j2`.
263+
264+
## What's out of scope (for now)
265+
266+
- **Specific vendor backends** — Langfuse, PromptLayer, etc.,
267+
ship as sibling packages (`openarmature-langfuse`, …). The
268+
core ships the protocol + a filesystem reference.
269+
- **Prompt versioning workflows** — how versions are assigned,
270+
promoted, pinned. Per project. The spec defines the
271+
`version` field; the discipline is yours.
272+
- **Cache invalidation policies**`template_hash` and
273+
`rendered_hash` are the keys; the cache itself is a
274+
separate concern.
275+
- **Prompt linting / evaluation** — quality checks belong to
276+
separate tools (or the future eval capability).
277+
- **Multi-message render decomposition** — v1 emits a single
278+
`UserMessage` carrying the rendered text. If you need
279+
`system + user` splits, construct the messages list
280+
manually outside `render()` for now.
281+
282+
## Where to next
283+
284+
- **[Model Providers](../model-providers/index.md)**
285+
what to pass `result.messages` into.
286+
- **[API reference: `openarmature.prompts`](../reference/prompts.md)**
287+
the full public surface.

docs/reference/prompts.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# openarmature.prompts
2+
3+
::: openarmature.prompts
4+
options:
5+
show_root_heading: false
6+
show_source: false
7+
heading_level: 2

mkdocs.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,7 @@ nav:
9595
- Composition: concepts/composition.md
9696
- Fan-out: concepts/fan-out.md
9797
- LLMs: concepts/llms.md
98+
- Prompts: concepts/prompts.md
9899
- Observability: concepts/observability.md
99100
- Checkpointing: concepts/checkpointing.md
100101
- Model Providers:
@@ -104,6 +105,7 @@ nav:
104105
- reference/index.md
105106
- openarmature.graph: reference/graph.md
106107
- openarmature.llm: reference/llm.md
108+
- openarmature.prompts: reference/prompts.md
107109
- openarmature.checkpoint: reference/checkpoint.md
108110
- openarmature.observability: reference/observability.md
109111

0 commit comments

Comments
 (0)