Skip to content

Commit d0bc5fa

Browse files
Add bundled AGENTS.md generator + drift test
The wheel now ships a generated AGENTS.md at the installed package root (openarmature/AGENTS.md) for AI agents working in code that uses openarmature. Sections, in order: version-stamped self- reference header, TL;DR, capability summaries (spec §1+§2 of graph-engine / pipeline-utilities / llm-provider / observability / prompt-management), patterns from docs/patterns/*.md, hand-written non-obvious-shapes recipes, example index (one-liners + paths inside the source tree), discovery footer. Generator at scripts/build_agents_md.py reads from the pinned spec submodule via `git show <sha>:spec/...` (not the working tree) and refuses to regenerate unless the submodule HEAD is reachable from a v* tag — closes the "release ships a bundle pinned to draft spec text" failure mode. Spec text and patterns are pulled verbatim; the non-obvious-shapes file (docs/agent/non-obvious-shapes.md) and the TL;DR (docs/agent/tldr.md) are hand-curated by python. tests/test_agents_md_drift.py regenerates in-memory and diffs against the committed src/openarmature/AGENTS.md, failing the suite when the bundle is stale relative to its sources. Run on every PR via the standard pytest invocation. Bundle is 908 lines / 46KB at v0.22.1 — dense enough to be authoritative without ballooning past useful agent context budgets.
1 parent d7be34c commit d0bc5fa

5 files changed

Lines changed: 1275 additions & 0 deletions

File tree

docs/agent/non-obvious-shapes.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
## Non-obvious shapes
2+
3+
Recipes that aren't deducible from the API surface alone. The primitives docs tell you what's possible; this section tells you what's smart.
4+
5+
### Use the bundled `FilesystemCheckpointer` or `SQLiteCheckpointer`, not a hand-rolled serializer
6+
7+
The temptation when persisting graph state is to `json.dumps(state.model_dump())` and write to a file. Don't. The shipped Checkpointer backends handle every contract `openarmature.checkpoint.Checkpointer` defines — round-trip integrity, `parent_states` for inner-save resume, fan-out progress tracking, schema-version migration, listing by `correlation_id`, `CheckpointRecordInvalid` on shape drift. A hand-rolled serializer that "works" on the happy path silently fails the moment a fan-out crash leaves an in-flight save record, and you'll be debugging it for hours before realizing the bundled backend exists.
8+
9+
If your storage requirement isn't local disk (`FilesystemCheckpointer`) or local SQLite (`SQLiteCheckpointer` — also supports `:memory:` and arbitrary file paths), implement the `Checkpointer` Protocol against your backend rather than wrapping state serialization yourself. Custom backends inherit the spec's correctness contract for free.
10+
11+
### Subgraphs > conditional-edge spaghetti when branches don't share state
12+
13+
A common shape is "after this LLM call, route to either a JSON-extraction node or a tool-dispatch node depending on `finish_reason`." The naive solution is two conditional edges from the LLM node, one to each downstream. That works for two branches; it scales poorly past three.
14+
15+
When the branches operate on different sub-shapes of state — e.g., one path is "extract JSON, then validate" while another is "dispatch tools, loop until done, then summarize" — encapsulate each as a `SubgraphNode` and route from the LLM node to the right subgraph. Each subgraph has its own state schema (projected from the parent), its own entry node, and its own internal topology. The parent graph becomes a switchboard with a few edges; the complexity lives one layer down where it composes cleanly.
16+
17+
### Be explicit with `tool_choice`; don't trust the provider's default
18+
19+
`Provider.complete(messages, tools, tool_choice=...)` accepts `"auto"`, `"required"`, `"none"`, or a `ForceTool(name=...)` record. When you omit `tool_choice`, the OpenAI provider's own default applies — usually `"auto"` when `tools` is non-empty, but documented per-provider. A pipeline that wants deterministic tool-calling (a routing node that MUST produce a tool call, a guarded LLM call that MUST NOT call tools) should pin `tool_choice` explicitly rather than relying on the provider default.
20+
21+
Pre-send validation catches the three §5 failure modes (`required` with empty tools, `ForceTool` with empty tools, `ForceTool.name` not in tools) and raises `ProviderInvalidRequest` before the HTTP call. Not all providers honor `tool_choice` — confirm with your provider's docs — but the OpenAI-compatible mapping is in `OpenAIProvider`.
22+
23+
### Always `await graph.drain()` in short-lived processes; supply a `timeout` if observers might hang
24+
25+
`CompiledGraph.invoke()` returns when the graph reaches END or raises; observer events are dispatched onto a per-invocation queue and delivered by a background worker. The graph's execution loop never awaits observer processing. In a long-running service this is invisible — the worker drains naturally. In a CLI, script, or serverless function, the process exits before the worker finishes, and any late observer events (typically the last node's `completed` event plus any `checkpoint_saved` events) get dropped.
26+
27+
Always call `await graph.drain()` before the short-lived process exits. If your observer set includes anything that might hang (a metrics observer with a flaky network endpoint, an OTel exporter behind a slow OTLP collector), supply a `timeout`:
28+
29+
```python
30+
summary = await graph.drain(timeout=5.0)
31+
if summary.timeout_reached:
32+
log.warning("drain truncated: %d events undelivered", summary.undelivered_count)
33+
```
34+
35+
The compiled graph stays usable for subsequent invocations after a timed-out drain — workers are cancelled cleanly, no partial state leaks.
36+
37+
### Three exception hierarchies; know which one your code catches
38+
39+
`openarmature` exceptions split across three sibling hierarchies:
40+
41+
- `RuntimeGraphError` (in `openarmature.graph`) — node execution failures: `NodeException`, `RoutingError`, `EdgeException`, `ReducerError`, `StateValidationError`. Each has a `category` string matching the spec's canonical error categories.
42+
- `CheckpointError` (in `openarmature.checkpoint`) — persistence failures: `CheckpointNotFound`, `CheckpointSaveFailed`, `CheckpointRecordInvalid`, `CheckpointStateMigrationMissing`, `CheckpointStateMigrationFailed`, `CheckpointStateMigrationChainAmbiguous`.
43+
- `LlmProviderError` (in `openarmature.llm`) — provider call failures: `ProviderAuthentication`, `ProviderInvalidRequest`, `ProviderInvalidResponse`, `ProviderInvalidModel`, `ProviderModelNotLoaded`, `ProviderRateLimit`, `ProviderUnavailable`, `ProviderUnsupportedContentBlock`, `StructuredOutputInvalid`.
44+
45+
Catching `Exception` works but is too broad; catching one hierarchy misses the other two. If you want to branch on category strings (e.g., for retry logic), catch the relevant base — `RuntimeGraphError` covers all five spec runtime categories, `LlmProviderError` covers all nine provider categories, `CheckpointError` covers all six checkpoint categories. The `TRANSIENT_CATEGORIES` frozenset in `openarmature.llm` enumerates which provider categories are retriable.

docs/agent/tldr.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
OpenArmature is a workflow framework for LLM pipelines and tool-calling agents — typed state, compile-time topology checks, observability, and crash-safe checkpoints baked into a graph engine. The graph layer has no concept of LLMs or tools; the same primitives drive deterministic ETL pipelines and tool-calling agents alike. Nodes return partial updates; the engine merges into a frozen state snapshot. Behavior is defined by [openarmature-spec](https://openarmature.ai/capabilities/) and verified by conformance fixtures; this package is the reference Python implementation.
2+
3+
**What OpenArmature is NOT:** not a chat framework (no built-in messages channel), not an LLM SDK (Provider is the abstraction layer; OpenAIProvider is the canonical impl), not a state-management library (state is per-invocation, not application-wide), not an evaluation framework (deferred to `openarmature-eval`).

scripts/build_agents_md.py

Lines changed: 269 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,269 @@
1+
"""Generator for the bundled ``src/openarmature/AGENTS.md`` agent docs.
2+
3+
Pulls from canonical sources (pinned spec submodule, patterns docs,
4+
hand-curated agent docs, example program docstrings) and concatenates
5+
into a single agent-discoverable file shipped in the wheel.
6+
7+
Sources, in order of bundle layout:
8+
9+
1. Self-reference header — version-stamped, pointers out to the docs
10+
site and the spec capabilities page.
11+
2. ``docs/agent/tldr.md`` — hand-written 3-5 sentence orientation.
12+
3. Capability summaries — §1 (Purpose) + §2 (Concepts) of each
13+
capability spec, read from the pinned ``openarmature-spec``
14+
submodule via ``git show <sha>:spec/...`` rather than the
15+
working tree.
16+
4. ``docs/patterns/*.md`` — verbatim concatenation of the patterns
17+
docs (excluding ``index.md``).
18+
5. ``docs/agent/non-obvious-shapes.md`` — hand-written opinionated
19+
recipes.
20+
6. Example index — one-line description + path for each
21+
``examples/*/main.py`` program.
22+
7. Discovery footer — pointer back out to docs / spec / host
23+
project conventions.
24+
25+
Build-time invariants (matches proposal-0028 follow-on review's
26+
submodule-pin discipline):
27+
28+
- Submodule HEAD MUST be reachable from a ``v*`` tag. The build
29+
refuses to read draft (untagged) spec text into a release bundle.
30+
- Spec text is read from the pinned commit via ``git show``, NOT
31+
from the submodule working tree. Closes the "submodule HEAD
32+
moved but bundle still reads stale tree" failure mode.
33+
34+
Drift between the committed bundle and the regenerated output is
35+
caught by ``tests/test_agents_md_drift.py``.
36+
"""
37+
38+
from __future__ import annotations
39+
40+
import subprocess
41+
import sys
42+
from pathlib import Path
43+
44+
# Make ``openarmature`` importable without requiring an editable install
45+
# pass through ``uv`` — the build script runs locally and on CI.
46+
REPO_ROOT = Path(__file__).resolve().parent.parent
47+
sys.path.insert(0, str(REPO_ROOT / "src"))
48+
49+
import openarmature # noqa: E402
50+
51+
SPEC_ROOT = REPO_ROOT / "openarmature-spec"
52+
DOCS = REPO_ROOT / "docs"
53+
EXAMPLES = REPO_ROOT / "examples"
54+
OUTPUT = REPO_ROOT / "src" / "openarmature" / "AGENTS.md"
55+
56+
# Spec capability directory names under ``openarmature-spec/spec/``,
57+
# in the order they appear in the bundle's "Capability contracts"
58+
# section. The order matches the order capabilities were introduced
59+
# (graph-engine first, prompt-management most recent) so an agent
60+
# reading top-down sees the foundational layer before the layers
61+
# built on top.
62+
CAPABILITIES = (
63+
"graph-engine",
64+
"pipeline-utilities",
65+
"llm-provider",
66+
"observability",
67+
"prompt-management",
68+
)
69+
70+
71+
def _git_in_spec(*args: str) -> str:
72+
"""Run ``git -C openarmature-spec <args>`` and return stdout stripped."""
73+
return subprocess.run(
74+
["git", "-C", str(SPEC_ROOT), *args],
75+
capture_output=True,
76+
text=True,
77+
check=True,
78+
).stdout.strip()
79+
80+
81+
def _assert_pin_at_tag() -> str:
82+
"""Confirm the submodule HEAD is at a ``v*`` tag.
83+
84+
Returns the tag name (e.g., ``v0.22.1``). Raises ``RuntimeError``
85+
on a non-tag pin so a release can't accidentally ship a bundle
86+
pinned to a draft spec commit.
87+
"""
88+
sha = _git_in_spec("rev-parse", "HEAD")
89+
tags_out = _git_in_spec("tag", "--points-at", sha, "--list", "v*")
90+
if not tags_out:
91+
raise RuntimeError(
92+
f"submodule HEAD {sha[:8]} is not at a v* tag; "
93+
f"bundle build refuses to read draft (untagged) spec text. "
94+
f"Pin the submodule to a published tag before regenerating."
95+
)
96+
# Prefer the highest version tag if multiple point at the same SHA
97+
# (e.g., during a re-tag) — sort by version-string descending.
98+
tags = sorted(tags_out.splitlines(), reverse=True)
99+
return tags[0]
100+
101+
102+
def _read_pinned_spec(path_in_spec: str) -> str:
103+
"""Read a file from the pinned spec commit via ``git show``.
104+
105+
Distinct from reading the working tree: a stale checkout would
106+
silently produce stale bundle content. ``git show HEAD:<path>``
107+
always reads from the recorded commit.
108+
"""
109+
return subprocess.run(
110+
["git", "-C", str(SPEC_ROOT), "show", f"HEAD:{path_in_spec}"],
111+
capture_output=True,
112+
text=True,
113+
check=True,
114+
).stdout
115+
116+
117+
def _header(version: str, spec_tag: str) -> str:
118+
return (
119+
f"# OpenArmature — Agent documentation\n"
120+
f"\n"
121+
f"*This is the agent guide bundled with the openarmature Python package, "
122+
f"version {version} (spec {spec_tag}). For the full docs site see "
123+
f"[openarmature.ai](https://openarmature.ai). For the canonical spec text see "
124+
f"[openarmature.ai/capabilities](https://openarmature.ai/capabilities/). "
125+
f"For project-specific conventions for the code you're editing, see the host "
126+
f"project's `AGENTS.md` or `CLAUDE.md`.*"
127+
)
128+
129+
130+
def _tldr() -> str:
131+
body = (DOCS / "agent" / "tldr.md").read_text().strip()
132+
return f"## TL;DR\n\n{body}"
133+
134+
135+
def _extract_sections_1_2(spec_text: str) -> str:
136+
"""Extract content between ``## 1.`` and ``## 3.`` (inclusive of §1+§2)."""
137+
out: list[str] = []
138+
in_target = False
139+
for line in spec_text.splitlines():
140+
if line.startswith("## 1."):
141+
in_target = True
142+
elif line.startswith("## 3."):
143+
break
144+
if in_target:
145+
out.append(line)
146+
if not out:
147+
raise RuntimeError(
148+
"spec heading-extraction failed: no `## 1.` heading found. "
149+
"Spec capability may have renumbered; revisit the build script."
150+
)
151+
return "\n".join(out).rstrip()
152+
153+
154+
def _capability_summaries(spec_tag: str) -> str:
155+
sections = [
156+
"## Capability contracts",
157+
"",
158+
f"_Sourced from openarmature-spec {spec_tag}. Each entry below "
159+
f"reproduces §1 (Purpose) and §2 (Concepts) of the capability's "
160+
f"`spec.md`. For the full spec text (execution model, error semantics, "
161+
f"determinism, observer hooks, etc.) see the linked docs site._",
162+
]
163+
for cap in CAPABILITIES:
164+
text = _read_pinned_spec(f"spec/{cap}/spec.md")
165+
sections.append("")
166+
sections.append(f"### Capability: `{cap}`")
167+
sections.append("")
168+
sections.append(_extract_sections_1_2(text))
169+
return "\n".join(sections)
170+
171+
172+
def _patterns() -> str:
173+
sections = [
174+
"## Patterns",
175+
"",
176+
"_Recipes that compose the primitives. Not framework contracts — "
177+
"these are how to do common things idiomatically._",
178+
]
179+
pattern_files = sorted(p for p in (DOCS / "patterns").glob("*.md") if p.name != "index.md")
180+
for pf in pattern_files:
181+
sections.append("")
182+
sections.append(pf.read_text().rstrip())
183+
return "\n".join(sections)
184+
185+
186+
def _non_obvious_shapes() -> str:
187+
# The file's own top-level heading is `## Non-obvious shapes`;
188+
# inlined verbatim with the heading intact.
189+
return (DOCS / "agent" / "non-obvious-shapes.md").read_text().rstrip()
190+
191+
192+
def _extract_first_docstring_paragraph(source: str) -> str:
193+
"""Extract the first paragraph of a Python module docstring.
194+
195+
Module docstrings open with a triple-quoted string at line 0.
196+
The first "paragraph" is the text from the opening quotes to
197+
the first blank line within the docstring (or to the closing
198+
quotes if the docstring is one paragraph).
199+
"""
200+
lines = source.splitlines()
201+
if not lines or not lines[0].startswith('"""'):
202+
return ""
203+
# First line after the opening triple-quote
204+
first_text = lines[0][3:].rstrip()
205+
if first_text.endswith('"""'):
206+
return first_text[:-3].rstrip()
207+
para = [first_text] if first_text else []
208+
for line in lines[1:]:
209+
stripped = line.strip()
210+
if stripped == "" or stripped.startswith('"""') or stripped.endswith('"""'):
211+
break
212+
para.append(stripped)
213+
return " ".join(p for p in para if p)
214+
215+
216+
def _example_index() -> str:
217+
sections = [
218+
"## Example index",
219+
"",
220+
"_Runnable example programs shipped in the source tree at `examples/`. "
221+
"The full code is not bundled here (each example is 300+ lines); read "
222+
"the file at the listed path to see the canonical shape for that use case._",
223+
"",
224+
]
225+
for ex in sorted(EXAMPLES.glob("*/main.py")):
226+
first_paragraph = _extract_first_docstring_paragraph(ex.read_text())
227+
rel = ex.relative_to(REPO_ROOT)
228+
sections.append(f"- **`{rel}`** — {first_paragraph}")
229+
return "\n".join(sections)
230+
231+
232+
def _discovery_footer() -> str:
233+
return (
234+
"## Discovery cross-references\n"
235+
"\n"
236+
"If your question isn't covered above, look here:\n"
237+
"\n"
238+
"- **Full docs site:** [openarmature.ai](https://openarmature.ai)\n"
239+
"- **Spec text:** [openarmature.ai/capabilities](https://openarmature.ai/capabilities/)\n"
240+
"- **API reference:** [openarmature.ai/reference](https://openarmature.ai/reference/)\n"
241+
"- **Host project conventions:** the project's own `AGENTS.md` / `CLAUDE.md`\n"
242+
)
243+
244+
245+
def build() -> str:
246+
spec_tag = _assert_pin_at_tag()
247+
version = openarmature.__version__
248+
sections = [
249+
_header(version, spec_tag),
250+
_tldr(),
251+
_capability_summaries(spec_tag),
252+
_patterns(),
253+
_non_obvious_shapes(),
254+
_example_index(),
255+
_discovery_footer(),
256+
]
257+
return "\n\n".join(sections) + "\n"
258+
259+
260+
def main() -> None:
261+
content = build()
262+
OUTPUT.write_text(content)
263+
line_count = content.count("\n")
264+
byte_count = len(content.encode("utf-8"))
265+
print(f"wrote {OUTPUT.relative_to(REPO_ROOT)}: {line_count} lines, {byte_count:,} bytes")
266+
267+
268+
if __name__ == "__main__":
269+
main()

0 commit comments

Comments
 (0)