Bump spec to v0.20.1; refresh docs and CHANGELOG

chris-colinsky · chris-colinsky · commit f06ad6257611 · 2026-05-25T16:03:14.000-07:00
Submodule pin advances from v0.19.0 to v0.20.1, absorbing proposal
0025 (tool_choice, v0.20.0) and proposal 0026 (§8.X wire-format
mapping subsection template, v0.20.1). 0026 is purely textual; the
existing OpenAI §8.1 mapping is the template's reference shape so
no python module-level work is needed. Runtime spec_version pins in
pyproject and __init__ updated to match; smoke test asserts v0.20.1.

CHANGELOG Unreleased section gains tool_choice + ForceTool +
ToolChoice + validate_tool_choice entries under Added; the
Provider.complete() signature extension noted under Changed;
cumulative pin-bump summary updated to v0.17.0 -&gt; v0.20.1 across
six spec versions absorbed this cycle.

docs/concepts/llms.md gains a "Controlling tool-call behavior with
tool_choice" subsection under Tool calling, covering the four modes,
the three pre-send validation rules, and the cross-provider caveat
(not all providers honor tool_choice).
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -8,6 +8,9 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The
 
 ### Added
 
+- **`tool_choice` parameter on `Provider.complete()`** (proposal 0025, accepted in spec v0.20.0). Optional discriminated-union value constraining the model's tool-calling behavior — one of `"auto"`, `"required"`, `"none"`, or a `ForceTool(name=...)` record. Validation runs pre-send: `"required"` and `ForceTool` both demand non-empty `tools`, and `ForceTool.name` must appear in the supplied list; violations raise `ProviderInvalidRequest` (§7's existing category — no new error category). When `tool_choice` is `None` (the default) the wire field is omitted and the provider's own default applies, preserving pre-0025 behavior exactly. The `OpenAIProvider` maps the spec shape onto OpenAI's wire shape per §8.1.1 (the `ForceTool.type="tool"` renames to wire `type="function"`).
+- **`ForceTool` and `ToolChoice` public types** at `openarmature.llm.ForceTool` / `openarmature.llm.ToolChoice`. `ForceTool` is a frozen Pydantic model with `type: Literal["tool"] = "tool"` and `name: str`; `ToolChoice = Literal["auto", "required", "none"] | ForceTool` is the type alias used in `Provider.complete()`'s signature.
+- **`validate_tool_choice` public validator** at `openarmature.llm.validate_tool_choice`. Standalone validator covering the three §5 pre-send rules; useful for third-party `Provider` implementations that want to reuse the canonical validation logic.
 - **Bounded drain timeout on `CompiledGraph.drain()`** (proposal 0010, accepted in spec v0.19.0). `drain()` accepts an optional `timeout: float | None = None` parameter (non-negative seconds). When supplied, drain returns no later than the deadline; any observer events still queued or in-flight are reported as undelivered. Workers are cancelled cleanly so the compiled graph remains usable for subsequent invocations — partial delivery state from one drain does NOT leak into the next. Solves the "slow / hung / misbehaving observer blocks process exit" footgun for short-lived processes (CLIs, scripts, serverless functions). Observers SHOULD be cancellation-safe (idempotent writes, `try/finally` cleanup); the spec doesn't mandate it but the docs recommend it.
 - **`DrainSummary` frozen dataclass** at `openarmature.graph.DrainSummary`. Returned from every `drain()` call (with or without `timeout`). Fields: `undelivered_count: int`, `timeout_reached: bool`. The shape is consistent across timed and untimed drains — callers receive the same dataclass whether the timeout was supplied or not. Per the v0.19.0 contract the two declared fields are the spec-mandated minimum; richer diagnostic detail (per-observer counts, sampled event metadata) is reserved for follow-on PRs.
 - **Per-instance fan-out resume contract** (proposal 0009, accepted in spec v0.18.0). The engine now writes a checkpoint record at every `completed` event inside a fan-out instance (in addition to the existing outermost-graph + subgraph-internal + fan-out node completion saves). On resume the engine consults the saved record's `fan_out_progress` field and treats each instance as `completed` (skip, contribution rolls forward), `in_flight` (re-run from subgraph entry), or `not_started` (dispatch normally). The `append` reducer's no-double-merge guarantee holds across resume because `completed` is a one-shot accumulator state.
@@ -17,13 +20,14 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The
 
 ### Changed
 
+- **`Provider.complete()` signature** extended with an optional `tool_choice: ToolChoice | None = None` parameter (per proposal 0025 v0.20.0). Backward-compatible: callers that omit the new argument see no wire-shape change. Third-party `Provider` implementations that don't add the parameter still structurally satisfy the `Provider` Protocol; their wire path silently drops `tool_choice`. The OpenAI mapping is in `OpenAIProvider`; future Anthropic / Gemini providers will follow the §8.X template (proposal 0026) and ship their own `tool_choice` wire mapping.
 - **`CompiledGraph.drain()` return type** changed from `None` to `DrainSummary` (pre-1.0; per proposal 0010 v0.19.0 contract). Callers that ignored the return are unaffected — `await graph.drain()` discards the returned dataclass exactly as before. Callers that explicitly typed the return as `None` will need to update their annotation.
 - **Fan-out resume behavior** flipped from atomic restart (0008's v1 contract) to per-instance resume. A crash mid-fan-out used to re-run the entire fan-out on resume; now only the instances that did not complete-and-record their contribution re-run. The economics matter for large fan-outs of expensive work (LLM calls, long extractions): an 80% complete fan-out crash now restores 80% of its results rather than discarding them.
 - **`SQLiteCheckpointer` schema** picks up a new `fan_out_progress_blob` column (added via `ALTER TABLE` for backward compatibility with pre-0009 databases). Pre-0009 rows back-fill as NULL on load and round-trip as the empty-tuple default. Both `pickle` and `json` serialization modes round-trip the new field.
 
 ### Notes
 
-- **Pinned spec version bumped from v0.17.0 to v0.19.0 over this Unreleased cycle.** Four spec versions absorbed: v0.17.1 (proposal 0019, multi-provider wire-format extension; purely textual reframe of llm-provider §8 as a catalog of wire-format mappings, OpenAI-compatible body nested under §8.1, code references updated to §8.1 / §8.1.1 / §8.1.2 / §8.1.3 / §8.1.5.1 / §8.1.1.1), v0.18.0 (proposal 0009, per-instance fan-out resume; pipeline-utilities §10.3 / §10.7 revised, §10.11 added with per-instance state machine plus composition rules plus configurable batching; the `append` reducer no-double-merge invariant from §10.11.1 is the load-bearing correctness story; see Added / Changed above), v0.18.1 (fixture-only patch on `release/v0.18.1` correcting an off-by-one literal in fixture 052's expected `results`), and v0.19.0 (proposal 0010, bounded drain timeout; graph-engine §6 amended with the `timeout` parameter and `DrainSummary` return contract; see Added / Changed above). All existing conformance fixtures continue to pass.
+- **Pinned spec version bumped from v0.17.0 to v0.20.1 over this Unreleased cycle.** Six spec versions absorbed: v0.17.1 (proposal 0019, multi-provider wire-format extension — purely textual reframe of llm-provider §8 as a catalog of wire-format mappings; OpenAI-compatible body nested under §8.1), v0.18.0 (proposal 0009, per-instance fan-out resume — pipeline-utilities §10.3 / §10.7 revised, §10.11 added; the `append` reducer no-double-merge invariant is the load-bearing correctness story), v0.18.1 (fixture-only patch correcting an off-by-one literal in fixture 052's expected `results`), v0.19.0 (proposal 0010, bounded drain timeout — graph-engine §6 amended with the `timeout` parameter and `DrainSummary` return contract), v0.20.0 (proposal 0025, llm-provider `tool_choice` — §5 / §7 / §8.1.1 amended; see Added / Changed above), and v0.20.1 (proposal 0026, llm-provider §8.X wire-format mapping subsection template — purely textual §8 framing paragraph; the existing OpenAI §8.1 mapping is the template's reference shape so no python module-level work was needed). All existing conformance fixtures continue to pass.
 
 ## [0.8.0] — 2026-05-23
 
diff --git a/docs/concepts/llms.md b/docs/concepts/llms.md
@@ -273,6 +273,47 @@ prevents runaway loops on a model that stays in tool-calling forever.
 See [`09 - Tool use`](../examples/09-tool-use.md) for the runnable
 shape.
 
+### Controlling tool-call behavior with `tool_choice`
+
+By default the model decides whether and which tools to call.
+`tool_choice` constrains that decision per call. Four modes:
+
+- `"auto"` — the model decides. Equivalent to omitting the parameter
+  when `tools` is non-empty.
+- `"required"` — the model MUST call at least one tool. Useful for
+  routing nodes that branch on tool selection.
+- `"none"` — the model MUST NOT call tools, even if `tools` is
+  supplied. Useful for guarded LLM calls or for explicitly disabling
+  tool-calling without rebuilding a tools-less request.
+- `ForceTool(name=...)` — the model MUST call the named tool exactly.
+
+Pre-send validation catches the three failure modes (`required` with
+empty tools, `ForceTool` with empty tools, `ForceTool.name` not in
+the supplied list) and raises `ProviderInvalidRequest` before the
+HTTP request is sent.
+
+```python
+from openarmature.llm import ForceTool
+
+# Routing node: model MUST pick one of the supplied tools.
+response = await provider.complete(
+    messages, tools=[search, summarize], tool_choice="required"
+)
+
+# Forced specific tool: useful when the pipeline knows which tool
+# the model should call next (e.g., a `dispatch_search` node).
+response = await provider.complete(
+    messages, tools=[search, summarize], tool_choice=ForceTool(name="search")
+)
+```
+
+Not all providers honor `tool_choice` — confirm with your provider's
+documentation. The `OpenAIProvider` maps the spec shape onto OpenAI's
+wire shape per the §8.1.1 mapping table. Whether the model actually
+honored the constraint is observable from the returned
+`finish_reason` and `tool_calls` fields; the framework does NOT
+re-validate the response against the constraint.
+
 ## Content blocks (multimodal user messages)
 
 User messages carry content in one of two shapes: a plain text string,
diff --git a/openarmature-spec b/openarmature-spec
@@ -1 +1 @@
-Subproject commit 5acaccc9f6432ccb96d6c876e1a5b31fb3b55994
+Subproject commit bd9e62d94a5d7082136cc675362a22c4f5fe31a1
diff --git a/pyproject.toml b/pyproject.toml
@@ -48,7 +48,7 @@ Repository = "https://github.com/LunarCommand/openarmature-python"
 Specification = "https://github.com/LunarCommand/openarmature-spec"
 
 [tool.openarmature]
-spec_version = "0.19.0"
+spec_version = "0.20.1"
 
 [dependency-groups]
 dev = [
diff --git a/src/openarmature/__init__.py b/src/openarmature/__init__.py
@@ -1,4 +1,4 @@
 """OpenArmature: workflow framework for LLM pipelines and tool-calling agents."""
 
 __version__ = "0.8.0"
-__spec_version__ = "0.19.0"
+__spec_version__ = "0.20.1"
diff --git a/tests/test_smoke.py b/tests/test_smoke.py
@@ -9,7 +9,7 @@
 
 def test_package_versions() -> None:
     assert openarmature.__version__ == "0.8.0"
-    assert openarmature.__spec_version__ == "0.19.0"
+    assert openarmature.__spec_version__ == "0.20.1"
 
 
 def test_spec_version_matches_pyproject() -> None: