diff --git a/CHANGELOG.md b/CHANGELOG.md index edea648..f28d750 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -18,7 +18,7 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The - **Observer privacy flag `disable_llm_payload` renamed to `disable_provider_payload`** (proposal 0059, observability §5.5.4, spec v0.54.0). The observer-level flag on both bundled observers (`OTelObserver` and `LangfuseObserver`) is renamed, and its scope broadens from LLM-completion payload to any provider-call payload (LLM completion today; embedding and rerank when those land). This is a breaking change to both observer constructors: config passing `disable_llm_payload=True` (or `False`) updates to `disable_provider_payload=...` with no other change. The default stays `True` (payload suppressed), and the gating behavior for `LlmCompletionEvent` / `LlmFailedEvent` rendering is unchanged at every existing site. The rename is the only part of proposal 0059 adopted this cycle: the retrieval-provider capability itself (the `EmbeddingProvider` protocol, the `EmbeddingEvent` / `EmbeddingFailedEvent` typed variants, and the embedding span / observation mapping) is not yet implemented and rides as `not-yet` in `conformance.toml`. The §5.5.4 rename touches existing LLM-payload gating, so it lands with the pin. - **Fan-out failure-isolation degrade contribution implemented** (proposal 0066, pipeline-utilities §9.3 / §9.8 / §11.7, spec v0.56.0). When `FailureIsolationMiddleware` degrades a fan-out instance, that instance is a success whose contribution is its `degraded_update`, read in subgraph-field-name space and never merged onto the failed instance's pre-failure state. This also fixes a latent bug: an instance `degraded_update`'s `extra_outputs` values were previously looked up by the parent field name and silently dropped (`collect_field` was unaffected). A static `degraded_update` that omits the node's `collect_field` is now a compile-time error (`FanOutDegradedUpdateMissingCollectField`); a callable `degraded_update` that omits it yields a graceful null slot rather than raising, preserving one collection slot per item. The parallel-branches counterpart (a branch `degraded_update` omitting a projected `outputs` field skips that field) was already correct as of the parallel-branches fix above and is now pinned by fixture 065. Success-path and resume behavior for correctly-configured fan-outs is unchanged. - **Failure-isolation events carry the full structured cause chain** (proposal 0068, pipeline-utilities §6.3, spec v0.57.0). `FailureIsolatedEvent.caught_exception` gains a `chain`: an ordered list of `CauseLink` records (each carrying `category`, `message`, and a `carrier` flag), from the caught exception (outermost) to the originating raise (innermost), with graph-engine `node_exception` carrier wrappers flagged `carrier=True`. The existing `category` and `message` are retained and redefined as a derivation over the chain: the category of the outermost non-carrier link whose category is a non-empty string (else `category` is `null` and `message` is the outermost non-carrier link's message). This supersedes proposal 0065's single "originating cause" representation, which was ambiguous once the post-carrier chain held more than one non-carrier link; the derivation reproduces 0065's single-carrier values, so fixture 064 is unchanged. A new `CauseLink` type is exported from `openarmature.graph`. The bundled OTel and Langfuse observers continue to render the derived `category`; surfacing the full chain is left to custom observers. The change is additive to the event shape, and catch/degrade behavior is unchanged. Conformance fixture 066 (three cases: an instance-site carrier chain, a node-level single non-carrier link, and an uncategorized null-category cause) passes. -- **Pinned spec advances v0.53.0 → v0.58.0 across the v0.14.0 cycle**, in five steps: v0.54.0 (proposal 0059, the observer-flag rename above), v0.55.1 (proposal 0065 above; the v0.55.1 patch also carries an observability §11 span-links text reconciliation that narrows an *Out of scope* bullet, with no python-observable change), v0.56.0 (proposal 0066, the fan-out degrade contribution above), v0.57.0 (proposal 0068, the failure-isolation cause chain above), and v0.58.0 (proposal 0070, conformance-adapter crash-injection and cause-chaining test vocabulary: a `crash_injection` directive and a recursive mock `cause`, with conformance fixtures 067 and 068, no library behavior change). `conformance.toml` records 0065, 0066, 0068, and 0070 as `implemented` and 0059 as `not-yet` (only its cross-spec flag rename was adopted). +- **Pinned spec advances v0.53.0 → v0.59.0 across the v0.14.0 cycle**, in six steps: v0.54.0 (proposal 0059, the observer-flag rename above), v0.55.1 (proposal 0065 above; the v0.55.1 patch also carries an observability §11 span-links text reconciliation that narrows an *Out of scope* bullet, with no python-observable change), v0.56.0 (proposal 0066, the fan-out degrade contribution above), v0.57.0 (proposal 0068, the failure-isolation cause chain above), v0.58.0 (proposal 0070, conformance-adapter crash-injection and cause-chaining test vocabulary: a `crash_injection` directive and a recursive mock `cause`, with conformance fixtures 067 and 068, no library behavior change), and v0.59.0 (proposal 0069, fan-out degrade contribution refinements to 0066: an omitted `extra_outputs` source is a positional null slot, an absent `collect_field` is a null slot the fan-in does not raise on except under a strict-element reducer, and a degraded slot survives resume; python already satisfied these, so the change is conformance coverage via fixture 069 plus a strict-reducer unit test, no library behavior change). `conformance.toml` records 0065, 0066, 0068, 0070, and 0069 as `implemented` and 0059 as `not-yet` (only its cross-spec flag rename was adopted). ### Fixed diff --git a/conformance.toml b/conformance.toml index be63bf1..37b5ab1 100644 --- a/conformance.toml +++ b/conformance.toml @@ -32,7 +32,7 @@ [manifest] implementation = "openarmature-python" -spec_pin = "v0.58.0" +spec_pin = "v0.59.0" # Status values: # implemented — shipped behavior matches the proposal's contract @@ -663,3 +663,19 @@ since = "0.14.0" [proposals."0070"] status = "implemented" since = "0.14.0" + +# Spec v0.59.0 (proposal 0069). Fan-out degrade contribution refinements +# (pipeline-utilities §9.3, refining 0066). Three refinements python already +# satisfied: (1) an omitted ``extra_outputs`` source contributes NULL at the +# instance's positional slot (index-aligned with target_field), not "not +# contributed"; (2) an absent ``collect_field`` on any fan-in path is a null +# slot and the fan-in MUST NOT raise -- with the caveat that under a strict- +# element reducer (``concat_flatten`` / ``merge_all``) a null contribution +# still raises ``ReducerError`` (python does not suppress it; the reducer runs +# in the engine merge); (3) a degraded slot survives a checkpoint + resume +# round-trip. No library behavior change. Fixture 069's FI-degrade cases run +# in test_pipeline_utilities, its crash_injection/resume case in +# test_checkpoint; the strict-reducer caveat has a focused unit test. +[proposals."0069"] +status = "implemented" +since = "0.14.0" diff --git a/openarmature-spec b/openarmature-spec index e9b2bcc..972abc5 160000 --- a/openarmature-spec +++ b/openarmature-spec @@ -1 +1 @@ -Subproject commit e9b2bcc0ba6906fa441a6411973b2b7bef0f7152 +Subproject commit 972abc54be00465dce3a8573f350ba6b80aa5523 diff --git a/pyproject.toml b/pyproject.toml index 5af8dc7..145d6fe 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -63,7 +63,7 @@ Specification = "https://github.com/LunarCommand/openarmature-spec" openarmature = "openarmature.cli:main" [tool.openarmature] -spec_version = "0.58.0" +spec_version = "0.59.0" [dependency-groups] dev = [ diff --git a/src/openarmature/AGENTS.md b/src/openarmature/AGENTS.md index 8d8ad7e..45ca4fb 100644 --- a/src/openarmature/AGENTS.md +++ b/src/openarmature/AGENTS.md @@ -1,6 +1,6 @@ # OpenArmature — Agent documentation -*This is the agent guide bundled with the openarmature Python package, version 0.13.0 (spec v0.58.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.* +*This is the agent guide bundled with the openarmature Python package, version 0.13.0 (spec v0.59.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.* ## TL;DR @@ -10,7 +10,7 @@ OpenArmature is a workflow framework for LLM pipelines and tool-calling agents: ## Capability contracts -_Sourced from openarmature-spec v0.58.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._ +_Sourced from openarmature-spec v0.59.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._ ### Capability: `graph-engine` diff --git a/src/openarmature/__init__.py b/src/openarmature/__init__.py index 6e48503..445220b 100644 --- a/src/openarmature/__init__.py +++ b/src/openarmature/__init__.py @@ -25,7 +25,7 @@ """ __version__ = "0.13.0" -__spec_version__ = "0.58.0" +__spec_version__ = "0.59.0" # Proposal 0052 (spec observability §5.1 / §8.4.1): canonical # package-registry name for this implementation. Surfaces on every # OTel invocation span as ``openarmature.implementation.name`` and on diff --git a/tests/conformance/test_checkpoint.py b/tests/conformance/test_checkpoint.py index 5094b86..516af65 100644 --- a/tests/conformance/test_checkpoint.py +++ b/tests/conformance/test_checkpoint.py @@ -49,6 +49,7 @@ NodePosition, ) from openarmature.graph import ( + FailureIsolationMiddleware, RuntimeGraphError, State, ) @@ -68,8 +69,11 @@ # rather than relying on the test runner's file-glob to filter the # missing fixture out. 067 (crash-injection fan-out resume, proposal # 0070) is a crash/resume fixture this runner owns; it joined at v0.58.0. +# 069 (fan-out degrade refinements, proposal 0069, v0.59.0) is a mixed +# fixture: this runner drives its crash_injection/resume case and skips the +# plain FI-degrade cases (owned by test_pipeline_utilities.py). _CHECKPOINT_FIXTURE_NUMBERS: frozenset[int] = frozenset( - (set(range(24, 32)) - {28}) | set(range(48, 57)) | {67} + (set(range(24, 32)) - {28}) | set(range(48, 57)) | {67, 69} ) # Fixtures that need resume-aware test seams the conformance adapter @@ -277,12 +281,31 @@ async def test_checkpoint_fixture(fixture_path: Path) -> None: ) spec = _load(fixture_path) if "cases" in spec: + cases_run = 0 for case in cast("list[dict[str, Any]]", spec["cases"]): case_name = case.get("name", "") + # This runner drives the checkpoint cases. A mixed fixture (069) + # interleaves plain FI-degrade cases owned by + # test_pipeline_utilities.py; skip a case with no checkpoint + # concern. The marker is checkpointer / resume / crash_injection — + # NOT resume alone: fixtures like 024 / 026 / 030 / 055 assert + # checkpoint behavior (saves, record shape, not-found, + # schema_version) with a checkpointer but no resume. + if not any(k in case for k in ("checkpointer", "resume", "crash_injection")): + continue + cases_run += 1 try: await _run_one_case(case, top_level=spec) except AssertionError as e: raise AssertionError(f"case {case_name!r}: {e}") from e + # A cases-shaped fixture in this runner's set that drives zero cases + # (all skipped as non-checkpoint) would pass vacuously; fail loudly + # instead so a routing mistake surfaces. + assert cases_run > 0, ( + f"{fixture_id}: cases-shaped fixture drove zero cases in this runner " + f"(all skipped as non-checkpoint). Fix the routing or remove it from " + f"_CHECKPOINT_FIXTURE_NUMBERS." + ) return await _run_one_case(spec, top_level=spec) @@ -367,6 +390,56 @@ def _find_crash_injection(spec: Mapping[str, Any]) -> tuple[int | None, str | No return None, None, None +def _translate_fi_instance_middleware( + spec: Mapping[str, Any], +) -> dict[str, list[FailureIsolationMiddleware]]: + """Translate a fan-out node's ``instance_middleware: [failure_isolation]`` + into FailureIsolationMiddleware instances keyed by node name, for + build_graph's ``fan_out_instance_middleware``. Scoped to the static + ``degraded_update`` mapping form (the only shape the checkpoint fixtures + use, e.g. fixture 069 Case 3's degrade-survives-resume); the callable + forms are owned by test_pipeline_utilities.py, which drives the plain + FI-degrade cases.""" + out: dict[str, list[FailureIsolationMiddleware]] = {} + nodes = cast("dict[str, dict[str, Any]]", spec.get("nodes") or {}) + for node_name, node_spec in nodes.items(): + fan_out = node_spec.get("fan_out") + if not isinstance(fan_out, dict): + continue + entries = cast( + "list[dict[str, Any]]", + cast("Mapping[str, Any]", fan_out).get("instance_middleware") or [], + ) + mws: list[FailureIsolationMiddleware] = [] + for entry in entries: + # Only failure_isolation is translated here. Other instance + # middleware (e.g. fixture 053's retry) is left unwired, as this + # runner did before — those fixtures drive their behavior via + # flaky_per_index seams, not a wired middleware. + if entry.get("type") != "failure_isolation": + continue + if "degraded_update" not in entry: + raise ValueError( + f"fan-out node {node_name!r}: failure_isolation instance middleware " + f"entry is missing the required 'degraded_update'" + ) + degraded = entry["degraded_update"] + if not isinstance(degraded, dict): + raise ValueError( + f"fan-out node {node_name!r}: checkpoint runner supports only the static " + f"degraded_update form for instance middleware" + ) + mws.append( + FailureIsolationMiddleware( + degraded_update=dict(cast("Mapping[str, Any]", degraded)), + event_name=entry.get("event_name", "degraded"), + ) + ) + if mws: + out[node_name] = mws + return out + + def _strip_abort_directive(spec: Mapping[str, Any]) -> Mapping[str, Any]: """Return a fresh spec dict with any ``abort_after_instance`` directive removed from fan-out nodes. The engine doesn't recognize @@ -421,6 +494,7 @@ async def _run_one_case(spec: Mapping[str, Any], *, top_level: Mapping[str, Any] trace=trace, flaky_per_index_attempt_recorders=flaky_per_index_recorders, instance_execution_recorders=instance_execution_recorders, + fan_out_instance_middleware=_translate_fi_instance_middleware(sanitized_spec), ) builder = built.builder diff --git a/tests/conformance/test_pipeline_utilities.py b/tests/conformance/test_pipeline_utilities.py index c755fdc..b488fbf 100644 --- a/tests/conformance/test_pipeline_utilities.py +++ b/tests/conformance/test_pipeline_utilities.py @@ -84,15 +84,16 @@ def _load(path: Path) -> dict[str, Any]: # the `cases:` shape carries seeded-record + migrations + resume blocks. _LAST_DRIVEN_FIXTURE = 38 -# Failure-isolation fixtures (058-066 + 068, proposals 0050 §6.3 / 0065 / -# 0066 / 0068 / 0070) are middleware fixtures this runner handles. They sit -# past _LAST_DRIVEN_FIXTURE only because the 039-057 range (state migration / -# checkpoint fan-out) is owned by dedicated runners (test_state_migration.py -# / test_checkpoint.py), not because this runner can't drive them. Fixture -# 066 (cause chain, 0068) joined at v0.57.0; 068 (failure-mock cause chain, -# 0070) at v0.58.0. Fixture 067 (crash-injection fan-out resume) is a -# checkpoint fixture owned by test_checkpoint.py, hence the gap at 67. -_FAILURE_ISOLATION_FIXTURES = frozenset(range(58, 67)) | {68} +# Failure-isolation fixtures (058-066, 068, 069, proposals 0050 §6.3 / 0065 / +# 0066 / 0068 / 0070 / 0069) are middleware fixtures this runner handles. They +# sit past _LAST_DRIVEN_FIXTURE only because the 039-057 range (state migration +# / checkpoint fan-out) is owned by dedicated runners (test_state_migration.py +# / test_checkpoint.py), not because this runner can't drive them. Fixture 066 +# (cause chain, 0068) joined at v0.57.0; 068 (failure-mock cause chain, 0070) +# at v0.58.0; 069 (fan-out degrade refinements, 0069) at v0.59.0 — this runner +# drives its FI-degrade cases and skips its crash_injection/resume case (owned +# by test_checkpoint.py, which also owns fixture 067, hence the gap at 67). +_FAILURE_ISOLATION_FIXTURES = frozenset(range(58, 67)) | {68, 69} def _fixture_paths() -> list[Path]: @@ -541,8 +542,15 @@ async def test_pipeline_utility_fixture( shared_subgraph_blocks = { k: spec[k] for k in ("subgraph", "subgraph_with_idx", "subgraphs") if k in spec } + cases_run = 0 for case in spec["cases"]: case_name = case.get("name", "") + # Checkpoint-concern cases (fixture 069 Case 3) are owned by + # test_checkpoint.py; this runner skips them. The marker mirrors + # that runner's: checkpointer / resume / crash_injection. + if any(k in case for k in ("checkpointer", "resume", "crash_injection")): + continue + cases_run += 1 merged: dict[str, Any] = dict(case) # Compile-error cases (065 Case 2) nest the graph under ``graph:`` # (the graph-engine fixture 007 convention) so it sits beside @@ -557,6 +565,14 @@ async def test_pipeline_utility_fixture( await _run_one(merged, monkeypatch) except AssertionError as e: raise AssertionError(f"case {case_name!r}: {e}") from e + # A cases-shaped fixture in this runner's set that drives zero cases + # (all skipped as checkpoint-owned) would pass vacuously; fail loudly + # instead so a routing mistake surfaces. + assert cases_run > 0, ( + f"{fixture_id}: cases-shaped fixture drove zero cases in this runner " + f"(all skipped as checkpoint-owned). Fix the routing or remove it from " + f"_FAILURE_ISOLATION_FIXTURES." + ) return if (hit := _unsupported_directive(spec)) is not None: diff --git a/tests/test_smoke.py b/tests/test_smoke.py index a121eed..3609962 100644 --- a/tests/test_smoke.py +++ b/tests/test_smoke.py @@ -9,7 +9,7 @@ def test_package_versions() -> None: assert openarmature.__version__ == "0.13.0" - assert openarmature.__spec_version__ == "0.58.0" + assert openarmature.__spec_version__ == "0.59.0" def test_spec_version_matches_pyproject() -> None: diff --git a/tests/unit/test_fan_out.py b/tests/unit/test_fan_out.py index 6a4dbae..bff1688 100644 --- a/tests/unit/test_fan_out.py +++ b/tests/unit/test_fan_out.py @@ -33,14 +33,17 @@ from openarmature.graph import ( END, CompiledGraph, + FailureIsolationMiddleware, FanOutCountModeAmbiguous, FanOutFieldNotList, GraphBuilder, NodeException, + ReducerError, RetryConfig, RetryMiddleware, State, append, + concat_flatten, deterministic_backoff, ) @@ -365,6 +368,55 @@ async def maybe_fail(state: WorkerState) -> Mapping[str, Any]: assert excinfo.value.recoverable_state.results == [] +class _StrictReducerParentState(State): + items: list[int] = Field(default_factory=list[int]) + # concat_flatten requires every collected element to be a list; a degrade + # that nulls the slot contributes None, which the reducer rejects. + results: Annotated[list[int], concat_flatten] = Field(default_factory=list[int]) + + +async def test_degrade_null_slot_under_strict_reducer_raises_reducer_error() -> None: + # Proposal 0069 refinement (2) caveat: an absent collect_field is a + # graceful null slot and the fan-in does not raise, but under a + # strict-element reducer (concat_flatten / merge_all) the null contribution + # still raises ReducerError. The degrade-path .get() null is not suppressed + # because the reducer runs in the engine merge, downstream of the fan-in. A + # callable degrade is used because a static degrade omitting collect_field + # is a compile error (proposal 0066). + async def always_fail(_state: WorkerState) -> Mapping[str, Any]: + raise RuntimeError("instance down") + + inner_builder: GraphBuilder[WorkerState] = GraphBuilder(WorkerState) + inner_builder.set_entry("compute") + inner_builder.add_node("compute", always_fail) + inner_builder.add_edge("compute", END) + inner = inner_builder.compile() + + builder: GraphBuilder[_StrictReducerParentState] = GraphBuilder(_StrictReducerParentState) + builder.set_entry("process") + builder.add_fan_out_node( + "process", + subgraph=inner, + items_field="items", + item_field="item", + collect_field="result", + target_field="results", + instance_middleware=( + FailureIsolationMiddleware( + # Callable degrade omitting collect_field -> runtime null slot. + degraded_update=lambda _state: {}, + event_name="degraded", + ), + ), + ) + builder.add_edge("process", END) + compiled = builder.compile() + + with pytest.raises(ReducerError): + await compiled.invoke(_StrictReducerParentState(items=[0])) + await compiled.drain() + + class CollectParentState(State): items: list[int] = Field(default_factory=list[int]) results: Annotated[list[int], append] = Field(default_factory=list[int])