Skip to content

Commit 22fc2fd

Browse files
Sweep docs em dashes and stale refs (#113)
Three definite stale references fixed: - docs/examples/10-langfuse-observability.md: spec_version='0.26.0' in example trace output now reads '0.38.0' (current pin). - docs/concepts/parallel-branches.md: dropped the dangling "v0.16.1" qualifier from the retry attempt_index propagation reference. Behavior is current; the version pin was leftover. - docs/agent/non-obvious-shapes.md: compiled.attach_observer corrected to graph.attach_observer for variable-name consistency with the rest of the docs. Em-dash sweep across the user-facing docs: 130 instances removed across 17 files. Per-instance replacement was contextual (colons, semicolons, restructured asides) to keep prose natural. Reference sections that listed cross-links as "[link] -- description" now use "[link]: description"; inline emphasis dashes drop into the surrounding clause. Regenerated AGENTS.md and the _patterns/ mirror; mkdocs strict build remains clean. Spec section-number references (e.g. observability §5.5, §8.4.1) were intentionally not verified against the current v0.38.0 submodule; deferred to the next spec-bump release prep.
1 parent c1b2f23 commit 22fc2fd

24 files changed

Lines changed: 214 additions & 210 deletions

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,10 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The
66

77
## [Unreleased]
88

9+
### Changed
10+
11+
- **Docs sweep: stale references and em-dash normalization.** Fixed three definite stale references (`spec_version='0.26.0'` in the Langfuse example output now reads `'0.38.0'`; the dangling `v0.16.1` qualifier dropped from the parallel-branches concept page; `compiled.attach_observer` corrected to `graph.attach_observer` in `non-obvious-shapes.md` for variable-name consistency with the rest of the docs). Swept em dashes out of the user-facing docs (130 instances across 17 files) per the convention set during the patterns expansion. mkdocs strict build clean; no broken intra-docs links.
12+
913
### Added
1014

1115
- **vLLM production deployment notes.** `docs/model-providers/vllm.md` grows a "Production deployment" section covering the `VLLM_HTTP_TIMEOUT_KEEP_ALIVE` gotcha (vLLM's stock 5s uvicorn keep-alive lapses pooled OA-side httpx connections and surfaces as `ProviderUnavailable`; widen to roughly 300s), a systemd unit skeleton, and the three throughput knobs that interact with OA's shared connection pool (`--max-model-len`, `--max-num-seqs`, `--gpu-memory-utilization`). The existing "Tool calling" section grows a `--tool-call-parser` family table verified against vLLM's docs (Llama 3.x / Llama 4 / Mistral / Hermes / Qwen3 / DeepSeek V3 / GPT-OSS), plus explicit "not supported here" callouts for Anthropic / Gemini (proprietary cloud) and mainstream Gemma (no vLLM parser).

docs/agent/non-obvious-shapes.md

Lines changed: 31 additions & 31 deletions
Large diffs are not rendered by default.

docs/agent/tldr.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
OpenArmature is a workflow framework for LLM pipelines and tool-calling agents typed state, compile-time topology checks, observability, and crash-safe checkpoints baked into a graph engine. The graph layer has no concept of LLMs or tools; the same primitives drive deterministic ETL pipelines and tool-calling agents alike. Nodes return partial updates; the engine merges into a frozen state snapshot. Behavior is defined by [openarmature-spec](https://openarmature.org/capabilities/) and verified by conformance fixtures; this package is the reference Python implementation.
1+
OpenArmature is a workflow framework for LLM pipelines and tool-calling agents: typed state, compile-time topology checks, observability, and crash-safe checkpoints baked into a graph engine. The graph layer has no concept of LLMs or tools; the same primitives drive deterministic ETL pipelines and tool-calling agents alike. Nodes return partial updates; the engine merges into a frozen state snapshot. Behavior is defined by [openarmature-spec](https://openarmature.org/capabilities/) and verified by conformance fixtures; this package is the reference Python implementation.
22

33
**What OpenArmature is NOT:** not a chat framework (no built-in messages channel), not an LLM SDK (Provider is the abstraction layer; OpenAIProvider is the canonical impl), not a state-management library (state is per-invocation, not application-wide), not an evaluation framework (deferred to `openarmature-eval`).

docs/concepts/checkpointing.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,7 @@ Field framing worth getting right:
117117
per-instance entry carries an explicit `result_is_error` boolean
118118
that discriminates success contributions (roll forward into
119119
`target_field`) from `collect`-mode error contributions (roll
120-
forward into `errors_field`) — the engine reads the explicit field
120+
forward into `errors_field`). The engine reads the explicit field
121121
on resume rather than inferring routing from the shape of `result`.
122122
Empty tuple when no fan-outs are in flight. See
123123
[Resume semantics](fan-out.md#resume-semantics) on the fan-out
@@ -222,7 +222,7 @@ deserializes the result into your current state class.
222222

223223
**Canonical source for `schema_version`.** The framework reads
224224
`schema_version` from the state class declared at graph construction
225-
time the class passed to `GraphBuilder(...)`. If you pass a State
225+
time: the class passed to `GraphBuilder(...)`. If you pass a State
226226
subclass instance at runtime whose `schema_version` shadows the
227227
declared class's value, the saved record still carries the declared
228228
class's value. This rule keeps every save site within an invocation

docs/concepts/fan-out.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -111,15 +111,15 @@ into prior state:
111111
declare `append` on `Annotated[list[X], append]`. Each instance's
112112
value is already an `X`; `append` concatenates cleanly.
113113
- Each instance emits a `list[X]` (0..N records per instance) → the
114-
engine lands `list[list[X]]`. Declare `concat_flatten` instead
114+
engine lands `list[list[X]]`. Declare `concat_flatten` instead;
115115
it flattens one level so the parent field stays `list[X]`. Plain
116116
`append` would leave the nesting and fail Pydantic validation.
117117
- Each instance emits a `dict[str, X]` → the engine lands
118118
`list[dict]`. Declare `merge_all`, which folds the mappings into
119119
the parent dict with last-write-wins per key. Plain `merge` can't
120120
consume a `list[dict]`.
121121

122-
`concat_flatten` and `merge_all` are strict they raise
122+
`concat_flatten` and `merge_all` are strict: they raise
123123
`ReducerError` if an update element isn't the expected list/mapping
124124
shape. See [state and reducers](state-and-reducers.md#five-built-in-reducers).
125125

docs/concepts/llms.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -99,42 +99,42 @@ async def startup() -> None:
9999
try:
100100
await provider.ready()
101101
except ProviderAuthentication:
102-
# Bad API key fail fast at boot.
102+
# Bad API key: fail fast at boot.
103103
raise
104104
except ProviderInvalidModel:
105-
# Bound model isn't served by this endpoint same.
105+
# Bound model isn't served by this endpoint: same.
106106
raise
107107
except ProviderUnavailable:
108-
# Endpoint is down or unreachable fail fast too.
108+
# Endpoint is down or unreachable: fail fast too.
109109
raise
110110
```
111111

112112
`OpenAIProvider` ships three probe shapes selected via the
113113
`readiness_probe` constructor kwarg:
114114

115-
- **`"chat_completions"`** (default) issues `POST /v1/chat/completions`
115+
- **`"chat_completions"`** (default): issues `POST /v1/chat/completions`
116116
with a `max_tokens=1` body. Actually exercises the inference wire
117117
path. Strongest signal at the cost of one prompt's worth of tokens
118118
on cloud endpoints.
119-
- **`"models"`** issues `GET /v1/models` and verifies the bound
119+
- **`"models"`**: issues `GET /v1/models` and verifies the bound
120120
model appears in the catalog. Cheaper (no completion billing) but
121121
blind to proxy wire-mismatch cases: some OpenAI-compatible proxies
122122
(Bifrost is the motivating example) serve `/v1/models` correctly
123123
while 405'ing the completions endpoint, so a green catalog probe
124124
doesn't prove `complete()` will work.
125-
- **`"both"`** runs the catalog probe first (cheap fail-fast on
125+
- **`"both"`**: runs the catalog probe first (cheap fail-fast on
126126
model-not-in-catalog with the cleaner `seen_ids` diagnostic), then
127127
the chat probe. Strongest signal at double the round-trip cost.
128128

129129
```python
130-
# Local server (LM Studio, vLLM, llama.cpp) chat probe is free.
130+
# Local server (LM Studio, vLLM, llama.cpp): chat probe is free.
131131
provider = OpenAIProvider(
132132
base_url="http://localhost:8000",
133133
model="qwen2.5-coder",
134134
readiness_probe="chat_completions", # default
135135
)
136136

137-
# Cloud endpoint, cost-sensitive opt back into the catalog-only probe.
137+
# Cloud endpoint, cost-sensitive: opt back into the catalog-only probe.
138138
provider = OpenAIProvider(
139139
base_url="https://api.openai.com",
140140
model="gpt-4o-mini",
@@ -342,14 +342,14 @@ shape.
342342
By default the model decides whether and which tools to call.
343343
`tool_choice` constrains that decision per call. Four modes:
344344

345-
- `"auto"` the model decides. Equivalent to omitting the parameter
345+
- `"auto"`: the model decides. Equivalent to omitting the parameter
346346
when `tools` is non-empty.
347-
- `"required"` the model MUST call at least one tool. Useful for
347+
- `"required"`: the model MUST call at least one tool. Useful for
348348
routing nodes that branch on tool selection.
349-
- `"none"` the model MUST NOT call tools, even if `tools` is
349+
- `"none"`: the model MUST NOT call tools, even if `tools` is
350350
supplied. Useful for guarded LLM calls or for explicitly disabling
351351
tool-calling without rebuilding a tools-less request.
352-
- `ForceTool(name=...)` the model MUST call the named tool exactly.
352+
- `ForceTool(name=...)`: the model MUST call the named tool exactly.
353353

354354
Pre-send validation catches the three failure modes (`required` with
355355
empty tools, `ForceTool` with empty tools, `ForceTool.name` not in
@@ -371,7 +371,7 @@ response = await provider.complete(
371371
)
372372
```
373373

374-
Not all providers honor `tool_choice` confirm with your provider's
374+
Not all providers honor `tool_choice`; confirm with your provider's
375375
documentation. The `OpenAIProvider` maps the spec shape onto OpenAI's
376376
wire shape per the §8.1.1 mapping table. Whether the model actually
377377
honored the constraint is observable from the returned

docs/concepts/observability.md

Lines changed: 29 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -244,16 +244,16 @@ or missing log entries.
244244
### Bounded drain (optional timeout)
245245

246246
`drain()` accepts an optional `timeout` parameter (non-negative
247-
seconds) `await compiled.drain(timeout=5.0)` bounds the wait at five
247+
seconds): `await compiled.drain(timeout=5.0)` bounds the wait at five
248248
seconds. When the deadline fires, in-flight workers are cancelled
249-
cleanly so the compiled graph stays usable for subsequent invocations
250-
partial delivery state from one drain does NOT leak into the next.
249+
cleanly so the compiled graph stays usable for subsequent invocations;
250+
partial delivery state from one drain does NOT leak into the next.
251251

252252
The returned `DrainSummary` carries:
253253

254-
- `timeout_reached: bool` `True` only when the timeout actually
254+
- `timeout_reached: bool`: `True` only when the timeout actually
255255
fired. A drain that finishes before the deadline reports `False`.
256-
- `undelivered_count: int` events dispatched but not fully delivered
256+
- `undelivered_count: int`: events dispatched but not fully delivered
257257
to every subscribed observer before the deadline. Always `0` when
258258
`timeout_reached is False`.
259259

@@ -305,8 +305,8 @@ the IDs explicitly.
305305
## Caller-supplied invocation metadata
306306

307307
`correlation_id` is one string; if you also need to attach
308-
business-domain identifiers tenant IDs, request IDs, feature
309-
flags, A/B cohort labels pass them as a structured mapping at
308+
business-domain identifiers (tenant IDs, request IDs, feature
309+
flags, A/B cohort labels), pass them as a structured mapping at
310310
`invoke()` time:
311311

312312
```python
@@ -324,7 +324,7 @@ await compiled.invoke(
324324
Every observability backend picks the entries up:
325325

326326
- **OTel** emits each entry as an `openarmature.user.<key>`
327-
cross-cutting span attribute on every span invocation, node,
327+
cross-cutting span attribute on every span: invocation, node,
328328
subgraph wrapper, fan-out instance, LLM provider, retry attempt.
329329
Backends that consume OTel attributes (Phoenix / Arize, Honeycomb,
330330
Datadog APM, HyperDX, Grafana Tempo, custom collectors) see them
@@ -345,7 +345,7 @@ Two rules:
345345
`int`, `float`, `bool`) or homogeneous arrays of those types.
346346
`None`, nested objects, and mixed-type arrays are rejected.
347347

348-
Violations raise `ValueError` synchronously no spans emitted, no
348+
Violations raise `ValueError` synchronously: no spans emitted, no
349349
work runs.
350350

351351
### Adding entries mid-invocation
@@ -371,7 +371,7 @@ node's `started`, any LLM call inside) pick up the new entries.
371371
**Per-async-context scoping.** The metadata mapping lives in a
372372
`ContextVar`, which Python copies on async-task creation. Fan-out
373373
instances and parallel-branches each receive their own copy at
374-
dispatch time an instance that calls `set_invocation_metadata`
374+
dispatch time; an instance that calls `set_invocation_metadata`
375375
does NOT leak its augmentation to sibling instances. This is the
376376
canonical pattern for per-instance identifiers:
377377

@@ -472,7 +472,7 @@ Cross-vendor attribute names every LLM-aware backend reads
472472
(Langfuse, Phoenix, Honeycomb's LLM lens, OpenInference-aware
473473
tools). Emitted alongside the OA namespace:
474474

475-
- `gen_ai.system` `"openai"` by default; override per provider
475+
- `gen_ai.system`: `"openai"` by default; override per provider
476476
instance to `"vllm"` / `"lm_studio"` / `"llama_cpp"` / etc. when
477477
the OpenAI Chat Completions wire format is hitting a non-OpenAI
478478
endpoint:
@@ -485,16 +485,16 @@ tools). Emitted alongside the OA namespace:
485485
)
486486
```
487487

488-
- `gen_ai.request.model` / `gen_ai.response.model` the bound
488+
- `gen_ai.request.model` / `gen_ai.response.model`: the bound
489489
model and (when the provider returns one) the more-specific
490490
identifier in the response body.
491491
- `gen_ai.request.temperature` / `max_tokens` / `top_p` / `seed` /
492-
`frequency_penalty` / `presence_penalty` / `stop_sequences`
492+
`frequency_penalty` / `presence_penalty` / `stop_sequences`:
493493
only emitted for fields the caller actually set; absence on
494494
the span means "not supplied," distinct from a zero value.
495-
- `gen_ai.usage.input_tokens` / `output_tokens` token counts.
496-
- `gen_ai.response.finish_reasons` single-element string array.
497-
- `gen_ai.response.id` when the provider returns one.
495+
- `gen_ai.usage.input_tokens` / `output_tokens`: token counts.
496+
- `gen_ai.response.finish_reasons`: single-element string array.
497+
- `gen_ai.response.id`: when the provider returns one.
498498

499499
Disable the GenAI semconv set with `OTelObserver(disable_genai_semconv=True)`
500500
when an external auto-instrumentation library (OpenInference,
@@ -515,12 +515,12 @@ observer = OTelObserver(
515515

516516
This surfaces three attributes:
517517

518-
- `openarmature.llm.input.messages` JSON-encoded message array
518+
- `openarmature.llm.input.messages`: JSON-encoded message array
519519
(the spec §3 message shape: `{role, content, tool_calls?, …}`).
520-
- `openarmature.llm.output.content` the assistant's response
520+
- `openarmature.llm.output.content`: the assistant's response
521521
content string verbatim. Omitted for tool-call-only responses
522522
with empty content.
523-
- `openarmature.llm.request.extras` JSON-encoded `RuntimeConfig`
523+
- `openarmature.llm.request.extras`: JSON-encoded `RuntimeConfig`
524524
extras bag (provider-specific pass-through fields like
525525
`repetition_penalty` for vLLM, or `top_k` for HuggingFace
526526
endpoints). Omitted when empty.
@@ -543,7 +543,7 @@ that fits within `cap - len(marker)` bytes followed by the marker:
543543
```
544544

545545
where M is the pre-truncation byte length. The marker is appended
546-
outside any JSON encoding a truncated attribute is *not* parseable
546+
outside any JSON encoding, so a truncated attribute is *not* parseable
547547
JSON, which is the clean signal backend code can use to detect
548548
truncation without a separate flag.
549549

@@ -563,7 +563,7 @@ provider, *before* the payload reaches the observer:
563563

564564
The `media_type` and `detail` fields are preserved at the image-block
565565
level (per llm-provider §3.1.2); only `source` is replaced. URL-form
566-
images pass through unchanged the URL is a short string and is
566+
images pass through unchanged: the URL is a short string and is
567567
informative for trace readers.
568568

569569
Redaction is **not** gated by `disable_llm_payload` and is **not**
@@ -626,7 +626,7 @@ observer = OTelObserver(
626626
```
627627

628628
Each enricher receives the live `Span` plus the `NodeEvent` that
629-
triggered the close (or `None` on synthetic close sites subgraph
629+
triggered the close (or `None` on synthetic close sites: subgraph
630630
dispatch, detached root, fan-out instance, invocation span,
631631
shutdown drain). Setting attributes inside this hook works
632632
correctly; doing it from a `SpanProcessor.on_end` callback does
@@ -668,9 +668,9 @@ full pattern.
668668

669669
`OTelObserver.shutdown()` calls `provider.shutdown()` on the private
670670
`TracerProvider`, which per OTel SDK contract flushes every
671-
registered span processor. Under unusual teardown orderings for
671+
registered span processor. Under unusual teardown orderings (for
672672
example, FastAPI's `TestClient` teardown that closes the event loop
673-
before a `BatchSpanProcessor`'s export thread finishes spans can
673+
before a `BatchSpanProcessor`'s export thread finishes), spans can
674674
appear dropped. Two workarounds:
675675

676676
- Call `observer._provider.force_flush(timeout_millis=...)`
@@ -682,7 +682,7 @@ appear dropped. Two workarounds:
682682
## Langfuse mapping (opt-in)
683683

684684
A second sibling observer maps the same `NodeEvent` stream onto
685-
Langfuse's native Trace + Observation data model Traces at the
685+
Langfuse's native Trace + Observation data model: Traces at the
686686
top, Span observations for graph nodes, Generation observations for
687687
LLM calls. Use it instead of (or alongside) the OTel observer when
688688
your trace UI is Langfuse and you want first-class Generation
@@ -699,7 +699,7 @@ observer = LangfuseObserver(client=client)
699699
graph.attach_observer(observer)
700700
```
701701

702-
The `client` is anything matching the `LangfuseClient` Protocol
702+
The `client` is anything matching the `LangfuseClient` Protocol:
703703
the bundled `InMemoryLangfuseClient` (used by the conformance
704704
harness, useful for unit tests), or a real `langfuse.Langfuse()`
705705
instance wrapped in `LangfuseSDKAdapter` for production. Install
@@ -749,7 +749,7 @@ for a runnable demo.
749749
matching the `LangfuseClient` Protocol's four methods.
750750

751751
A runtime `isinstance(adapter, LangfuseClient)` check ships in
752-
the unit suite if a future v4 patch breaks the Protocol's
752+
the unit suite, so if a future v4 patch breaks the Protocol's
753753
surface, the test fails loudly.
754754

755755
### What Langfuse sees
@@ -772,7 +772,7 @@ for a runnable demo.
772772

773773
### Payload + truncation
774774

775-
`disable_llm_payload` mirrors the OTel observer's flag defaults
775+
`disable_llm_payload` mirrors the OTel observer's flag and defaults
776776
to `True` for the same privacy reason. Flip to `False` to populate
777777
`generation.input` / `output` / `metadata.request_extras` from the
778778
LLM event payload.
@@ -804,7 +804,7 @@ the Generation observation links to that entity natively (spec
804804

805805
The two observers are independent §6 event consumers and can be
806806
attached together. They share the `correlation_id` as the
807-
cross-backend join key find a slow Generation in Langfuse, search
807+
cross-backend join key: find a slow Generation in Langfuse, search
808808
for its `correlation_id` in OTel logs, see the surrounding
809809
infrastructure activity.
810810

docs/concepts/parallel-branches.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ wrap that branch's whole subgraph invocation as a unit. Retry
105105
middleware on a branch retries the **whole branch**: a fresh
106106
subgraph invocation each time, fresh inner-node execution. The
107107
wrapping retry's attempt counter propagates to events emitted from
108-
inner nodes (per graph-engine §6 v0.16.1), so observer events
108+
inner nodes (per graph-engine §6), so observer events
109109
inside the branch correctly show `attempt_index` ticking across
110110
retries.
111111

0 commit comments

Comments
 (0)