Skip to content

Commit 2434876

Browse files
dougborgclaude
andcommitted
docs: add subsystem-local docs for progressive-discovery refactor
Create three new topical docs that will absorb content currently inlined in CLAUDE.md, so agents only pay the context cost for the subsystem they're actually touching: - katana_mcp_server/docs/prefab/README.md — DataTable mustache binding, browser-test wire shape via make_tool_result, register_preview_tool + meta=UI_META contract, help.py drift. - katana_mcp_server/docs/typed_cache/README.md — archive/deleted opt-in flags + derived bools, cross-entity ID collisions, Cached<Name> siblings (don't pollute API spec). - katana_public_api_client/docs/spec-authoring.md — OpenAPI 3.1 conventions, use-site descriptions, ListResponse schema, generator/spec regen lockstep, privacy, fix-at-source rule. Extend two existing docs with content that didn't have a topical home yet: - katana_public_api_client/docs/guide.md — new "Response Handling" section with unwrap helpers, when-to-use table, anti-patterns, and exception hierarchy. Also fix four stale status_code == 200 example blocks in this file to use unwrap_as / unwrap_data / is_success per the rule we're now codifying. - .github/agents/guides/shared/COMMIT_STANDARDS.md — add First-Push Safety (HEAD:refs/heads/<name> form), Lockfile Drift (uv.lock bundling), and Generator/spec regen lockstep sections. No content is removed from CLAUDE.md yet; that move lands in the next commit so the two halves of the refactor are reviewable independently. Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 0c6fde7 commit 2434876

5 files changed

Lines changed: 518 additions & 47 deletions

File tree

.github/agents/guides/shared/COMMIT_STANDARDS.md

Lines changed: 82 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -215,28 +215,27 @@ Before version 1.0.0, breaking changes still bump MINOR (not MAJOR):
215215
## Schema and Generator Changes
216216

217217
Edits to the OpenAPI spec (`docs/katana-openapi.yaml`) or to the generator scripts
218-
(`scripts/generate_pydantic_models.py`, `scripts/regenerate_client.py`) can ripple
219-
into the **public client surface** in ways that break consumers — even when each
220-
individual change reads as a "fix" or "chore". Those ripples must be captured in
221-
the commit message so the auto-generated changelog flags them.
218+
(`scripts/generate_pydantic_models.py`, `scripts/regenerate_client.py`) can ripple into
219+
the **public client surface** in ways that break consumers — even when each individual
220+
change reads as a "fix" or "chore". Those ripples must be captured in the commit message
221+
so the auto-generated changelog flags them.
222222

223223
### What counts as a breaking schema change
224224

225-
Use `feat(client)!:` (or `fix(client)!:`) **with a `BREAKING CHANGE:` footer**
226-
whenever a spec or generator edit causes any of the following in the regenerated
227-
client output:
225+
Use `feat(client)!:` (or `fix(client)!:`) **with a `BREAKING CHANGE:` footer** whenever
226+
a spec or generator edit causes any of the following in the regenerated client output:
228227

229-
- A previously-exported class **disappears** (e.g., a `StrEnum` was deduped into
230-
a structurally identical sibling — see #414 / `OutsourcedRecipeIngredientAvailability`
228+
- A previously-exported class **disappears** (e.g., a `StrEnum` was deduped into a
229+
structurally identical sibling — see #414 / `OutsourcedRecipeIngredientAvailability`
231230
collapsed into `OutsourcedPurchaseOrderIngredientAvailability`)
232-
- A field **type narrows** in a way that makes previously-valid values raise
233-
(e.g., `type: string``$ref` to a `StrEnum` — see #410)
231+
- A field **type narrows** in a way that makes previously-valid values raise (e.g.,
232+
`type: string``$ref` to a `StrEnum` — see #410)
234233
- A field is **renamed** or **removed** on a response or request schema
235234
- An endpoint path is **removed**
236235
- A required field becomes **optional** (changes parse semantics) or vice versa
237236

238-
The footer must list the affected name(s) so a consumer searching the
239-
changelog for the broken import or attribute can find the change:
237+
The footer must list the affected name(s) so a consumer searching the changelog for the
238+
broken import or attribute can find the change:
240239

241240
```
242241
fix(client)!: dedupe collapses OutsourcedRecipeIngredientAvailability
@@ -256,24 +255,79 @@ Refs #409
256255

257256
- Adding a new endpoint or field (purely additive)
258257
- Adding a value to an existing enum (parse stays tolerant; old values still work)
259-
- Generated-file diff is byte-identical (e.g., a docstring tweak in the generator
260-
that doesn't change emitted code)
261-
- Internal-only refactors that don't change the surface (e.g., consolidating
262-
generator config dicts — see #407)
258+
- Generated-file diff is byte-identical (e.g., a docstring tweak in the generator that
259+
doesn't change emitted code)
260+
- Internal-only refactors that don't change the surface (e.g., consolidating generator
261+
config dicts — see #407)
263262

264263
### Why this matters pre-1.0
265264

266-
The package is pre-1.0, so breaking changes bump MINOR (per the
267-
"Pre-1.0 Special Rules" section above) — *not* MAJOR. That's a smaller
268-
version-number signal than a 1.x → 2.x bump, so the **changelog entry** carries
269-
the discoverability for affected consumers. Without the breaking-change footer,
270-
the change shows up in the changelog as a normal `fix:` line and consumers hit
271-
an `ImportError` (or `ValueError`, etc.) without a breadcrumb to follow.
272-
273-
When in doubt, run `uv run poe regenerate-client && uv run poe generate-pydantic`
274-
and inspect `git diff katana_public_api_client/`. If the diff removes any
275-
top-level class, removes any `__all__` entry, or changes a field's type
276-
annotation in a non-additive way, use `!` and `BREAKING CHANGE:`.
265+
The package is pre-1.0, so breaking changes bump MINOR (per the "Pre-1.0 Special Rules"
266+
section above) — *not* MAJOR. That's a smaller version-number signal than a 1.x → 2.x
267+
bump, so the **changelog entry** carries the discoverability for affected consumers.
268+
Without the breaking-change footer, the change shows up in the changelog as a normal
269+
`fix:` line and consumers hit an `ImportError` (or `ValueError`, etc.) without a
270+
breadcrumb to follow.
271+
272+
When in doubt, run `uv run poe regenerate-client && uv run poe generate-pydantic` and
273+
inspect `git diff katana_public_api_client/`. If the diff removes any top-level class,
274+
removes any `__all__` entry, or changes a field's type annotation in a non-additive way,
275+
use `!` and `BREAKING CHANGE:`.
276+
277+
### Generator/spec edits must commit the regen in the same PR
278+
279+
Whenever you edit a generator script or the OpenAPI spec, run the regen, run
280+
`uv run poe check` (or at minimum `agent-check` + `uv run poe test`), and commit the
281+
regenerated output **in the same PR**. The input and its output stay locked together at
282+
every commit so the cause-and-effect chain is reviewable.
283+
284+
- Pushing a generator/spec change **without** its regen leaves CI green-but-stale until
285+
the next time someone runs the generator.
286+
- Pushing regen output **without** the input change drifts in the other direction.
287+
288+
Note the generated-file impact in the PR description (e.g., "byte-identical except X" or
289+
list affected files). Before editing the spec, audit upstream drift via the workflow in
290+
[`docs/upstream-specs/README.md`](../../../docs/upstream-specs/README.md)
291+
(`poe refresh-upstream-spec``poe audit-spec``poe validate-response-examples`
292+
`poe validate-examples`).
293+
294+
## Lockfile Drift
295+
296+
When `uv.lock` shows up modified but you didn't touch dependencies (e.g., a
297+
sibling-package release on `main` bumped a workspace version), `git add uv.lock` and
298+
bundle it into your current commit.
299+
300+
**Don't `git checkout -- uv.lock` to drop it**: pre-commit's auto-stash/restore fights
301+
with the lockfile being regenerated mid-hook (pytest's `uv run` re-syncs it), producing
302+
confusing "files were modified by this hook" failures where nothing was actually wrong
303+
with the staged content. The lockfile must stay in sync with `pyproject.toml` at every
304+
commit anyway, so bundling is always the right call.
305+
306+
## First-Push Safety
307+
308+
When a local branch was created via `git checkout -b <name> origin/main`, its upstream
309+
is set to `origin/main`. A subsequent `git push -u origin <name>` then resolves to its
310+
tracked upstream and **pushes the local tip straight to `main`** — bypassing PR review
311+
and triggering semantic-release.
312+
313+
This actually happened: commit `30f3fd86` reached main + tagged `mcp-v0.44.1` +
314+
published to PyPI before the pipeline could be cancelled.
315+
316+
**Always use the explicit destination ref for first-time pushes:**
317+
318+
```bash
319+
# Wrong — pushes to whatever the local branch tracks (may be main)
320+
git push -u origin chore/foo
321+
322+
# Right — explicit destination, creates the remote branch
323+
git push -u origin HEAD:refs/heads/chore/foo
324+
```
325+
326+
A `pre-push` hook at `.pre-commit-config.yaml`'s `pre-push` stage enforces this for
327+
contributors who run pre-commit; **do not bypass with `--no-verify`**. The
328+
`Protect Main` ruleset has `bypass_mode: always` for the Admin role (required by
329+
semantic-release's PAT-based pushes); tightening that bypass is tracked in #429 (GitHub
330+
App migration). Until #429 lands, the local hook is the only mechanical guardrail.
277331

278332
## Multi-Package Changes
279333

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# Prefab UI — Rendering Pitfalls and Contracts
2+
3+
Prefab cards are emitted by MCP tools as JSON envelopes that the host (Claude Desktop,
4+
Cowork, the browser-render harness) hydrates into a React tree inside an iframe. The
5+
JSON wire is forgiving (it accepts most pydantic shapes); the JS renderer is not. The
6+
rules below come from real production failures — cards that passed unit tests but
7+
crashed in the host, or tools that promised a widget in their docstring and silently
8+
emitted none.
9+
10+
When you change anything that emits a Prefab card, run `uv run poe test-browser` (needs
11+
one-time `uv run playwright install chromium`) — the browser-render harness in
12+
`katana_mcp_server/tests/browser/` is the only thing that exercises the real JS
13+
renderer.
14+
15+
Related tests:
16+
17+
- `katana_mcp_server/tests/test_prefab_ui.py` — unit tests on the wire envelope.
18+
- `katana_mcp_server/tests/browser/` — headless Chromium harness that proves cards
19+
actually mount.
20+
21+
______________________________________________________________________
22+
23+
## `DataTable.rows` requires mustache `{{ key }}`, never a bare string
24+
25+
The Python pydantic field type accepts `rows: str` either way, but the JS renderer
26+
**crashes the entire iframe** with `t.some is not a function` if it sees a bare
27+
state-key string — it treats the string as the rows array itself, calls `.some()` on a
28+
string, and the React tree never mounts.
29+
30+
Always use the mustache form for state-bound rows:
31+
32+
```python
33+
DataTable(rows="{{ items }}")
34+
DataTable(rows="{{ stock.by_location }}") # dotted paths supported
35+
```
36+
37+
`_assert_state_bindings_resolve` in `tests/test_prefab_ui.py` enforces this on every
38+
state-bound `DataTable`. The browser-render harness then proves the rendered card
39+
actually mounts in headless Chromium — the prior unit-test contract (`to_json()` returns
40+
a dict with `$prefab`) was insufficient because the wire envelope can be "valid but
41+
unrenderable."
42+
43+
Discovered while investigating #629; bit every state-bound DataTable in the repo
44+
(search, inventory, verification, batch_recipe, modification card).
45+
46+
______________________________________________________________________
47+
48+
## Browser-test tool stubs must mirror production wire shape via `make_tool_result`
49+
50+
When stubbing an MCP tool in `tests/browser/render_test_server.py`, return
51+
`make_tool_result(response_pydantic, ui=ui_card)` — exactly like real tool code in
52+
`src/katana_mcp/tools/foundation/`. A hand-built
53+
`ToolResult(content="ok", structured_content=raw_dict)` silently passes browser tests
54+
but misses production-shape bugs.
55+
56+
In particular, `$result` in the `on_success` Rx context resolves to the apply tool's
57+
`structured_content` — a `PrefabApp` wire envelope keyed by `$prefab` / `view` / `state`
58+
**not** the raw `ModificationResponse` shape. A stub that returns a raw
59+
`ModificationResponse` therefore exposes an `actions` field that doesn't exist in
60+
production, so the rail's `Rx("$result.actions")` looks like it resolves correctly in
61+
test but doesn't in real life.
62+
63+
Rule: **a stub that doesn't match production wire shape is worse than no stub.**
64+
Discovered via Copilot review on #634; the broken live-tick that "passed" against the
65+
bad stub is tracked for proper fix in #645.
66+
67+
______________________________________________________________________
68+
69+
## A tool's docstring promises must match the UI it actually emits
70+
71+
`register_preview_tool` auto-appends "Preview→apply: ... returns a Prefab card with
72+
Confirm/Cancel buttons" to the tool's docstring. Hosts read that promise and look for a
73+
widget. If the tool is registered with `register_preview_tool` but **without**
74+
`meta=UI_META` (or it returns via `make_simple_result` with no Prefab envelope), the
75+
host's widget-fetch fails — Claude Desktop crashes its internal `read_widget_context`
76+
with `tool_name=undefined` because no widget exists for the tool the host believed was
77+
emitting one. The tool result still returns successfully, but the iframe renders
78+
nothing.
79+
80+
The contract is bidirectional:
81+
82+
- Every `register_preview_tool` call **must** pair `meta=UI_META` with a real Prefab
83+
card (built via `make_tool_result(response, ui=...)`) — never the docstring without
84+
the card.
85+
- Conversely, tools that emit no UI must use plain `mcp.tool(...)`, **not**
86+
`register_preview_tool` — the docstring promise has to match reality.
87+
88+
Caught via a live Claude Desktop session against `create_stock_adjustment` (fixed in
89+
#649); the same misregistration still applies to `create_stock_transfer` — tracked in
90+
#639. (`fulfill_order` is correctly wired with `meta=UI_META`; its remaining work is the
91+
direct-apply rail migration in #638, not this misregistration.)
92+
93+
______________________________________________________________________
94+
95+
## Help resource drift
96+
97+
`katana_mcp_server/.../resources/help.py` contains hardcoded tool documentation. When
98+
adding or modifying tool parameters, also update the help resource content to stay in
99+
sync. The `pr-preparer` agent flags this on PR readiness.
Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Typed Cache — Patterns and Pitfalls
2+
3+
The typed cache is a SQLite-backed mirror of Katana's wire state, with per-entity tables
4+
(`product`, `material`, `supplier`, `manufacturing_order`, …) generated from the same
5+
pydantic models the API client uses. It exists so MCP tools can serve queries without
6+
round-tripping to Katana for every read, and so FTS5 fuzzy search has something to
7+
index.
8+
9+
The rules below are about how the cache layer relates to the API layer (the spec, the
10+
generated pydantic, the attrs models) — what belongs where, and what surprises bite when
11+
those layers blur.
12+
13+
Related code:
14+
15+
- `katana_mcp_server/src/katana_mcp/typed_cache/sync.py` — attrs → API pydantic → cache
16+
pydantic conversion.
17+
- `scripts/generate_pydantic_models.py` — emits both the API pydantic class and the
18+
sibling `Cached<Name>` class with `table=True` per entity. See
19+
`duplicate_cache_tables_as_cached_siblings`.
20+
21+
______________________________________________________________________
22+
23+
## Archive / deleted state — opt-in flags + derived booleans
24+
25+
Katana represents soft-state as nullable timestamps on the wire: `archived_at` (the
26+
user-toggleable archive lifecycle — exposed on catalog items, inventory rows, and a few
27+
other archivable entities) and `deleted_at` (soft-delete, exposed on most entities
28+
including catalog items *and* transactional entities like POs, SOs, MOs, stock
29+
transfers, stock adjustments). The two are independent — an entity can be both archived
30+
and soft-deleted.
31+
32+
Two MCP-side conventions surface this; **keep them symmetric**:
33+
34+
### Query-param flags for opting into soft-state rows (default `False`)
35+
36+
Items use `include_archived` (`search_items`, catalog cache). After #472 Phase D the
37+
canonical wiring is the typed cache's `CatalogQueries` adapter — `parent_archived_at` is
38+
denormalized onto `CachedVariant` at sync time (via the variant `attrs_postprocess` hook
39+
in `typed_cache/sync.py`) and the adapter's default `include_archived=False` /
40+
`include_deleted=False` filters push the `archived_at IS NULL` / `deleted_at IS NULL`
41+
predicates down to SQL.
42+
43+
Transactional entities use `include_deleted` on `list_purchase_orders` /
44+
`list_sales_orders` / `list_manufacturing_orders` / `list_stock_adjustments`, filtering
45+
at the same typed-cache query layer.
46+
47+
### Response-side derived booleans
48+
49+
Every response model that exposes `archived_at` / `deleted_at` should also expose a
50+
convenience `is_archived` / `is_deleted` bool derived from `<timestamp> is not None`,
51+
saving callers from the timestamp/null inspection.
52+
53+
**Note the asymmetry**: `is_archived` mirrors Katana's *write* convention
54+
(`update_<entity>` request bodies accept `is_archived: bool`, so round-tripping through
55+
`modify_<entity>` with `{"update_header": {"is_archived": false}}` works). `is_deleted`,
56+
by contrast, is *read-side only* — Katana exposes deletion through DELETE endpoints, not
57+
as a writable boolean on update bodies. Items expose `is_archived` on `ItemInfo` and
58+
`ItemDetailsResponse` as of #526.
59+
60+
Don't add a new opt-in flag without the matching derived bool, and vice versa.
61+
62+
**Known gaps** (filed for follow-up):
63+
64+
- `list_stock_transfers` lacks `include_deleted` parity (#484).
65+
- `check_inventory` and the inventory reporting tools lack `include_archived` and the
66+
`is_archived` row field (#539).
67+
- Transactional response models lack the `is_deleted` derived bool (#540).
68+
69+
______________________________________________________________________
70+
71+
## Cache IDs are not globally unique — never merge cross-entity maps by numeric ID alone
72+
73+
The typed cache stores each entity type in its own table (`product`, `material`,
74+
`supplier`, …), so a product with `id=42` and a material with `id=42` are both legal.
75+
When enriching a list of variants with parent context (or any other cross-entity batch
76+
fetch), keep separate per-type maps (`products`, `materials`) and select based on which
77+
ID the variant carries (`v.product_id` vs `v.material_id`).
78+
79+
Merging into a single dict via `{**products, **materials}` mis-attaches parents on
80+
collision — Python dict-unpack iterates left-to-right and later keys win, so the
81+
material entry silently overwrites the product entry on shared IDs. The bug is symmetric
82+
in practice: every product variant whose ID also exists as a material ID looks up the
83+
material's data instead (and vice versa if you reorder the unpack).
84+
85+
Caught in #542 (variant card redesign) by Copilot review; regression test pins the case
86+
in `test_items.py::test_enrich_variants_keeps_product_and_material_maps_separate`.
87+
88+
______________________________________________________________________
89+
90+
## Don't pollute the API spec/models with cache-only fields
91+
92+
The OpenAPI spec at `docs/katana-openapi.yaml` and the generated pydantic models in
93+
`katana_public_api_client/models_pydantic/_generated/*.py` reflect Katana's actual wire
94+
contract. **Never** add fields to the spec or inject fields into the API pydantic
95+
classes to satisfy cache-schema, MCP-tool, or other consumer needs.
96+
97+
Cache schemas live on sibling `Cached<Name>` classes emitted by the same generator pass
98+
— the API class stays pure pydantic, the cache class carries `table=True`, foreign keys,
99+
relationships, JSON columns, and any cache-only fields. See
100+
`scripts/generate_pydantic_models.py::duplicate_cache_tables_as_cached_siblings` and
101+
`katana_mcp_server/src/katana_mcp/typed_cache/sync.py::_attrs_<entity>_to_cached` for
102+
the conversion pattern: attrs → API pydantic (via the registry) → cache pydantic (via
103+
`model_dump`/`model_validate`), with relationship fields set after construction since
104+
SQLModel `Relationship` descriptors don't accept input via `__init__`.
105+
106+
If a bug surfaces in `sync.py` but originates in generated client code (attrs, pydantic,
107+
`from_attrs`, `Cached*` schemas), fix it at the generator/spec layer — not in `sync.py`.
108+
See
109+
[Fix bugs at the client/generator layer](../../../katana_public_api_client/docs/spec-authoring.md#fix-bugs-at-the-clientgenerator-layer-when-the-root-cause-lives-there)
110+
in the spec-authoring guide.

0 commit comments

Comments
 (0)