design(mcp): normalize BLOCK_WARNING_PREFIX in tool responses — operator-prefix leaks into JSON content

## Background

Many tool response models carry a ``warnings: list[str]`` field where some entries are tagged with an internal ``BLOCK:`` sentinel prefix (``BLOCK_WARNING_PREFIX``). The string-level sentinel is double-purpose:

- **Wire signal** — tells programmatic consumers "this is a blocking failure, not advisory."
- **UI signal** — Prefab UI builders substring-match the prefix to gate the Confirm button.
- **Apply-path guard** — same substring-match used as a second defense so direct callers (skipping the UI) get the same refusal.

Now that #567 dropped the markdown rendering layer, JSON ``content`` exposes the raw prefix:

- **Order tools (PO / SO / MO modify, stock-adjustment family)** — previously exposed the prefix in JSON whenever ``format=\"json\"`` was passed. Pre-existing condition; no regression from #567.
- **``create_stock_transfer``** — markdown was the only content path before #567 (the markdown formatter stripped the prefix). Post-#567, ``make_json_result`` exposes the prefix for the first time. **Net new exposure.**

Surfaced in PR #719 ([Copilot review comment](https://github.com/dougborg/katana-openapi-client/pull/719#discussion_r3236916846)).

Beyond the leak, the sentinel-string protocol has bigger problems that have been latent:

1. Consumers wanting to programmatically react to specific failure modes (\"retry with serial_numbers added\") must regex the message text. Brittle.
2. No structured way to tell the UI **which input field** caused the block, so form-level error highlighting isn't possible.
3. Binary block/no-block doesn't distinguish ``advisory`` (proceed-with-warning) from ``info`` (purely contextual) — every non-block warning gets the same treatment.

## Proposal

Replace ``warnings: list[str]`` with ``warnings: list[Warning]`` across every response model that carries warnings (~12 models).

### Structured shape

\`\`\`python
from typing import Any, Literal
from pydantic import BaseModel, Field


class Warning(BaseModel):
    \"\"\"Structured warning on a preview/apply response.

    Replaces the legacy ``list[str]`` shape where the ``BLOCK:`` prefix
    double-served as wire signal and UI cue. Three things are now
    first-class: severity (how the consumer should respond), message
    (human text), field_path (which input caused it, when applicable),
    and code (stable programmatic identifier).
    \"\"\"

    severity: Literal[\"info\", \"advisory\", \"blocking\"] = Field(
        description=(
            \"How a consumer should respond:\\n\"
            \"- 'blocking': apply path will refuse; consumer must fix \"
            \"input or state before retrying.\\n\"
            \"- 'advisory': operation will proceed but the user should know \"
            \"(unusual state, soft warning).\\n\"
            \"- 'info': pure context, no action implied.\"
        ),
    )
    message: str = Field(
        description=(
            \"Human-readable message. Must be self-contained — readable \"
            \"without consulting other fields. Inline values directly \"
            \"(e.g. 'expected 3, got 2.5'); don't reference other Warning \"
            \"fields by name.\"
        ),
    )
    field_path: str | None = Field(
        default=None,
        description=(
            \"Dotted-path to the offending input field, e.g. \"
            \"'rows[0].sales_order_row_id' or 'actual_quantity'. Set when \"
            \"the warning is caused by a specific input value (so the UI \"
            \"can highlight the field and the LLM knows what to correct). \"
            \"None for state-driven warnings (e.g. 'order is already \"
            \"DONE') where no single input is at fault.\"
        ),
    )
    code: str | None = Field(
        default=None,
        description=(
            \"Optional stable identifier for this warning kind \"
            \"(e.g. 'serial_count_mismatch', 'unknown_row_id'). Consumers \"
            \"match on this when they want to react programmatically to a \"
            \"specific failure mode without parsing the message string. \"
            \"Stable across releases; collected in \"
            \"``katana_mcp/tools/warning_codes.py`` to prevent drift.\"
        ),
    )
\`\`\`

### Severity decision table

| Case | Severity |
|---|---|
| Katana will 4xx if we apply | `blocking` |
| Operation is a no-op (already DONE, source=destination, zero rows) | `blocking` |
| Input is internally inconsistent given current state (unknown row, duplicate keys) | `blocking` |
| Input refers to a soft-state row (archived, deleted) and we'll still apply | `advisory` |
| State is unusual but legal (MO is BLOCKED status, PO partially received) | `advisory` |
| Default was auto-selected (warehouse, currency, etc.) and the user might want to override | `info` |
| Operating-mode context (dry-run, sandbox) | `info` |

**Rule of thumb**: set ``severity=\"blocking\"`` only when apply will fail or do nothing useful. If a sensible user might rationally proceed after reading the message, it's ``advisory`` or ``info``.

### Existing call sites → new shape

Every current ``BLOCK_WARNING_PREFIX`` and plain-string warning has clean material for all four fields:

| Existing string (paraphrased) | `severity` | `field_path` | `code` |
|---|---|---|---|
| `BLOCK: MO {n} is already completed. No further action will mark it DONE again.` | `blocking` | `None` (state) | `already_completed` |
| `MO {n} is blocked - review before completing` | `advisory` | `None` | `blocked_status` |
| `BLOCK: MO {n} ({sku}) is serial-tracked but actual_quantity ({qty}) is not a whole number` | `blocking` | `actual_quantity` | `serial_qty_not_integer` |
| `BLOCK: MO {n} ({sku}) is serial-tracked. Pass serial_numbers ...` | `blocking` | `serial_numbers` | `serial_numbers_required` |
| `BLOCK: MO {n} ({sku}): serial_numbers count ({a}) must equal actual_quantity ({b}).` | `blocking` | `serial_numbers` | `serial_count_mismatch` |
| `BLOCK: Row override references unknown sales_order_row_id={x} ...` | `blocking` | `rows[i].sales_order_row_id` | `unknown_row_id` |
| `BLOCK: Multiple overrides for the same sales_order_row_id ({dups}); ...` | `blocking` | `rows` | `duplicate_row_overrides` |
| `BLOCK: Source and destination are the same location ...` | `blocking` | `destination_location_id` | `same_source_destination` |
| `BLOCK: Sales order {n} has no rows to fulfill.` | `blocking` | `None` (state) | `empty_order` |

Each row becomes a single Pydantic construction at the call site. No string concatenation, no sentinel prefix, no substring matching.

### Design decisions worth pinning

**Three severity levels, not binary.** A boolean (`blocked: bool`) is one bit and we'd regret it. Three buckets cover real distinctions: refuse / let-them-decide / FYI. The `info` bucket has no callers today but it's free to reserve — and the first time someone wants a soft \"we picked the default warehouse for you\" surface, they won't have to misuse `advisory`.

**`field_path` as a string, not a tuple.** Pydantic's native error format is `loc: tuple` (e.g. `('rows', 0, 'sales_order_row_id')`). We could mirror that. The rendered string form (`rows[0].sales_order_row_id`) wins because:
- Round-trips through JSON cleaner.
- LLM consumers parse it more reliably than a heterogeneous array.
- UIs splitting on `.` / `[` to highlight a form field have an easy job.
- A Pydantic validator can accept tuple input from internal callers and render to string on serialization.

**`code` for programmatic matching.** A consumer that wants to retry with corrected input shouldn't have to regex the message — they check `code == \"serial_numbers_required\"` and know to add the field. Messages can evolve for humans; codes stay stable for machines. **Acceptance bullet below requires every existing BLOCK call site to pick a code and record it in a shared constants module.**

**No free-form `details: dict[str, Any]`.** Considered and deliberately left out for v1:
- Consumers can't depend on bag keys without a per-warning schema, so it's back to free-text parsing.
- The values that matter for remediation are already in `message` (we write \"sales order has rows [101, 102, 103]\" inline today).
- If a real pattern emerges (e.g. \"every blocking warning should surface valid alternatives\"), add a typed field then. Cheap to extend; expensive to remove.

## Scope (~12 response models)

- ``CreateStockTransferResponse``
- ``ModifyStockTransferResponse``
- ``CreatePurchaseOrderResponse`` / ``ModifyPurchaseOrderResponse``
- ``CreateSalesOrderResponse`` / ``ModifySalesOrderResponse``
- ``CreateManufacturingOrderResponse`` / ``ModifyManufacturingOrderResponse``
- ``CreateStockAdjustmentResponse`` / ``UpdateStockAdjustmentResponse`` / ``DeleteStockAdjustmentResponse``
- ``ReceivePurchaseOrderResponse``
- ``CorrectOrderResponse`` (correct_sales_order / correct_purchase_order / correct_manufacturing_order)
- ``FulfillOrderResponse``

Plus every Prefab UI builder in ``prefab_ui.py`` that substring-matches ``BLOCK:`` to gate the Confirm button — they switch to reading ``w.severity == \"blocking\"`` instead.

Plus every apply-path safety guard in ``orders.py`` (and friends) doing the same substring-match — same swap.

## Migration

`warnings: list[str]` → `warnings: list[Warning]` is breaking on the wire. Three options considered:

1. **Hard swap.** Just change the type.
2. **Parallel field for a release.** Add `warnings_v2: list[Warning]`; deprecate `warnings`; remove next major.
3. **Computed-field bridge.** Keep `warnings: list[str]` as a `@computed_field` that strips `BLOCK:` from new structured `_warnings: list[Warning]`.

**Going with (1).** No external programmatic consumers of these responses live outside the MCP server itself, and the migration cost for any that exist is the same as today's: parse a different shape. PR will land as a `feat!:` with explicit notes.

## Acceptance

- [ ] ``Warning`` model defined in ``katana_mcp.tools.tool_result_utils`` (or new ``warnings.py``) with the four fields above. Field docstrings match the proposal verbatim.
- [ ] ``katana_mcp.tools.warning_codes`` constants module holds the canonical ``code`` values as string constants (e.g. ``ALREADY_COMPLETED = \"already_completed\"``). Every existing BLOCK call site picks a code from this module.
- [ ] All ~12 response models above migrate ``warnings: list[str]`` → ``warnings: list[Warning]``.
- [ ] Every Prefab UI builder gating the Confirm button on ``w.startswith(BLOCK_WARNING_PREFIX)`` switches to ``w.severity == \"blocking\"``.
- [ ] Every apply-path safety guard doing the same substring-match switches to the structured field.
- [ ] ``BLOCK_WARNING_PREFIX`` constant deleted from ``tool_result_utils.py``.
- [ ] Regression tests pin the wire shape: ``json.loads(content)[\"warnings\"][i]`` returns a dict with ``{\"severity\", \"message\", \"field_path\", \"code\"}`` keys; no string content begins with ``\"BLOCK:\"``.
- [ ] Test: each existing BLOCK call site emits a `Warning` with `severity=\"blocking\"` and a non-empty `code`. Pinned per-tool so a future contributor adding a BLOCK can't skip the code.
- [ ] Severity decision table copied into the `Warning` class docstring or a sibling ADR so contributors aren't guessing.

## Out of scope

- The Prefab UI's substring-match in ``_modification.py``'s rendering helpers — separate from response models, will follow the same swap pattern but in a follow-up if not already covered.
- Cross-cutting refactor of ``next_actions`` / other ``list[str]`` fields that don't carry sentinel prefixes — those are pure advisory text, no parallel signal.
- Localization / i18n of warning messages — `code` is intentionally locale-free so a future i18n layer can map codes to localized strings without changing the wire format.

## Discovered in

PR #719 review (2026-05-13). [Original comment](https://github.com/dougborg/katana-openapi-client/pull/719#discussion_r3236916846).

## Design discussion

Full rationale (including alternatives considered for severity buckets, field_path format, and the case against ``details: dict``) walked through in [PR #719 review thread](https://github.com/dougborg/katana-openapi-client/pull/719) on 2026-05-13. Pasted into this issue body as the canonical reference; reviewers don't need to chase the thread.

Existing string (paraphrased)	`severity`	`field_path`	`code`
`BLOCK: MO {n} is already completed. No further action will mark it DONE again.`	`blocking`	`None` (state)	`already_completed`
`MO {n} is blocked - review before completing`	`advisory`	`None`	`blocked_status`
`BLOCK: MO {n} ({sku}) is serial-tracked but actual_quantity ({qty}) is not a whole number`	`blocking`	`actual_quantity`	`serial_qty_not_integer`
`BLOCK: MO {n} ({sku}) is serial-tracked. Pass serial_numbers ...`	`blocking`	`serial_numbers`	`serial_numbers_required`
`BLOCK: MO {n} ({sku}): serial_numbers count ({a}) must equal actual_quantity ({b}).`	`blocking`	`serial_numbers`	`serial_count_mismatch`
`BLOCK: Row override references unknown sales_order_row_id={x} ...`	`blocking`	`rows[i].sales_order_row_id`	`unknown_row_id`
`BLOCK: Multiple overrides for the same sales_order_row_id ({dups}); ...`	`blocking`	`rows`	`duplicate_row_overrides`
`BLOCK: Source and destination are the same location ...`	`blocking`	`destination_location_id`	`same_source_destination`
`BLOCK: Sales order {n} has no rows to fulfill.`	`blocking`	`None` (state)	`empty_order`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

design(mcp): normalize BLOCK_WARNING_PREFIX in tool responses — operator-prefix leaks into JSON content #720

Background

Proposal

Structured shape

Severity decision table

Existing call sites → new shape

Design decisions worth pinning

Scope (~12 response models)

Migration

Acceptance

Out of scope

Discovered in

Design discussion

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Case	Severity
Katana will 4xx if we apply	`blocking`
Operation is a no-op (already DONE, source=destination, zero rows)	`blocking`
Input is internally inconsistent given current state (unknown row, duplicate keys)	`blocking`
Input refers to a soft-state row (archived, deleted) and we'll still apply	`advisory`
State is unusual but legal (MO is BLOCKED status, PO partially received)	`advisory`
Default was auto-selected (warehouse, currency, etc.) and the user might want to override	`info`
Operating-mode context (dry-run, sandbox)	`info`

Uh oh!

design(mcp): normalize BLOCK_WARNING_PREFIX in tool responses — operator-prefix leaks into JSON content #720

Description

Background

Proposal

Structured shape

Severity decision table

Existing call sites → new shape

Design decisions worth pinning

Scope (~12 response models)

Migration

Acceptance

Out of scope

Discovered in

Design discussion

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions