Skip to content

design(mcp): normalize BLOCK_WARNING_PREFIX in tool responses — operator-prefix leaks into JSON content #720

@dougborg

Description

@dougborg

Background

Many tool response models carry a warnings: list[str] field where some entries are tagged with an internal BLOCK: sentinel prefix (BLOCK_WARNING_PREFIX). The string-level sentinel is double-purpose:

  • Wire signal — tells programmatic consumers "this is a blocking failure, not advisory."
  • UI signal — Prefab UI builders substring-match the prefix to gate the Confirm button.
  • Apply-path guard — same substring-match used as a second defense so direct callers (skipping the UI) get the same refusal.

Now that #567 dropped the markdown rendering layer, JSON content exposes the raw prefix:

Surfaced in PR #719 (Copilot review comment).

Beyond the leak, the sentinel-string protocol has bigger problems that have been latent:

  1. Consumers wanting to programmatically react to specific failure modes ("retry with serial_numbers added") must regex the message text. Brittle.
  2. No structured way to tell the UI which input field caused the block, so form-level error highlighting isn't possible.
  3. Binary block/no-block doesn't distinguish advisory (proceed-with-warning) from info (purely contextual) — every non-block warning gets the same treatment.

Proposal

Replace warnings: list[str] with warnings: list[Warning] across every response model that carries warnings (~12 models).

Structured shape

```python
from typing import Any, Literal
from pydantic import BaseModel, Field

class Warning(BaseModel):
"""Structured warning on a preview/apply response.

Replaces the legacy ``list[str]`` shape where the ``BLOCK:`` prefix
double-served as wire signal and UI cue. Three things are now
first-class: severity (how the consumer should respond), message
(human text), field_path (which input caused it, when applicable),
and code (stable programmatic identifier).
\"\"\"

severity: Literal[\"info\", \"advisory\", \"blocking\"] = Field(
    description=(
        \"How a consumer should respond:\\n\"
        \"- 'blocking': apply path will refuse; consumer must fix \"
        \"input or state before retrying.\\n\"
        \"- 'advisory': operation will proceed but the user should know \"
        \"(unusual state, soft warning).\\n\"
        \"- 'info': pure context, no action implied.\"
    ),
)
message: str = Field(
    description=(
        \"Human-readable message. Must be self-contained — readable \"
        \"without consulting other fields. Inline values directly \"
        \"(e.g. 'expected 3, got 2.5'); don't reference other Warning \"
        \"fields by name.\"
    ),
)
field_path: str | None = Field(
    default=None,
    description=(
        \"Dotted-path to the offending input field, e.g. \"
        \"'rows[0].sales_order_row_id' or 'actual_quantity'. Set when \"
        \"the warning is caused by a specific input value (so the UI \"
        \"can highlight the field and the LLM knows what to correct). \"
        \"None for state-driven warnings (e.g. 'order is already \"
        \"DONE') where no single input is at fault.\"
    ),
)
code: str | None = Field(
    default=None,
    description=(
        \"Optional stable identifier for this warning kind \"
        \"(e.g. 'serial_count_mismatch', 'unknown_row_id'). Consumers \"
        \"match on this when they want to react programmatically to a \"
        \"specific failure mode without parsing the message string. \"
        \"Stable across releases; collected in \"
        \"``katana_mcp/tools/warning_codes.py`` to prevent drift.\"
    ),
)

```

Severity decision table

Case Severity
Katana will 4xx if we apply blocking
Operation is a no-op (already DONE, source=destination, zero rows) blocking
Input is internally inconsistent given current state (unknown row, duplicate keys) blocking
Input refers to a soft-state row (archived, deleted) and we'll still apply advisory
State is unusual but legal (MO is BLOCKED status, PO partially received) advisory
Default was auto-selected (warehouse, currency, etc.) and the user might want to override info
Operating-mode context (dry-run, sandbox) info

Rule of thumb: set severity=\"blocking\" only when apply will fail or do nothing useful. If a sensible user might rationally proceed after reading the message, it's advisory or info.

Existing call sites → new shape

Every current BLOCK_WARNING_PREFIX and plain-string warning has clean material for all four fields:

Existing string (paraphrased) severity field_path code
BLOCK: MO {n} is already completed. No further action will mark it DONE again. blocking None (state) already_completed
MO {n} is blocked - review before completing advisory None blocked_status
BLOCK: MO {n} ({sku}) is serial-tracked but actual_quantity ({qty}) is not a whole number blocking actual_quantity serial_qty_not_integer
BLOCK: MO {n} ({sku}) is serial-tracked. Pass serial_numbers ... blocking serial_numbers serial_numbers_required
BLOCK: MO {n} ({sku}): serial_numbers count ({a}) must equal actual_quantity ({b}). blocking serial_numbers serial_count_mismatch
BLOCK: Row override references unknown sales_order_row_id={x} ... blocking rows[i].sales_order_row_id unknown_row_id
BLOCK: Multiple overrides for the same sales_order_row_id ({dups}); ... blocking rows duplicate_row_overrides
BLOCK: Source and destination are the same location ... blocking destination_location_id same_source_destination
BLOCK: Sales order {n} has no rows to fulfill. blocking None (state) empty_order

Each row becomes a single Pydantic construction at the call site. No string concatenation, no sentinel prefix, no substring matching.

Design decisions worth pinning

Three severity levels, not binary. A boolean (blocked: bool) is one bit and we'd regret it. Three buckets cover real distinctions: refuse / let-them-decide / FYI. The info bucket has no callers today but it's free to reserve — and the first time someone wants a soft "we picked the default warehouse for you" surface, they won't have to misuse advisory.

field_path as a string, not a tuple. Pydantic's native error format is loc: tuple (e.g. ('rows', 0, 'sales_order_row_id')). We could mirror that. The rendered string form (rows[0].sales_order_row_id) wins because:

  • Round-trips through JSON cleaner.
  • LLM consumers parse it more reliably than a heterogeneous array.
  • UIs splitting on . / [ to highlight a form field have an easy job.
  • A Pydantic validator can accept tuple input from internal callers and render to string on serialization.

code for programmatic matching. A consumer that wants to retry with corrected input shouldn't have to regex the message — they check code == \"serial_numbers_required\" and know to add the field. Messages can evolve for humans; codes stay stable for machines. Acceptance bullet below requires every existing BLOCK call site to pick a code and record it in a shared constants module.

No free-form details: dict[str, Any]. Considered and deliberately left out for v1:

  • Consumers can't depend on bag keys without a per-warning schema, so it's back to free-text parsing.
  • The values that matter for remediation are already in message (we write "sales order has rows [101, 102, 103]" inline today).
  • If a real pattern emerges (e.g. "every blocking warning should surface valid alternatives"), add a typed field then. Cheap to extend; expensive to remove.

Scope (~12 response models)

  • CreateStockTransferResponse
  • ModifyStockTransferResponse
  • CreatePurchaseOrderResponse / ModifyPurchaseOrderResponse
  • CreateSalesOrderResponse / ModifySalesOrderResponse
  • CreateManufacturingOrderResponse / ModifyManufacturingOrderResponse
  • CreateStockAdjustmentResponse / UpdateStockAdjustmentResponse / DeleteStockAdjustmentResponse
  • ReceivePurchaseOrderResponse
  • CorrectOrderResponse (correct_sales_order / correct_purchase_order / correct_manufacturing_order)
  • FulfillOrderResponse

Plus every Prefab UI builder in prefab_ui.py that substring-matches BLOCK: to gate the Confirm button — they switch to reading w.severity == \"blocking\" instead.

Plus every apply-path safety guard in orders.py (and friends) doing the same substring-match — same swap.

Migration

warnings: list[str]warnings: list[Warning] is breaking on the wire. Three options considered:

  1. Hard swap. Just change the type.
  2. Parallel field for a release. Add warnings_v2: list[Warning]; deprecate warnings; remove next major.
  3. Computed-field bridge. Keep warnings: list[str] as a @computed_field that strips BLOCK: from new structured _warnings: list[Warning].

Going with (1). No external programmatic consumers of these responses live outside the MCP server itself, and the migration cost for any that exist is the same as today's: parse a different shape. PR will land as a feat!: with explicit notes.

Acceptance

  • Warning model defined in katana_mcp.tools.tool_result_utils (or new warnings.py) with the four fields above. Field docstrings match the proposal verbatim.
  • katana_mcp.tools.warning_codes constants module holds the canonical code values as string constants (e.g. ALREADY_COMPLETED = \"already_completed\"). Every existing BLOCK call site picks a code from this module.
  • All ~12 response models above migrate warnings: list[str]warnings: list[Warning].
  • Every Prefab UI builder gating the Confirm button on w.startswith(BLOCK_WARNING_PREFIX) switches to w.severity == \"blocking\".
  • Every apply-path safety guard doing the same substring-match switches to the structured field.
  • BLOCK_WARNING_PREFIX constant deleted from tool_result_utils.py.
  • Regression tests pin the wire shape: json.loads(content)[\"warnings\"][i] returns a dict with {\"severity\", \"message\", \"field_path\", \"code\"} keys; no string content begins with \"BLOCK:\".
  • Test: each existing BLOCK call site emits a Warning with severity=\"blocking\" and a non-empty code. Pinned per-tool so a future contributor adding a BLOCK can't skip the code.
  • Severity decision table copied into the Warning class docstring or a sibling ADR so contributors aren't guessing.

Out of scope

  • The Prefab UI's substring-match in _modification.py's rendering helpers — separate from response models, will follow the same swap pattern but in a follow-up if not already covered.
  • Cross-cutting refactor of next_actions / other list[str] fields that don't carry sentinel prefixes — those are pure advisory text, no parallel signal.
  • Localization / i18n of warning messages — code is intentionally locale-free so a future i18n layer can map codes to localized strings without changing the wire format.

Discovered in

PR #719 review (2026-05-13). Original comment.

Design discussion

Full rationale (including alternatives considered for severity buckets, field_path format, and the case against details: dict) walked through in PR #719 review thread on 2026-05-13. Pasted into this issue body as the canonical reference; reviewers don't need to chase the thread.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions