Skip to content

Commit 4f0f7fa

Browse files
dougborgclaude
andauthored
feat(mcp): background cache warm-up at lifespan startup (#593) (#680)
Closes #500 (customer-module timeouts) and #593 (the impl itself); completes the mitigation cluster for #463 alongside the already-shipped #590 (adaptive rate limit), #591 (bulk upsert), and #592 (filtered write-through). The SQLite-backed typed cache is persistent across MCP sessions, so the expensive case is the very first run or one after long idleness. Today the first tool call that needs a cold entity pays the full sync RTT, which under Katana's 60/min rate limit can exceed the MCP client's ~60s request timeout — surfacing as opaque `-32001` errors. There is typically a gap between Claude/MCP startup and the first user-facing tool call; this PR uses that gap. In `server.lifespan`, after `typed_cache.open()`, schedule `_warm_caches_in_background(client, typed_cache)` via `asyncio.create_task` and yield immediately. The task fans out 16 `ensure_*_synced` calls through `asyncio.gather(..., return_exceptions=True)` so a transient failure on one entity never blocks the others or crashes the server. On lifespan exit the task is cancelled cleanly before `typed_cache.close()` so writes can't race a closed engine. `manufacturing_order_recipe_row` is pulled implicitly via `MANUFACTURING_ORDER_SPEC.related_specs` and intentionally absent from the explicit warmup list. Tool calls that arrive mid-warmup serialize on each entity's `cache.lock_for(...)` and wait the same amount of time they would have waited without warmup, with the warmup's incremental progress available on later calls — no extra cost in the worst case. Set `MCP_DISABLE_CACHE_WARMUP=1` to skip (used by the new conftest autouse fixture so tests that don't intentionally exercise the warmup don't kick off async tasks against a mock client). Tests that want the real warmup path opt in with `@pytest.mark.cache_warmup_enabled`. Tests: - `test_disable_env_var_skips_warmup` — pins that the autouse fixture works; future refactors decoupling the env-var check from the task creation will fail this test. - `test_warmup_task_scheduled_when_enabled` — pins fire-and-forget semantics + name-based task lookup + clean cancellation on exit. - `test_warmup_failure_does_not_crash_server` — a raising warmup body doesn't propagate out of the lifespan. - `test_warm_caches_in_background_swallows_per_entity_errors` — pins the `return_exceptions=True` gather contract directly. Bundles the uv.lock sync to mcp v0.69.1 from the auto-release after PR #678 merged. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent c329345 commit 4f0f7fa

5 files changed

Lines changed: 391 additions & 0 deletions

File tree

katana_mcp_server/pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ markers = [
8080
"unit: Unit tests with mocks",
8181
"integration: Integration tests requiring API key",
8282
"browser: Browser-render tests via Playwright + fastmcp dev-apps (slow, needs `playwright install chromium`)",
83+
"cache_warmup_enabled: opt in to the lifespan cache-warmup task (default-off in tests via autouse fixture)",
8384
]
8485

8586
[tool.semantic_release]

katana_mcp_server/src/katana_mcp/server.py

Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,14 +12,18 @@
1212
- Response caching for improved performance (FastMCP 2.13+)
1313
"""
1414

15+
import asyncio
1516
import os
17+
import time
1618
from collections.abc import AsyncIterator
1719
from contextlib import asynccontextmanager
1820
from typing import TYPE_CHECKING, Literal, cast
1921

2022
if TYPE_CHECKING:
2123
from fastmcp.server.auth import AuthProvider # pragma: no cover
2224

25+
from katana_mcp.typed_cache import TypedCacheEngine # pragma: no cover
26+
2327
from dotenv import load_dotenv
2428
from fastmcp import FastMCP
2529
from fastmcp.server.middleware.caching import (
@@ -44,6 +48,97 @@
4448
logger = get_logger(__name__)
4549

4650

51+
async def _warm_caches_in_background(
52+
client: KatanaClient,
53+
typed_cache: "TypedCacheEngine",
54+
) -> None:
55+
"""Kick off ``ensure_<entity>_synced`` for every cache-backed entity.
56+
57+
Runs concurrently with the server accepting MCP requests so the gap
58+
between Claude/MCP startup and the first user-facing tool call gets
59+
used to warm the typed cache. Tool calls that arrive mid-warmup
60+
serialize on each entity's ``cache.lock_for(...)`` — they wait the
61+
same amount of time they would have waited without warmup, with the
62+
warmup's incremental progress available on later calls (closes #500's
63+
cold-cache window, the remaining mitigation for #463 after #592 and
64+
#591).
65+
66+
Per-entity failures are swallowed and logged so a transient error on
67+
one entity never blocks the others from warming and never crashes
68+
the server. ``manufacturing_order_recipe_row`` is pulled implicitly
69+
via ``MANUFACTURING_ORDER_SPEC.related_specs`` when MOs are warmed,
70+
so it's intentionally absent from this list.
71+
"""
72+
from katana_mcp.typed_cache import (
73+
ensure_additional_costs_synced,
74+
ensure_customers_synced,
75+
ensure_factory_synced,
76+
ensure_locations_synced,
77+
ensure_manufacturing_orders_synced,
78+
ensure_materials_synced,
79+
ensure_operators_synced,
80+
ensure_products_synced,
81+
ensure_purchase_orders_synced,
82+
ensure_sales_orders_synced,
83+
ensure_services_synced,
84+
ensure_stock_adjustments_synced,
85+
ensure_stock_transfers_synced,
86+
ensure_suppliers_synced,
87+
ensure_tax_rates_synced,
88+
ensure_variants_synced,
89+
)
90+
91+
helpers = (
92+
ensure_sales_orders_synced,
93+
ensure_purchase_orders_synced,
94+
ensure_manufacturing_orders_synced,
95+
ensure_stock_adjustments_synced,
96+
ensure_stock_transfers_synced,
97+
ensure_customers_synced,
98+
ensure_suppliers_synced,
99+
ensure_locations_synced,
100+
ensure_tax_rates_synced,
101+
ensure_operators_synced,
102+
ensure_additional_costs_synced,
103+
ensure_variants_synced,
104+
ensure_products_synced,
105+
ensure_materials_synced,
106+
ensure_services_synced,
107+
ensure_factory_synced,
108+
)
109+
110+
started = time.monotonic()
111+
# ``return_exceptions=True`` keeps the gather alive when any one
112+
# helper raises, so a transient API hiccup on one entity doesn't
113+
# block the others from warming. ``CancelledError`` propagates
114+
# through ``gather`` regardless of this flag, which is what we want
115+
# on shutdown.
116+
results = await asyncio.gather(
117+
*(fn(client, typed_cache) for fn in helpers),
118+
return_exceptions=True,
119+
)
120+
elapsed_s = round(time.monotonic() - started, 1)
121+
failures = [
122+
(fn.__name__, type(r).__name__, str(r))
123+
for fn, r in zip(helpers, results, strict=True)
124+
if isinstance(r, BaseException)
125+
]
126+
if failures:
127+
logger.warning(
128+
"cache_warmup_partial",
129+
elapsed_s=elapsed_s,
130+
success_count=len(helpers) - len(failures),
131+
failure_count=len(failures),
132+
failures=failures,
133+
)
134+
else:
135+
logger.info(
136+
"cache_warmup_complete",
137+
elapsed_s=elapsed_s,
138+
entity_count=len(helpers),
139+
)
140+
141+
47142
@asynccontextmanager
48143
async def lifespan(server: FastMCP) -> AsyncIterator[Services]:
49144
"""Manage server lifespan and KatanaClient lifecycle.
@@ -109,6 +204,20 @@ async def lifespan(server: FastMCP) -> AsyncIterator[Services]:
109204
typed_cache = TypedCacheEngine()
110205
await typed_cache.open()
111206
logger.info("typed_cache_initialized", db_path=str(typed_cache.db_path))
207+
208+
# Fire-and-forget background warm-up so the gap between
209+
# MCP startup and the first user-facing tool call gets used
210+
# to populate the typed cache. Default ON; set
211+
# ``MCP_DISABLE_CACHE_WARMUP=1`` to skip (test runs do this
212+
# via the autouse fixture in ``conftest.py``).
213+
warmup_task: asyncio.Task[None] | None = None
214+
if os.getenv("MCP_DISABLE_CACHE_WARMUP") != "1":
215+
warmup_task = asyncio.create_task(
216+
_warm_caches_in_background(cast(KatanaClient, client), typed_cache),
217+
name="katana_cache_warmup",
218+
)
219+
logger.info("cache_warmup_started")
220+
112221
try:
113222
# The generated ``AuthenticatedClient.__aenter__`` is
114223
# annotated to return its own class, dropping the
@@ -120,6 +229,31 @@ async def lifespan(server: FastMCP) -> AsyncIterator[Services]:
120229
logger.info("server_ready", version=__version__)
121230
yield context
122231
finally:
232+
# Always await the warmup task on shutdown, regardless of
233+
# done-state, so any exception it raised is consumed
234+
# rather than emitted as an asyncio "Task exception was
235+
# never retrieved" warning. If still running, cancel
236+
# first so the await returns promptly. Awaiting before
237+
# ``typed_cache.close()`` also prevents the warmup from
238+
# writing against a closed engine.
239+
if warmup_task is not None:
240+
if not warmup_task.done():
241+
warmup_task.cancel()
242+
try:
243+
await warmup_task
244+
except asyncio.CancelledError:
245+
logger.info("cache_warmup_cancelled")
246+
except Exception as exc:
247+
# ``_warm_caches_in_background`` already swallows
248+
# per-helper errors via ``return_exceptions=True``,
249+
# so reaching this branch means something
250+
# unexpected raised at the task scope (import
251+
# error, logging failure, programmer bug).
252+
logger.warning(
253+
"cache_warmup_task_raised",
254+
error_type=type(exc).__name__,
255+
error=str(exc),
256+
)
123257
await typed_cache.close()
124258
logger.info("typed_cache_closed")
125259

katana_mcp_server/tests/conftest.py

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,42 @@
11
"""Shared pytest fixtures for MCP server tests."""
22

33
import os
4+
from collections.abc import Iterator
45
from unittest.mock import AsyncMock, MagicMock, patch
56

67
import pytest
78
import pytest_asyncio
89

910

11+
@pytest.fixture(autouse=True)
12+
def _disable_cache_warmup_by_default(
13+
request: pytest.FixtureRequest,
14+
) -> Iterator[None]:
15+
"""Skip the background cache warm-up unless a test explicitly opts in.
16+
17+
``server.lifespan()`` schedules an ``asyncio.create_task`` that fans
18+
out ``ensure_*_synced`` calls (closes #500 / #593). Tests that don't
19+
intentionally exercise warmup don't want it kicking off against a
20+
mock client — it produces noisy logs, racy task lifecycles, and
21+
unintentional coverage of unrelated code. Tests that want to verify
22+
warmup behavior opt in with ``@pytest.mark.cache_warmup_enabled``.
23+
"""
24+
if "cache_warmup_enabled" in request.keywords:
25+
# Test wants the real warmup path; let MCP_DISABLE_CACHE_WARMUP
26+
# flow through unchanged (or be set by the test itself).
27+
yield
28+
return
29+
prev = os.environ.get("MCP_DISABLE_CACHE_WARMUP")
30+
os.environ["MCP_DISABLE_CACHE_WARMUP"] = "1"
31+
try:
32+
yield
33+
finally:
34+
if prev is None:
35+
os.environ.pop("MCP_DISABLE_CACHE_WARMUP", None)
36+
else:
37+
os.environ["MCP_DISABLE_CACHE_WARMUP"] = prev
38+
39+
1040
@pytest_asyncio.fixture
1141
async def katana_context():
1242
"""Create a mock context for integration tests that uses real KatanaClient.

0 commit comments

Comments
 (0)