Skip to content

Commit 01443c4

Browse files
committed
docs(05-04): add v0.2.0 changelog entry for MCP response caps
- New '## v0.2.0 — unreleased' section above v0.1.5. - Added: [supamem.mcp.caps] config table with all three keys + defaults (max_top_k=25, max_query_chars=250, max_preview_chars=200), plus the additive response-shape fields (Chunk.preview, SearchResult.clamped_to, ⚠️ summary_md clamp line) and the supamem doctor surfacing. - Changed: query-length enforcement now config-driven at the schema boundary; MAX_QUERY_LEN=4096 internal constant removed. - ⚠️ Behavior change call-out: default 250 is dramatically lower than the old 4096; documented the .supamem/config.toml override snippet and the rationale (token economy for retrieval keys; long contexts belong on the write path). - Notes: pyproject.toml not bumped this phase; README + translations intentionally untouched per PUB-05 / Phase 13 bench gate.
1 parent 4fe20d8 commit 01443c4

1 file changed

Lines changed: 69 additions & 0 deletions

File tree

CHANGELOG.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,75 @@
22

33
All notable changes to `supamem` will be documented in this file.
44

5+
## v0.2.0 — unreleased
6+
7+
First milestone of the v0.2.0 token-economy line. Phase 5 ships server-side
8+
hard caps on every MCP retrieval response so agents can't blow their context
9+
budget by accident, and so callers can detect when the server clamped a
10+
request. No upstream phase dependencies; additive at the schema layer.
11+
12+
### Added
13+
14+
- New `[supamem.mcp.caps]` TOML config table with three keys:
15+
- `max_top_k` (default: **25**) — silently clamps requested `top_k` on every
16+
retrieval call; the response carries `SearchResult.clamped_to` so callers
17+
can detect it.
18+
- `max_query_chars` (default: **250**) — enforced via Pydantic
19+
`Field(max_length=...)` baked into the MCP tool schema at registration
20+
time; over-cap queries fail at the schema boundary as a structured MCP
21+
validation error (no silent truncation, no stdout pollution).
22+
- `max_preview_chars` (default: **200**) — display preview cap applied to
23+
`Chunk.preview` on each hit. The full canonical payload in `Chunk.text`
24+
is **never** truncated.
25+
- New `Chunk.preview: str` field on MCP search responses — display-only
26+
excerpt of `Chunk.text`, capped at `max_preview_chars`. Existing
27+
`Chunk.text` consumers see byte-identical full payloads (backward-compat).
28+
- New top-level `SearchResult.clamped_to: Optional[int]` field — set to the
29+
effective cap when the server clamped requested `top_k`; `None` otherwise.
30+
- `summary_md` rendering now includes a `⚠️` warning line on clamp events
31+
(D-14): `⚠️ Clamped \`top_k\`: {requested} → {N} (raise mcp.caps.max_top_k)`.
32+
- `supamem doctor` surfaces all three cap values in a dedicated **MCP caps**
33+
section with config-source attribution (`[source: default|user|project]`).
34+
- `qdrant_find` alias inherits identical caps and response shape via shared
35+
closure-captured locals — alias drift is impossible by construction (D-17).
36+
37+
### Changed
38+
39+
- Query-length enforcement is now **config-driven** at the MCP schema
40+
boundary. The previous internal `MAX_QUERY_LEN = 4096` constant in
41+
`src/supamem/mcp_server.py` has been removed; the cap lives at
42+
`cfg.mcp_caps_max_query_chars` and is baked into the tool's JSON Schema
43+
at registration time so MCP clients (Cursor, Claude Code) see the limit
44+
at tool-discovery time.
45+
46+
### ⚠️ Behavior change — review before upgrading
47+
48+
The default `max_query_chars` is **250**, dramatically lower than the previous
49+
internal `MAX_QUERY_LEN = 4096`. Agents (or callers) submitting queries longer
50+
than 250 characters now receive a structured MCP validation error instead of
51+
the request silently working. If your workflow legitimately needs longer
52+
queries — long natural-language prompts, embedded code excerpts, paragraph
53+
seeds — raise the cap explicitly in your project config:
54+
55+
```toml
56+
# .supamem/config.toml
57+
[supamem.mcp.caps]
58+
max_query_chars = 4096 # restore v0.1.x behavior
59+
```
60+
61+
The new default is calibrated for token economy on small focused queries,
62+
which is the intended retrieval-key shape. Long contexts belong in the
63+
ingestion path (write to `dual_memory_write`), not in retrieval queries.
64+
65+
### Notes
66+
67+
- This is one phase of the v0.2.0 milestone; `pyproject.toml` is not bumped
68+
here. The version bump lands at the milestone Definition-of-Done point.
69+
- README.md and the four translations (`README.{zh-CN,es,ja,ru}.md`) are
70+
intentionally untouched in this entry per PUB-05: README updates are
71+
gated on Phase 13 bench validation so the user-facing narrative ships
72+
with measured numbers, not pre-bench claims.
73+
574
## v0.1.5 — 2026-04-29
675

776
`supamem install --client claude-code` now wires the **SessionStart banner**

0 commit comments

Comments
 (0)