Skip to content

feature(api): cache proposal data model + service + MCP propose tools#134

Merged
jamby77 merged 58 commits into
masterfrom
feature/cache-proposal-data-model
May 5, 2026
Merged

feature(api): cache proposal data model + service + MCP propose tools#134
jamby77 merged 58 commits into
masterfrom
feature/cache-proposal-data-model

Conversation

@jamby77
Copy link
Copy Markdown
Collaborator

@jamby77 jamby77 commented Apr 27, 2026

Summary

Combined Days 1–3 of the cache intelligence plan — originally split as #134 + #135, now folded into one PR. Specs: spec-cache-proposal-data-model.md, spec-cache-proposal-service-and-propose-tools.md.

Day 1 — data model

  • cache_proposals and cache_proposal_audit tables on Postgres + SQLite + memory adapters with idempotent migrations.
  • Both required indexes on cache_proposals(connection_id, status, proposed_at desc) and the partial (connection_id, cache_name, proposal_type) WHERE status = 'pending' — plus a partial expires_at index for the expiry cron.
  • Shared types live in packages/shared/src/utils/cache-proposals.ts, derived via z.infer from Zod schemas. Schemas own dialect quirks (BIGINT-as-string from pg, JSON-as-text from SQLite) via preprocessors, so adapter mappers parse rows directly with StoredCacheProposalSchema.parse(row).
  • Storage updateCacheProposalStatus validates any proposal_payload override against the existing row's (cache_type, proposal_type) via the variantPayloadSchemaFor helper — prevents poisoning a row with a payload shape that doesn't match its discriminator.
  • SQLite expiry uses single-statement UPDATE ... RETURNING * for atomic + race-safe expiry.
  • Scoped by connection_id (matches every other table in the schema) instead of the tenant_id mentioned in the issue spec — no tenant concept in this codebase yet.

Day 2–3 — service + MCP propose tools

  • CacheProposalService validates input per (cache_type, proposal_type), checks duplicate-pending via the storage port, and enforces a per-connection sliding-window 30/hour rate limit. Releases the slot on storage failure so transient DB errors don't permanently consume capacity.
  • CacheResolverService reads HGETALL __betterdb:caches via ConnectionRegistry to look up cache_name → cache_type (the first real consumer of the discovery-marker protocol from PRs feature(semantic-cache): add discovery marker protocol (0.2.0) #127/feature(agent-cache): add discovery marker protocol (0.5.0) #128). 30s in-memory cache.
  • Typed domain errors (CacheProposalValidationError, InvalidCacheTypeError, CacheNotFoundError, DuplicatePendingProposalError, RateLimitedError) map to HTTP 400/404/409/429 at the controller. ZodError from service-layer parses also maps to 400 with structured issues.
  • Three new MCP tools (cache_propose_threshold_adjust, cache_propose_tool_ttl_adjust, cache_propose_invalidate) wired in packages/mcp/src/index.ts. Each is a thin wrapper over a new /mcp/instance/:id/cache-proposals/... endpoint on McpController.
  • Validation enforces: threshold 0..2, ttl 10..86400, reasoning ≥20 chars; estimated_affected > 10000 produces a warning, not a rejection.

Discovery-marker dependency: end-to-end use needs PRs #127 and #128 merged so caches actually write the marker hash. Service tests mock the resolver.

Test plan

  • pnpm --filter api test -- --testPathPatterns=\"cache-proposals|rate-limiter\" — 46 tests pass (full 12-case matrix from the propose-tools spec, plus storage CRUD, plus rate-limiter unit tests including release-on-failure)
  • pnpm --filter api exec tsc --noEmit
  • pnpm --filter @betterdb/shared build
  • pnpm --filter @betterdb/mcp build
  • After feature(semantic-cache): add discovery marker protocol (0.2.0) #127, feature(agent-cache): add discovery marker protocol (0.5.0) #128 merge: integration test that walks propose → list pending → expire against real Valkey + real Postgres
  • Reviewer: confirm connection_id (vs tenant_id per spec) is the right call given Monitor is single-tenant for now

Note

High Risk
Adds new persistent data model and write paths (new cache_proposals/cache_proposal_audit tables plus adapter methods) and wires a new web review/approve flow, so migration and status-transition logic bugs could affect production data and operations.

Overview
Introduces a cache proposal persistence model end-to-end: StoragePort gains CRUD/status/audit methods for proposals, and the Postgres/SQLite/memory adapters implement them with new tables, indexes, uniqueness constraints (including NULL-safe discriminators), payload validation, and expiry updates.

Adds MCP/controller utilities (extracted query parsing/validation helpers) and loads an optional proprietary CacheProposalsModule at runtime.

Extends the web app with a Cache Proposals page (pending + history + detail drawer), approval/reject/edit-and-approve actions via a new cacheProposalsApi, and a sidebar unread badge backed by a polling hook with localStorage “last seen” tracking.

Updates the Python betterdb-agent-cache package to support the workflow via periodic tool-policy refresh (atomic swap + failure metrics) and a discovery marker protocol (registry + heartbeat + collision detection), including new options/types, telemetry counters, examples, and expanded tests.

Reviewed by Cursor Bugbot for commit 9249a97. Bugbot is set up for automated code reviews on this repo. Configure here.

Postgres + SQLite migrations for cache_proposals and
cache_proposal_audit, scoped by connection_id. Includes the
two indexes required for tenant-status-proposed ordering and
the partial pending-lookup index, plus an expires_at index for
the expiry cron.

Shared types are derived from Zod schemas (utils/cache-proposals)
with preprocessors for BIGINT and JSON columns, so adapter row
mappers parse rows directly with no dialect handling at the call
site. Storage port methods cover create, get, list, status update,
expiry, and audit append/read; memory adapter mirrors them for
unit tests.

SQLite expiry runs in a transaction with status re-check on UPDATE
to avoid races against concurrent approvals.
Comment thread apps/api/src/storage/adapters/__tests__/cache-proposals.spec.ts
Comment thread apps/api/src/storage/adapters/memory.adapter.ts
Comment thread apps/api/src/storage/adapters/memory.adapter.ts
- Memory adapter now structuredClones the input on create, update,
  and audit append, so caller mutations after the call don't leak
  into stored state. Read paths also deep-clone via structuredClone
  instead of a shallow spread, matching the deep behavior the rest
  of the codebase assumes.
- Agent invalidate discriminated-union test now throws on narrowing
  failure to match the semantic-invalidate test (was silently green
  if the narrow ever broke).
jamby77 added a commit that referenced this pull request Apr 27, 2026
CacheProposalService with type-specific validation, duplicate-pending
rejection, and per-connection 30/hour rate limit. Resolves caches via
HGETALL __betterdb:caches (discovery markers from PRs #127/#128).
Typed errors (validation, invalid cache type, not found, duplicate,
rate limited) are mapped to HTTP status codes at the MCP controller.

Three MCP tools registered:
- cache_propose_threshold_adjust (semantic_cache only, 0..2 range)
- cache_propose_tool_ttl_adjust  (agent_cache only, 10..86400 range)
- cache_propose_invalidate       (filter_kind discriminates by type;
                                  warns when estimated_affected > 10000)

All proposals start as 'pending' with a 24h expiry, awaiting human
approval (Day 5 spec). The module imports StorageModule for proposal
CRUD and ConnectionsModule for the Valkey client used to read markers.

Stacks on feature/cache-proposal-data-model (PR #134).
Comment thread packages/shared/src/utils/cache-proposals.ts
Comment thread apps/api/src/storage/adapters/sqlite.adapter.ts Outdated
Comment thread apps/api/src/storage/adapters/sqlite.adapter.ts
jamby77 added 6 commits April 27, 2026 12:12
- Storage updateCacheProposalStatus now validates any
  proposal_payload override against the existing row's
  (cache_type, proposal_type) via the new
  variantPayloadSchemaFor helper in @betterdb/shared. Prevents
  rows from being poisoned with a payload shape that doesn't
  match the discriminator. Throws ZodError on mismatch (mapped
  to HTTP 400 by the controller).
- Empty options.status array no longer produces "status IN ()"
  invalid SQL; short-circuits to an empty result and skips the
  filter when status is undefined.
- SQLite expireCacheProposalsBefore uses RETURNING * in a single
  UPDATE, eliminating the candidate+update dance that could
  return rows already expired before the call.
CacheProposalService with type-specific validation, duplicate-pending
rejection, and per-connection 30/hour rate limit. Resolves caches via
HGETALL __betterdb:caches (discovery markers from PRs #127/#128).
Typed errors (validation, invalid cache type, not found, duplicate,
rate limited) are mapped to HTTP status codes at the MCP controller.

Three MCP tools registered:
- cache_propose_threshold_adjust (semantic_cache only, 0..2 range)
- cache_propose_tool_ttl_adjust  (agent_cache only, 10..86400 range)
- cache_propose_invalidate       (filter_kind discriminates by type;
                                  warns when estimated_affected > 10000)

All proposals start as 'pending' with a 24h expiry, awaiting human
approval (Day 5 spec). The module imports StorageModule for proposal
CRUD and ConnectionsModule for the Valkey client used to read markers.

Stacks on feature/cache-proposal-data-model (PR #134).
- CacheResolverService no longer takes ttlMs/now as constructor
  parameters; NestJS was attempting to resolve them as Number/Function
  injection tokens and crashing on module init. ttlMs and now are now
  fields with defaults, settable via configureForTesting() in tests.
- Rate limiter now exposes a single atomic reserve() that combines
  check+record, and CacheProposalService.persist() calls it before
  the storage write. Previously check() and record() were separated
  by an awaited DB call, allowing concurrent requests to overshoot
  the limit by 1-2.
- ZodError thrown from service-layer schema parses now maps to HTTP
  400 with structured `issues` instead of falling through to a 500.
- Rate limiter check() returned remaining = limit - events - 1,
  understating available slots by one. Corrected to limit - events;
  reserve() still records exactly one event per allowed call.
- "Does not count proposals against other connections" test was
  misnamed: it just retested the limit on the same connection_id.
  Now exhausts the limit on CONNECTION_ID, then proposes against
  OTHER_CONNECTION_ID and asserts the proposal succeeds, verifying
  per-connection isolation.
reserve() previously returned the remaining count from check(),
which is the pre-record count. For an allowed reservation that
records one event, the returned remaining was off by one. Now
decrements remaining by one (clamped at zero) when allowed.
Previously a transient storage failure (DB hiccup, connection loss)
would consume a rate-limit slot without producing a proposal,
prematurely exhausting the per-connection 30/hour budget. The
service now wraps the storage call in try/catch and calls
rateLimiter.release() on failure to free the slot. Concurrency
safety is preserved because the reservation is still made before
the awaited write, so concurrent callers can't overshoot the limit.
@jamby77 jamby77 changed the title feature(api): cache proposal data model (Day 1) feature(api): cache proposal data model + service + MCP propose tools (Days 1–3) Apr 27, 2026
Comment thread proprietary/cache-proposals/rate-limiter.ts
Comment thread proprietary/cache-proposals/cache-proposal.service.ts
Comment thread proprietary/cache-proposals/cache-resolver.service.ts
Comment thread apps/api/src/storage/adapters/sqlite.adapter.ts
@jamby77 jamby77 requested a review from KIvanow April 27, 2026 10:38
@jamby77 jamby77 changed the title feature(api): cache proposal data model + service + MCP propose tools (Days 1–3) feature(api): cache proposal data model + service + MCP propose tools Apr 27, 2026
- Rate limiter reservations now carry a unique releaseToken; the
  service passes the token to release(key, token) on storage
  failure, freeing the exact reservation rather than whichever
  event was newest. Two concurrent reservations with one failing
  no longer corrupts the bucket.
- Storage updateCacheProposalStatus accepts an optional
  expected_status filter, applied as WHERE status IN (...). If
  the row is no longer in an allowed state the update is a no-op
  and returns null. Prevents a stale approve from resurrecting an
  expired/applied/rejected proposal.
- Unique partial indexes on cache_proposals enforce per-pending
  uniqueness for threshold_adjust (by category, NULL = global)
  and tool_ttl_adjust (by tool_name) at the DB layer, closing the
  TOCTOU between rejectIfDuplicatePending and the insert. The
  service catches the resulting unique-constraint violation
  (Postgres 23505 / SQLite SQLITE_CONSTRAINT*) and surfaces it as
  the same DuplicatePendingProposalError.
- CacheResolverService caches negative lookups for 2s instead of
  30s so a propose call right after marker registration recovers
  quickly instead of waiting out the full positive TTL.
Comment thread apps/api/src/storage/adapters/postgres.adapter.ts
jamby77 added 2 commits April 27, 2026 14:13
Adds packages/shared/src/utils/discovery-protocol.ts exporting
REGISTRY_KEY, PROTOCOL_KEY, HEARTBEAT_KEY_PREFIX, the protocol
version, and a heartbeatKeyFor() helper. CacheResolverService now
imports these instead of redefining the literals locally.

The cache packages (@betterdb/semantic-cache, @betterdb/agent-cache)
intentionally keep their own copies — they ship to npm and don't
depend on @betterdb/shared. Names match (REGISTRY_KEY,
HEARTBEAT_KEY_PREFIX) so the duplication is reconcile-by-grep
rather than reconcile-by-rename.
Adds two as-const string constants to @betterdb/shared, exported next
to CacheTypeSchema:

  export const SEMANTIC_CACHE = 'semantic_cache' as const;
  export const AGENT_CACHE   = 'agent_cache'    as const;

Replaces runtime string literals in CacheResolverService and
CacheProposalService with these constants. Type-position usages
(interface fields, type unions) switch to the existing CacheType
alias instead of repeating the literal union inline.

The cache packages keep their own per-package CACHE_TYPE constant
because they don't depend on @betterdb/shared.
Comment thread apps/api/src/storage/adapters/postgres.adapter.ts
Comment thread proprietary/cache-proposals/cache-proposal.service.ts
Comment thread proprietary/cache-proposals/cache-proposal.service.ts
jamby77 added 2 commits April 27, 2026 14:50
- updateCacheProposalStatus now treats expected_status: [] the same
  as listCacheProposals does — short-circuits to null instead of
  generating "status IN ()" invalid SQL on Postgres + SQLite or
  silently returning null via [].includes() on the memory adapter.
- rejectIfDuplicatePending no longer caps at the 100-row default
  page size of listCacheProposals. It now pages through pending
  proposals (200 per page, up to 10 pages = 2000) so a connection
  with many pending proposals across distinct categories can't
  hide a real duplicate from the pre-check. The DB unique partial
  indexes still backstop the pre-check on insert.
- isUniqueConstraintViolation no longer matches every SQLite error
  code starting with SQLITE_CONSTRAINT (which includes CHECK,
  NOTNULL, FOREIGNKEY, etc.). Only SQLITE_CONSTRAINT_UNIQUE and
  SQLITE_CONSTRAINT_PRIMARYKEY count as duplicate-pending now;
  other constraint failures bubble up as the original error.
Comment thread apps/api/eslint.config.mjs
…s duplicate

bugbot flagged that the partial unique indexes on cache_proposals used
JSON-extracted category / tool_name directly. Both Postgres and SQLite
treat NULL as distinct in unique indexes, so two pending threshold-adjust
proposals with category=null on the same (connection, cache) would both
land — bypassing the DB-level dedup.

Wrap the extracts in COALESCE to a sentinel string so NULLs collide.
Drop the old indexes first so existing dev DBs pick up the new shape.

Add a regression test confirming a second pending threshold_adjust with
category=null is rejected by the DB.
Comment thread apps/api/src/storage/adapters/postgres.adapter.ts
…s across adapters

bugbot LOW: Postgres + SQLite parsed proposal_payload via Zod before
checking expected_status, while the memory adapter checked
expected_status first. A stale write that updated both fields and
failed the status precondition would surface as a ZodError on SQL but
return null on memory — inconsistent client-visible behavior.

Read the existing row once when either field is provided, run the
expected_status guard before the variant payload parse.
Comment thread apps/api/src/storage/adapters/postgres.adapter.ts
Comment thread proprietary/cache-proposals/cache-proposal.service.ts
… on dup-pending

bugbot MEDIUM (1): the previous fix unconditionally ran DROP INDEX +
CREATE UNIQUE INDEX on every API startup, taking ACCESS EXCLUSIVE on
cache_proposals each time and rebuilding the index — a recurring write
outage proportional to row count. Rename the COALESCE-fixed indexes to
_v2 and keep CREATE IF NOT EXISTS, so first-startup migrates dev DBs
once and subsequent startups skip both DROP and CREATE.

bugbot MEDIUM (2): persist() refunded the rate-limit slot on every
storage error, including unique-constraint violations turned into
DuplicatePendingProposalError. A client guessing an existing pending
(cache, scope) could spam the endpoint with 409s at zero rate-limit
cost. Refund only on non-unique storage failures.
Comment thread proprietary/cache-proposals/cache-proposal.service.ts
Comment thread proprietary/cache-proposals/cache-proposal.service.ts
The apply dispatcher writes runtime threshold overrides to fields
'threshold' and 'threshold:<category>' on {prefix}:__config. The
reader was only checking the SDK-published baseline fields
(default_threshold and category_thresholds JSON), so after any
applied threshold proposal, subsequent proposals reported the
original baseline as 'current_threshold' instead of the actually
effective value. Read overrides first, fall back to baseline.
Comment thread apps/api/src/storage/adapters/memory.adapter.ts
Match the structuredClone used in createCacheProposal /
updateCacheProposalStatus so expireCacheProposalsBefore doesn't share
mutable proposal_payload / applied_result references with the
caller's previously-returned copies.
Comment thread proprietary/cache-proposals/cache-proposal.service.ts
Replace the fall-through after the threshold_adjust / tool_ttl_adjust
branches with a thrown ProposalEditNotAllowedError. Previously, an
unhandled proposal_type would have left newPayload undefined and the
storage call would silently skip the payload update — approving the
proposal without applying the requested edit.
Comment thread apps/api/src/cache-proposals/cache-proposal.service.ts Outdated
jamby77 added 4 commits April 29, 2026 12:02
Memory/SQLite/Postgres expireCacheProposalsBefore all use
expires_at <= now (inclusive), so a proposal at exactly now is
expired by the cron. The service-layer guards in transitionToApproved
and requireFreshPending used strict <, leaving a one-tick window
where the cron treats the proposal as expired but approve/reject
treats it as still valid. Switch the service to <= to match.
…sal-data-model

# Conflicts:
#	apps/api/src/storage/adapters/memory.adapter.ts
#	packages/shared/src/index.ts
…onfig

- check()/checkBatch() now read HGETALL {prefix}:__config (5s in-process
  cache) and honor 'threshold' and 'threshold:{category}' fields as runtime
  overrides on top of constructor categoryThresholds
- Resolution order: options.threshold > runtime category > runtime global
  > constructor categoryThresholds > defaultThreshold
- Read failures fall back to constructor (warn-logged); out-of-range values
  (<0, >2, NaN) are dropped
- Advertise 'threshold_adjust' in the discovery marker's capabilities so
  Monitor's apply dispatcher can write the config hash
- Bump to 0.4.0; CHANGELOG entry flags the {prefix}:__config behavior change
- cache_list_pending_proposals, cache_get_proposal, cache_approve_proposal,
  cache_reject_proposal, cache_edit_and_approve_proposal
- Each wraps the corresponding pre-existing endpoint on McpController
  (apps/api/src/mcp/mcp.controller.ts:688-781), which sets actorSource='mcp'
  on every approval/reject/edit call
- Bumps @betterdb/mcp to 1.2.0
Copy link
Copy Markdown
Member

@KIvanow KIvanow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to the proprietary folder as part of the pro feature set @jamby77

Overall it looks very good! I'll open a PR or two for updating the semantic and agent cache libs

- README adds a 'Cache Intelligence Tools' section covering all 14 cache
  tools (6 read-only, 3 propose, 5 approval) plus two example prompts
- server.json version bumped 1.0.0 -> 1.2.0 to match package.json (the
  registry manifest doesn't enumerate tools; tools/list is exposed via
  the MCP protocol from server.tool registrations)
@jamby77
Copy link
Copy Markdown
Collaborator Author

jamby77 commented May 4, 2026

Move this to the proprietary folder as part of the pro feature set @jamby77

Overall it looks very good! I'll open a PR or two for updating the semantic and agent cache libs

The entire feature?

@KIvanow
Copy link
Copy Markdown
Member

KIvanow commented May 4, 2026

Move this to the proprietary folder as part of the pro feature set @jamby77
Overall it looks very good! I'll open a PR or two for updating the semantic and agent cache libs

The entire feature?

Yes. The entire tab should be only part of the pro tier. The MCP is basically a thin wrapper, so as long as the monitor has the proper checks and gating the MCP will inherit them

…tier

Per review feedback, cache intelligence becomes a Pro feature gated by a
new CACHE_INTELLIGENCE entitlement.

Backend:
- Move apps/api/src/cache-proposals/ -> proprietary/cache-proposals/
- Module becomes @global and is conditionally loaded via try/catch in
  app.module.ts, matching the inference-latency-pro pattern
- Extract MCP cache routes from mcp.controller.ts into a new proprietary
  CacheProposalMcpController so the routes simply don't exist when the
  proprietary module isn't loaded; community-tier deployments return 404
  on every cache endpoint and the MCP tools surface that to the agent
- HTTP and MCP controllers both gate on @UseGuards(LicenseGuard) +
  @RequiresFeature(Feature.CACHE_INTELLIGENCE), returning 402 when not
  entitled
- Extract shared MCP helpers (ValidateInstanceIdPipe, safeLimit, etc.)
  into apps/api/src/mcp/mcp-helpers.ts so both controllers share them
- Update internal imports in moved files to use @app/* aliases

Shared:
- Add Feature.CACHE_INTELLIGENCE under Tier.pro

Frontend:
- NavItem for /cache-proposals adds requiredFeature so the link locks
  for non-entitled users (matches Anomaly Detection pattern)
- usePendingProposals accepts an enabled flag; useCacheProposalsUnread
  short-circuits on non-entitled licenses so we don't poll every 15s and
  get 402s

Tests pass: api 1228/1235, web 175/175. No regressions vs. pre-move.
Comment thread apps/api/src/storage/adapters/sqlite.adapter.ts
Comment thread apps/web/src/components/layout/AppSidebar.tsx
@jamby77
Copy link
Copy Markdown
Collaborator Author

jamby77 commented May 5, 2026

Done in 5e5d971. Whole feature moved to proprietary/cache-proposals/ and gated by a new Feature.CACHE_INTELLIGENCE under Tier.pro. MCP cache routes extracted from mcp.controller.ts into a separate proprietary controller so they 404 cleanly when the module isn't loaded; sidebar locks for non-entitled users. Tests green.

… mcp CHANGELOG

Closes the test gaps surfaced in the C5 audit:
- HistoryTable: Source column derivation from proposed_by prefix, empty
  state, cache_name filter wiring through to useHistoryProposals
- DetailPanel: full data render (cache header, reasoning, payload, apply
  result, audit trail), empty audit, loading and error branches
- useCacheProposalsUnread: entitlement gate skips polling, count when no
  lastSeenAt, markAllRead persists newest proposed_at and zeroes count

Also: add CHANGELOG.md to packages/mcp documenting 1.2.0 release with the
5 new cache-intelligence approval tools and their Pro tier requirement.

Web suite now: 187/187 (was 175).
#148)

* feat(cache-proposals): runtime config refresh for agent-cache and semantic-cache (TS + Python)

Implements the full propose→approve→apply→pickup loop so BetterDB Monitor
cache proposals take effect in running processes without a restart.

- Periodic refresh of `{name}:__tool_policies` (default 30 s, opt-out via
  `configRefresh: { enabled: false }`). First refresh fires synchronously on
  construction; subsequent ticks run on a `setInterval`.
- `ToolCache.refreshPolicies()` — atomic swap (clear + repopulate), returns
  bool. `loadPolicies()` now delegates to it; stale entries are evicted.
- New Prometheus counter `{prefix}_config_refresh_failed_total`.
- New `ConfigRefreshOptions` type exported from the package root.

- Periodic refresh of `{name}:__config` (same interval/opt-out pattern).
  Fields: `threshold` → `defaultThreshold`; `threshold:{cat}` →
  `categoryThresholds[cat]`. Constructor values are fallbacks when absent.
- `refreshConfig()` public method with per-field range validation (0–2).
- Adds `threshold_adjust` to the discovery capabilities array, unblocking
  `cache_propose_threshold_adjust` in Monitor.
- New `{prefix}_config_refresh_failed_total` counter.
- New `ConfigRefreshOptions` type exported from the package root.
- `escapeTag` exported from the package root (both TS and Python).

- Discovery marker protocol (0.5.0): registers `__betterdb:caches` entry
  and 30 s heartbeat on construction; `shutdown()` removes the heartbeat.
  New `DiscoveryOptions`, `{prefix}_discovery_write_failed_total` counter.
- Config refresh (0.6.0): `asyncio` task loop mirrors TS behaviour —
  first refresh before first sleep. `ToolCache.refresh_policies()` atomic
  swap. New `ConfigRefreshOptions`. `{prefix}_config_refresh_failed_total`.
- New `examples/monitor_proposals/main.py` demonstrating the full loop.
- Missing test coverage added: `refresh_policies()` (6 tests),
  `AgentCache` config refresh (6 tests + counter), `SessionStore.get_all()`,
  `destroy_thread()`, `scan_fields_by_prefix()` (13 tests).
- `aiohttp` declared as `[normalizer]` optional extra in `pyproject.toml`.

- Discovery marker protocol: registers on `initialize()`; capabilities
  include `['invalidate', 'similarity_distribution', 'threshold_adjust']`.
  Cross-type collision raises `SemanticCacheUsageError`. `flush()` stops the
  old manager before dropping the index (matches TS concurrency semantics).
  New `DiscoveryOptions`, `{prefix}_discovery_write_failed_total` counter.
- Config refresh: `asyncio` task loop, `refresh_config()` with field-level
  validation, constructor fallbacks, per-category support.
  New `ConfigRefreshOptions`. `{prefix}_config_refresh_failed_total`.
- New `examples/monitor_proposals/main.py` with deterministic content-word
  mock embedder (stopwords stripped, DJB2 hash, dim=64). Output is
  bit-for-bit identical to the TypeScript equivalent.
- `escape_tag` exported from the package root.
- New `test_config_refresh.py` (14 tests) and `test_discovery.py` (21 tests).

- `CacheApplyDispatcher.applySemanticInvalidate`: corrected FT index name
  from `{prefix}:__index` to `{prefix}:idx` (all semantic invalidation
  proposals were silently deleting 0 entries against a non-existent index).
- Dispatcher test `FakeClient.call()` now captures arguments so index name
  and filter expression can be asserted.
- New dispatcher contract tests: index name, filter forwarding, field format
  agreement between dispatcher writes and library reads.
- `cache-proposal.service.spec.ts`: `readCurrentThreshold` and
  `readCurrentTtl` tested with a fake registry, verifying the
  apply→re-propose cycle reads the dispatcher-written value.

* fix: address roborev findings (High + Medium + Low)

High — ensure_discovery_ready() hung indefinitely
  agent_cache.py: track the discovery registration in a dedicated
  _discovery_task field and await only that task in ensure_discovery_ready(),
  not all _background_tasks. The config-refresh loop is an infinite task that
  never completes on its own; gathering it blocked the caller permanently.

Medium — cache_edit_and_approve_proposal accepted both edit fields at once
  mcp/src/index.ts: add a mutual-exclusion guard that returns an error when
  both new_threshold and new_ttl_seconds are provided. The tool description
  says 'provide exactly one'; now the contract is enforced in code.

Low — DiscoveryOptions defined in two places (types.py and discovery.py)
  discovery.py: remove the duplicate @DataClass definition and import
  DiscoveryOptions from types.py, the single canonical location already
  re-exported by __init__.py.

Low — dead code in mock_embed()
  semantic-cache-py examples/monitor_proposals/main.py: the first words =
  list({...}) set-comprehension was immediately overwritten by the cleaned
  loop below it. Remove the dead first pass; keep only the strip-then-filter
  loop that produces the correct deduplicated word list.
jamby77 added a commit that referenced this pull request May 5, 2026
PR #134's earlier B3 commit added a 5s-TTL read-time override (HGETALL on
each check()) and PR #148's commit added a 30s background refresh that
mutates defaultThreshold/categoryThresholds in-place. Both read the same
{prefix}:__config hash; running both is duplicated work and the file
even ended up with a duplicate `private readonly configKey: string`
field declaration.

Keep the 30s background-refresh approach (cleaner lifecycle, opt-out
flag, prometheus counter, no per-call overhead) and delete the B3
machinery:

- Removes private fields thresholdOverrides, thresholdOverridesCachedAt,
  thresholdOverridesRefresh and the THRESHOLD_OVERRIDES_TTL_MS constant.
- Removes private helpers resolveThreshold, getThresholdOverrides,
  refreshThresholdOverrides.
- Restores check()/checkBatch() threshold resolution to the simple
  options.threshold > categoryThresholds[category] > defaultThreshold
  chain; refreshConfig() updates those mutable fields.
- Deletes runtime-threshold-overrides.test.ts (covered the deleted helpers).
- Removes the duplicate configKey field declaration and constructor
  assignment.
- CHANGELOG: drop the read-time-overrides bullet, expand the
  periodic-refresh bullet to spell out hash field semantics and the
  synchronous-first-tick guarantee, and reword the Behavior change note.

Tests: 128/128 pass. Trade-off: propagation goes from ~5s to ~30s
worst-case, which is acceptable given the human-approval flow upstream.
jamby77 added 2 commits May 5, 2026 17:50
…#151)

PR #134's earlier B3 commit added a 5s-TTL read-time override (HGETALL on
each check()) and PR #148's commit added a 30s background refresh that
mutates defaultThreshold/categoryThresholds in-place. Both read the same
{prefix}:__config hash; running both is duplicated work and the file
even ended up with a duplicate `private readonly configKey: string`
field declaration.

Keep the 30s background-refresh approach (cleaner lifecycle, opt-out
flag, prometheus counter, no per-call overhead) and delete the B3
machinery:

- Removes private fields thresholdOverrides, thresholdOverridesCachedAt,
  thresholdOverridesRefresh and the THRESHOLD_OVERRIDES_TTL_MS constant.
- Removes private helpers resolveThreshold, getThresholdOverrides,
  refreshThresholdOverrides.
- Restores check()/checkBatch() threshold resolution to the simple
  options.threshold > categoryThresholds[category] > defaultThreshold
  chain; refreshConfig() updates those mutable fields.
- Deletes runtime-threshold-overrides.test.ts (covered the deleted helpers).
- Removes the duplicate configKey field declaration and constructor
  assignment.
- CHANGELOG: drop the read-time-overrides bullet, expand the
  periodic-refresh bullet to spell out hash field semantics and the
  synchronous-first-tick guarantee, and reword the Behavior change note.

Tests: 128/128 pass. Trade-off: propagation goes from ~5s to ~30s
worst-case, which is acceptable given the human-approval flow upstream.
… spec

PR #148 added a ConnectionRegistry import to cache-proposal.service.spec.ts
using a relative path that doesn't resolve from proprietary/. Switch to
the @app alias to match every other import in the file. CI api-tests run
was failing TS2307 on this line; nothing else changes.
@jamby77 jamby77 merged commit aa364ad into master May 5, 2026
3 checks passed
@jamby77 jamby77 deleted the feature/cache-proposal-data-model branch May 5, 2026 15:06
@github-actions github-actions Bot locked and limited conversation to collaborators May 5, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants