feature(api): cache proposal data model + service + MCP propose tools#134
Conversation
Postgres + SQLite migrations for cache_proposals and cache_proposal_audit, scoped by connection_id. Includes the two indexes required for tenant-status-proposed ordering and the partial pending-lookup index, plus an expires_at index for the expiry cron. Shared types are derived from Zod schemas (utils/cache-proposals) with preprocessors for BIGINT and JSON columns, so adapter row mappers parse rows directly with no dialect handling at the call site. Storage port methods cover create, get, list, status update, expiry, and audit append/read; memory adapter mirrors them for unit tests. SQLite expiry runs in a transaction with status re-check on UPDATE to avoid races against concurrent approvals.
- Memory adapter now structuredClones the input on create, update, and audit append, so caller mutations after the call don't leak into stored state. Read paths also deep-clone via structuredClone instead of a shallow spread, matching the deep behavior the rest of the codebase assumes. - Agent invalidate discriminated-union test now throws on narrowing failure to match the semantic-invalidate test (was silently green if the narrow ever broke).
CacheProposalService with type-specific validation, duplicate-pending rejection, and per-connection 30/hour rate limit. Resolves caches via HGETALL __betterdb:caches (discovery markers from PRs #127/#128). Typed errors (validation, invalid cache type, not found, duplicate, rate limited) are mapped to HTTP status codes at the MCP controller. Three MCP tools registered: - cache_propose_threshold_adjust (semantic_cache only, 0..2 range) - cache_propose_tool_ttl_adjust (agent_cache only, 10..86400 range) - cache_propose_invalidate (filter_kind discriminates by type; warns when estimated_affected > 10000) All proposals start as 'pending' with a 24h expiry, awaiting human approval (Day 5 spec). The module imports StorageModule for proposal CRUD and ConnectionsModule for the Valkey client used to read markers. Stacks on feature/cache-proposal-data-model (PR #134).
- Storage updateCacheProposalStatus now validates any proposal_payload override against the existing row's (cache_type, proposal_type) via the new variantPayloadSchemaFor helper in @betterdb/shared. Prevents rows from being poisoned with a payload shape that doesn't match the discriminator. Throws ZodError on mismatch (mapped to HTTP 400 by the controller). - Empty options.status array no longer produces "status IN ()" invalid SQL; short-circuits to an empty result and skips the filter when status is undefined. - SQLite expireCacheProposalsBefore uses RETURNING * in a single UPDATE, eliminating the candidate+update dance that could return rows already expired before the call.
CacheProposalService with type-specific validation, duplicate-pending rejection, and per-connection 30/hour rate limit. Resolves caches via HGETALL __betterdb:caches (discovery markers from PRs #127/#128). Typed errors (validation, invalid cache type, not found, duplicate, rate limited) are mapped to HTTP status codes at the MCP controller. Three MCP tools registered: - cache_propose_threshold_adjust (semantic_cache only, 0..2 range) - cache_propose_tool_ttl_adjust (agent_cache only, 10..86400 range) - cache_propose_invalidate (filter_kind discriminates by type; warns when estimated_affected > 10000) All proposals start as 'pending' with a 24h expiry, awaiting human approval (Day 5 spec). The module imports StorageModule for proposal CRUD and ConnectionsModule for the Valkey client used to read markers. Stacks on feature/cache-proposal-data-model (PR #134).
- CacheResolverService no longer takes ttlMs/now as constructor parameters; NestJS was attempting to resolve them as Number/Function injection tokens and crashing on module init. ttlMs and now are now fields with defaults, settable via configureForTesting() in tests. - Rate limiter now exposes a single atomic reserve() that combines check+record, and CacheProposalService.persist() calls it before the storage write. Previously check() and record() were separated by an awaited DB call, allowing concurrent requests to overshoot the limit by 1-2. - ZodError thrown from service-layer schema parses now maps to HTTP 400 with structured `issues` instead of falling through to a 500.
- Rate limiter check() returned remaining = limit - events - 1, understating available slots by one. Corrected to limit - events; reserve() still records exactly one event per allowed call. - "Does not count proposals against other connections" test was misnamed: it just retested the limit on the same connection_id. Now exhausts the limit on CONNECTION_ID, then proposes against OTHER_CONNECTION_ID and asserts the proposal succeeds, verifying per-connection isolation.
reserve() previously returned the remaining count from check(), which is the pre-record count. For an allowed reservation that records one event, the returned remaining was off by one. Now decrements remaining by one (clamped at zero) when allowed.
Previously a transient storage failure (DB hiccup, connection loss) would consume a rate-limit slot without producing a proposal, prematurely exhausting the per-connection 30/hour budget. The service now wraps the storage call in try/catch and calls rateLimiter.release() on failure to free the slot. Concurrency safety is preserved because the reservation is still made before the awaited write, so concurrent callers can't overshoot the limit.
- Rate limiter reservations now carry a unique releaseToken; the service passes the token to release(key, token) on storage failure, freeing the exact reservation rather than whichever event was newest. Two concurrent reservations with one failing no longer corrupts the bucket. - Storage updateCacheProposalStatus accepts an optional expected_status filter, applied as WHERE status IN (...). If the row is no longer in an allowed state the update is a no-op and returns null. Prevents a stale approve from resurrecting an expired/applied/rejected proposal. - Unique partial indexes on cache_proposals enforce per-pending uniqueness for threshold_adjust (by category, NULL = global) and tool_ttl_adjust (by tool_name) at the DB layer, closing the TOCTOU between rejectIfDuplicatePending and the insert. The service catches the resulting unique-constraint violation (Postgres 23505 / SQLite SQLITE_CONSTRAINT*) and surfaces it as the same DuplicatePendingProposalError. - CacheResolverService caches negative lookups for 2s instead of 30s so a propose call right after marker registration recovers quickly instead of waiting out the full positive TTL.
Adds packages/shared/src/utils/discovery-protocol.ts exporting REGISTRY_KEY, PROTOCOL_KEY, HEARTBEAT_KEY_PREFIX, the protocol version, and a heartbeatKeyFor() helper. CacheResolverService now imports these instead of redefining the literals locally. The cache packages (@betterdb/semantic-cache, @betterdb/agent-cache) intentionally keep their own copies — they ship to npm and don't depend on @betterdb/shared. Names match (REGISTRY_KEY, HEARTBEAT_KEY_PREFIX) so the duplication is reconcile-by-grep rather than reconcile-by-rename.
Adds two as-const string constants to @betterdb/shared, exported next to CacheTypeSchema: export const SEMANTIC_CACHE = 'semantic_cache' as const; export const AGENT_CACHE = 'agent_cache' as const; Replaces runtime string literals in CacheResolverService and CacheProposalService with these constants. Type-position usages (interface fields, type unions) switch to the existing CacheType alias instead of repeating the literal union inline. The cache packages keep their own per-package CACHE_TYPE constant because they don't depend on @betterdb/shared.
- updateCacheProposalStatus now treats expected_status: [] the same as listCacheProposals does — short-circuits to null instead of generating "status IN ()" invalid SQL on Postgres + SQLite or silently returning null via [].includes() on the memory adapter. - rejectIfDuplicatePending no longer caps at the 100-row default page size of listCacheProposals. It now pages through pending proposals (200 per page, up to 10 pages = 2000) so a connection with many pending proposals across distinct categories can't hide a real duplicate from the pre-check. The DB unique partial indexes still backstop the pre-check on insert. - isUniqueConstraintViolation no longer matches every SQLite error code starting with SQLITE_CONSTRAINT (which includes CHECK, NOTNULL, FOREIGNKEY, etc.). Only SQLITE_CONSTRAINT_UNIQUE and SQLITE_CONSTRAINT_PRIMARYKEY count as duplicate-pending now; other constraint failures bubble up as the original error.
…s duplicate bugbot flagged that the partial unique indexes on cache_proposals used JSON-extracted category / tool_name directly. Both Postgres and SQLite treat NULL as distinct in unique indexes, so two pending threshold-adjust proposals with category=null on the same (connection, cache) would both land — bypassing the DB-level dedup. Wrap the extracts in COALESCE to a sentinel string so NULLs collide. Drop the old indexes first so existing dev DBs pick up the new shape. Add a regression test confirming a second pending threshold_adjust with category=null is rejected by the DB.
…s across adapters bugbot LOW: Postgres + SQLite parsed proposal_payload via Zod before checking expected_status, while the memory adapter checked expected_status first. A stale write that updated both fields and failed the status precondition would surface as a ZodError on SQL but return null on memory — inconsistent client-visible behavior. Read the existing row once when either field is provided, run the expected_status guard before the variant payload parse.
… on dup-pending bugbot MEDIUM (1): the previous fix unconditionally ran DROP INDEX + CREATE UNIQUE INDEX on every API startup, taking ACCESS EXCLUSIVE on cache_proposals each time and rebuilding the index — a recurring write outage proportional to row count. Rename the COALESCE-fixed indexes to _v2 and keep CREATE IF NOT EXISTS, so first-startup migrates dev DBs once and subsequent startups skip both DROP and CREATE. bugbot MEDIUM (2): persist() refunded the rate-limit slot on every storage error, including unique-constraint violations turned into DuplicatePendingProposalError. A client guessing an existing pending (cache, scope) could spam the endpoint with 409s at zero rate-limit cost. Refund only on non-unique storage failures.
The apply dispatcher writes runtime threshold overrides to fields
'threshold' and 'threshold:<category>' on {prefix}:__config. The
reader was only checking the SDK-published baseline fields
(default_threshold and category_thresholds JSON), so after any
applied threshold proposal, subsequent proposals reported the
original baseline as 'current_threshold' instead of the actually
effective value. Read overrides first, fall back to baseline.
Match the structuredClone used in createCacheProposal / updateCacheProposalStatus so expireCacheProposalsBefore doesn't share mutable proposal_payload / applied_result references with the caller's previously-returned copies.
Replace the fall-through after the threshold_adjust / tool_ttl_adjust branches with a thrown ProposalEditNotAllowedError. Previously, an unhandled proposal_type would have left newPayload undefined and the storage call would silently skip the payload update — approving the proposal without applying the requested edit.
Memory/SQLite/Postgres expireCacheProposalsBefore all use expires_at <= now (inclusive), so a proposal at exactly now is expired by the cron. The service-layer guards in transitionToApproved and requireFreshPending used strict <, leaving a one-tick window where the cron treats the proposal as expired but approve/reject treats it as still valid. Switch the service to <= to match.
…sal-data-model # Conflicts: # apps/api/src/storage/adapters/memory.adapter.ts # packages/shared/src/index.ts
…onfig
- check()/checkBatch() now read HGETALL {prefix}:__config (5s in-process
cache) and honor 'threshold' and 'threshold:{category}' fields as runtime
overrides on top of constructor categoryThresholds
- Resolution order: options.threshold > runtime category > runtime global
> constructor categoryThresholds > defaultThreshold
- Read failures fall back to constructor (warn-logged); out-of-range values
(<0, >2, NaN) are dropped
- Advertise 'threshold_adjust' in the discovery marker's capabilities so
Monitor's apply dispatcher can write the config hash
- Bump to 0.4.0; CHANGELOG entry flags the {prefix}:__config behavior change
- cache_list_pending_proposals, cache_get_proposal, cache_approve_proposal, cache_reject_proposal, cache_edit_and_approve_proposal - Each wraps the corresponding pre-existing endpoint on McpController (apps/api/src/mcp/mcp.controller.ts:688-781), which sets actorSource='mcp' on every approval/reject/edit call - Bumps @betterdb/mcp to 1.2.0
- README adds a 'Cache Intelligence Tools' section covering all 14 cache tools (6 read-only, 3 propose, 5 approval) plus two example prompts - server.json version bumped 1.0.0 -> 1.2.0 to match package.json (the registry manifest doesn't enumerate tools; tools/list is exposed via the MCP protocol from server.tool registrations)
The entire feature? |
Yes. The entire tab should be only part of the pro tier. The MCP is basically a thin wrapper, so as long as the monitor has the proper checks and gating the MCP will inherit them |
…tier Per review feedback, cache intelligence becomes a Pro feature gated by a new CACHE_INTELLIGENCE entitlement. Backend: - Move apps/api/src/cache-proposals/ -> proprietary/cache-proposals/ - Module becomes @global and is conditionally loaded via try/catch in app.module.ts, matching the inference-latency-pro pattern - Extract MCP cache routes from mcp.controller.ts into a new proprietary CacheProposalMcpController so the routes simply don't exist when the proprietary module isn't loaded; community-tier deployments return 404 on every cache endpoint and the MCP tools surface that to the agent - HTTP and MCP controllers both gate on @UseGuards(LicenseGuard) + @RequiresFeature(Feature.CACHE_INTELLIGENCE), returning 402 when not entitled - Extract shared MCP helpers (ValidateInstanceIdPipe, safeLimit, etc.) into apps/api/src/mcp/mcp-helpers.ts so both controllers share them - Update internal imports in moved files to use @app/* aliases Shared: - Add Feature.CACHE_INTELLIGENCE under Tier.pro Frontend: - NavItem for /cache-proposals adds requiredFeature so the link locks for non-entitled users (matches Anomaly Detection pattern) - usePendingProposals accepts an enabled flag; useCacheProposalsUnread short-circuits on non-entitled licenses so we don't poll every 15s and get 402s Tests pass: api 1228/1235, web 175/175. No regressions vs. pre-move.
|
Done in 5e5d971. Whole feature moved to |
… mcp CHANGELOG Closes the test gaps surfaced in the C5 audit: - HistoryTable: Source column derivation from proposed_by prefix, empty state, cache_name filter wiring through to useHistoryProposals - DetailPanel: full data render (cache header, reasoning, payload, apply result, audit trail), empty audit, loading and error branches - useCacheProposalsUnread: entitlement gate skips polling, count when no lastSeenAt, markAllRead persists newest proposed_at and zeroes count Also: add CHANGELOG.md to packages/mcp documenting 1.2.0 release with the 5 new cache-intelligence approval tools and their Pro tier requirement. Web suite now: 187/187 (was 175).
#148) * feat(cache-proposals): runtime config refresh for agent-cache and semantic-cache (TS + Python) Implements the full propose→approve→apply→pickup loop so BetterDB Monitor cache proposals take effect in running processes without a restart. - Periodic refresh of `{name}:__tool_policies` (default 30 s, opt-out via `configRefresh: { enabled: false }`). First refresh fires synchronously on construction; subsequent ticks run on a `setInterval`. - `ToolCache.refreshPolicies()` — atomic swap (clear + repopulate), returns bool. `loadPolicies()` now delegates to it; stale entries are evicted. - New Prometheus counter `{prefix}_config_refresh_failed_total`. - New `ConfigRefreshOptions` type exported from the package root. - Periodic refresh of `{name}:__config` (same interval/opt-out pattern). Fields: `threshold` → `defaultThreshold`; `threshold:{cat}` → `categoryThresholds[cat]`. Constructor values are fallbacks when absent. - `refreshConfig()` public method with per-field range validation (0–2). - Adds `threshold_adjust` to the discovery capabilities array, unblocking `cache_propose_threshold_adjust` in Monitor. - New `{prefix}_config_refresh_failed_total` counter. - New `ConfigRefreshOptions` type exported from the package root. - `escapeTag` exported from the package root (both TS and Python). - Discovery marker protocol (0.5.0): registers `__betterdb:caches` entry and 30 s heartbeat on construction; `shutdown()` removes the heartbeat. New `DiscoveryOptions`, `{prefix}_discovery_write_failed_total` counter. - Config refresh (0.6.0): `asyncio` task loop mirrors TS behaviour — first refresh before first sleep. `ToolCache.refresh_policies()` atomic swap. New `ConfigRefreshOptions`. `{prefix}_config_refresh_failed_total`. - New `examples/monitor_proposals/main.py` demonstrating the full loop. - Missing test coverage added: `refresh_policies()` (6 tests), `AgentCache` config refresh (6 tests + counter), `SessionStore.get_all()`, `destroy_thread()`, `scan_fields_by_prefix()` (13 tests). - `aiohttp` declared as `[normalizer]` optional extra in `pyproject.toml`. - Discovery marker protocol: registers on `initialize()`; capabilities include `['invalidate', 'similarity_distribution', 'threshold_adjust']`. Cross-type collision raises `SemanticCacheUsageError`. `flush()` stops the old manager before dropping the index (matches TS concurrency semantics). New `DiscoveryOptions`, `{prefix}_discovery_write_failed_total` counter. - Config refresh: `asyncio` task loop, `refresh_config()` with field-level validation, constructor fallbacks, per-category support. New `ConfigRefreshOptions`. `{prefix}_config_refresh_failed_total`. - New `examples/monitor_proposals/main.py` with deterministic content-word mock embedder (stopwords stripped, DJB2 hash, dim=64). Output is bit-for-bit identical to the TypeScript equivalent. - `escape_tag` exported from the package root. - New `test_config_refresh.py` (14 tests) and `test_discovery.py` (21 tests). - `CacheApplyDispatcher.applySemanticInvalidate`: corrected FT index name from `{prefix}:__index` to `{prefix}:idx` (all semantic invalidation proposals were silently deleting 0 entries against a non-existent index). - Dispatcher test `FakeClient.call()` now captures arguments so index name and filter expression can be asserted. - New dispatcher contract tests: index name, filter forwarding, field format agreement between dispatcher writes and library reads. - `cache-proposal.service.spec.ts`: `readCurrentThreshold` and `readCurrentTtl` tested with a fake registry, verifying the apply→re-propose cycle reads the dispatcher-written value. * fix: address roborev findings (High + Medium + Low) High — ensure_discovery_ready() hung indefinitely agent_cache.py: track the discovery registration in a dedicated _discovery_task field and await only that task in ensure_discovery_ready(), not all _background_tasks. The config-refresh loop is an infinite task that never completes on its own; gathering it blocked the caller permanently. Medium — cache_edit_and_approve_proposal accepted both edit fields at once mcp/src/index.ts: add a mutual-exclusion guard that returns an error when both new_threshold and new_ttl_seconds are provided. The tool description says 'provide exactly one'; now the contract is enforced in code. Low — DiscoveryOptions defined in two places (types.py and discovery.py) discovery.py: remove the duplicate @DataClass definition and import DiscoveryOptions from types.py, the single canonical location already re-exported by __init__.py. Low — dead code in mock_embed() semantic-cache-py examples/monitor_proposals/main.py: the first words = list({...}) set-comprehension was immediately overwritten by the cleaned loop below it. Remove the dead first pass; keep only the strip-then-filter loop that produces the correct deduplicated word list.
PR #134's earlier B3 commit added a 5s-TTL read-time override (HGETALL on each check()) and PR #148's commit added a 30s background refresh that mutates defaultThreshold/categoryThresholds in-place. Both read the same {prefix}:__config hash; running both is duplicated work and the file even ended up with a duplicate `private readonly configKey: string` field declaration. Keep the 30s background-refresh approach (cleaner lifecycle, opt-out flag, prometheus counter, no per-call overhead) and delete the B3 machinery: - Removes private fields thresholdOverrides, thresholdOverridesCachedAt, thresholdOverridesRefresh and the THRESHOLD_OVERRIDES_TTL_MS constant. - Removes private helpers resolveThreshold, getThresholdOverrides, refreshThresholdOverrides. - Restores check()/checkBatch() threshold resolution to the simple options.threshold > categoryThresholds[category] > defaultThreshold chain; refreshConfig() updates those mutable fields. - Deletes runtime-threshold-overrides.test.ts (covered the deleted helpers). - Removes the duplicate configKey field declaration and constructor assignment. - CHANGELOG: drop the read-time-overrides bullet, expand the periodic-refresh bullet to spell out hash field semantics and the synchronous-first-tick guarantee, and reword the Behavior change note. Tests: 128/128 pass. Trade-off: propagation goes from ~5s to ~30s worst-case, which is acceptable given the human-approval flow upstream.
…#151) PR #134's earlier B3 commit added a 5s-TTL read-time override (HGETALL on each check()) and PR #148's commit added a 30s background refresh that mutates defaultThreshold/categoryThresholds in-place. Both read the same {prefix}:__config hash; running both is duplicated work and the file even ended up with a duplicate `private readonly configKey: string` field declaration. Keep the 30s background-refresh approach (cleaner lifecycle, opt-out flag, prometheus counter, no per-call overhead) and delete the B3 machinery: - Removes private fields thresholdOverrides, thresholdOverridesCachedAt, thresholdOverridesRefresh and the THRESHOLD_OVERRIDES_TTL_MS constant. - Removes private helpers resolveThreshold, getThresholdOverrides, refreshThresholdOverrides. - Restores check()/checkBatch() threshold resolution to the simple options.threshold > categoryThresholds[category] > defaultThreshold chain; refreshConfig() updates those mutable fields. - Deletes runtime-threshold-overrides.test.ts (covered the deleted helpers). - Removes the duplicate configKey field declaration and constructor assignment. - CHANGELOG: drop the read-time-overrides bullet, expand the periodic-refresh bullet to spell out hash field semantics and the synchronous-first-tick guarantee, and reword the Behavior change note. Tests: 128/128 pass. Trade-off: propagation goes from ~5s to ~30s worst-case, which is acceptable given the human-approval flow upstream.
Summary
Combined Days 1–3 of the cache intelligence plan — originally split as #134 + #135, now folded into one PR. Specs:
spec-cache-proposal-data-model.md,spec-cache-proposal-service-and-propose-tools.md.Day 1 — data model
cache_proposalsandcache_proposal_audittables on Postgres + SQLite + memory adapters with idempotent migrations.cache_proposals—(connection_id, status, proposed_at desc)and the partial(connection_id, cache_name, proposal_type) WHERE status = 'pending'— plus a partialexpires_atindex for the expiry cron.packages/shared/src/utils/cache-proposals.ts, derived viaz.inferfrom Zod schemas. Schemas own dialect quirks (BIGINT-as-string from pg, JSON-as-text from SQLite) via preprocessors, so adapter mappers parse rows directly withStoredCacheProposalSchema.parse(row).updateCacheProposalStatusvalidates anyproposal_payloadoverride against the existing row's(cache_type, proposal_type)via thevariantPayloadSchemaForhelper — prevents poisoning a row with a payload shape that doesn't match its discriminator.UPDATE ... RETURNING *for atomic + race-safe expiry.connection_id(matches every other table in the schema) instead of thetenant_idmentioned in the issue spec — no tenant concept in this codebase yet.Day 2–3 — service + MCP propose tools
CacheProposalServicevalidates input per(cache_type, proposal_type), checks duplicate-pending via the storage port, and enforces a per-connection sliding-window 30/hour rate limit. Releases the slot on storage failure so transient DB errors don't permanently consume capacity.CacheResolverServicereadsHGETALL __betterdb:cachesviaConnectionRegistryto look upcache_name → cache_type(the first real consumer of the discovery-marker protocol from PRs feature(semantic-cache): add discovery marker protocol (0.2.0) #127/feature(agent-cache): add discovery marker protocol (0.5.0) #128). 30s in-memory cache.CacheProposalValidationError,InvalidCacheTypeError,CacheNotFoundError,DuplicatePendingProposalError,RateLimitedError) map to HTTP 400/404/409/429 at the controller.ZodErrorfrom service-layer parses also maps to 400 with structuredissues.cache_propose_threshold_adjust,cache_propose_tool_ttl_adjust,cache_propose_invalidate) wired inpackages/mcp/src/index.ts. Each is a thin wrapper over a new/mcp/instance/:id/cache-proposals/...endpoint onMcpController.estimated_affected > 10000produces a warning, not a rejection.Discovery-marker dependency: end-to-end use needs PRs #127 and #128 merged so caches actually write the marker hash. Service tests mock the resolver.
Test plan
pnpm --filter api test -- --testPathPatterns=\"cache-proposals|rate-limiter\"— 46 tests pass (full 12-case matrix from the propose-tools spec, plus storage CRUD, plus rate-limiter unit tests including release-on-failure)pnpm --filter api exec tsc --noEmitpnpm --filter @betterdb/shared buildpnpm --filter @betterdb/mcp buildconnection_id(vstenant_idper spec) is the right call given Monitor is single-tenant for nowNote
High Risk
Adds new persistent data model and write paths (new
cache_proposals/cache_proposal_audittables plus adapter methods) and wires a new web review/approve flow, so migration and status-transition logic bugs could affect production data and operations.Overview
Introduces a cache proposal persistence model end-to-end:
StoragePortgains CRUD/status/audit methods for proposals, and the Postgres/SQLite/memory adapters implement them with new tables, indexes, uniqueness constraints (including NULL-safe discriminators), payload validation, and expiry updates.Adds MCP/controller utilities (extracted query parsing/validation helpers) and loads an optional proprietary
CacheProposalsModuleat runtime.Extends the web app with a Cache Proposals page (pending + history + detail drawer), approval/reject/edit-and-approve actions via a new
cacheProposalsApi, and a sidebar unread badge backed by a polling hook with localStorage “last seen” tracking.Updates the Python
betterdb-agent-cachepackage to support the workflow via periodic tool-policy refresh (atomic swap + failure metrics) and a discovery marker protocol (registry + heartbeat + collision detection), including new options/types, telemetry counters, examples, and expanded tests.Reviewed by Cursor Bugbot for commit 9249a97. Bugbot is set up for automated code reviews on this repo. Configure here.