Skip to content

feat(search): semantic search MCP tools + tiered context mode (B-005 Phase 2)#124

Merged
George-iam merged 3 commits intomainfrom
feat/semantic-search-mcp-tool-20260429
Apr 29, 2026
Merged

feat(search): semantic search MCP tools + tiered context mode (B-005 Phase 2)#124
George-iam merged 3 commits intomainfrom
feat/semantic-search-mcp-tool-20260429

Conversation

@George-iam
Copy link
Copy Markdown
Contributor

Summary

Closes B-005 Phase 2. Adds three new MCP tools + opt-in search context mode for KBs >100 entries.

axme_get_memory(slug)              - full body of one memory
axme_get_decision(id_or_slug)      - full body of one decision
axme_search_kb(query, type?, k?)   - semantic search across both

These work in both modes. The search mode also changes what axme_context emits at session start: a compact catalog (title + 1-line desc + [type/enforce] labels) instead of every body. Token cost drops ~10x at session start.

Two context modes

Loaded at session start When to use
full (default) Every memory + decision body KBs ≤ 100 entries. Existing behaviour, zero breakage.
search (opt-in) Catalog only (title + desc + labels) KBs > 100 entries. Bodies fetched on demand.

User-driven switching (agent never decides):

  • env var: AXME_CONTEXT_MODE=search claude (one-off)
  • CLI: axme-code config set context.mode search
  • manual edit of .axme-code/config.yaml

When the user is on full and the KB exceeds 100 entries, axme_context adds a soft hint suggesting the switch — non-blocking, just informational.

Lazy install of @huggingface/transformers

Semantic-search runtime (~100MB node_modules + ~30MB MiniLM ONNX model cached at ~/.cache/huggingface/) is not bundled. It installs into ~/.local/share/axme-code/runtime/ on opt-in via:

axme-code config set context.mode search

That CLI command runs atomically:

  1. npm install --prefix runtime @huggingface/transformers@^4.0.1
  2. Reindex every memory + decision into .axme-code/_index/embeddings.json
  3. Write context.mode = search

If either step fails → config rolls back to full, error printed, user is never left in a half-broken state.

axme-code reindex is also exposed for manual rebuilds.

Embedding strategy

  • Brute-force cosine over Float32 (no HNSW). At ≤1000 vectors this is <10ms; HNSW only matters at 10K+. Avoids native bindings entirely.
  • Synchronous embed on every axme_save_memory / axme_save_decision when search mode is active. ~50-200ms per save once embedder is warm.
  • Skips silently when mode != search OR runtime missing — never blocks a save the user opted out of.

What's verified locally

  • npm test536/536 pass (was 511; +25 new tests across embeddings.test.ts + kb-search.test.ts)
  • npx tsc --noEmit — clean
  • npm run build — clean
  • New tests cover: cosine math, topK ranking + type filter, JSON round-trip, mtime staleness, runtime detection fallback, all 3 handler not-found / install-hint paths.

What's NOT yet verified (needs combined E2E with PR #123)

These are scheduled for the combined E2E pass (Linux locally + Windows VM) before merge.

Files

New (5 files, ~750 LOC):

  • src/storage/embeddings.ts — cosine, topK, JSON load/save, lazy embedder loader, mtime staleness, embedKbEntry helper
  • src/tools/kb-search.ts — 3 MCP handlers with friendly fallbacks
  • src/tools/search-install.ts — installTransformers + reindexAll + atomic runConfigSetSearch
  • test/embeddings.test.ts, test/kb-search.test.ts

Modified (5 files):

  • src/types.tsContextMode type + ProjectConfig.contextMode
  • src/storage/config.ts — nested context.mode parse/format
  • src/server.ts — 3 tool registrations + post-save embed hooks
  • src/tools/context.ts — search-mode branch (catalog + MUST instructions) + >100-entry soft hint
  • src/cli.tsconfig get/set + reindex subcommands

Related

Test plan

  • Local unit tests 536/536 pass
  • npx tsc --noEmit clean
  • npm run build clean
  • CI 3-OS matrix green
  • Linux E2E: axme-code config set context.mode search install + reindex + axme_search_kb returns hits
  • Linux E2E: search mode catalog visible in axme_context output, axme_get_memory fetches body
  • Linux E2E: rollback on install failure (simulate via npm proxy block)
  • Windows VM E2E: same standalone-bundle path as feat(windows): native Windows support — install.ps1, .cmd shims, 3-OS CI matrix #122
  • Reviewer eyes that embedKbEntry skip logic is correct (mode != search → no-op)

🤖 Generated with Claude Code

George-iam and others added 3 commits April 29, 2026 13:44
Adds three new MCP tools and an opt-in tiered context-loading mode:

  axme_get_memory(slug)          - full body of one memory
  axme_get_decision(id_or_slug)  - full body of one decision
  axme_search_kb(query, type?, k?) - semantic search across both

These enable agents on KBs >100 entries to navigate without loading every
body at session start. Available in both modes (full + search) so they
also serve as fuzzy-lookup tools mid-session.

## Two context modes (config.context.mode)

- full   - default. Every memory and decision body is loaded into agent
           context at session start. Existing behaviour, zero breakage.
- search - only a catalog (title + 1-line description, prefixed with
           [type/enforce]) is loaded. Bodies fetched on demand via the
           three new tools.

Switching is user-driven only - the agent never decides:
- env var: AXME_CONTEXT_MODE=tiered claude (one-off)
- CLI:     axme-code config set context.mode search
- manual:  edit .axme-code/config.yaml

Default config.yaml now ships a commented hint explaining when to switch.
At >100 KB entries the axme_context output also adds a soft suggestion
without changing behaviour.

## Lazy install of @huggingface/transformers

The semantic-search runtime (~100MB node_modules + ~30MB MiniLM ONNX
model cached at ~/.cache/huggingface/) is NOT bundled. It's installed on
demand into ~/.local/share/axme-code/runtime/ by:

  axme-code config set context.mode search

That command atomically:
  1. npm install --prefix runtime @huggingface/transformers@^4.0.1
  2. Reindex every memory + decision into .axme-code/_index/embeddings.json
  3. Write context.mode = search

If either step fails, config rolls back to full and an error is printed.
The user is never left in a half-broken state.

`axme-code reindex` is also exposed for manual rebuilds (e.g. after
hand-editing .md files).

## Embedding strategy

- Brute-force cosine over Float32 (no HNSW). At <=1000 vectors this is
  <10ms; HNSW only matters at 10K+. Avoids native bindings entirely.
- Synchronous embed on every axme_save_memory / axme_save_decision when
  search mode is active. ~50-200ms per save once the embedder is warm;
  acceptable cost for "search results are immediately consistent".
- Skips silently when mode != search OR runtime missing - never blocks a
  save on something the user opted out of.

## New files

- src/storage/embeddings.ts     - core: cosine, topK, JSON load/save,
                                   loadEmbedder lazy loader, mtime
                                   staleness, embedKbEntry helper.
- src/tools/kb-search.ts        - 3 MCP handlers (get_memory, get_decision,
                                   search_kb) with friendly fallbacks.
- src/tools/search-install.ts   - installTransformers + reindexAll +
                                   runConfigSetSearch atomic flow.
- test/embeddings.test.ts       - cosine math + topK ranking + JSON
                                   round-trip + staleness + runtime detect.
- test/kb-search.test.ts        - 3 handlers' fallback paths + format
                                   round-trip.

## Modified files

- src/types.ts            - ContextMode type + ProjectConfig.contextMode
                             ("full" default).
- src/storage/config.ts   - nested context.mode parse/format.
- src/server.ts           - 3 new tool registrations; saveMemory/saveDecision
                             handlers now await embedKbEntry post-save.
- src/tools/context.ts    - branch on contextMode: full mode unchanged;
                             search mode emits catalog (title+desc+labels)
                             plus MUST instructions and soft KB-size hint.
- src/cli.ts              - `config get/set <key> [<value>]` and `reindex`
                             subcommands.

## Verification

- npm test  536/536 pass (was 511; +25 new tests for embeddings + kb-search)
- npx tsc --noEmit  clean
- npm run build  clean

Phase 1 (multi-client docs) is open separately as PR #123 and will be
merged together after combined E2E.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the descriptive "consider search mode" hint with an explicit
MUST-style instruction: the agent surfaces the option to the user in
its first response, asks whether to run the install command itself or
let the user run it, and waits for an explicit decision before
continuing the original task.

Matches the existing instruction style used elsewhere (TRUNCATED
OUTPUT RULE in server.ts, Pending Audits Check in CLAUDE.md). The
agent never switches the mode without explicit user confirmation, and
it does not nag again in the same session if the user declines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two fixes for `axme-code config set context.mode search` on Windows:

1. **pathToFileURL on dynamic import.** `await import(absolutePath)`
   throws on Windows because Node treats raw `C:\...` paths as bare
   specifiers. Wrapping the resolved path with `pathToFileURL(...).href`
   produces a `file:///C:/...` URL that imports cleanly on every
   platform.

2. **Actionable error when onnxruntime-node native binding fails to
   load.** Most common Windows failure: `onnxruntime_binding.node`
   throws "specified module could not be found" because Microsoft
   Visual C++ Redistributable is missing. We now detect that
   signature and print a clear hint pointing to the redist installer
   plus the retry command, instead of returning null silently and
   leaving the user to chase a generic "module not loaded" message
   from search-install.ts.

Verified on Azure Win11 native:
- Before VC++ Redist install: error message correctly directs the
  user to https://aka.ms/vs/17/release/vc_redist.x64.exe.
- After VC++ Redist install: `axme-code config set context.mode
  search` succeeds end-to-end (31/31 entries indexed in 5s,
  embeddings.json written, config.yaml updated).
- claude --print session calls axme_context (renders search-mode
  catalog) and axme_search_kb (top hits semantically correct: D-004
  + D-002 for "git push force safety" query).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@George-iam George-iam merged commit 754eec7 into main Apr 29, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant