Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/fix-model-dir-mismatch.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"@prosdevlab/dev-agent": patch
---

Fix `dev setup` reporting model ready while `dev index` fails with "model not found". The CLI's `hasModel`/`pullModel` used `~/.termite/models` but the running server looked in `~/.antfly/models`. Both now use a shared `--models-dir` pointing at the server's data directory.
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,8 @@ dev-agent indexes your codebase and provides 6 MCP tools to AI assistants. Inste
```bash
# Install
npm install -g @prosdevlab/dev-agent
brew install --cask antflydb/antfly/antfly

# One-time setup
# One-time setup (installs Antfly, pulls embedding model, starts server)
dev setup

# Index your repository
Expand Down Expand Up @@ -138,7 +137,7 @@ Server health, Antfly connectivity, and repository access.
## Prerequisites

- Node.js 22+ (LTS)
- [Antfly](https://antfly.io) — `brew install --cask antflydb/antfly/antfly`
- [Antfly](https://antfly.io) — installed automatically by `dev setup`

## Development

Expand Down
92 changes: 92 additions & 0 deletions packages/cli/src/utils/__tests__/antfly.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
/**
* Tests for antfly utility helpers.
*
* Regression for: hasModel() false positive when antfly termite list defaulted
* to ~/.termite/models (different from the server's ~/.antfly/models), causing
* "Embedding model ready" in `dev setup` but "model not found" in `dev index`.
*/

import { describe, expect, it } from 'vitest';

// modelPresentInOutput is not exported — test via the exported path by extracting
// the pure logic into a local copy that mirrors the implementation exactly.
// This keeps the test focused on the matching logic without requiring CLI env.

function modelPresentInOutput(model: string, output: string): boolean {
if (output.includes(model)) return true;

const shortName = model.split('/').pop() ?? model;
const escaped = shortName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
return new RegExp(`(?<![\\w/-])${escaped}(?![\\w/-])`).test(output);
}

describe('modelPresentInOutput', () => {
const FULL_NAME = 'BAAI/bge-small-en-v1.5';
const SHORT_NAME = 'bge-small-en-v1.5';

// Simulates `antfly termite list --models-dir ~/.antfly/models` output when
// the model IS present (full name in NAME column, also in SOURCE column).
const PRESENT_OUTPUT = `Local models in /Users/dev/.antfly/models:

NAME TYPE SIZE VARIANTS SOURCE
BAAI/bge-small-en-v1.5 embedder 127.8 MB BAAI/bge-small-en-v1.5
`;

// Output when NO models are installed (the bug scenario: server's models-dir
// is empty, but ~/.termite/models has the model — the old code read the wrong
// directory and would never see "No models found").
const EMPTY_OUTPUT = `Local models in /Users/dev/.antfly/models:

NAME TYPE SIZE VARIANTS SOURCE
No models found locally.

Use 'antfly termite pull <model-name>' to download models.
Use 'antfly termite list --remote' to see available models.
`;

// Output with a DIFFERENT model that happens to contain the short name as a
// suffix — the old substring check would incorrectly return true here.
const OTHER_MODEL_OUTPUT = `Local models in /Users/dev/.antfly/models:

NAME TYPE SIZE VARIANTS SOURCE
vendor/other-bge-small-en-v1.5 embedder 200.0 MB vendor/other-bge-small-en-v1.5
`;

it('returns true when full model name is present in output', () => {
expect(modelPresentInOutput(FULL_NAME, PRESENT_OUTPUT)).toBe(true);
});

it('returns true when only short name is present as a standalone token', () => {
const outputWithShortName = `Local models:\n\n${SHORT_NAME} embedder 127 MB\n`;
expect(modelPresentInOutput(FULL_NAME, outputWithShortName)).toBe(true);
});

it('returns false when models directory is empty (server has no models)', () => {
// This is the core regression: old code checked ~/.termite/models which had
// the model, new code checks ~/.antfly/models which was empty. When empty,
// hasModel must return false so pullModel is invoked.
expect(modelPresentInOutput(FULL_NAME, EMPTY_OUTPUT)).toBe(false);
});

it('returns false when a different model shares the short name as a suffix', () => {
// Old bug: output.includes("bge-small-en-v1.5") matched
// "vendor/other-bge-small-en-v1.5" — false positive.
expect(modelPresentInOutput(FULL_NAME, OTHER_MODEL_OUTPUT)).toBe(false);
});

it('returns false for completely unrelated output', () => {
expect(modelPresentInOutput(FULL_NAME, 'No models found locally.')).toBe(false);
});

it('handles model names without an org prefix', () => {
// model = "mxbai-embed-large-v1" (no slash)
const bareModel = 'mxbai-embed-large-v1';
const output = `NAME TYPE\nmxbai-embed-large-v1 embedder\n`;
expect(modelPresentInOutput(bareModel, output)).toBe(true);
});

it('handles bare model not present', () => {
const bareModel = 'mxbai-embed-large-v1';
expect(modelPresentInOutput(bareModel, EMPTY_OUTPUT)).toBe(false);
});
});
67 changes: 60 additions & 7 deletions packages/cli/src/utils/antfly.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
*/

import { execSync, spawn } from 'node:child_process';
import { homedir } from 'node:os';
import { join } from 'node:path';
import { logger } from './logger.js';

const DEFAULT_ANTFLY_URL = process.env.ANTFLY_URL ?? 'http://localhost:18080/api/v1';
Expand All @@ -14,6 +16,18 @@ const DOCKER_PORT = 18080;
const STARTUP_TIMEOUT_MS = 30_000;
const POLL_INTERVAL_MS = 500;

/**
* The Termite models directory used by the running Antfly swarm server.
*
* `antfly swarm` uses `--data-dir` (default: ~/.antfly) as its root for all
* storage, including Termite models at {data-dir}/models.
* `antfly termite list/pull` defaults to --models-dir ~/.termite/models, which
* is a DIFFERENT path. We must always pass --models-dir explicitly so that
* `pullModel` and `hasModel` operate on the same directory the server uses.
*/
const ANTFLY_DATA_DIR = process.env.ANTFLY_DATA_DIR ?? join(homedir(), '.antfly');
const TERMITE_MODELS_DIR = join(ANTFLY_DATA_DIR, 'models');

/**
* Ensure antfly is running. Auto-starts if needed.
*
Expand All @@ -32,10 +46,14 @@ export async function ensureAntfly(options?: { quiet?: boolean }): Promise<strin
if (!options?.quiet) logger.info('Starting Antfly server...');
// Use custom ports to avoid 8080 conflicts (Docker, other services).
// metadata-api on 18080 (our default), store-api on 18381, raft on 19017/19021.
// --data-dir is passed explicitly so the server's embedded Termite node stores
// models in the same directory that pullModel/hasModel use (TERMITE_MODELS_DIR).
const child = spawn(
'antfly',
[
'swarm',
'--data-dir',
ANTFLY_DATA_DIR,
'--metadata-api',
'http://0.0.0.0:18080',
'--store-api',
Expand Down Expand Up @@ -187,9 +205,14 @@ export function getNativeVersion(): string | null {

/**
* Pull a Termite embedding model (native binary).
*
* Always targets TERMITE_MODELS_DIR so the model ends up in the same directory
* the running Antfly swarm server uses for its embedded Termite node.
*/
export function pullModel(model: string): void {
execSync(`antfly termite pull ${model}`, { stdio: 'inherit' });
execSync(`antfly termite pull --models-dir ${TERMITE_MODELS_DIR} ${model}`, {
stdio: 'inherit',
});
}

/**
Expand All @@ -201,33 +224,63 @@ export function pullModelDocker(model: string): void {
}

/**
* Check if a Termite model is available locally (native binary).
* Check if a Termite model is available in the directory used by the running
* Antfly swarm server (TERMITE_MODELS_DIR = ~/.antfly/models by default).
*
* Checks for the full model name first (e.g. "BAAI/bge-small-en-v1.5"), then
* the short name as a whole word (e.g. "bge-small-en-v1.5"). Previously used
* a simple substring match on the short name, which caused false positives when
* `antfly termite list` defaulted to ~/.termite/models — a different directory
* from the one the server reads, so the model appeared present but was not
* available to the server during embedding.
*/
export function hasModel(model: string): boolean {
try {
const output = execSync('antfly termite list', {
const output = execSync(`antfly termite list --models-dir ${TERMITE_MODELS_DIR}`, {
encoding: 'utf-8',
stdio: ['pipe', 'pipe', 'pipe'],
});
const shortName = model.split('/').pop() ?? model;
return output.includes(shortName);
return modelPresentInOutput(model, output);
} catch {
return false;
}
}

/**
* Check if a Termite model is available inside the Docker container.
*
* Checks for the full model name first (e.g. "BAAI/bge-small-en-v1.5"), then
* the short name as a whole word (e.g. "bge-small-en-v1.5"). Simple substring
* matching on the short name was causing false positives when other models or
* partial download records shared the suffix.
*/
export function hasModelDocker(model: string): boolean {
try {
const output = execSync(`docker exec ${CONTAINER_NAME} /antfly termite list`, {
encoding: 'utf-8',
stdio: ['pipe', 'pipe', 'pipe'],
});
const shortName = model.split('/').pop() ?? model;
return output.includes(shortName);
return modelPresentInOutput(model, output);
} catch {
return false;
}
}

/**
* Return true when the model name is present in `antfly termite list` output.
*
* Strategy (most-specific first):
* 1. Full name exact match — "BAAI/bge-small-en-v1.5" appears verbatim.
* 2. Short name word-boundary — "bge-small-en-v1.5" appears as a whole token
* (not as a suffix of a different model name).
*/
function modelPresentInOutput(model: string, output: string): boolean {
// Full name check (covers "BAAI/bge-small-en-v1.5" style output)
if (output.includes(model)) return true;

// Short name check with word-boundary anchors so "bge-small-en-v1.5" does not
// match inside "other-bge-small-en-v1.5" or a partial download entry.
const shortName = model.split('/').pop() ?? model;
const escaped = shortName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
return new RegExp(`(?<![\\w/-])${escaped}(?![\\w/-])`).test(output);
}
2 changes: 1 addition & 1 deletion website/content/docs/install.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
## Requirements

- **Node.js 22+** (LTS recommended)
- **[Antfly](https://antfly.io)** — search backend (`brew install --cask antflydb/antfly/antfly`)
- **[Antfly](https://antfly.io)** — search backend (installed automatically by `dev setup`)
- **Cursor** or **Claude Code** (for MCP integration)

## Install dev-agent
Expand Down
2 changes: 1 addition & 1 deletion website/content/docs/quickstart.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Get from zero to semantic search in 5 minutes.
## Prerequisites

- Node.js 22+ installed
- [Antfly](https://antfly.io) — search backend (`brew install --cask antflydb/antfly/antfly`)
- [Antfly](https://antfly.io) — search backend (installed automatically by `dev setup`)
- Cursor IDE (or Claude Code)
- A code repository to index

Expand Down
1 change: 0 additions & 1 deletion website/content/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,6 @@ The key difference: semantic search finds code by **meaning**, not text matching

```bash
npm install -g @prosdevlab/dev-agent
brew install --cask antflydb/antfly/antfly
```

### Setup and index
Expand Down
8 changes: 4 additions & 4 deletions website/content/latest-version.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@
*/

export const latestVersion = {
version: '0.10.2',
title: 'MCP Install Fix & Dependency Cleanup',
version: '0.10.3',
title: 'Fix Setup/Index Model Directory Mismatch',
date: 'March 30, 2026',
summary:
'Fixed dev mcp install check, removed dead metrics module and better-sqlite3 dependency.',
link: '/updates#v0102--mcp-install-fix--dependency-cleanup',
'Fixed dev setup reporting model ready while dev index fails with "model not found" due to mismatched model directories.',
link: '/updates#v0103--fix-setupindex-model-directory-mismatch',
} as const;
12 changes: 12 additions & 0 deletions website/content/updates/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,18 @@ What's new in dev-agent. We ship improvements regularly to help AI assistants un

---

## v0.10.3 — Fix Setup/Index Model Directory Mismatch

*March 30, 2026*

**Fixed `dev setup` reporting model ready while `dev index` fails with "model not found".**

- `hasModel`/`pullModel` used `~/.termite/models` but the running Antfly server looked in `~/.antfly/models` — both now use a shared `--models-dir` pointing at the server's data directory
- Improved model name matching to avoid false positives from substring collisions
- Added unit tests for model detection logic

---

## v0.10.2 — MCP Install Fix & Dependency Cleanup

*March 30, 2026*
Expand Down
Loading