docs(plans): complete spike 1.1 — antfly API validated

prosdev · claude · prosdev · commit 3eb7961203a8 · 2026-03-29T16:36:42.000-07:00
Key findings:
- Upsert works (batch insert overwrites existing keys)
- Hybrid search (BM25 + vector + RRF) works beautifully
- Embedding delay ~2s (dev index must wait for completion)
- Lookup by key is direct (replaces O(n) zero-vector hack)
- Table info provides disk_usage for VectorStats
- Auto full-text index on every table (hybrid search for free)
- SDK has CJS/ESM bug — use direct REST API with fetch() instead
- Docker image needs explicit `swarm` command
- i8 model variant 404s — use default (f32) variant
- Updated Part 1.5 to Docker-first with native fallback

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/.claude/da-plans/core/phase-1-antfly-migration/1.1-spike-findings.md b/.claude/da-plans/core/phase-1-antfly-migration/1.1-spike-findings.md
@@ -0,0 +1,127 @@
+# Part 1.1 — Spike Findings
+
+**Date:** 2026-03-29
+**Antfly version:** 0.1.0 (native binary, macOS ARM64)
+**SDK version:** @antfly/sdk 0.0.14
+
+## Results
+
+| # | Question | Answer |
+|---|----------|--------|
+| 1 | Does batch insert overwrite existing keys (upsert)? | **Yes.** Re-inserting same key overwrites the document. Confirmed via lookup after upsert. |
+| 2 | How long does background embedding take? | **~2 seconds** for a single document to become searchable. First batch (10 docs) searchable within 5-8s. |
+| 3 | Can we query immediately after insert? | **No — ~2s delay.** Embeddings are generated asynchronously. `dev index` should wait or poll for completion. |
+| 4 | What does `client.tables.get()` return? | Returns table info including `storage_status.disk_usage` (bytes), index configs, and shard info. **No direct doc count** — need to use a query with limit to count. |
+| 5 | Latency of lookup vs vector search? | Lookup is near-instant. Semantic search ~1-2ms for 10 docs. Both fast at this scale. |
+| 6 | Can we full-scan without a query vector? | **Yes** — use the global `/api/v1/query` endpoint with just `table` and `limit`, no `semantic_search`. Returns all docs. |
+| 7 | Does the SDK handle connection errors gracefully? | **SDK has a CJS/ESM interop bug** — `TypeError: (0 , import_openapi_fetch.default) is not a function`. Direct REST API works fine. See SDK issues section below. |
+| 8 | What happens when antfly server is not running? | curl gets `ECONNREFUSED`. Clear and fast failure. |
+| 9 | Does `getAll()` paginate beyond 10000 docs? | Not tested at scale in this spike. The query endpoint accepts `limit` — likely works up to a reasonable size. Need to test with a real repo index. |
+| 10 | Does `dev index` need to wait for embedding completion? | **Yes.** There's a ~2s delay between insert and searchability. For a full index run, we should wait for all embeddings to complete before declaring success. Poll embedding status or add a brief wait. |
+
+## API Endpoint Reference (verified)
+
+| Operation | Method | Endpoint |
+|-----------|--------|----------|
+| Create table | POST | `/api/v1/tables/{name}` |
+| Get table info | GET | `/api/v1/tables/{name}` |
+| Drop table | DELETE | `/api/v1/tables/{name}` |
+| List tables | GET | `/api/v1/tables` |
+| Batch insert/delete | POST | `/api/v1/tables/{name}/batch` |
+| Lookup by key | GET | `/api/v1/tables/{name}/lookup/{key}` |
+| Query (table-specific) | POST | `/api/v1/tables/{name}/query` |
+| Query (global) | POST | `/api/v1/query` |
+
+**Important:** The global query endpoint (`/api/v1/query`) returns results in `responses[0].hits.hits[]` format. Table-specific query (`/api/v1/tables/{name}/query`) returns in `hits.hits[]` format.
+
+## Key Findings
+
+### 1. Table creation auto-creates full-text index
+
+When creating a table with an embeddings index, antfly automatically adds a
+`full_text_index_v0` full-text index. This means **every table gets hybrid search
+for free** — no extra configuration needed.
+
+### 2. Hybrid search with RRF works beautifully
+
+Tested: `semantic_search: "error handling and retry"` + `full_text_search: "retryWithBackoff"`
+
+Result: `func-retryBackoff` ranked #1 with scores from BOTH BM25 and vector similarity.
+The `_index_scores` object shows which indexes contributed. RRF doubled its score vs
+semantic-only results. This is exactly the upgrade we wanted for `dev_search`.
+
+### 3. Document structure is flexible (schemaless)
+
+Documents are JSON objects. No predefined schema required. We can store `text`, `metadata`,
+`type`, `file`, `line` — whatever we want. The embedding index uses the `template` field
+(Handlebars) to know which field(s) to embed.
+
+### 4. Embedding model confirmed: bge-small-en-v1.5, dimension 384
+
+Table info shows `dimension: 384` and `model: BAAI/bge-small-en-v1.5`. Same dimension
+as our current all-MiniLM-L6-v2 (384), so result structures don't change.
+
+Note: i8 variant 404'd during model pull. f32 variant (127.8MB) works. The plan should
+use default variant (no `--variants i8` flag) until i8 is fixed.
+
+### 5. Lookup by key replaces O(n) zero-vector hack
+
+`GET /api/v1/tables/{name}/lookup/{key}` returns the document directly. Returns 404 if
+not found. This is a massive improvement over the current `get()` implementation in
+`LanceDBVectorStore` which does a full vector scan with a zero vector.
+
+### 6. Storage info available
+
+`client.tables.get()` returns `storage_status.disk_usage` in bytes. This can replace
+the `storageSize` field in `VectorStats` (currently reads local LanceDB directory).
+
+## SDK Issues
+
+### CJS/ESM interop bug
+
+The `@antfly/sdk` v0.0.14 fails when imported in a CJS context (e.g., via `tsx`):
+
+```
+TypeError: (0 , import_openapi_fetch.default) is not a function
+```
+
+The SDK's CJS bundle (`dist/index.cjs`) doesn't correctly handle the `openapi-fetch`
+default export.
+
+**Workaround options:**
+1. Use ESM imports only (our packages use ESM anyway)
+2. Use the REST API directly with `fetch()` instead of the SDK
+3. Report the bug to antfly team (the user's friend built it)
+
+**Recommendation:** Start with direct REST API calls via `fetch()`. The SDK is thin
+(just openapi-fetch wrapper) and we need only 6-7 endpoints. Building our own thin
+client gives us full control and avoids SDK version coupling. We can adopt the SDK
+later when it stabilizes (v0.0.x is very early).
+
+## Docker Findings
+
+### `ghcr.io/antflydb/antfly:omni`
+- No ARM64 image available. Runs under Rosetta with `--platform linux/amd64`.
+- Pull succeeded but entrypoint errored: `Error: unknown flag: --api-url`
+
+### `ghcr.io/antflydb/antfly:latest`
+- Pulls successfully on ARM64 (via amd64 emulation)
+- Does NOT auto-start — just shows help. Needs explicit `swarm` command.
+- Would need: `docker run -d ... ghcr.io/antflydb/antfly:latest swarm`
+
+### Port conflict on native
+- `antfly swarm` binds to ports 8080, 9017, 9021, 12380, 11433
+- If any are occupied (e.g., old Docker container), it crashes with `bind: address already in use`
+- Docker is preferred because it isolates ports inside the container
+
+**Recommendation:** Docker-first with `antfly swarm` as the command, native fallback.
+Need to verify Docker image + `swarm` command works end-to-end.
+
+## Impact on Plan
+
+1. **Use direct REST API instead of SDK** — avoids CJS/ESM bug and early SDK instability
+2. **Model pull: use default variant** (not `--variants i8`) until i8 is fixed
+3. **`dev index` must wait for embeddings** — poll or add brief sleep after batch insert
+4. **Table info provides disk_usage** — can populate `VectorStats.storageSize`
+5. **Auto full-text index** — every table gets BM25 for free, simplifies table creation
+6. **Docker needs `swarm` command** — `docker run ... antfly swarm` not just `docker run ... antfly`
diff --git a/.claude/da-plans/core/phase-1-antfly-migration/1.5-dev-setup-command.md b/.claude/da-plans/core/phase-1-antfly-migration/1.5-dev-setup-command.md
@@ -5,17 +5,20 @@
 The user never runs `antfly` directly. `dev setup` handles one-time installation,
 and any command that needs antfly auto-starts it as a background process.
 
+**Docker-first, native fallback.** Prefer Docker (isolated, no port conflicts).
+Fall back to native binary if Docker isn't available.
+
 ## UX
 
-### First time
+### First time (Docker available)
 
 ```bash
 $ dev setup
 
-✓ Antfly v0.4.2 found
-✓ Pulling embedding model...
-  BAAI/bge-small-en-v1.5 (INT8) ready
-✓ Starting Antfly server...
+✓ Docker found
+✓ Pulling antfly image...
+  ghcr.io/antflydb/antfly:omni ready
+✓ Starting container "dev-agent-antfly"...
   Running on http://localhost:8080
 
 ✓ Setup complete!
@@ -25,41 +28,48 @@ $ dev setup
     dev mcp install --cursor       # Connect to Cursor
 ```
 
-### Already set up
+### First time (no Docker, native fallback)
 
 ```bash
 $ dev setup
 
-✓ Antfly v0.4.2 found
-✓ Embedding model ready
-✓ Server already running
+Docker not found. Falling back to native binary.
 
-  Nothing to do — you're all set!
+Antfly is not installed. Install it now? (Y/n) y
+
+Installing via Homebrew...
+✓ Antfly v0.4.2 installed
+✓ Pulling embedding model...
+  BAAI/bge-small-en-v1.5 ready
+✓ Starting Antfly server...
+  Running on http://localhost:8080
+
+✓ Setup complete!
 ```
 
-### Not installed
+### Already set up
 
 ```bash
 $ dev setup
 
-Antfly is not installed. Install it now? (Y/n) y
-
-Installing via Homebrew...
-✓ Antfly v0.4.2 installed
+✓ Container "dev-agent-antfly" already running
+✓ Server healthy on http://localhost:8080
 
-✓ Pulling embedding model...
-✓ Starting server...
-✓ Setup complete!
+  Nothing to do — you're all set!
 ```
 
-If user declines:
+### Neither Docker nor native
 
 ```bash
-Antfly is not installed. Install it now? (Y/n) n
+$ dev setup
+
+✗ No runtime found.
 
-Install manually, then run `dev setup` again:
-  brew install --cask antflydb/antfly/antfly    # macOS
-  curl -fsSL https://releases.antfly.io/antfly/latest/install.sh | sh    # Linux
+  Install one of:
+    Docker Desktop  → https://docker.com/get-started
+    Antfly native   → brew install --cask antflydb/antfly/antfly
+
+  Then run `dev setup` again.
 ```
 
 ### Any command when antfly is down
@@ -83,44 +93,61 @@ No error — it just starts it. Transparent.
 
 New file: `packages/cli/src/utils/antfly.ts`
 
+Docker-first, native fallback:
+
 ```typescript
 export async function ensureAntfly(options?: { quiet?: boolean }): Promise<AntflyClient> {
   const client = new AntflyClient({ baseUrl: getAntflyUrl() });
 
-  // 1. Check if already running
+  // 1. Check if already running (Docker or native)
   try {
     await client.tables.list();
     return client;
   } catch {
     // Not running — try to start
   }
 
-  // 2. Check binary exists
+  // 2. Try Docker first (preferred — isolated, no port conflicts)
+  if (hasDocker()) {
+    if (!options?.quiet) log.info('Starting Antfly via Docker...');
+    execSync(
+      'docker run -d --name dev-agent-antfly -p 8080:8080 ghcr.io/antflydb/antfly:omni',
+      { stdio: 'pipe' }
+    );
+    await waitForServer(getAntflyUrl(), { timeout: 30000 });
+    if (!options?.quiet) log.success('Running on ' + getAntflyUrl());
+    return client;
+  }
+
+  // 3. Fall back to native binary
   try {
     execSync('antfly --version', { stdio: 'pipe' });
   } catch {
     throw new Error(
-      'Antfly is not installed.\n' +
-      'Run `dev setup` or install manually:\n' +
-      '  brew install --cask antflydb/antfly/antfly'
+      'No runtime found. Run `dev setup` or install:\n' +
+      '  Docker Desktop → https://docker.com/get-started\n' +
+      '  Native binary  → brew install --cask antflydb/antfly/antfly'
     );
   }
 
-  // 3. Start in background
   if (!options?.quiet) log.info('Starting Antfly server...');
-  const child = spawn('antfly', ['swarm'], {
-    detached: true,
-    stdio: 'ignore',
-  });
+  const child = spawn('antfly', ['swarm'], { detached: true, stdio: 'ignore' });
   child.unref();
-
-  // 4. Wait for readiness (poll with timeout)
   await waitForServer(getAntflyUrl(), { timeout: 15000 });
   if (!options?.quiet) log.success('Running on ' + getAntflyUrl());
 
   return client;
 }
 
+function hasDocker(): boolean {
+  try {
+    execSync('docker info', { stdio: 'pipe' });
+    return true;
+  } catch {
+    return false;
+  }
+}
+
 function getAntflyUrl(): string {
   return process.env.ANTFLY_URL || 'http://localhost:8080';
 }
@@ -149,7 +176,22 @@ commander
   .description('One-time setup: install search backend and embedding model')
   .option('--model <name>', 'Termite embedding model', 'BAAI/bge-small-en-v1.5')
   .action(async (opts) => {
-    // 1. Check antfly binary — offer to install if missing
+    // 1. Docker path (preferred)
+    if (hasDocker()) {
+      log.success('Docker found');
+      // Pull image if needed
+      execSync('docker pull ghcr.io/antflydb/antfly:omni', { stdio: 'inherit' });
+      // Start container (ensureAntfly handles this)
+      await ensureAntfly();
+      // Save runtime preference
+      saveConfig({ antflyRuntime: 'docker', embeddingModel: opts.model });
+      log.success('Setup complete!');
+      return;
+    }
+
+    // 2. Native fallback
+    log.info('Docker not found. Falling back to native binary.');
+
     try {
       const version = execSync('antfly --version', { stdio: 'pipe' }).toString().trim();
       log.success(`Antfly ${version} found`);
@@ -160,23 +202,22 @@ commander
 
       const answer = await confirm('Antfly is not installed. Install it now?');
       if (answer) {
-        log.info(`Installing via ${process.platform === 'darwin' ? 'Homebrew' : 'install script'}...`);
         execSync(installCmd, { stdio: 'inherit' });
         log.success('Antfly installed');
       } else {
-        log.info('Install manually, then run `dev setup` again:');
-        log.info(`  ${installCmd}`);
+        log.info(`Install manually: ${installCmd}`);
         return;
       }
     }
 
-    // 2. Check/pull embedding model
-    // 3. ensureAntfly() — starts server if needed
-    // 4. Verify connection
+    // 3. Pull embedding model (native only — Docker image bundles models)
+    // 4. ensureAntfly() — starts server
+    // 5. Save config
+    saveConfig({ antflyRuntime: 'native', embeddingModel: opts.model });
   });
 ```
 
-Step 2 (model pull) uses the selected model and saves to config:
+Step 3 (model pull — native path only) uses the selected model:
 
 ```typescript
 const model = opts.model; // default: 'BAAI/bge-small-en-v1.5'