You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(plans): complete spike 1.1 — antfly API validated
Key findings:
- Upsert works (batch insert overwrites existing keys)
- Hybrid search (BM25 + vector + RRF) works beautifully
- Embedding delay ~2s (dev index must wait for completion)
- Lookup by key is direct (replaces O(n) zero-vector hack)
- Table info provides disk_usage for VectorStats
- Auto full-text index on every table (hybrid search for free)
- SDK has CJS/ESM bug — use direct REST API with fetch() instead
- Docker image needs explicit `swarm` command
- i8 model variant 404s — use default (f32) variant
- Updated Part 1.5 to Docker-first with native fallback
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| 1 | Does batch insert overwrite existing keys (upsert)? |**Yes.** Re-inserting same key overwrites the document. Confirmed via lookup after upsert. |
12
+
| 2 | How long does background embedding take? |**~2 seconds** for a single document to become searchable. First batch (10 docs) searchable within 5-8s. |
13
+
| 3 | Can we query immediately after insert? |**No — ~2s delay.** Embeddings are generated asynchronously. `dev index` should wait or poll for completion. |
14
+
| 4 | What does `client.tables.get()` return? | Returns table info including `storage_status.disk_usage` (bytes), index configs, and shard info. **No direct doc count** — need to use a query with limit to count. |
15
+
| 5 | Latency of lookup vs vector search? | Lookup is near-instant. Semantic search ~1-2ms for 10 docs. Both fast at this scale. |
16
+
| 6 | Can we full-scan without a query vector? |**Yes** — use the global `/api/v1/query` endpoint with just `table` and `limit`, no `semantic_search`. Returns all docs. |
17
+
| 7 | Does the SDK handle connection errors gracefully? |**SDK has a CJS/ESM interop bug** — `TypeError: (0 , import_openapi_fetch.default) is not a function`. Direct REST API works fine. See SDK issues section below. |
18
+
| 8 | What happens when antfly server is not running? | curl gets `ECONNREFUSED`. Clear and fast failure. |
19
+
| 9 | Does `getAll()` paginate beyond 10000 docs? | Not tested at scale in this spike. The query endpoint accepts `limit` — likely works up to a reasonable size. Need to test with a real repo index. |
20
+
| 10 | Does `dev index` need to wait for embedding completion? |**Yes.** There's a ~2s delay between insert and searchability. For a full index run, we should wait for all embeddings to complete before declaring success. Poll embedding status or add a brief wait. |
21
+
22
+
## API Endpoint Reference (verified)
23
+
24
+
| Operation | Method | Endpoint |
25
+
|-----------|--------|----------|
26
+
| Create table | POST |`/api/v1/tables/{name}`|
27
+
| Get table info | GET |`/api/v1/tables/{name}`|
28
+
| Drop table | DELETE |`/api/v1/tables/{name}`|
29
+
| List tables | GET |`/api/v1/tables`|
30
+
| Batch insert/delete | POST |`/api/v1/tables/{name}/batch`|
31
+
| Lookup by key | GET |`/api/v1/tables/{name}/lookup/{key}`|
32
+
| Query (table-specific) | POST |`/api/v1/tables/{name}/query`|
33
+
| Query (global) | POST |`/api/v1/query`|
34
+
35
+
**Important:** The global query endpoint (`/api/v1/query`) returns results in `responses[0].hits.hits[]` format. Table-specific query (`/api/v1/tables/{name}/query`) returns in `hits.hits[]` format.
36
+
37
+
## Key Findings
38
+
39
+
### 1. Table creation auto-creates full-text index
40
+
41
+
When creating a table with an embeddings index, antfly automatically adds a
42
+
`full_text_index_v0` full-text index. This means **every table gets hybrid search
43
+
for free** — no extra configuration needed.
44
+
45
+
### 2. Hybrid search with RRF works beautifully
46
+
47
+
Tested: `semantic_search: "error handling and retry"` + `full_text_search: "retryWithBackoff"`
48
+
49
+
Result: `func-retryBackoff` ranked #1 with scores from BOTH BM25 and vector similarity.
50
+
The `_index_scores` object shows which indexes contributed. RRF doubled its score vs
51
+
semantic-only results. This is exactly the upgrade we wanted for `dev_search`.
52
+
53
+
### 3. Document structure is flexible (schemaless)
54
+
55
+
Documents are JSON objects. No predefined schema required. We can store `text`, `metadata`,
56
+
`type`, `file`, `line` — whatever we want. The embedding index uses the `template` field
57
+
(Handlebars) to know which field(s) to embed.
58
+
59
+
### 4. Embedding model confirmed: bge-small-en-v1.5, dimension 384
60
+
61
+
Table info shows `dimension: 384` and `model: BAAI/bge-small-en-v1.5`. Same dimension
62
+
as our current all-MiniLM-L6-v2 (384), so result structures don't change.
63
+
64
+
Note: i8 variant 404'd during model pull. f32 variant (127.8MB) works. The plan should
65
+
use default variant (no `--variants i8` flag) until i8 is fixed.
66
+
67
+
### 5. Lookup by key replaces O(n) zero-vector hack
68
+
69
+
`GET /api/v1/tables/{name}/lookup/{key}` returns the document directly. Returns 404 if
70
+
not found. This is a massive improvement over the current `get()` implementation in
71
+
`LanceDBVectorStore` which does a full vector scan with a zero vector.
72
+
73
+
### 6. Storage info available
74
+
75
+
`client.tables.get()` returns `storage_status.disk_usage` in bytes. This can replace
76
+
the `storageSize` field in `VectorStats` (currently reads local LanceDB directory).
77
+
78
+
## SDK Issues
79
+
80
+
### CJS/ESM interop bug
81
+
82
+
The `@antfly/sdk` v0.0.14 fails when imported in a CJS context (e.g., via `tsx`):
83
+
84
+
```
85
+
TypeError: (0 , import_openapi_fetch.default) is not a function
86
+
```
87
+
88
+
The SDK's CJS bundle (`dist/index.cjs`) doesn't correctly handle the `openapi-fetch`
89
+
default export.
90
+
91
+
**Workaround options:**
92
+
1. Use ESM imports only (our packages use ESM anyway)
93
+
2. Use the REST API directly with `fetch()` instead of the SDK
94
+
3. Report the bug to antfly team (the user's friend built it)
95
+
96
+
**Recommendation:** Start with direct REST API calls via `fetch()`. The SDK is thin
97
+
(just openapi-fetch wrapper) and we need only 6-7 endpoints. Building our own thin
98
+
client gives us full control and avoids SDK version coupling. We can adopt the SDK
99
+
later when it stabilizes (v0.0.x is very early).
100
+
101
+
## Docker Findings
102
+
103
+
### `ghcr.io/antflydb/antfly:omni`
104
+
- No ARM64 image available. Runs under Rosetta with `--platform linux/amd64`.
105
+
- Pull succeeded but entrypoint errored: `Error: unknown flag: --api-url`
106
+
107
+
### `ghcr.io/antflydb/antfly:latest`
108
+
- Pulls successfully on ARM64 (via amd64 emulation)
109
+
- Does NOT auto-start — just shows help. Needs explicit `swarm` command.
110
+
- Would need: `docker run -d ... ghcr.io/antflydb/antfly:latest swarm`
111
+
112
+
### Port conflict on native
113
+
-`antfly swarm` binds to ports 8080, 9017, 9021, 12380, 11433
114
+
- If any are occupied (e.g., old Docker container), it crashes with `bind: address already in use`
115
+
- Docker is preferred because it isolates ports inside the container
116
+
117
+
**Recommendation:** Docker-first with `antfly swarm` as the command, native fallback.
118
+
Need to verify Docker image + `swarm` command works end-to-end.
119
+
120
+
## Impact on Plan
121
+
122
+
1.**Use direct REST API instead of SDK** — avoids CJS/ESM bug and early SDK instability
123
+
2.**Model pull: use default variant** (not `--variants i8`) until i8 is fixed
124
+
3.**`dev index` must wait for embeddings** — poll or add brief sleep after batch insert
125
+
4.**Table info provides disk_usage** — can populate `VectorStats.storageSize`
126
+
5.**Auto full-text index** — every table gets BM25 for free, simplifies table creation
127
+
6.**Docker needs `swarm` command** — `docker run ... antfly swarm` not just `docker run ... antfly`
0 commit comments